New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

100% break rate #20

Closed
PierreBarre opened this Issue Dec 18, 2015 · 12 comments

Comments

Projects
None yet
3 participants
@PierreBarre

PierreBarre commented Dec 18, 2015

Hello,

I made a script that is breaking visualCaptcha with a rate of 100%
Of course, this is required to build a database of the images used by the website. This can be time consuming but most websites which makes the use of visualcaptcha are using the default images anyway.

https://github.com/PierreBarre/visualCaptchaBreaker

@BrunoBernardino

This comment has been minimized.

Collaborator

BrunoBernardino commented Dec 18, 2015

Hi @PierreBarre thanks for submitting this!

Wow. Yeah, one of the points for visualCaptcha is to allow people customizing the images, their names, etc. but most don't change that indeed.

One thing I'd note is that script would need tweaks for each website you encounter. You're assuming it's always a PHP setup with the default session name (though you'd only need to look for what the session cookie name is).

I'm more curious about the resources needed to run those visual comparisons. The images are small, definitely, but still, could you easily/quickly do a 1000 requests, for example?

Anyway, what would you say the solution for this is? I've thought about creating a "generator" which would easily allow people to define questions and images for download and setup, but it would require people to act on it, which I'm not so sure they'd do.

@PierreBarre

This comment has been minimized.

PierreBarre commented Dec 18, 2015

Hi @BrunoBernardino !

In fact, in the present state, that would be required to adapt the script to every website we would want to attack, but we could just send back all the cookies we got so we don't bother with "which cookie is the session one?".

The CPU consumption issue is a good point, I haven't thought too much about that.

I benchmarked using this script:

from PIL import Image
import multiprocessing


def image_diff_percentage(i):
    image1 = Image.open('0.png')
    image2 = Image.open('1.png')

    pairs = zip(image1.getdata(), image2.getdata())

    if len(image1.getbands()) == 1:
        dif = sum(abs(p1-p2) for p1, p2 in pairs)
    else:
        dif = sum(abs(c1-c2) for p1, p2 in pairs for c1, c2 in zip(p1, p2))

    ncomponents = image1.size[0] * image2.size[1] * 3
    percentage = abs((dif / 255.0 * 100) / ncomponents)
    return percentage

if __name__ == '__main__':
    pool = multiprocessing.Pool()
    for _ in range(100000):
        pool.apply_async(image_diff_percentage, (_,))
        print('compared {} couples of images'.format(_))
    pool.close()
    pool.join()
Results

It tooks ~7 seconds for an Intel® Xeon® E3 1220 v2 to compare 100 000 couples of images (14 000 per second). I assume, then, the bottleneck is more related to both the bandwidth and the computing power which are allocated to run the website.

I would have though of two solutions;

  • Permitting the addition of multiple words to describe one image, that would add some pain for the attacker.
  • Adding a way more noise on the images, disabling the ability to compare them.

The reason why I don't appreciate your solution is that it sounds a bit like "use visualCaptcha only if you don't need a captcha, if you need one, someone will break it". In my opinion, a captcha solution should be secure by default.
Furthermore, relying only on the laziness or cost not worthing it for the attacker is not sufficient.

@BrunoBernardino

This comment has been minimized.

Collaborator

BrunoBernardino commented Dec 18, 2015

If you need a lot of security, ReCaptcha is a pretty good choice, visualCaptcha is not in a competition for security, but usability.

Google has done something very nice with the checkbox thing, but it doesn't always show up. Eventually I imagine visualCaptcha won't be necessary, but the need for focusing on UX is real.

As for adding noise in the images, it defeats the purpose UX-wise.

More than one word to describe the image is pretty good.

I've been thinking of a v6, where there would be no images, only the audio option, which is accessible by default and much harder to bypass. Coupled with some app to generate the audio.json and the audio files.

That would definitely break your app, but I'm wondering if an audio-only captcha is something people would enjoy using... Maybe if the question is typed as well it would work?

What do you think?

@PierreBarre

This comment has been minimized.

PierreBarre commented Dec 18, 2015

I understand that was not the point of visualCaptcha to provide a lot of security. But, I think there is a world between a break rate of a few percents and a 100% break rate.

You are right about the UX but not doing this compromise would means visualCaptcha is something fancy that bothers real users while doing nothing with bots.

I believe an audio-only captcha would be a bad thing, especially while on the move in public transports or just in presence of other peoples, it's not always possible to play sounds without disturbing the others. Hence, not everyone have the hardware required to listen to sounds. That's also about user's convenience, for instance, if someone is listening to music while trying to register to your website, I don't think that would please him to have first to pause his listening to hear yours. Moreover, that's really terrible in terms of accessibility; deaf people would not be able to use the solution at all.

A written question is pretty weak too if the attacker has a cibled target.

@BrunoBernardino

This comment has been minimized.

Collaborator

BrunoBernardino commented Dec 18, 2015

I see, that makes sense.

Maybe there's no place in the world for a new visualCaptcha version anymore, now that Google has the checkbox thing.

@PierreBarre

This comment has been minimized.

PierreBarre commented Dec 18, 2015

I would not say there is no place for another solution as far as reCaptcha is existing.

There are a few issues running reCaptcha anyway;

A little list that comes to my mind, I might have missed a few points:

  • You don't control whoever can see and can't see the captcha (a few countries or a few networks may not like google too much).
  • You share a lot of information with google.
  • You are not controlling the availability (ok, that's less of an issue with google but still).
  • You don't know when the product will ends its life (hello google reader and tons of services google has discontinued).
  • You don't know if google will not kick you out any day for some reasons.
  • You may disagree with Google's Terms Of Service.

The new version of reCaptcha is very good UX wise, it tries to minimise the impact on the real user while being a real pain for attackers. It's quite possible to build a similar product which accomplishes the same goals with relatively not too much effort.

@BrunoBernardino

This comment has been minimized.

Collaborator

BrunoBernardino commented Dec 19, 2015

Ok, let me play devil's advocate here:

  • Google might be blocked, but not reCaptcha
  • Right. And the new method actually makes this worse
  • I'd say it's a non-issue :)
  • It's spread through way too many places for Google to just shut it down like that. I think.
  • Hardly happens, though it's not false
  • Yeah, but... have you read them? :)

So. I'm left with a bit of a tough choice...

  1. Launching a new version will be very time-consuming (updating all libraries et all), and I will not have the time to make that kind of effort in the next 6 months.
  2. I don't see an obvious way that will keep a great UX and improve security substantially from what's there now (which doesn't rely on a SaaS-like model)
  3. While there are no great alternatives to reCaptcha (funCaptcha looks interesting, but suffers similar issues to the ones you presented above), it serves the need most people have. If you're technical, there's many ways you can create a custom plan for avoiding bots on your apps. Also, many options to install and customize, just like visualCaptcha.

It seems to me it does not make sense to launch a new visualCaptcha version.

Thoughts?

@PierreBarre

This comment has been minimized.

PierreBarre commented Dec 19, 2015

  • Since the recaptcha scripts are served from the google.com domain, running on the google's ips ranges, the distinction would be quite difficult to do.
  • In fact.
  • Google already discontinued very popular services, I wouldn't bet too much on this.
  • Of course.
  • Better to do so, if you don't want to be the victim of the previous point :-)

Unfortunately, I think you're true, visualCaptcha is appealing because it's no too much a pain to complete for an user, but if that's the case for the human, that's here even easier for the robot to do the same.

I don't think you can really enhance visualCaptcha anymore security wise (because security through obscurity is not a solution); the product is suffering of it's inherent design.

Like you said, doing something that is easy for the user while being complicated for the machine would be hard excluding a SaaS model. So, no, I don't think it would be worth it to build another version of visualCaptcha, it would probably fall into one of the points mentioned above, even with a new design.

@BrunoBernardino

This comment has been minimized.

Collaborator

BrunoBernardino commented Dec 19, 2015

  • Right, I haven't looked at how it works in a long time, it used to be reCaptcha's own domain.

So, I suppose I'll close this, then.

Let me know if you think of something that could solve this!

@CrazyPython

This comment has been minimized.

CrazyPython commented Jul 11, 2016

@BrunoBernardino if we have a 100% break rate, then why have a CAPTCHA at all? It's pretty much security through obscurity. You could use real world images and only prompt when the user appears suspicious.

@BrunoBernardino

This comment has been minimized.

Collaborator

BrunoBernardino commented Jul 11, 2016

@CrazyPython I'm happy to review and accept a PR with such capabilities, or link to such a project. At this moment I'm not in a position to completely rework visualCaptcha when it's a valid option when reCaptcha isn't used (which, to me, in their current state, poses as a much better option).

@CrazyPython

This comment has been minimized.

CrazyPython commented Jul 12, 2016

@BrunoBernardino This issue is a manifestation of my theoretical problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment