Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE]: Captcha Cracking for Resy API #28

Open
1 task done
21Bruce opened this issue Jan 18, 2024 · 5 comments
Open
1 task done

[FEATURE]: Captcha Cracking for Resy API #28

21Bruce opened this issue Jan 18, 2024 · 5 comments
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@21Bruce
Copy link
Owner

21Bruce commented Jan 18, 2024

Is there an existing issue for this feature?

  • I have searched for this issue

Description of the problem

With the increasing prevalence of bot solutions, resy has started using captcha on higher-demand reservations. If this pattern continues, the critical factor in making reservations will most certainly be the ability to complete a captcha. This is a fantastic blessing in disguise. For one, there are well-documented machine learning algorithms which can classify captchas at an extremely high rate. Second, since most of the bots out there are not actively maintained, they'll be completely useless compared to a bot that handles captcha. Third, and even better, an ML captcha system is probably far faster than the vast majority of people trying to solve a captcha.

Despite these opportunities, implementing captcha accurately is tough. We first have to decipher the networking calls, which is pretty hard since we have to create an event in the browser that we can monitor and study, and these captchas are appearing very infrequently. I'd assume they appear on harder reservations, which makes it harder for us to reproduce and test our implementation. Furthermore, once we have the networking calls down, there are a number of captcha tasks. Some involve typing letters and number, some involve selecting photos, others involve checking a box and then moving the mouse in such a way that google thinks a human is behind the IO. We will need algorithms for these separate functions.

Planned Solution

Add the networking checks and calls to resy's reserve function, create a separate top level package for ML stuff.

Alternatives

None really, maybe somehow displaying the captcha to the user in the terminal, but that seems pathological.

Solution Specifics

There are a few papers online about breaking captchas. For analyzing network calls, we'll start with the common firefox/postman method of breaking and then modify if that doesn't work

@21Bruce 21Bruce added enhancement New feature or request pending Feature under consideration help wanted Extra attention is needed and removed pending Feature under consideration labels Jan 18, 2024
@21Bruce
Copy link
Owner Author

21Bruce commented Jan 19, 2024

Confirmed that resy is using captcha by analyzing homepages source. Specifically reCaptcha from google. More analysis of the booking page leads me to believe that v3 is in use

@tshamz
Copy link

tshamz commented Jan 23, 2024

Hi! I've been stalking this project for a few months now, and I'd like to possibly contribute by taking a crack at this.

I see a branch has been cut already, but I don't seen any work committed to it yet. Not sure if you've started on this and just having pushed any of your changes, or if that branch is just a placeholder for work to land eventually, but if this is unclaimed I'd be happy to take a stab at it.

Lmk if the above comments are missing any additional information that might be good to know, like a specific idea or direction for how this should be accomplished/implemented, or any other findings that might be relevant.

@21Bruce
Copy link
Owner Author

21Bruce commented Jan 23, 2024

@tshamz We'd be happy to review any code contributions you have. However, I have a few caveats. The contribution etiquette document is super outdated - I wrote it(when this project was privated) for new people I recruited IRL. Some of the information is relevant, like I'd like solutions that do not use third party libraries and that maintain documentation somewhat, but the stuff about creating branches is not relevant, since not everyone has write access to the source code repo. The way we review code now for non-contributors is via diffs sent in the comments of the github issue(like this one), or pull requests from forks. So, you're welcome to take a stab at any part of the bot and send a diff in the right issue thread(if there isn't an issue thread, you can create one and I'd be happy to review that as well) or make a pr to the right issue branch.

Finally, and hopefully this is not taken as discouragement, but this specific problem is decently complicated for a first issue, and relative to the knowledge required to make the original bot, is incredibly convoluted. It requires some theoretical background in machine learning, specifically enough to understand reinforcement learning, and some more advanced knowledge of HTTP networking. If you do decide to send diffs, perhaps the best strategy would be to pick a problem that can be accomplished with relatively less knowledge, since you haven't done any dev work on the bot yet, and then once you are familiarized with the bot internals, and hopefully the new branch is up to speed, you can judge what you feel is the right level of challenge for you. We will be posting relevant information on the dev process of this issue here, much like issue #7 . In the meantime, if you wanted a list of good first issues, here's a few, in order of importance:

  1. Adding table options. We've gotten a request from a user for this specific feature. We'd want it to be something like a command line option, specifying indoor vs outdoor etc, but this would require effort at the networking layer as well, though I'd imagine not a significantly difficult amount of work.

  2. Simplifying the string manipulation code in api/resy/api.go. Specifically I'd like to see fmt.sprintf used here and what that looks like, I think it'd be way simpler.

  3. Integrating the opentable API with the rest of the app. Currently the opentable API works as a standalone thing, like if you wanted to write a hardcoded program to interact with opentable. The issue is we don't have the right semantics to handle the differences between logins from opentable and resy such that one could integrate this really easy. This isn't super important since we've received no real requests for this from users.

And as a very last note - no issue regarding the bot is really 'claimed' by any group, you can submit work to any issue and we'll look at it. The issue threads that are up right now are outdated, I'll remove a decent amount of them today

@21Bruce 21Bruce pinned this issue Jan 26, 2024
@chanyk-joseph
Copy link

Why don't we just use 3rd party vendor for solving recaptcha?
Even if we solve it this time, the recaptcha is continuously evolving to anti-bot (especially this is an open source project)

To avoid endless effort to be spent on the task, I would suggest relying on 3rd party solver solution

@21Bruce
Copy link
Owner Author

21Bruce commented May 28, 2024

@chanyk-joseph Yeah,no. I'm not paying monthly for the bot to use a click farm in some third world country so you can get a reservation. As you mention it's open source, so if you want you can pay for a third party captcha solver, clone the repo and add it to the bot for your individual copy...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

4 participants