Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Defaults to Chrome Headless #25

Merged
30 commits merged into from
Dec 14, 2021
Merged

Defaults to Chrome Headless #25

30 commits merged into from
Dec 14, 2021

Conversation

pws1453
Copy link
Contributor

@pws1453 pws1453 commented Dec 11, 2021

As title says, chrome runs headlessly now.

@big-labor
Copy link

Is there a reason to use chrome at all? Can accounts be created and applications be sent via API requests?

@pws1453
Copy link
Contributor Author

pws1453 commented Dec 11, 2021

Is there a reason to use chrome at all? Can accounts be created and applications be sent via API requests?

That's a great question. This program has used selenium as a driver and chrome as a base since I've contributed. I know firefox and selenium pair together well as well. I'm sure you could accomplish something similar using GET/POST requests. I haven't really explored that quite yet (I'm in the process of taking finals now). You might get a better response from another contributor if you throw it in issues?

@BigweldIndustries
Copy link
Contributor

Assuming it won't simply allows post requests. An measure as simple as that is typically already implemented

Copy link
Contributor

@joeyagreco joeyagreco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be nice to have a variable that can be passed in from the main driver (if name == "main") something like "debugMode" that if false will make it headless.

This way during development, you can set it to true and still see what is happening.

Plus in the future if other features are added similar to this, they can also look at this variable to decide how to run

@joeyagreco
Copy link
Contributor

to answer @big-labor 's question... usually production-level applications require some sort of authentication when sending API requests, such as a token. meaning without a valid token these requests would likely fail with a 401 status code

@pws1453 pws1453 marked this pull request as draft December 12, 2021 05:38
@pws1453
Copy link
Contributor Author

pws1453 commented Dec 12, 2021

Can do. Moved to draft for the time being. Should the application default to being headless or headed? I can see pros/cons to both.

@joeyagreco
Copy link
Contributor

Can do. Moved to draft for the time being. Should the application default to being headless or headed? I can see pros/cons to both.

Headless is a lot more lightweight and allows the logs to be visible. Other than for debugging/development purposes, what reasons would you prefer headed @pws1453 ?

@pws1453
Copy link
Contributor Author

pws1453 commented Dec 12, 2021

Honestly, the only other thing I could see is if someone wanted to record the applications being submitted (which while a use case, is served by the --debug option).

Additionally, I've added the parsing for the --debug option, so I'm moving this out of draft.

@pws1453 pws1453 marked this pull request as ready for review December 12, 2021 06:30
pws1453 and others added 7 commits December 12, 2021 01:44
Before, in fill_out_application_and_submit(), it would just wait for a certain amount of time before continuing.

Sometimes it would continue before the page was fully loaded (SeanDaBlack#31).

I've instead added three new selenium imports and used WebDriverWait().until to wait for the page to finish loading.
- Replaced all occurences of `print` with `printf` to remove syntax error.
- Replaced both `match`es with ifs/elifs.
defined info before the loop
@pws1453
Copy link
Contributor Author

pws1453 commented Dec 12, 2021

Alright, here is where it gets fun, recaptcha is trying to detect whether or not the system is automated. So, we somehow need to beat it. I'll attach an image from the html file (I can't upload it here sadly, but I will include it in the next commit) I was able to pull from driver.page_source while debugging.
Screen Shot 214

@bolshoytoster
Copy link
Contributor

There's a patched version of the chromedriver that is designed to be undetectable to things like this. You could try to implement that.

@bolshoytoster
Copy link
Contributor

bolshoytoster commented Dec 12, 2021

@pws1453 I've opened the pr

@pws1453
Copy link
Contributor Author

pws1453 commented Dec 12, 2021

Have tried also using firefox headless with geckodriver - no difference.

@bolshoytoster
Copy link
Contributor

Interesting to note, the error occurs when you try to download the recaptcha's audio file, so if we don't find a way to get past this block we could just fall back to making the user do the normal recaptcha if the audio fails to download.

Salary is now a range i.e. 20-25 instead of just 20.

Salary now in multiples of 5.
@bolshoytoster bolshoytoster mentioned this pull request Dec 13, 2021
pws1453 and others added 2 commits December 13, 2021 14:43
Make salary more convincing
- Removed the random_email function and constants/email.py

Emails are now gotten by:
- Sending a 'get_email_address' request to guerrilla, this returns our temporary email and session id.
I use json.loads to parse json for each of the requests.

- I added 'sid' to 'fake_identity', it just makes things easier.

- After it clicks the 'CREATE_ACCOUNT_BUTTON' we start waiting for an email by checking with the 'check_email' guerrilla request.

- We check if the response contains an email, if so we fetch it.

- I didn't want to have to import a whole html parsing library just for this so I used regex to search for the passcode.

- Type the passcode into 'VERIFY_EMAIL_INPUT' then click 'VERIFY_EMAIL_BUTTON' and it's done.
@bolshoytoster
Copy link
Contributor

In the recent change they made (adding email verification) they seem to have removed the recaptcha, so once we get around the email verification this problem should be solved.

@bolshoytoster
Copy link
Contributor

I've removed the recaptcha code from my fork, removing some errors. If @pws1453 accepts my pr to the email-verification branch of his repo, we can merge that branch with main and the bot should work again.

pws1453 and others added 3 commits December 13, 2021 16:13
Hopefully get around the email verification
- I now use a modified version of the `random_email` function for generating emails for mail.tm

Changes to `random_email`:
- I removed the `lambda` expressions with dots in them - mail.tm strips dots from it's emails.
- Doesn't pick domain from a preprogrammed list anymore - it gets the domain from mail.tm via it's api

- Added a command line option, `--mailtm`. Using this will use mail.tm by default
If you don't use that option, it will automatically fallback to it if guerrilla
doesn't respond within 180 seconds (let me know if it's a bit much) and vice versa

mail.tm pipeline
- In `random_email`, we get the domain to use from the `/domains` endpoint
- Create the account with the `/accounts` endpoint
- Get the token used to access the account from the `/token` endpoint
...
sign up
...
- Checks inbox with the `/messages?page=1` endpoint every 1.5 seconds until the email arrives
- Fetches the email from `/messages/{id}`
@pws1453 pws1453 marked this pull request as ready for review December 14, 2021 03:33
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor questions, plus a mysterious bug that causes start_driver() to crash after starting the driver.

main.py Outdated Show resolved Hide resolved
oof.html Outdated Show resolved Hide resolved
main.py Outdated Show resolved Hide resolved
main.py Show resolved Hide resolved
@pws1453
Copy link
Contributor Author

pws1453 commented Dec 14, 2021

Newest commit should resolve above conversations, will await response before resolving

@ghost
Copy link

ghost commented Dec 14, 2021

Looks good to me!

@ghost ghost merged commit f87bd8c into SeanDaBlack:main Dec 14, 2021
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants