Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As a web scraper, I can search Google and save results #9

Merged
merged 9 commits into from
Jan 14, 2021

Conversation

bterone
Copy link
Owner

@bterone bterone commented Jan 8, 2021

What happened

  • Add search worker to search Google
  • Update result with the response
  • Update timing to be spaced so it's less likely to be flagged as a bot

Working on this in #14
🚧 Create service to extract advertiser information from the response

Insight

Created a search worker using Oban. Thanks to @andyduong1920's article on choosing a background job 🕺

Using Oban we can have greater control over the job scheduling and have more resilient queues.

Currently, we're defaulting to 5 max concurrent queues every 2 seconds and retrying only twice.

Proof Of Work

Tests pass ✅

Saves HTML cache and our sample data
Screen Shot 2021-01-12 at 10 17 00 PM

@bterone bterone self-assigned this Jan 8, 2021
@bterone bterone added backend Anything affecting the application functionality feature Something that brings value to the end-user WIP Work in Progress labels Jan 11, 2021
@bterone bterone removed the WIP Work in Progress label Jan 12, 2021
@bterone bterone marked this pull request as ready for review January 12, 2021 15:19
@bterone bterone merged commit 7f49bd8 into develop Jan 14, 2021
@bterone bterone deleted the feature/webscraper-worker branch January 14, 2021 10:05
@bterone bterone mentioned this pull request Feb 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Anything affecting the application functionality feature Something that brings value to the end-user
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants