Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to Plucker: Remove Scrapy, use Apify #1285

Merged
merged 10 commits into from Jan 23, 2024
Merged

Conversation

honzajavorek
Copy link
Member

@honzajavorek honzajavorek commented Jan 23, 2024

  • remove proxies
  • remove scraping and Scrapy
  • load jobs from Apify
  • local caching of Apify datasets, if nothing then at least for an hour
  • Do we need "first_seen_on" at all? plucker#12 Can be done later
  • process legacy_jobs This needs to be done after implementing LLMs

@honzajavorek honzajavorek changed the title Use Apify Remove Scrapy, use Apify Jan 23, 2024
@honzajavorek honzajavorek changed the title Remove Scrapy, use Apify Migrate to Plucker: Remove Scrapy, use Apify Jan 23, 2024
Part of migration from local Scrapy to Plucker and Apify.
@honzajavorek honzajavorek marked this pull request as ready for review January 23, 2024 13:39
@honzajavorek honzajavorek merged commit febc7df into main Jan 23, 2024
1 check was pending
@honzajavorek honzajavorek deleted the honzajavorek/use-apify branch January 23, 2024 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant