Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add build_request and other helpers #409

Merged
merged 4 commits into from
Jun 2, 2020
Merged

Add build_request and other helpers #409

merged 4 commits into from
Jun 2, 2020

Conversation

jpmckinney
Copy link
Member

closes #317

…Add date_range_by_year and date_range_by_month helpers to normalize date loops. Update relevant spiders.

Other changes to specific spiders:

- australia: Use one instead of multiple date ranges
- australia_nsw: Use the same ResultsPerPage for sample mode
- chile_base: Fold get_year_month_until into start_requests to make logic easier to follow
- chile_base: Move spider-specific logic into chile_compra_records and chile_compra_releases
- chile_compra_bulk: Set the earliest start year to 2009
- colombia: Extract retry() method from parse()
- moldova: Split into multiple callbacks
- uganda_release: Let Scrapy's dupe filter skip duplicates
- canada_montreal, chile_base, mexico_administracion_publica_federal, uk_contracts_finder: Make the code more generic for future method extraction

Other small changes to multiple spiders:

- Change some samples to be more recent
- Use the same variable names consistently across spiders
- Fold class attributes into methods if used only once
- Put literal URLs on separate lines, to make reading and copying easier
- Add comments to explain `kf_filename` value
- Add comments to explain `formatter` value
- Use f-strings if the format string is not reused
- Use XPath instead of CSS selectors
@jpmckinney jpmckinney merged commit e32f339 into master Jun 2, 2020
@jpmckinney jpmckinney deleted the build-request branch June 2, 2020 00:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Should filenames be meaningful?
1 participant