Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve parsing failure logging #167

Merged
merged 1 commit into from Jul 4, 2019

Conversation

@rgreinho
Copy link
Member

@rgreinho rgreinho commented Jul 4, 2019

Types of changes

  • New feature (non-breaking change which adds functionality)
  • Code cleanup / Refactoring

Description

Logs more details about the fields that could not be parsed correctly.

It is already showing great information in the logs:

± scrapd -vv --pages 3 --format count
2019-07-04T12:30:44-0500 scrapd.core.apd:862  Retrieving fatalities from 0001-01-01 to 9999-12-31.
2019-07-04T12:30:44-0500 scrapd.core.apd:867  Fetching page 1...
2019-07-04T12:30:44-0500 scrapd.core.apd:878  4 fatality page(s) to process.
2019-07-04T12:30:46-0500 scrapd.core.apd:809  Fatality report http://austintexas.gov/news/traffic-fatality-36-3 was not parsed correctly:
	 * could not retrieve the location
2019-07-04T12:30:46-0500 scrapd.core.apd:809  Fatality report http://austintexas.gov/news/traffic-fatality-37-4 was not parsed correctly:
	 * could not retrieve the location
2019-07-04T12:30:47-0500 scrapd.core.apd:809  Fatality report http://austintexas.gov/news/traffic-fatality-35-3 was not parsed correctly:
	 * could not retrieve the location
2019-07-04T12:30:47-0500 scrapd.core.apd:911  4 fatality page(s) is/are within the specified time range.
2019-07-04T12:30:47-0500 scrapd.core.apd:867  Fetching page 2...
2019-07-04T12:30:47-0500 scrapd.core.apd:878  9 fatality page(s) to process.
2019-07-04T12:30:47-0500 scrapd.core.apd:809  Fatality report http://austintexas.gov/news/traffic-fatality-34-4 was not parsed correctly:
	 * could not retrieve the location
2019-07-04T12:30:47-0500 scrapd.core.apd:809  Fatality report http://austintexas.gov/news/traffic-fatality-31-3 was not parsed correctly:
	 * could not retrieve the location
2019-07-04T12:30:47-0500 scrapd.core.apd:809  Fatality report http://austintexas.gov/news/traffic-fatality-30-1 was not parsed correctly:
	 * could not retrieve the location
2019-07-04T12:30:47-0500 scrapd.core.apd:809  Fatality report http://austintexas.gov/news/traffic-fatality-25-update was not parsed correctly:
	 * could not retrieve the location
2019-07-04T12:30:47-0500 scrapd.core.apd:809  Fatality report http://austintexas.gov/news/traffic-fatality-28-3 was not parsed correctly:
	 * could not retrieve the location
2019-07-04T12:30:47-0500 scrapd.core.apd:809  Fatality report http://austintexas.gov/news/traffic-fatality-33-5 was not parsed correctly:
	 * could not retrieve the location
2019-07-04T12:30:47-0500 scrapd.core.apd:809  Fatality report http://austintexas.gov/news/traffic-fatality-32-5 was not parsed correctly:
	 * could not retrieve the location
2019-07-04T12:30:47-0500 scrapd.core.apd:809  Fatality report http://austintexas.gov/news/traffic-fatality-27-4 was not parsed correctly:
	 * could not retrieve the location
2019-07-04T12:30:47-0500 scrapd.core.apd:809  Fatality report http://austintexas.gov/news/traffic-fatality-29-3 was not parsed correctly:
	 * could not retrieve the location
2019-07-04T12:30:47-0500 scrapd.core.apd:911  9 fatality page(s) is/are within the specified time range.
2019-07-04T12:30:47-0500 scrapd.core.apd:867  Fetching page 3...
2019-07-04T12:30:48-0500 scrapd.core.apd:878  9 fatality page(s) to process.
2019-07-04T12:30:53-0500 scrapd.core.apd:809  Fatality report http://austintexas.gov/news/traffic-fatality-26-3 was not parsed correctly:
	 * could not retrieve the location
2019-07-04T12:30:54-0500 scrapd.core.apd:809  Fatality report http://austintexas.gov/news/traffic-fatality-22-3 was not parsed correctly:
	 * could not retrieve the location
2019-07-04T12:30:54-0500 scrapd.core.apd:809  Fatality report http://austintexas.gov/news/traffic-fatality-21-3 was not parsed correctly:
	 * could not retrieve the location
2019-07-04T12:30:54-0500 scrapd.core.apd:809  Fatality report http://austintexas.gov/news/traffic-fatality-25-4 was not parsed correctly:
	 * could not retrieve the deceased information
	 * could not retrieve the location
	 * no deceased information found in fatality page
	 * age is invalid: None
2019-07-04T12:30:54-0500 scrapd.core.apd:809  Fatality report http://austintexas.gov/news/traffic-fatality-18-4 was not parsed correctly:
	 * could not retrieve the location
2019-07-04T12:30:54-0500 scrapd.core.apd:809  Fatality report http://austintexas.gov/news/traffic-fatality-23-3 was not parsed correctly:
	 * could not retrieve the location
2019-07-04T12:30:54-0500 scrapd.core.apd:809  Fatality report http://austintexas.gov/news/traffic-fatality-24-5 was not parsed correctly:
	 * could not retrieve the location
2019-07-04T12:30:54-0500 scrapd.core.apd:809  Fatality report http://austintexas.gov/news/traffic-fatality-19-5 was not parsed correctly:
	 * could not retrieve the location
2019-07-04T12:30:54-0500 scrapd.core.apd:809  Fatality report http://austintexas.gov/news/traffic-fatality-20-4 was not parsed correctly:
	 * could not retrieve the location
	 * age is invalid: -1
2019-07-04T12:30:54-0500 scrapd.core.apd:911  9 fatality page(s) is/are within the specified time range.
2019-07-04T12:30:54-0500 scrapd.cli.cli:87   Total: 21

Checklist:

  • [] I have updated the documentation accordingly
  • [] I have written unit tests

Fixes #152

Logs more details about the fields that could not be parsed correctly.

Fixes scrapd#152
@rgreinho rgreinho self-assigned this Jul 4, 2019
@rgreinho rgreinho merged commit 84ef3c3 into scrapd:master Jul 4, 2019
9 checks passed
Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

1 participant