-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option to make validation errors ERRORs in job #50
Comments
What do you mean by more obvious? Do you have some scenario and an idea on how to improve this? |
Thanks, forgot about these tickets! Basically right now even if there are dozens of validation errors, the job will finish with "0 errors" in the scrapy cloud page. In practice this makes it easy to think a job was okay when it actually had lots of data problems. |
@andrewbaxter can you test this again with the branch master? |
@andrewbaxter @raphapassini @rennerocha I also believe that is the case with the latest version of spidermon. Does that mean #50 here and #47 can be closed? |
Yep, this looks good to me! Thanks! |
I guess I'll go ahead and close it. |
Sorry @Gallaecio :( I was looking at the wrong logs... Here's what my logs look like when I get a validation error currently:
My settings are:
I also tried without ADD_ERRORS_TO_ITEMS, but in that case I get neither an ERROR nor a warning... in fact, it's not clear what the validation does here other than increase a job stat. |
@andrewbaxter What kind of error you want here ? |
@andrewbaxter If you want to get a warning on items, you can add the following setting |
I think this was more about how the validation errors are logged/interact with the logging system. Ideally, I'd like to see
and also have the errors appear in the logs with log level ERROR. That way, say, in Scrapy Cloud, if you look at a job in the dashboard it will show "1 error" with a link, and when you click on it it will take you to the logs page with the ERROR filter already applied. Right now even if there's lots of validation issues the job appears as finished with 0 errors. |
@andrewbaxter Maybe we can convert the Warning into an error on the above setting ? |
In my opinion, When a monitor fails, it includes an error line in the logs, so I'd create a monitor that fails if Enabling |
@rennerocha I agree So should we close this issue ? |
I think the severity of incorrect items depends on the project, and in projects where data quality needs to be high it's as significant an issue as any other error. How about making whether it's an error or warning an option? |
@rennerocha @raphapassini Do we need a change here ? |
Validation failures should not be logged as errors and added to If we want a monitor to check the stats and then raise an error in the logs if it fails, we have the built-in monitor ItemValidationMonitor that can be easily included into the project. For more sophisticated validations, a custom monitor should be created. |
At the moment validation errors can be added to the items in the _validation key, but there doesn't seem to be any way to make it more obvious that some errors are incorrect.
The text was updated successfully, but these errors were encountered: