-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: update database with validators report #383
Conversation
Great PR! It was easy to follow what's going on with all the comments and code structure |
logging.info(f"Validation report {report_id} already exists. Terminating.") | ||
raise Exception(f"Validation report {report_id} already exists.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming that cloud functions can be called multiple times with the same parameters, It's safe to log a warning and not raise an error. Raising an error could cause infinite loops, or at least multiple calls to the cloud function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did change the log from info to warning, but the thrown exception is caught line 204:
# Generate the database entities required for the report
try:
entities = generate_report_entities(
version,
validated_at,
json_report,
dataset_stable_id,
session,
feed_stable_id,
)
except Exception as error:
return str(error), 409 # Conflict if report already exists
Do you think it still requires refactoring of some sort?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The question is, if the GCP gets 409, will it re-send the event?
My suggestion is if the function detects that the report is already created, don't send an error status back as it's expected message gets duplicated in the cloud environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding the retry happens when an exception is thrown and not caught (https://cloud.google.com/functions/docs/samples/functions-tips-retry). I tested and the function do not seem to retry when the same request is done twice (duplicate id) -- here are the logs
Summary:
This PR introduces a new cloud function tasked with the role of updating the database with entities derived from the GTFS validation report.
Expected Behavior:
Upon completion of the JSON reports' upload to the
mobilitydata-feeds-[dev | qa | prod]
bucket by the gtfs web validator, this function is triggered via thegtfs_validator_execution
workflow. This ensures that the database is promptly updated with the latest GTFS validation results.Testing Tips:
gtfs_validator_execution
workflow as outlined in the testing tips of feat: execute the GTFS Validator after downloading a dataset #342.Please make sure these boxes are checked before submitting your pull request - thanks!
./scripts/api-tests.sh
to make sure you didn't break anything