Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Lighten end_to_end.yml workflow #1243

Merged
merged 16 commits into from
Aug 30, 2022

Conversation

maximearmstrong
Copy link
Contributor

@maximearmstrong maximearmstrong commented Aug 18, 2022

Summary:

Closes #1233 and #1206

This PR lightens the end-to-end workflow. It concentrates the end-to-end workflow efforts in a single file and samples GTFS Schedule sources from the Mobility Database catalogs. This way, each time the workflow is run, the snapshot validator is tested using different sources.

Using the latest URLs from the Mobility Database fixes issue #1206, since the SSL error was raised when using agency or transitfeeds URLs.

Changes:

  • The end_to_end_big.yml and end_to_end_100.yml workflows are removed so that everything about the end-to-end workflow happens in end_to_end.yml.
  • end_to_end.yml is refactored to mimic the behavior of the acceptance_test.yml workflow. The artifacts are named similarly to those in the acceptance test workflow.
  • harvest_latest_versions.py is updated to allow sampling. 5% of the GTFS Schedule sources in the database are used for the end-to-end workflow (excluding sources requiring authentication for simplicity).
  • queue_runner.sh is udpated to add a flag. It the flag is set to true, the queue runner will validate the datasets using both the snapshot and master validators, otherwise only the snapshot one.

Expected behavior:

The end-to-end workflow is run on each commit and samples different GTFS Schedule sources from the Mobility Database each time.

Please make sure these boxes are checked before submitting your pull request - thanks!

  • Run the unit tests with gradle test to make sure you didn't break anything
  • Format the title like "feat: [new feature short description]". Title must follow the Conventional Commit Specification(https://www.conventionalcommits.org/en/v1.0.0/).
  • Linked all relevant issues
  • Include screenshot(s) showing how this pull request works and fixes the issue(s)

@github-actions
Copy link
Contributor

Thank you for this contribution! 🍰✨🦄

Information about source corruption

1 out of 1347 sources are corrupted.
The following sources are corrupted:

Acceptance test details

The changes in this pull request did not trigger any new errors on known GTFS datasets from the MobilityDatabase.
Download the full acceptance test report for commit bdb3361 here (report will disappear after 90 days).

@maximearmstrong maximearmstrong self-assigned this Aug 18, 2022
@maximearmstrong maximearmstrong marked this pull request as ready for review August 18, 2022 21:26
@github-actions
Copy link
Contributor

Thank you for this contribution! 🍰✨🦄

Information about source corruption

1 out of 1346 sources are corrupted.
The following sources are corrupted:

Acceptance test details

The changes in this pull request did not trigger any new errors on known GTFS datasets from the MobilityDatabase.
Download the full acceptance test report for commit 86ec7b9 here (report will disappear after 90 days).

@bdferris-v2
Copy link
Collaborator

One high-level comment: am I reading the code correctly that a different sub-set of feeds will be selected with each run? I feel like that could be problematic. For example, if a particular feed causes a PR to fail, you wouldn't necessarily get the same feed on the next run, making it hard to repro. I think I'd be more in favor of a consistent set of feeds for each run.

@maximearmstrong
Copy link
Contributor Author

@bdferris-v2 Our original idea was that we would cover more feeds this way, with different feeds being tested at each commit, but you bring a very good point. I changed the behaviour and added a consistent set of 50 feeds.

@emmambd Do you think the selection makes sense? I've selected feeds from various locations, some with features, some aggregated. Let me know if you think more or other feeds should be added.

@github-actions
Copy link
Contributor

Thank you for this contribution! 🍰✨🦄

Information about source corruption

1 out of 1347 sources are corrupted.
The following sources are corrupted:

Acceptance test details

The changes in this pull request did not trigger any new errors on known GTFS datasets from the MobilityDatabase.
Download the full acceptance test report for commit 233a3bb here (report will disappear after 90 days).

@isabelle-dr
Copy link
Contributor

@maximearmstrong do we need to update (END_TO_END.md)[https://github.com/MobilityData/gtfs-validator/blob/master/docs/END_TO_END.md]?

@emmambd
Copy link
Contributor

emmambd commented Aug 24, 2022

Do you think the selection makes sense? I've selected feeds from various locations, some with features, some aggregated. Let me know if you think more or other feeds should be added.

@maximearmstrong I don't see any feed types missing so this list looks good to me! My one thought/consideration might be to look at end-to-end test failures in the past and see which feeds it failed on. That would be a last "check" to see if there are any other feed types that would be valuable to include.

@maximearmstrong
Copy link
Contributor Author

maximearmstrong commented Aug 24, 2022

@maximearmstrong do we need to update (END_TO_END.md)[https://github.com/MobilityData/gtfs-validator/blob/master/docs/END_TO_END.md]?

@isabelle-dr Sure, I wasn't aware of that file. Its content is outdated - I suggest we delete it. If you think there's value in having our workflows documented, we can add WORKFLOWS.md in another PR to explain our workflows, including test_package_doc.yml and end_to_end.yml. Would it be a good plan?

@github-actions
Copy link
Contributor

✅ Rule acceptance tests passed.
New Errors: 0 out of 1345 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
Dropped Errors: 0 out of 1345 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
0 out of 1345 sources (~0 %) are corrupted.
Commit: bd765ab
Download the full acceptance test report here (report will disappear after 90 days).
✅ Rule acceptance tests passed.

@isabelle-dr
Copy link
Contributor

@maximearmstrong sounds good, that's a good plan!

@github-actions
Copy link
Contributor

✅ Rule acceptance tests passed.
New Errors: 0 out of 1346 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
Dropped Errors: 0 out of 1346 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
0 out of 1346 sources (~0 %) are corrupted.
Commit: 1475ba9
Download the full acceptance test report here (report will disappear after 90 days).
✅ Rule acceptance tests passed.

@maximearmstrong maximearmstrong merged commit 0a22069 into master Aug 30, 2022
@maximearmstrong maximearmstrong deleted the issue/1233/lighten-end-to-end-workflow branch August 30, 2022 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lighten end-to-end workflows or remove them
4 participants