Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate Data Refreshes blocking during popularity steps #1473

Closed
1 task
stacimc opened this issue Aug 17, 2022 · 1 comment
Closed
1 task

Investigate Data Refreshes blocking during popularity steps #1473

stacimc opened this issue Aug 17, 2022 · 1 comment
Labels
💻 aspect: code Concerns the software code in the repository 🛠 goal: fix Bug fix 🟨 priority: medium Not blocking but should be addressed soon 🧱 stack: catalog Related to the catalog and Airflow DAGs
Projects

Comments

@stacimc
Copy link
Contributor

stacimc commented Aug 17, 2022

Description

The data refresh DAGs use a SingleRunExternalDAGsSensor to enforce a concurrency restraint. The actual data refresh steps on the ingestion server should only run for one media type at a time. The Sensor accomplishes this by checking to see if if there is another running data refresh DAG, and if that DAG's own Sensor has passed. If true, it blocks until the other DAG is finished.

Recently we ran an audio data refresh in production while the image refresh was running simultaneously. The audio's wait_for_data_refresh sensor went up for reschedule, even though the image refresh was still at the update_materialized_popularity_view step, meaning that its own sensor had not completed.

My initial read is that the current implementation of the Sensor should work here; we should investigate what happened and try to reproduce it.

Additional context

The workaround for this bug is setting the wait_for_data_refresh to success after manually verifying the state of the refreshes. This is not sustainable but allows us to pass refreshes.

Resolution

  • 🙋 I would be interested in resolving this bug.
@stacimc stacimc added 🚦 status: awaiting triage Has not been triaged & therefore, not ready for work 🛠 goal: fix Bug fix 🟧 priority: high Stalls work on the project or its dependents 💻 aspect: code Concerns the software code in the repository and removed 🚦 status: awaiting triage Has not been triaged & therefore, not ready for work labels Aug 17, 2022
@AetherUnbound AetherUnbound added 🟨 priority: medium Not blocking but should be addressed soon and removed 🟧 priority: high Stalls work on the project or its dependents labels Nov 15, 2022
@obulat obulat added the 🧱 stack: catalog Related to the catalog and Airflow DAGs label Feb 24, 2023
@obulat obulat transferred this issue from WordPress/openverse-catalog Apr 17, 2023
@AetherUnbound
Copy link
Contributor

We no longer have the popularity steps so closely coupled to the data refresh, and this issue hasn't come up since. Going to go ahead and close it!

@AetherUnbound AetherUnbound closed this as not planned Won't fix, can't repro, duplicate, stale Mar 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💻 aspect: code Concerns the software code in the repository 🛠 goal: fix Bug fix 🟨 priority: medium Not blocking but should be addressed soon 🧱 stack: catalog Related to the catalog and Airflow DAGs
Projects
Archived in project
Openverse
  
Backlog
Development

No branches or pull requests

3 participants