Fixes #5266 - Add a separate view for live user_reports #5267
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a follow up for #5250.
I would like to create a separate live view mainly for ML classification job using bugbug that is defined in docker-etl repository.
This ETL job runs every 15 minutes and classifies reports gradually throughout the day. That worked well with the live data, but with the switch to historical table that gets updates only once a day, the script will try to classify all reports at once when they appear in the table. Given the size of the job (800-1k reports with non-empty description per day) this will take a long time, and ETL script likely will not get results during a single run. With this approach, by the time a report appears in the historical table, it will be already classified.
Checklist for reviewer:
<username>:<branch>
of the fork as parameter. The parameter will also show upin the logs of the
manual-trigger-required-for-fork
CI task together with more detailed instructions.For modifications to schemas in restricted namespaces (see
CODEOWNERS
):