Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap: data engineering 2023 #1285

Closed
26 of 41 tasks
larsyencken opened this issue Jun 28, 2023 · 0 comments
Closed
26 of 41 tasks

Roadmap: data engineering 2023 #1285

larsyencken opened this issue Jun 28, 2023 · 0 comments
Labels

Comments

@larsyencken
Copy link
Collaborator

larsyencken commented Jun 28, 2023

↑ Engineering | 2024 →

2023 Q4

Must have

Nice to have

  • An admin link to Github to edit or create a YAML metadata file

✅ 2023 Q3

See below
  • Use our new data API client-side
    • Decide how to measure site performance
    • ❓ Trial it on a subset of our work
    • Merge change to use API client-side
    • Check how site performance changed
  • Kill data_values and backporting
    • Make variables read-only in the admin
    • Insert a warning in the importers repo and warn data managers
    • Migrate all the covid scripts to the ETL
    • Decide on whether to keep a once-off set of backports
    • Turn off backporting and delete backporting code
    • Remove the data_values table
  • Support derived charts in visualisation
    • Full export of ETL indicators and DAG into MySQL
    • Support for basic relationships in the ETL and MySQL (see: proposal)

✅ 2023 Q2

See below

Not this year

  • Full model of the ETL in MySQL
    • MySQL contains the full dag
    • Every MySQL variable has an ETL link
  • ❓ Make it easier to see how big data updates are and plan them better
    • ❓ Draft technical catalog explaining indicators and dependencies and downstream usage (e.g. charts)
    • ❓ Create a better dashboard of how datasets are used (in the admin)
  • ❗ Admin support for ETL metadata editing
  • Update the data science API
    • Create a baking step for the new API
    • Update owid-catalog to support the new API
  • Bake new data download options
@larsyencken larsyencken changed the title Tracking issue: data engineering roadmap 2023 Roadmap: data engineering 2023 Jun 30, 2023
@stale stale bot added the wontfix This will not be worked on label Oct 21, 2023
@stale stale bot removed the wontfix This will not be worked on label Oct 23, 2023
@owid owid deleted a comment from stale bot Oct 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants