-
-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(#1328) add guide for creating dbt models #1394
Conversation
I'll complete my review tomorrow. Time got away from me a bit today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good but left some minor comments.
|
||
The url is set in the `CHT_PIPELINE_BRANCH_URL` environment variable, either in .env if using docker compose, or in values.yaml if useing kubernetes. | ||
|
||
When models are changed, CHT Sync needs to be restarted for the change to take effect; when it restarts, it automatically downloads the latest version of cht pipeline and applies than changes; for models materialized as views, they are are updated instantly, for models materialized as incmrenetal tables, the table is dropped and will be unavailable until it is rebuilt from scratch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When models are changed, CHT Sync needs to be restarted for the change to take effect; when it restarts, it automatically downloads the latest version of cht pipeline and applies than changes; for models materialized as views, they are are updated instantly, for models materialized as incmrenetal tables, the table is dropped and will be unavailable until it is rebuilt from scratch. | |
When models are changed, CHT Sync needs to be restarted for the change to take effect; when it restarts, it automatically downloads the latest version of cht pipeline and applies the changes; for models materialized as views, they are are updated instantly, for models materialized as incremental tables, the table is dropped and will be unavailable until it is rebuilt from scratch. |
|
||
The url is set in the `CHT_PIPELINE_BRANCH_URL` environment variable, either in .env if using docker compose, or in values.yaml if useing kubernetes. | ||
|
||
When models are changed, CHT Sync needs to be restarted for the change to take effect; when it restarts, it automatically downloads the latest version of cht pipeline and applies than changes; for models materialized as views, they are are updated instantly, for models materialized as incmrenetal tables, the table is dropped and will be unavailable until it is rebuilt from scratch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth it to add a sentence discouraging editing of the base models given how long it would take to rebuild the tables if there is already a lot of data in the database.
|--|--| | ||
|`uuid`|Unique identifier of the record. Note this is both the primary key for this table, and a foreign key to the couchdb table.| | ||
|`contact_uuid`| uuid of the `contact` who made the report| | ||
|`parent_contact_uuid`| uuid of the parent of `contact` who made the report (usually a place)| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having this named as contact_parent_uuid
as we have it in the model currently makes it clearer.
- Models to contain aggregates that may be useful for dashboards or analysis | ||
|
||
### Form models | ||
for each form in the CHT app, create one model which selects from `data_record where form = 'theformyouwant'`, moves the fields from the `fields` json into columns, applies any transformations that would be convenient, and if necessary, indexes them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for each form in the CHT app, create one model which selects from `data_record where form = 'theformyouwant'`, moves the fields from the `fields` json into columns, applies any transformations that would be convenient, and if necessary, indexes them. | |
for each form in the CHT app, create one model which selects from `data_record where form = 'theformyouwant'`, moves the fields from the `fields` json into columns, applies any transformations that would be convenient, and if necessary, add indexes to them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@witash this is an excellent start to the DBT model documentation. Thank you for the detailed steps and the example queries.
My suggestions are mainly typo and style correction.
Considering @njuguna-n's latest findings about lambda views, we will need a separate PR for those changes. I suggest:
- fixing the current styles and typos
- creating a separate PR to be merged into this branch with the latests changes from Njuguna
- re-doing a review.
Co-authored-by: Andra Blaj <andra@medic.org> Co-authored-by: njuguna-n <141340177+njuguna-n@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small style changes.
@njuguna-n, how do you plan to proceed with updating this doc to include the latest changes (e.g. lambda views)? |
@andrablaj I have added a section about the base lambda view here. |
* docs: add a description about the couchdb lambda view * Update content/en/apps/guides/data/analytics/building-dbt-models.md Co-authored-by: Andra Blaj <andra@medic.org> --------- Co-authored-by: Andra Blaj <andra@medic.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good so far but we need to update some sections that have changed since this was first written. I think we should also add a section describing how to handle deleted documents using FK relationships and cascade deletes.
```yml | ||
packages: | ||
- git: "https://github.com/medic/cht-pipeline" | ||
revision: "1.0.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing as we do not have versioning on cht-pipeline yet we should suggest using the main branch and updating this once this issue is done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea the thing about using the main branch is they might get changes that they don't want or are not prepared for. Maybe we wait until that issue is done boefre merging this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
## Setup | ||
|
||
To create application specifc models, create a new dbt project as described [here](https://docs.getdbt.com/reference/commands/init) and a Github repository (it may be public or private). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To create application specifc models, create a new dbt project as described [here](https://docs.getdbt.com/reference/commands/init) and a Github repository (it may be public or private). | |
To create application specific models, create a new dbt project as described [here](https://docs.getdbt.com/reference/commands/init) and a Github repository (it may be public or private). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be updated to include the new tables e.g. document_metadata and update the columns in the existing tables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering why we need a new GH repo for dbt and Andra pointed me to these nascent docs. I understand now!
I'm suggesting some minor re-wording for clarity
Co-authored-by: mrjones <8253488+mrjones-plip@users.noreply.github.com> Co-authored-by: Andra Blaj <andra@medic.org> Co-authored-by: njuguna-n <141340177+njuguna-n@users.noreply.github.com>
this now depends on medic/cht-pipeline#155 which contains the contact_type models, changes to person and place models, and macros if there's no changes from review on that, this is ready for re-review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keeping my review very narrowly scoped, trusting others are reviewing this PR more widely. As such, my two minor feedback points were addressed - ship it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comments are mostly typos and formatting-related. Good job at having this done!
Co-authored-by: Andra Blaj <andra@medic.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking very good. left a couple of additional comments.
content/en/apps/guides/data/analytics/building-dbt-models/cht-pipeline-er.png
Outdated
Show resolved
Hide resolved
Co-authored-by: Andra Blaj <andra@medic.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Great job!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Closes #1328
Closes #1447