-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added views to the controller and a fast recreate option #929
Conversation
104780e
to
4705f63
Compare
4705f63
to
be4cf7c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @bashir2 for the bug fixes and the SQL-on-FHIR V2 changes. I am exited that with this change, we now have the SQL-on-FHIR V2 spec implementation done. Kudos for that.
The changes looks fine. I have few comments/suggestions which are minor. Please have a look.
pipelines/batch/src/main/java/com/google/fhir/analytics/ProcessGenericRecords.java
Show resolved
Hide resolved
pipelines/batch/src/main/java/com/google/fhir/analytics/ProcessGenericRecords.java
Outdated
Show resolved
Hide resolved
pipelines/batch/src/main/java/com/google/fhir/analytics/JdbcResourceWriter.java
Outdated
Show resolved
Hide resolved
pipelines/batch/src/main/java/com/google/fhir/analytics/ProcessGenericRecords.java
Show resolved
Hide resolved
bunsen/bunsen-avro/src/test/java/com/cerner/bunsen/avro/R4AvroConverterTest.java
Show resolved
Hide resolved
pipelines/controller/src/main/java/com/google/fhir/analytics/ApiController.java
Show resolved
Hide resolved
pipelines/batch/src/main/java/com/google/fhir/analytics/FhirEtl.java
Outdated
Show resolved
Hide resolved
pipelines/controller/src/main/java/com/google/fhir/analytics/PipelineManager.java
Show resolved
Hide resolved
pipelines/batch/src/main/java/com/google/fhir/analytics/ProcessGenericRecords.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @chandrashekar-s for the review.
bunsen/bunsen-avro/src/test/java/com/cerner/bunsen/avro/R4AvroConverterTest.java
Show resolved
Hide resolved
pipelines/batch/src/main/java/com/google/fhir/analytics/FhirEtl.java
Outdated
Show resolved
Hide resolved
pipelines/batch/src/main/java/com/google/fhir/analytics/JdbcResourceWriter.java
Outdated
Show resolved
Hide resolved
pipelines/batch/src/main/java/com/google/fhir/analytics/ProcessGenericRecords.java
Show resolved
Hide resolved
pipelines/batch/src/main/java/com/google/fhir/analytics/ProcessGenericRecords.java
Outdated
Show resolved
Hide resolved
pipelines/batch/src/main/java/com/google/fhir/analytics/ProcessGenericRecords.java
Show resolved
Hide resolved
pipelines/batch/src/main/java/com/google/fhir/analytics/ProcessGenericRecords.java
Show resolved
Hide resolved
pipelines/controller/src/main/java/com/google/fhir/analytics/ApiController.java
Show resolved
Hide resolved
pipelines/controller/src/main/java/com/google/fhir/analytics/PipelineManager.java
Show resolved
Hide resolved
e705bb9
to
8c65407
Compare
Thanks @bashir2 for the changes. This is a great milestone to be completed. I am awaiting for the adoption of this feature and see the benefits of it. |
Description of what I changed
This is the last piece that fixes #916. Beside adding the incremental view update option to the
controller
, this also implements a fast view recreation option. This is done by reading the whole DWH snapshot (the Parquet files), convert them back to HAPI objects, and apply the new view logic on them. This can be an order of magnitude (or more) faster than refetching resources from the FHIR server. For a fairly large DWH (which originally took ~200 minutes to create by reading from the FHIR server), recreating views only took ~13 to 18 minutes (depending on old views size/presence). From this time, less than 2 minutes was actually converting Avro to HAPI.Note we never used the Avro to HAPI feature of Bunsen in our production code; this first usage uncovered some new bugs which are fixed in this PR as well.
E2E test
TESTED:
Ran the controller and tested the FULL and VIEWS modes. See above for performance testing.
Checklist: I completed these to help reviewers :)
I have read and will follow the review process.
I am familiar with Google Style Guides for the language I have coded in.
No? Please take some time and review Java and Python style guides.
My IDE is configured to follow the Google code styles.
No? Unsure? -> configure your IDE.
I have added tests to cover my changes. (If you refactored existing code that was well tested you do not have to add tests)
I ran
mvn clean package
right before creating this pull request and added all formatting changes to my commit.All new and existing tests passed.
My pull request is based on the latest changes of the master branch.
No? Unsure? -> execute command
git pull --rebase upstream master