Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare Spark+Parquet with PostgreSQL/Alloy DB view based approaches #968

Open
bashir2 opened this issue Feb 26, 2024 · 2 comments
Open
Labels
enhancement New feature or request good first issue Good for newcomers P2:should An issue to be addressed in a quarter or so.

Comments

@bashir2
Copy link
Collaborator

bashir2 commented Feb 26, 2024

Now that our support for SQL-on-FHIR-v2 ViewDefinition is complete (#821 and #916) we should do some large scale comparisons of the Spark+Parquet based approach with relational DB based ones using materialized views. We can start with PostgreSQL and establish some guidelines on the scale of data at which using a single node PG DB starts to make less sense (compared to a multi node Spark+Parquet based approach). Then we should repeat the same experiment with AlloyDB to see the impact of columnar storage (while still single node).

We should do several experiments using multiple realistic workloads (e.g., calculating program or data quality metrics involving joins of multiple resource tables). But we also recognize that these comparisons will always be subjective to some extent, because of the choice of workloads.

@bashir2 bashir2 added enhancement New feature or request good first issue Good for newcomers P2:should An issue to be addressed in a quarter or so. labels Feb 26, 2024
@jakubadamek
Copy link
Collaborator

I might be interested as a sequel to #967

@bashir2
Copy link
Collaborator Author

bashir2 commented Feb 29, 2024

I might be interested as a sequel to #967

That would be great; please feel free to assign this to yourself once you start working on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers P2:should An issue to be addressed in a quarter or so.
Projects
None yet
Development

No branches or pull requests

2 participants