Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #3423
This PR reduces the time required to load the "Movie Collection" dataset for local as well as remote DBs.
Technical details
Problem: We previously relied on a single SQL dump of Movie Collection schema to load the entire dataset into the DB. This was fine when the DB was present along with the django service but was really slow in the case of a remote DB which ultimately resulted in server timeout with a 502 response.
Solution: This PR solves the aforementioned problem by breaking the large SQL dump into parts and extracting all the data to be loaded in multiple
.csv
.movie_collection_tables.sql
: Contains SQL queries for setting up tables for the dataset.movies_csv/
: Contains all the data required to be loaded into the tables in separate .csvs.movie_collection_fks.sql
: Contains SQL queries for setting up PKs and FKs on the tables.We first execute
movie_collection_tables.sql
then we load all the data in the tables using SQLCOPY
instead ofINSERT
and then finally we executemovie_collection_fks.sql
to setup PKs and FKs.Performance (GCP):
Checklist
Update index.md
).develop
branch of the repositoryvisible errors.
Developer Certificate of Origin
Developer Certificate of Origin