-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add_data.sh related Python scripts - flexible data loading #53
Comments
@anthonyfok is this done? If not please move to Sprint 33 Milestone. |
@jvanulde Thanks for the reminder! Done, and am in the process of moving other outstanding tasks to Sprint 33 too. |
NotesWed 2021-02-24 Skype meetingsIIRC, this issue was first discussed on Wednesday, 24 February 2021, over Skype meetings with Will and Drew. add_data.sh is the main orchestration of the whole thing. It has gotten a lot better over time, but it used to be extremely brittle, so anytime anyone changed any little thing, the whole thing would break. So, we've been spending time and effort trying to make this more flexible... For example: Will wrote the following SQL script to pull in the social vulnerability data: If the upstream CSV file (created by e.g. Murray or Tiegan) were changed, e.g. the headers
So, instead of explicitly defining those header files, And that way when the CSV files are changed, then we end up just loading the whole CSV as it is with the ... Some fields are critical... Mon 2021-05-10 Zoom meetingAbout model-factory/scripts/PSRA_copyTables.pyThe tables are defined in model-factory/scripts/psra_1.Create_tables.sql. These Python and SQL scripts are called from opendrr-api/python/add_data.sh like so: # PSRA_1-8
for PT in ${PT_LIST[@]}
do
python3 PSRA_runCreate_tables.py --province=${PT} --sqlScript="psra_1.Create_tables.sql"
python3 PSRA_copyTables.py --province=${PT}
python3 PSRA_sqlWrapper.py --province=${PT} --sqlScript="psra_2.Create_table_updates.sql"
python3 PSRA_sqlWrapper.py --province=${PT} --sqlScript="psra_3.Create_psra_building_all_indicators.sql"
python3 PSRA_sqlWrapper.py --province=${PT} --sqlScript="psra_4.Create_psra_sauid_all_indicators.sql"
python3 PSRA_sqlWrapper.py --province=${PT} --sqlScript="psra_5.Create_psra_sauid_references_indicators.sql"
done |
WIP, have yet to add code to read the header from CSV files. [Eventually] Fixes OpenDRR#53
Dynamically load raw model datasets which may have changes to the types of fields by reading and loading those fields. Implement tests on inputs with some reasonable constraints on what the stack will load
See also Issue #48
Major Priorities
PSRA:
sed
`add_data.shpsql opendrr -c 'COPY (SELECT * FROM psra_BC.psra_BC_hcurves_pga WHERE FALSE ) TO STDOUT WITH CSV HEADER;'
.
,()
, etc.config.ini
and add error checking if file not found (POSTGRES_* variables not defined)DSRA:
Minor Priorities
Exposure:
The text was updated successfully, but these errors were encountered: