You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From 1970 - 1989, PHMSA distribution and transmission data is reported in one .xls spreadsheet per dataset, with one tab containing multiple years of data. This data was not included in the original round of raw data extraction because of data particularities. These will need to be addressed in order to extract this data:
Multi-row column headers with repeated names for the final row of columns. This will make column mapping substantially more work, even though the columns available are roughly similar to 1990 data, with slightly fewer fields.
Tabs correspond to years rather than pages, with no breakdown between table parts. Each tab has a different version of the form. This will probably require some level of adaptation of our existing extraction infrastructure, which expects one sheet per year with tabs sorted by form section.
Potentially, we will want to extract this data into separate raw_phmsagas__transmission_1970_1979, raw_phmsagas__transmission_1980_1981 etc. tables and then split and concatenate them to the other tables during processing.
The text was updated successfully, but these errors were encountered:
e-belfer
added
new-data
Requests for integration of new data.
phmsa
Data from the Pipeline and Hazardous Material Safety Administration
labels
Jan 24, 2024
e-belfer
changed the title
Extract PHMSA data from 1970-1989
Extract PHMSA distribution and transmission data from 1970-1989
Jan 24, 2024
From 1970 - 1989, PHMSA distribution and transmission data is reported in one
.xls
spreadsheet per dataset, with one tab containing multiple years of data. This data was not included in the original round of raw data extraction because of data particularities. These will need to be addressed in order to extract this data:Potentially, we will want to extract this data into separate
raw_phmsagas__transmission_1970_1979
,raw_phmsagas__transmission_1980_1981
etc. tables and then split and concatenate them to the other tables during processing.The text was updated successfully, but these errors were encountered: