You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
User U is a member of team who is doing a data audit of client C who stores data in multiple columns.
Only common denominator is a column "site" which exists in all tables.
The team has a list of all sites and has a team drive T, which is the team's cleaned data.
User U needs to reads through C's data to find tables that nobody else in the team have worked on. U shares it in online spreadsheet.
Once U has cleaned the data, s/he uploads it to T.
Drives:
/tmp/tablite.hdf5 # local working memory on SSD.
/C/datasources.hdf5 # source data from the client (read only access)
/T/clean_data.hdf5 # teams cleaned data.
Workflow:
prerequisite: User selects site 'Corrusant'
site_name= 'Corrusant'
all_tables = Table.reload_stored_tables('/C/datasources.hdf5') # rapid remote header level read-only access.
# sift through all tables and create tables for site on localhost using the default 'H5_STORAGE':
wip = [ t.any(**{'site': lambda x:x=site_name}) for t in all_tables) ]
# undisclosed process used to clean up the data.
# create/append to a super table with all data from all sites. Note that this table may already exist!
super_table = Table(H5_location="/T/clean_data.hdf5")
for table in wip:
super_table.stack(table) # append to the team wide super table.
At this point clean data exists in a super table for the whole team to work on.
The text was updated successfully, but these errors were encountered:
Use case:
Drives:
Workflow:
prerequisite: User selects site 'Corrusant'
At this point clean data exists in a super table for the whole team to work on.
The text was updated successfully, but these errors were encountered: