Copy CSV and Storage TODOs for Phase 1 #504

semihsalihoglu-uw · 2022-03-08T05:41:11Z

Functionality

Support for list data type
Replace robin_hood_node_id map to use our hash_index. Two main things need to be done:
1. Hash_Index needs to support parallel updates.
2. We need to replace robin_hood_node_id map. We should ensure that after the index is constructed, we should save it to disk and then in later parts of the loader access it through the BufferManager.
Allow copying a node/rel csv file with or without header.

Usability

Progress Reporting Mechanism: This piece of code works nicely: https://stackoverflow.com/questions/14539867/how-to-display-a-progress-indicator-in-pure-c-c-cout-printf. We need to wrap it around a updateProgress function, do some computation to compute progress percentages based on the number of files and lines in each file and then call updateProgress in different parts of the code. [Loader progress bar #507 ]
When large strings are being inserted, give a warning instead of failing. => Just take the prefix 4096 char for now.
If for a relationship, there is single nodeLabel and a single destination nodeLabel, then do not require relationship files to contain a START_ID_LABEL and END_ID_LABEL fields.
Verify each CSV header as the first step and if there is something wrong, then error early with a proper error message. => Be graceful with capitalization but let's not accept arbitrary column names that don't match the schema.

semihsalihoglu-uw assigned ray6080 Mar 8, 2022

This was referenced Mar 8, 2022

List of Features for Phase 1 #85

Closed

Loader progress bar #507

Merged

ray6080 added the storage label Mar 27, 2022

semihsalihoglu-uw assigned acquamarin Aug 17, 2022

acquamarin mentioned this issue Aug 17, 2022

Allow flexible header in node/rel csv files #754

Closed

semihsalihoglu-uw changed the title ~~Loader and Storage TODOs for Phase 1~~ Copy CSV and Storage TODOs for Phase 1 Aug 17, 2022

acquamarin mentioned this issue Aug 17, 2022

Replace loader with ddl and copyCSV #747

Merged

This was referenced Sep 27, 2022

Add config for copying csv with or without header #802

Merged

Make ID labels optional for rels with single label #808

Merged

mewim self-assigned this Oct 5, 2022

This was referenced Oct 10, 2022

Truncate string with size > 4096 for CSV copier #819

Merged

Verify CSV header show a message on error #825

Merged

acquamarin closed this as completed Nov 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Copy CSV and Storage TODOs for Phase 1 #504

Copy CSV and Storage TODOs for Phase 1 #504

semihsalihoglu-uw commented Mar 8, 2022 •

edited

Copy CSV and Storage TODOs for Phase 1 #504

Copy CSV and Storage TODOs for Phase 1 #504

Comments

semihsalihoglu-uw commented Mar 8, 2022 • edited

Functionality

Usability

semihsalihoglu-uw commented Mar 8, 2022 •

edited