Use dusql to create a table of file info #9

aidanheerdegen · 2019-06-18T00:22:54Z

When nccompress times out and is re-run on the same directory it ends up trying to open every file to check if it is netCDF. This is slow and inefficient if it has already done that before.

Could use dusql to create a file info database, and create a table that stores the format information, and if it has already been compressed, so the work is not duplicated when it it is run again.

The text was updated successfully, but these errors were encountered:

ccarouge · 2019-06-18T00:27:28Z

Can it handle independent changes made to the directory well? I mean would nccompress need to run a dusql update each time it runs to make sure the database is up to date? How long would that take? How would the update keep the file info table in sync? Maybe those are trivial questions when dealing with databases but I don't know much about databases.

aidanheerdegen · 2019-06-18T00:42:29Z

Those are good questions and I don't know all the answers myself. At the very least dusql can be run and the database updated but we'll still know if file modification times have changed since their status was determined last time it was run (if that information is saved in this other proposed table, which clearly, it should be).

ScottWales · 2019-06-18T02:21:53Z

We could add a minimum age to dusql.scan(), so that it only runs a full scan if the last scan was > 1 hour ago or whatever.

The scan takes however long it takes to os.walk() the directory tree, plus however long it takes to ID the file type for newly added files (file ID isn't currently implemented, but should be doable from looking at the first few bytes of the file)

ScottWales · 2019-06-18T02:24:43Z

See coecms/dusql#29

aidanheerdegen mentioned this issue Aug 2, 2019

Add simple file filtering patterns #13

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use dusql to create a table of file info #9

Use dusql to create a table of file info #9

aidanheerdegen commented Jun 18, 2019

ccarouge commented Jun 18, 2019

aidanheerdegen commented Jun 18, 2019

ScottWales commented Jun 18, 2019

ScottWales commented Jun 18, 2019

Use dusql to create a table of file info #9

Use dusql to create a table of file info #9

Comments

aidanheerdegen commented Jun 18, 2019

ccarouge commented Jun 18, 2019

aidanheerdegen commented Jun 18, 2019

ScottWales commented Jun 18, 2019

ScottWales commented Jun 18, 2019