Remove the read from duckdb table in fasta action #467

Taepper · 2024-06-07T08:37:26Z

The fasta action currently relies on reading from a duckdb table, which itself actually only reads from a .parquet on disk and joins that file with the set of sequence ids that are to be returned. This is the only place duckdb is used after preprocessing.

We can make this both more efficient and remove that dependence by reading from the files directly, maybe using a different file format, that is already ordered by id, which would remove the need for a join.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove the read from duckdb table in fasta action #467

Remove the read from duckdb table in fasta action #467

Taepper commented Jun 7, 2024

Remove the read from duckdb table in fasta action #467

Remove the read from duckdb table in fasta action #467

Comments

Taepper commented Jun 7, 2024