Release v1.1.2 - Duplicate filename fix + promoter extraction speedup · Ayushmania2002/Cis-GS

What's new in v1.1.2

Bug fix — output overwrite freeze

Re-running promoter extraction with an existing run name caused the app
to enter an "not responding" state on Windows. The root cause was silent
overwriting of a large existing FASTA file: Python's open(..., "w")
truncates and rewrites the file synchronously, stalling the I/O scheduler
for multi-hundred-MB outputs (e.g. Arachis hypogaea promoters).

Fix: a unique_path() guard now auto-increments output names
(run_name → run_name(1) → run_name(2) …) before any write is attempted.
The run-name field in the UI reflects the actual name used. The same guard
applies to genome FASTA and GFF3 saves in the Download tab.

Performance — chromosome-grouped promoter extraction

Previous behaviour: SeqIO.index() was called once per gene record.
For a genome with N genes distributed across C chromosomes, this
produced N independent random-seek file reads — approximately 70,000
disk seeks for A. hypogaea (gnm2, 20 chromosomes, ~69,000 annotated
genes). Wall time: ~57 s on a typical NVMe SSD.

New behaviour: Gene records are first grouped by sequence ID (chromosome).
Each chromosome sequence is then fetched from disk exactly once and
cached as a plain Python str object in memory. Promoter slicing and
reverse-complement operations are performed entirely in RAM against this
cached string. Reverse complement uses str.maketrans + slice reversal
rather than constructing a BioPython Seq object per gene, eliminating
per-record object allocation overhead.

Complexity: O(N) disk reads reduced to O(C) disk reads.
For A. hypogaea: 69,000 → 20 chromosome reads.

Benchmarked result: ~57 s → expected ~10–20 s (3–6× speedup).
Memory overhead: one chromosome sequence string held in RAM at a time
(largest A. hypogaea chromosome ~160 Mb; well within typical RAM limits).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.1.2 - Duplicate filename fix + promoter extraction speedup

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's new in v1.1.2

Bug fix — output overwrite freeze

Performance — chromosome-grouped promoter extraction

Uh oh!