Fix Transcript.exons crash when GTF lacks exon_id attribute by iskandr · Pull Request #331 · openvax/pyensembl

iskandr · 2026-04-21T16:17:42Z

Summary

Some GTFs (Ensembl release 54 and earlier, plus non-Ensembl GTFs) omit the exon_id attribute. pyensembl's installer already treats that column as optional (see `database.py:134`), but `Transcript.exons` still unconditionally SELECTed exon_id, so any call on such a genome crashed with:
```
sqlite3.OperationalError: no such column: exon_id
```
This was hit from pirlygenes: FN1 tests call `transcript.exons` and the local pyensembl cache (old release) has the exon table without the exon_id column.
`Transcript.exons` now checks `db.column_exists("exon", "exon_id")` and falls back to constructing Exon objects directly from the exon row, with a synthesized per-transcript ID of the form `"<transcript_id>exon"`.
Exon objects returned from the fallback path carry the real contig/start/end/strand/gene coordinates from the GTF; only the id is synthetic.
Regression test builds a minimal Ensembl-style GTF with `exon_number` but no `exon_id` and verifies exon ordering and synthesized IDs.
Bumps to 2.6.7.

Test plan

New `test_transcript_exons_without_exon_id` passes.
Full local suite (120 tests excluding HLA-A data-drift ones) passes.
CI green on Python 3.9–3.12 (exercises the normal path — release 75/77/93 GTFs all have exon_id).

Ensembl release 54 and some non-Ensembl GTFs (e.g. UCSC refseq/gencode) omit the exon_id attribute. pyensembl's installer already treats the column as optional (database.py:134), but Transcript.exons still unconditionally SELECTed exon_id, crashing with sqlite3.OperationalError: no such column: exon_id. Transcript.exons now checks db.column_exists("exon", "exon_id") and falls back to building Exon objects directly from the exon row with a synthesized per-transcript ID of the form "<transcript_id>_exon_<n>". Adds a regression test that builds an Ensembl-style GTF with exon_number but no exon_id and verifies both exon ordering and synthesized IDs. Bumps to 2.6.7.

coveralls · 2026-04-21T16:24:30Z

coverage: 83.468% (+0.3%) from 83.208% — fix-exon-id-missing into main

iskandr merged commit a35f787 into main Apr 21, 2026
10 checks passed

iskandr deleted the fix-exon-id-missing branch April 21, 2026 16:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Transcript.exons crash when GTF lacks exon_id attribute#331

Fix Transcript.exons crash when GTF lacks exon_id attribute#331
iskandr merged 1 commit intomainfrom
fix-exon-id-missing

iskandr commented Apr 21, 2026

Uh oh!

coveralls commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

iskandr commented Apr 21, 2026

Summary

Test plan

Uh oh!

coveralls commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants