feat!: UTA access class separation of concerns, postgres lib fixes#465
feat!: UTA access class separation of concerns, postgres lib fixes#465jsstevenson wants to merge 18 commits into
Conversation
|
Will review this tomorrow |
| resp = await uta_repo.get_alt_ac_start_or_end("NM_152263.3", 822, 892, None) | ||
| assert resp == tpm3_1_8_end_genomic | ||
|
|
||
| with pytest.raises(NoMatchingAlignmentError): |
There was a problem hiding this comment.
Consistency?
| with pytest.raises(NoMatchingAlignmentError): | |
| with pytest.raises( | |
| NoMatchingAlignmentError, | |
| match=re.escape( | |
| "Unable to find a result where NM_152263.63 has transcript coordinates (tx_exon_start=822, tx_exon_end=892) between an exon's start and end coordinates on gene=None" | |
| ), | |
| ): |
Co-authored-by: Kori Kuzma <korikuzma@gmail.com>
|
just as a heads-up -- it occurred to me that maybe this class should be tailored to ensure compatibility with the hgvs dataprovider interface so that it can be reused there. so for now i am pausing this to enable reflection |
ugh I think I would like to see a change in hgvs rather than adapting this function |
korikuzma
left a comment
There was a problem hiding this comment.
I had to change the uta-setup.sql to the following to get tests to pass (otherwise, I got the following error when running tests Getting psycopg.errors.InsufficientPrivilege: must be owner of table genomic)
\c uta;
GRANT CONNECT ON DATABASE uta TO anonymous;
GRANT USAGE ON SCHEMA uta_20241220 TO anonymous;
CREATE TABLE IF NOT EXISTS uta_20241220.genomic AS
SELECT
t.hgnc,
aes.alt_ac,
aes.alt_aln_method,
aes.alt_strand,
ae.start_i AS alt_start_i,
ae.end_i AS alt_end_i
FROM uta_20241220.transcript t
JOIN uta_20241220.exon_set tes
ON t.ac = tes.tx_ac
AND tes.alt_aln_method = 'transcript'
JOIN uta_20241220.exon_set aes
ON t.ac = aes.tx_ac
AND aes.alt_aln_method <> 'transcript'
JOIN uta_20241220.exon te
ON tes.exon_set_id = te.exon_set_id
JOIN uta_20241220.exon ae
ON aes.exon_set_id = ae.exon_set_id
AND te.ord = ae.ord
LEFT JOIN uta_20241220.exon_aln ea
ON te.exon_id = ea.tx_exon_id
AND ae.exon_id = ea.alt_exon_id;
CREATE INDEX IF NOT EXISTS alt_pos_index
ON uta_20241220.genomic (alt_ac, alt_start_i, alt_end_i);
CREATE INDEX IF NOT EXISTS gene_alt_index
ON uta_20241220.genomic (hgnc, alt_ac);
CREATE INDEX IF NOT EXISTS alt_ac_index
ON uta_20241220.genomic (alt_ac);
ALTER TABLE uta_20241220.genomic OWNER TO anonymous;
ALTER INDEX uta_20241220.alt_pos_index OWNER TO anonymous;
ALTER INDEX uta_20241220.gene_alt_index OWNER TO anonymous;
ALTER INDEX uta_20241220.alt_ac_index OWNER TO anonymous;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA uta_20241220 TO anonymous;
ALTER DEFAULT PRIVILEGES IN SCHEMA uta_20241220
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO anonymous;
ALTER DATABASE uta OWNER TO anonymous;
ALTER SCHEMA uta_20241220 OWNER TO anonymous;When I changed to this, all tests passed using both DB URLs (new and legacy)
|
@korikuzma what if we could just ditch the genomic table entirely? per friendly chatgpt, with basically no loss of performance (this is important, it's used in the variation normalizer)
|
Sent in Slack, but posting here. No usage. Let's remove both. 🚀 |
|
Ok, I think I covered my bases here? |
Basically a lot of tech debt fixes
asyncpgtopsycopg. Honestly, asyncpg might be faster but we are generally using psycopg elsewhere so might as well stick with one thing.UtaDatabaseclass that basically retains this behavior so we can minimize needed changes in the short term (+ less work for me to update tests)postgresql://uta_admin@localhost:5432/uta?options=-csearch_path%3Duta_20241220,publicrather than/uta/uta_20241220postgresql://stuff/(db_name)/(schema_name)thing is nonstandard (I think it was basically invented for the hgvs library) and not in keeping with more typical postgres access patterns. In general, there are two standards for describing postgres connections -- as a key-value string and as a URI. So, I think if we're gonna have users go with the URI way, we should adhere to the standard._normalize_uta_db_urlfunction that converts it to the new way and issues a deprecation warning. If this ends up being really annoying we can also just silence the warning.uta_20241220by default, but this is set at connection pool creation via the connection string)%(argname)ssyntax(result, failure_description)tuple where the result and the failure values were mutually exclusive into a function that returned the result value, and raised an exception in case of failure.genomictable initializer method, and also make it optional for the pool factory function to attempt creation of thegenomictable (I think this addresses UTA database recreates schema on startup #430)basically I think the ideal FastAPI usage for just UTA should look like
and within coolseqtool itself, or in other contexts where you might want to pass around the entire coolseqtool god class instance, you can do something like
A few misc engineering notes I thought about while working on this