Skip to content
David Osumi-Sutherland edited this page Dec 11, 2013 · 4 revisions

FB CHADO PSQL DB

We run a local copy of the FB chado PostgresSQL DB for the purposes of running annotation queries and finding details of pubs.

CHADO is a very large and complicated database. In order to use it directly to run queries on VFB, we generate views using a set of SQL scripts that run every time we update our local copy of CHADO.

SQL script used to generate view tables

Challenges

The main problem for our purposes is that expression is recorded for gene and transgene products, but we need to display the relevant genes and constructs/insertions. Mapping across - especially from transgene products to constructs/insertions requires iterative queries of the central CHADO feature_relation table. The logic of design for this table (in particular the choice of relations used) is not at all clear and is very poorly documented. This table has also been subject to poorly documented change in the past. The most important job that our view tables do is to directly map gene and transgene products to genes and insertions/constructs.

Brief summary of view tables:

vfbview_fbbt: A mapping of ontology term names to IDs for FBbt. (The connection is surprisingly circuitous in CHADO).

vfbview_gene_expressed_gp: Maps gene products for which expression data exists to genes.

vfbview_transgene_expressed_gp_pkey: Maps transgene products for which expression data exists to constructs/insertions.

vfbview_phenstatement_feature: Maps phenstatements to features (the mapping in chado is to whole genotypes. There is a reasonable case for displaying full phenotypes instead - perhaps as part of a phenotype report.

vfbview_feature_synonym: Links feature_ids directly to valid names and symbols and their synonyms in both ascii and unicode. Not currently used - but plan to use this in future for: (a) Displaying unicode names for features; (b) Search and query on feature names+synonyms. Note - this table may need some work to add keys indexes.

Queries

Queries used on the site can be found here.

Here are the expression annotation queries:

/* Generic transgene expression Query: */

SELECT DISTINCT fbbt.cvterm_name AS anatomy, fbbt.db_name AS idp, fbbt.accession, teg.transgene_name, teg.transgene_uniquename, pub.miniref, pub.uniquename AS fbrf
    FROM vfbview_fbbt fbbt
    JOIN expression_cvterm ec1 ON (fbbt.cvterm_id = ec1.cvterm_id)
    JOIN expression_cvterm ec2 ON (ec2.expression_id = ec1.expression_id)
    JOIN cvterm stage ON (stage.cvterm_id = ec2.cvterm_id)
    JOIN feature_expression fe ON (ec1.expression_id = fe.expression_id)
    JOIN pub ON (fe.pub_id = pub.pub_id)
    JOIN vfbview_transgene_expressed_gp teg ON (teg.gp_feature_id = fe.feature_id)
    WHERE fbbt.accession IN ()
    AND stage.name = 'adult stage';

/* Generic gene expression query */

SELECT DISTINCT fbbt.cvterm_name AS anatomy, fbbt.db_name AS idp, fbbt.accession, geg.gene_name, geg.gene_uniquename, pub.miniref, pub.uniquename AS fbrf
    FROM vfbview_fbbt fbbt
    JOIN expression_cvterm ec1 ON (fbbt.cvterm_id = ec1.cvterm_id)
    JOIN expression_cvterm ec2 ON (ec2.expression_id = ec1.expression_id)
    JOIN cvterm stage ON (stage.cvterm_id = ec2.cvterm_id)
    JOIN feature_expression fe ON (ec1.expression_id = fe.expression_id)
    JOIN pub ON (fe.pub_id = pub.pub_id)
    JOIN vfbview_gene_expressed_gp geg ON (geg.gp_feature_id = fe.feature_id)
    WHERE fbbt.accession IN ()
    AND stage.name = 'adult stage';

In actual queries fbbt.accession IN () is populated with a list generated by a DL query.

Note the restriction to adult stage. In future versions this restriction should either be dropped completely or extended to => appropriate gross stage restrictions for terms. This will require classification of terms as larval or adult. If we always have appropriate, stage specific terms and these are always used correctly in annotation, this stage restriction would not be necessary. We do have complete or near complete sets of specific terms for embryonic/larval and adult structures - pupal less so. There is some stage inappropriate use of terms in annotation, but this is being cleaned up and is most seriously a problem for the embryo.