Skip to content

Edits to InterMine Checkout

mcwbbc edited this page Feb 8, 2012 · 3 revisions

Before building RatMine there is some customization of the stock InterMine checkout needed. Most of these editions are to source configuration files to ensure that data is integrated correctly.

Uniprot
Both the Uniprot configuration and keys files need to edits.

MINE/bio/sources/uniprot/main/resources/uniprot_config.properies

10116.primaryIdentifier = RGD

MINE/bio/sources/uniprot/resources/uniprot_keys.properties

Protein.key_primaryaccession = primaryAccession

KEGG Pathway
Both the KEGG Pathway configuration and keys files need edits. KEGG Pathway uses the obsolete key style definitions, they can be updated to the current style, but it not needed.

MINE/bio/sources/kegg-pathway/main/resources/kegg_config.properies

# rat
rno.taxonId = 10116
rno.identifier = ncbiGeneNumber

MINE/bio/sources/kegg-pathway/resources/uniprot_keys.properties

Gene=key_primaryidentifier, key_secondaryidentifier, key_ncbigenenumber

FASTA
The FASTA source is missing a keys file for RatMine. Created the properties files and add the key definition to it.

MINE/bio/sources/fasta/resources/rat-chromosome-fasta_keys.properties

Chromosome.key_primaryidentifier=primaryIdentifier

BIOGRID
The BioGrid sources configuration file need to be set to use RGD identifiers

MINE/bio/sources/biogrid/main/resources/biogrid_config.properties

# rat
10116.xref.primaryIdentifier = rgd

PSI IntAct
The PSI sources configuration file need to be set to use RGD identifiers

MINE/bio/sources/psi/main/resources/psi-intact_config.properties

# Rattus norvegicus
10116.identifier = primaryIdentifier
10116.datasource = rgd

GFF3
RatMine has its own GFF3 parser, however to get RatMine to load duplicate IDs the parent class needs one line commented out. Add the following line to approximately line #125 of InterMine’s GFF3Converter.java file.

MINE/bio/core/main/src/org/intermine/bio/dataconversion/GFF3Converter.java

duplicates = false;

Clone this wiki locally