Gene/protein names should be case normalized #2

johnbachman · 2015-02-01T14:29:38Z

Gene names are parsed in from BELscript/RDF using whatever case they have, which leads to different rules having, e.g., "Braf" and others "BRAF". When converted into PySB, these lead to different monomers.

Possible solutions:

Store all names as ALL CAPS
Store all names as first letter Caps
Use the first appearance of the gene name as the convention, store it, and then canonicalize all other-case appearances of the gene to that
Look up the name in a database somewhere.

Addresses issue #2. Also identified some duplicate rules, which I addressed in the specific case of the ActivityModifications by checking the model for preexisting rules before adding.

johnbachman · 2015-02-02T17:23:52Z

Closed by c7628db (resolved to ALL CAPS, at least for now)

add heap size setting; fix data source url in comment

johnbachman mentioned this issue Feb 1, 2015

Check for duplicate BelPy statements #3

Closed

johnbachman closed this as completed Feb 2, 2015

johnbachman referenced this issue in johnbachman/indra May 18, 2015

Merge pull request #2 from jmuhlich/biopax_demo

44287aa

add heap size setting; fix data source url in comment

bgyori mentioned this issue Oct 12, 2020

Reimplementation / significant speedup of preassembly algorithm #1177

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gene/protein names should be case normalized #2

Gene/protein names should be case normalized #2

johnbachman commented Feb 1, 2015

johnbachman commented Feb 2, 2015

Gene/protein names should be case normalized #2

Gene/protein names should be case normalized #2

Comments

johnbachman commented Feb 1, 2015

johnbachman commented Feb 2, 2015