Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gene/protein names should be case normalized #2

Closed
johnbachman opened this issue Feb 1, 2015 · 1 comment
Closed

Gene/protein names should be case normalized #2

johnbachman opened this issue Feb 1, 2015 · 1 comment

Comments

@johnbachman
Copy link
Member

Gene names are parsed in from BELscript/RDF using whatever case they have, which leads to different rules having, e.g., "Braf" and others "BRAF". When converted into PySB, these lead to different monomers.

Possible solutions:

  • Store all names as ALL CAPS
  • Store all names as first letter Caps
  • Use the first appearance of the gene name as the convention, store it, and then canonicalize all other-case appearances of the gene to that
  • Look up the name in a database somewhere.
johnbachman referenced this issue in johnbachman/indra Feb 2, 2015
Addresses issue #2. Also identified some duplicate rules, which I
addressed in the specific case of the ActivityModifications by checking
the model for preexisting rules before adding.
@johnbachman
Copy link
Member Author

Closed by c7628db (resolved to ALL CAPS, at least for now)

johnbachman referenced this issue in johnbachman/indra May 18, 2015
add heap size setting; fix data source url in comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant