Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong regular expression for enzyme in mzid 1.2 export #64

Closed
julianu opened this issue Jul 10, 2024 · 1 comment
Closed

Wrong regular expression for enzyme in mzid 1.2 export #64

julianu opened this issue Jul 10, 2024 · 1 comment

Comments

@julianu
Copy link

julianu commented Jul 10, 2024

The regular expression in the mzid export for trypsin right now (v2024.01) is exported as
((?<=[KR]) (?!P))

This though is incorrect, as it contains a blank and thus does not match the cleavage sites (at least not in common regex interpreters in Python, Java, etc on regex101.com and in my code) . Also, the outside parentheses are probably unnecessary. I only tested trypsin right now, but this might also be a problem with other enzymes.

For the more common enzymes listed in the PSI ontology, it would also be great to export the more complete form like:

<Enzyme nTermGain="H" cTermGain="OH" semiSpecific="false" id="enzyme_1_1">
    <SiteRegexp>(?&lt;=[KR])(?!P)</SiteRegexp>
    <EnzymeName>
        <cvParam cvRef="PSI-MS" accession="MS:1001251" name="Trypsin"/>
    </EnzymeName>
</Enzyme>
@jke000
Copy link
Collaborator

jke000 commented Jul 10, 2024

Julian, thanks for reporting this. I removed the extra blank space with commit 754514f. This will be available on the next Comet maintenance release which I'll likely make in the next few weeks to address this and the other issues that were recently posted. I'll also go add the complete form for the common enzymes before the next release.

@jke000 jke000 closed this as completed Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants