Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protein coverage CV term #91

Closed
1 task
mwalzer opened this issue Sep 26, 2019 · 6 comments
Closed
1 task

Protein coverage CV term #91

mwalzer opened this issue Sep 26, 2019 · 6 comments
Assignees
Labels
request for new CV entry request for a new entry in the QC-CV

Comments

@mwalzer
Copy link
Collaborator

mwalzer commented Sep 26, 2019

Name:
Protein coverage

Definition:
The coverage of protein sequences from the peptide sequences identified. The table records the coverage itself (in percent), the protein accession (SHOULD correspond to the accession used in fasta document used for Sequence DB based identification), Length of the protein(, and optionally if it is 'target' or 'decoy' entry.)

Comment: The protein coverage can provide insight into the sensitivity of the instrument/identification/protocols used.

Proposed value type:

  • table
    {'coverage':[...], 'accession':[...], 'length':[...], *'TD':[...]}
    *optional
@mwalzer mwalzer added the request for new CV entry request for a new entry in the QC-CV label Sep 26, 2019
@mwalzer
Copy link
Collaborator Author

mwalzer commented Sep 26, 2019

examplenewplot

@julianu
Copy link
Contributor

julianu commented Oct 17, 2019

Please add comments (i.e. descriptions on how to interpret the values) for the metrics.

As we also have "table" as type, this would be a better suggestion in my opinion.

@julianu
Copy link
Contributor

julianu commented Jan 14, 2020

@mwalzer please add comment (explanation for interpretation)

@mwalzer
Copy link
Collaborator Author

mwalzer commented Mar 20, 2020

Indeed, and I refined the definition a little.

Name:
Protein coverage

Definition:
The coverage of protein sequences from the peptide sequences identified. The table records the coverage itself (percentage), the protein accession (SHOULD correspond to the accession used in the document used for Sequence DB based identification), Length of the protein(, and optionally if it is a decoy entry.)

Comment:
Higher is better. Gives insight into how much of the target search space is covered. Low coverage can be caused by the absence of proteins, because of the used acquisition methods insensitivity to certain peptide species or shortcomings of the identification method.

Proposed value type:

  • single value
  • n-tuple
  • table
  • corresponding lists
  • matrix

{'coverage':[...], 'accession':[...], 'length':[...], *'TD':[...]}
*optional

@mwalzer
Copy link
Collaborator Author

mwalzer commented Mar 20, 2020

@julianu Where would we put something like relationship: has_order MS:1002108 ! higher score better? As text in the comments is suboptimal. And how do we designate which column is primary/affected by has_order? And units.

@mwalzer
Copy link
Collaborator Author

mwalzer commented Mar 20, 2020

@julianu Maybe we can use this one as an example during PSI2020 to formalise the cv term value situation.

mwalzer added a commit that referenced this issue Jun 16, 2020
from #85, #86, #87, #88, #89, #90, #91, #101
mwalzer added a commit that referenced this issue Nov 11, 2020
from #85, #86, #87, #88, #89, #90, #91, #101
@mwalzer mwalzer closed this as completed Aug 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
request for new CV entry request for a new entry in the QC-CV
Projects
None yet
Development

No branches or pull requests

2 participants