-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for reading Sage PSM files; various minor fixes #31
Conversation
Codecov ReportPatch coverage:
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more Additional details and impacted files@@ Coverage Diff @@
## main #31 +/- ##
==========================================
- Coverage 41.52% 41.48% -0.05%
==========================================
Files 18 19 +1
Lines 1416 1468 +52
==========================================
+ Hits 588 609 +21
- Misses 828 859 +31
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
- Write PSM score as cvParam `search engine specific score` instead of userParam `score`. - Write retention time to the spectrumIdentificationItem as cvParam `retention time` instead of at the Result level as `scan start time`. - Update documentation notes
I just opened a fork to do this, should've checked here first! Thanks for implementing this |
@lazear, great to hear that you wanted to add Sage support! Feel free to take a look at the implementation and give feedback where needed. |
Thanks for the catch - this is what I get for using ChatGPT to write docs :) |
Added
Changed
psm
: The default values ofPSM.provenance_data
,PSM.metadata
andPSM.rescoring_features
are nowdict()
instead ofNone
.io.mzid.MzidReader
: Attempt to parseretention time
orscan start time
cvParams from both SpectrumIdentificationResult as SpectrumIdentificationItem levels. Note that according to the mzIdentML specification document (v1.1.1) neither cvParams are expected to be present at either level.io.mzid.MzidReader
: Preferspectrum title
cvParam overspectrumID
attribute forPSM.spectrum_id
as these titles always match to the peak list files. In this case,spectrumID
is saved inmetadata["mzid_spectrum_id"]
. Fall back tospectrumID
ifspectrum title
is absent.io.mzid.MzidWriter
:PSM.retention_time
is now written as cvParamretention time
instead ofscan start time
, and to theSpectrumIdentificationItem
level instead of theSpectrumIdentificationResult
level, as theoretically in psm_utils, multiple PSMs for the same spectrum can have different values forretention_time
.io.mzid.MzidWriter
: Write PSM score as cvParamsearch engine specific score
instead of userParamscore
.io.percolator.PercolatorTabWriter
: For PIN-style files: UseSpecId
instead ofPSMId
and writePSMScore
andChargeN
columns by default.Fixed
peptidoform
: ProForma mass modifications are now correctly parsed within therename_modifications
function.io.maxquant.MSMSReader
: Correctly parse emptyProteins
column toNone
io.mzid.MzidReader
: SetPSM.retention_time
toNone
instead offloat('nan')
if missing from the PSM file.io.percolator.PercolatorTabReader
: Correctly parse Percolator peptidoform notation if no leading or trailing amino acids are present (e.g..ACDK.
instead ofK.ACDK.E
).io.percolator.PercolatorTabWriter
: ScanNr is now correctly written as an integer counting from the first PSM in the file.io.percolator.PercolatorTabWriter
: If no protein information is present, write the peptidoform preceded byPEP_
to the Proteins column.