oddities in output #6

kevinkovalchik · 2020-04-22T18:22:01Z

Hello,

Thanks for making this tool! I am finding it useful and am planning to use it in a large-scale reanalysis of published data to avoid difficulties with missing/incomplete information on acquisition parameters.

I noticed something that seems odd about the output and am wondering if you can help clarify it. Here are the details of an analysis of some data from a sciex triple tof:

Details:
  Charge 0
Spectra in same averagine bin as another: 1768
    ... and also within m/z tolerance: 1267
    ... and also within scan range: 557
    ... and also with sufficient in-common fragments: 189
  Charge 2
Spectra in same averagine bin as another: 19037
    ... and also within m/z tolerance: 13767
    ... and also within scan range: 11484
    ... and also with sufficient in-common fragments: 189
  Charge 3
Spectra in same averagine bin as another: 1912
    ... and also within m/z tolerance: 1528
    ... and also within scan range: 1232
    ... and also with sufficient in-common fragments: 189
  Charge 4
Spectra in same averagine bin as another: 489
    ... and also within m/z tolerance: 414
    ... and also within scan range: 350
    ... and also with sufficient in-common fragments: 189

All these numbers make sense to me except and also with sufficient in-common fragments:, which is exactly the same for each charge state. Is this expected?

Also, when I run the same file and specify --charges 2 then this is the output:

Details:
  Charge 2
Spectra in same averagine bin as another: 19037
    ... and also within m/z tolerance: 13767
    ... and also within scan range: 11484
    ... and also with sufficient in-common fragments: 170

The numbers match charge 2 from above except now sufficient in-common fragments is different. Is this expected?

Also, I'm aware that I'm seeing these detail reports because there are not enough paired spectra to do the analysis. But I would still like to understand the output here.

Best,
Kevin

The text was updated successfully, but these errors were encountered:

dhmay · 2020-04-22T18:34:58Z

Looks like you found a bug in errorcalc.py, on line 254. As you noted, it appears to be giving you the same number of spectra that it's able to use for every charge. What it's actually reporting is the total number of usable spectra across all charges.

I believe I could fix the bug very easily by changing line 254 to report len(percharge_calculator.paired_fragment_peaks) instead of len(precursor_distances_ppm)

However, it's been a long time since I looked at this code, and I'm a little nervous about screwing it up. So, two options for you:

As you noticed, if you restrict to a single charge, you'll get a different number than if you run all charges. That number is, in fact, correct for that charge. So, if you want those numbers, you can run them separately for each charge and sum them up.
You could try implementing the fix I suggested above. If you do, please make a pull request!

I'll try to get around to fixing it, but verifying the fix would take me far longer than making it. If I made the fix on a branch, would you be willing check out the branch and verify it for me? If so, I'll update this issue when it's done on a branch.

kevinkovalchik · 2020-04-22T18:54:48Z

Thanks for the quick response.
Hm... that might be the fix. I changed that line and here is the output:

2020-04-22 14:42:10,086 INFO: Need >= 200 peak pairs to fit mixed distribution. Got only 189.
Details:
  Charge 0
Spectra in same averagine bin as another: 1768
    ... and also within m/z tolerance: 1267
    ... and also within scan range: 557
    ... and also with sufficient in-common fragments: 20
  Charge 2
Spectra in same averagine bin as another: 19037
    ... and also within m/z tolerance: 13767
    ... and also within scan range: 11484
    ... and also with sufficient in-common fragments: 850
  Charge 3
Spectra in same averagine bin as another: 1912
    ... and also within m/z tolerance: 1528
    ... and also within scan range: 1232
    ... and also with sufficient in-common fragments: 85
  Charge 4
Spectra in same averagine bin as another: 489
    ... and also within m/z tolerance: 414
    ... and also within scan range: 350
    ... and also with sufficient in-common fragments: 10

which looks more reasonable. But the largest reported number there is 850 which is not the number of peak pairs, 189. Is that because 850 represents the total number of paired spectra, not the number of peak pairs?

dhmay · 2020-04-22T20:24:48Z

Ha, that's what I get for trying to barge back into code I haven't looked at in years. I gave you the wrong variable to plug in there. Try it with len(percharge_calculator.paired_precursor_mzs).

kevinkovalchik · 2020-04-23T11:54:53Z

Haha. At least the code is nice and readable! I'll give that a try.

…

On Wed, Apr 22, 2020 at 4:25 PM Damon May ***@***.***> wrote: Ha, that's what I get for trying to barge back into code I haven't looked at in years. I gave you the wrong variable to plug in there. Try it with len(percharge_calculator.paired_precursor_mzs). — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#6 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AD2PTUDDTESSBBQJOUHEJTLRN5HB5ANCNFSM4MOMWVVA> .

kevinkovalchik · 2020-04-23T13:07:47Z

Okay, this looks good now! Here is the output this time:

2020-04-23 09:04:26,381 INFO: Need >= 200 peak pairs to fit mixed distribution. Got only 189.
Details:
  Charge 0
Spectra in same averagine bin as another: 1768
    ... and also within m/z tolerance: 1267
    ... and also within scan range: 557
    ... and also with sufficient in-common fragments: 4
  Charge 2
Spectra in same averagine bin as another: 19037
    ... and also within m/z tolerance: 13767
    ... and also within scan range: 11484
    ... and also with sufficient in-common fragments: 170
  Charge 3
Spectra in same averagine bin as another: 1912
    ... and also within m/z tolerance: 1528
    ... and also within scan range: 1232
    ... and also with sufficient in-common fragments: 17
  Charge 4
Spectra in same averagine bin as another: 489
    ... and also within m/z tolerance: 414
    ... and also within scan range: 350
    ... and also with sufficient in-common fragments: 2

The numbers for charge 2, 3 and 4 add up to the reported number of peak pairs (189). Charge 0 doesn't seem to be contributing to the number of peak pairs. Are unknown charges not used?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

oddities in output #6

oddities in output #6

kevinkovalchik commented Apr 22, 2020 •

edited

Loading

dhmay commented Apr 22, 2020

kevinkovalchik commented Apr 22, 2020

dhmay commented Apr 22, 2020

kevinkovalchik commented Apr 23, 2020 via email

kevinkovalchik commented Apr 23, 2020

oddities in output #6

oddities in output #6

Comments

kevinkovalchik commented Apr 22, 2020 • edited Loading

dhmay commented Apr 22, 2020

kevinkovalchik commented Apr 22, 2020

dhmay commented Apr 22, 2020

kevinkovalchik commented Apr 23, 2020 via email

kevinkovalchik commented Apr 23, 2020

kevinkovalchik commented Apr 22, 2020 •

edited

Loading