Improvements to Bruker PASEF support #50

chambm · 2018-05-03T17:42:31Z

added combineIonMobilitySpectra support for Bruker TIMS data; it has special behavior compared to Waters and Agilent IMS data, and on PASEF MS2 data it's even more special:
- for non-PASEF data, all mobility scans are merged in each frame, but the mobility values are kept as a 3rd binary data array for that merged spectrum; whenever m/z values would overlap between mobility scans, a small jitter (1e-8) is added to the duplicate values so they can be preserved without violating the m/z uniqueness constraint; thus all mobility information is preserved but in a much more compact representation
- for PASEF data, MS1 frames are merged the same way as above, but MS2 frames are merged only within PASEF precursors (i.e. each precursor gets its own merged scan); MS2 scans that do not belong to a PASEF precursor are dropped entirely
- merged spectrum ids are like 'merged=1234'; for PASEF data, the constituent ids (frame and scan) are preserved in the scanList element (but not for full frame merges, that would be too verbose)
added IonMobility field to SpectrumList_TitleMaker
fixed some bugs in SpectrumList_ScanSummer and gave it the ability to combine peaks (within an individual spectrum) that occur in very close proximity (1e-7); (the intent is that SpectrumList_ScanSummer will be used to prepare PASEF data for identification, so at this point the mobility dimension for fragment peaks is collapsed)
updated threshold filter to work on all arrays with the same size as defaultArrayLength, not just the two primary arrays (usually m/z and intensity)
fixed SeeMS issues with showing 'merged=1234' ids, and showing IMS heatmaps for files created with combineIonMobilitySpectra

added MSData::getArrayByCVID() convenience function for retrieving extra array identified by CVID
added attributes to msdata::TextWriter
removed runtime-debugging=off restriction for using Bruker API
fixed building outside a git clone or without git available
added a for --without-compassxtract builds

…s special behavior compared to Waters and Agilent IMS data, and on PASEF MS2 data it's even more special: -- for non-PASEF data, all mobility scans are merged in each frame, but the mobility values are kept as a 3rd binary data array for that merged spectrum; whenever m/z values would overlap between mobility scans, a small jitter (1e-8) is added to the duplicate values so they can be preserved without violating the m/z uniqueness constraint; thus all mobility information is preserved but in a much more compact representation -- for PASEF data, MS1 frames are merged the same way as above, but MS2 frames are merged only within PASEF precursors (i.e. each precursor gets its own merged scan); MS2 scans that do not belong to a PASEF precursor are dropped entirely -- merged spectrum ids are like 'merged=1234'; for PASEF data, the constituent ids (frame and scan) are preserved in the scanList element (for full frame merges, that would be too verbose) - added IonMobility field to SpectrumList_TitleMaker - fixed some bugs in SpectrumList_ScanSummer and gave it the ability to combine peaks (within an individual spectrum) that occur in very close proximity (1e-7); (the intent is that SpectrumList_ScanSummer will be used to prepare PASEF data for identification, so at this point the mobility dimension for fragment peaks is collapsed) - updated threshold filter to work on all arrays with the same size as defaultArrayLength, not just the two primary arrays (usually m/z and intensity) - fixed SeeMS issues with showing 'merged=1234' ids, and showing IMS heatmaps for files created with combineIonMobilitySpectra * added MSData::getArrayByCVID() convenience function for retrieving extra array identified by CVID * added <scan> attributes to msdata::TextWriter * removed runtime-debugging=off restriction for using Bruker API * fixed building outside a git clone or without git available * added a <location-prefix> for --without-compassxtract builds

chambm · 2018-05-03T18:18:48Z

NB: mz5 currently ignores extra binaryDataArrays, so it will drop the mobility array.

bspratt

seems good - was the 1/k0 output for mgf in a different commit? Also, what's the jamroot change about? And we'd discussed adding a mechanism for Skyline to know whether a file is PASEF data or not, I don't think I see that here.

chambm · 2018-05-03T18:28:07Z

For MGFs: added IonMobility field to SpectrumList_TitleMaker
Jamroot change: fixed building outside a git clone or without git available

I forgot about the PASEF determination from Skyline. I'll hack it in there.

bspratt · 2018-05-03T18:59:17Z

I was naively expecting to see a literal "1/k0=" string the in the code but I suppose it makes more sense to do it more generally - though it will need to declare units (or perhaps I'm missing something and it already does). It looks like this will get us into trouble if a 3rd type of ion mobility ever is introduced.

In any event it looks like I'll need to go back into BlibBuild to handle whatever we come up with as a 4th variant for Mascot results parsing.

chambm · 2018-05-03T19:41:51Z

TitleMaker allows maximum flexibility. The user can make the format whatever they want. Including:
--filter "titleMaker 1/K0=<IonMobility>"

I think it's fair to leave it up to the user to put the correct quantity type there although there's nothing difficult about adding extra fields like <IonMobilityQuantity> and <IonMobilityUnits>. Is it necessary? The MSData interface will provide units if you're reading straight from the raw data for MS1 quant.

bspratt · 2018-05-03T19:47:30Z

I see - that should suffice. The units do matter in the context of BiblioSpec, but they're easily inferred from suitable formatted data as with your example --filter "titleMaker 1/K0=".

chambm · 2018-05-03T20:26:03Z

From Brendan email:

Doesn’t that seem a little brittle to make this depend on users getting a format string right?

For our users, I would prefer to have them just use MSConvertGUI and choose MGF and have that work.

I think you are imagining command line users.

The idea of "prepare this PASEF data for search by Mascot", which includes merging across retention time (using the scanSummer filter), is not something that should just happen by default by selecting MGF. That's too much magic for a generic conversion tool's default behavior. But I could see that kind of capability being baked into MSConvertGUI "presets", which would set the filters and other options in a certain way. But yeah, currently the GUI doesn't have combineIonMobilitySpectra, scanSummer, or a customizable titleMaker...it only uses it for the 'TPP compatibility' mode. So I'll need to add those at the very least for this to work from the GUI.

chambm · 2018-05-03T20:34:06Z

From Brian email:

I agree that some kind of reasonable default for including available ion mobility in the MGF in the absence of a titleMaker filter would be highly desirable. That's what I was initially imaging, which is why I was confused.

It's certainly do-able but doesn't solve the more significant issue with PASEF of merging the MS2s properly. But I'm also pretty sure we could ask 10 people what they think the default MGF title should be and get 10 different answers. This is why we have standards...

brendanx67 · 2018-05-03T20:52:38Z

Right and why in the absence of standard we make software popular enough to set the standard or follow the lead of other software, like the Bruker tool.

…

On Thu, May 3, 2018 at 4:34 PM, Matt Chambers ***@***.***> wrote: From Brian email: I agree that some kind of reasonable default for including available ion mobility in the MGF in the absence of a titleMaker filter would be highly desirable. That's what I was initially imaging, which is why I was confused. It's certainly do-able but doesn't solve the more significant issue with PASEF of merging the MS2s properly. But I'm also pretty sure we could ask 10 people what they think the default MGF title should be and get 10 different answers. This is why we have standards... — You are receiving this because your review was requested. Reply to this email directly, view it on GitHub <#50 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ALAbUKqWd1O6wU4ZFJ7NFcIgnj-p5jeVks5tu2m_gaJpZM4TxgRA> .

…o 1e-2 to more closely approximate the official Bruker peak merging, but eventually this threshold should be configurable

chambm · 2018-05-08T17:38:35Z

pwiz_tools/SeeMS/Dialogs/HeatmapForm.cs

@@ -327,8 +360,24 @@ protected override void OnShown(EventArgs e)

                double scanTime = (double) dgv[scanTimeColumn.Index, i].Value;
                double ionMobility = (double) dgv[ionMobilityColumn.Index, i].Value;
+                if (ionMobility == 0)


@bspratt At this point, when ionMobility == 0, I check whether the spectrum has a mobility array. If it does, then I add all the mobility bins from that array. For an MS1, all heatmap data comes from a single spectrum with 3 arrays rather than 1000 spectra with 2 arrays. The number of data points is that same but the performance is much better with 1 spectrum due to data locality.

chambm · 2018-05-08T17:41:26Z

pwiz_tools/SeeMS/Dialogs/HeatmapForm.cs

-                    bounds.MaxY = Math.Max(bounds.MaxY, bin.IonMobility);
-
-                    heatmapPoints.Add(new Point3D(mz, bin.IonMobility, intensity));
+                    var mobilityArray = mobilityBDA.data;


@bspratt Here's where I am adding each mz/mobility/intensity point to the heatmap. The heatmap has to know all the mobility bins in both cases, but with the combined representation, it populates them all from a single spectrum (for MS1; for PASEF MS2, there will still be multiple spectra, but far fewer).

bspratt

FWIW, in TitleMaker ionmobility might also reasonably include FAIMS compensation voltage

chambm · 2018-05-08T18:06:22Z

Agreed. I've got a request in for a CV change to add parent terms for ion mobility concepts so I can just look for a child of a single term. Also it'll allow putting those terms in mzIdentML files, so search engines can start putting it in structured output rather than title strings. The CV change seems unopposed so should be available soon. Title strings give me the heebie jeebies, but they're certainly a necessary evil right now because it's currently necessary from pepXML/mzIdentML to go back to the raw data (e.g. mzML) to get the ion mobility value.

… avoid excessive file open times

…that the client only wants to enumerate a specific MS level (currently only Bruker TDF uses this, for performance reasons) (from Brian) * added sqlite3pp::has_table() * reduced memory footprint of TDF index by using on-the-fly logic in SpectrumList_Bruker::find() * moved Bruker TDF ion mobility metadata from Instant level to Fast, and merged scan numbers to Full * added hasPASEF() overload for non-MSVC builds

chambm · 2018-05-15T21:10:04Z

@bspratt Did your ion mobility changes cause the Skyline test failures?

bspratt · 2018-05-15T21:22:15Z

Which changes are you thinking of? I don't think I've made any commits yet. But it's perhaps not surprising that there's some churning. I'll take a look.

…

On Tue, May 15, 2018 at 2:10 PM, Matt Chambers ***@***.***> wrote: @bspratt <https://github.com/bspratt> Did your ion mobility changes cause the Skyline test failures? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#50 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABJSeVkLkj5ge-QpCUVKB_F_cxWmrn5wks5ty0QtgaJpZM4TxgRA> .

chambm · 2018-05-15T21:23:30Z

My last push includes most of the patch you sent me, refactored from preferOmitMsLevel to preferOnlyMsLevel.

bspratt · 2018-05-15T21:30:02Z

Oh, OK. At least I know what I'm looking for, and I'm sure that it's all related, yes.

…

On Tue, May 15, 2018 at 2:23 PM, Matt Chambers ***@***.***> wrote: My last push includes most of the patch you sent me, refactored from preferOmitMsLevel to preferOnlyMsLevel. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#50 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABJSedfYIa0hkD3ciitmMW9oNtAr10Osks5ty0dSgaJpZM4TxgRA> .

… ignored the need to TIC collection in MS1 data

bspratt · 2018-05-16T15:00:39Z

I'm seeing some test failures with non-PASEF Bruker data, we're not there yet...

…

On Tue, May 15, 2018 at 2:29 PM, Brian Pratt ***@***.***> wrote: Oh, OK. At least I know what I'm looking for, and I'm sure that it's all related, yes. On Tue, May 15, 2018 at 2:23 PM, Matt Chambers ***@***.***> wrote: > My last push includes most of the patch you sent me, refactored from > preferOmitMsLevel to preferOnlyMsLevel. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#50 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/ABJSedfYIa0hkD3ciitmMW9oNtAr10Osks5ty0dSgaJpZM4TxgRA> > . >

bspratt · 2018-05-16T20:37:21Z

Rebuilding now. I believe it's failing with actual TDF data.

…

On Wed, May 16, 2018 at 1:23 PM, Matt Chambers ***@***.***> wrote: Try the test with what I just pushed. I'm wondering if it'll be fixed by the bug fix for non-combined ion mobility. Otherwise I'm perplexed because it's relatively simple code that works in the new unit test. Which part is it failing on, the native TDF or the mz5? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#50 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABJSecRH_kKMK48MSOiJfv0GqQgnU4YNks5tzIqkgaJpZM4TxgRA> .

bspratt · 2018-05-16T21:09:54Z

Same effect, with an actual .d file.

> I'm wondering if it'll be fixed by the bug fix for non-combined ion

mobility. I kind of think so, yes. Can you make that happen?

…

On Wed, May 16, 2018 at 1:37 PM, Brian Pratt ***@***.***> wrote: Rebuilding now. I believe it's failing with actual TDF data. On Wed, May 16, 2018 at 1:23 PM, Matt Chambers ***@***.***> wrote: > Try the test with what I just pushed. I'm wondering if it'll be fixed by > the bug fix for non-combined ion mobility. Otherwise I'm perplexed because > it's relatively simple code that works in the new unit test. Which part is > it failing on, the native TDF or the mz5? > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#50 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/ABJSecRH_kKMK48MSOiJfv0GqQgnU4YNks5tzIqkgaJpZM4TxgRA> > . >

chambm · 2018-05-16T21:28:51Z

The bug fix I mentioned was in my last push.

> filepath
"D:\\test\\Bruker\\tims\\BSA_50fmol_TIMS_InfusionESI_10prec.d\\analysis.tdf"
> var msd = new MSDataFile(filepath);
> var sl = msd.run.spectrumList;
> sl.size()
464013
> sl.spectrumIdentity(300000)
{pwiz.CLI.msdata.SpectrumIdentity}
    base_: 0x0000000027d87440
    id: "frame=306 scan=796"
    index: 300000
    owner_: {pwiz.CLI.msdata.SpectrumList}
    sourceFilePosition: 18446744073709551615
    spotID: ""
> sl.findAbbreviated("306.796")
300000

How do I run your failing test myself?

bspratt · 2018-05-16T21:34:11Z

You'll need to build SkylineTester, and run it from there. My guess is that it's not about findAbbreviated per se, but about having put off some non-PASEF TDF features while working on the MGF combination. That's work we're going to need done anyway, your time might be better spent there than in figuring out our rather exotic test mechanism.

…

On Wed, May 16, 2018 at 2:28 PM, Matt Chambers ***@***.***> wrote: > filepath "D:\\test\\Bruker\\tims\\BSA_50fmol_TIMS_InfusionESI_10prec.d\\analysis.tdf" > var msd = new MSDataFile(filepath); > var sl = msd.run.spectrumList; > sl.size() 464013 > sl.spectrumIdentity(300000) {pwiz.CLI.msdata.SpectrumIdentity} base_: 0x0000000027d87440 id: "frame=306 scan=796" index: 300000 owner_: {pwiz.CLI.msdata.SpectrumList} sourceFilePosition: 18446744073709551615 spotID: "" > sl.findAbbreviated("306.796") 300000 How do I run your failing test myself? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#50 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABJSeSm1Lfki1-2XB46OBlqSGXhAFGQ7ks5tzJoUgaJpZM4TxgRA> .

chambm · 2018-05-16T21:44:03Z

Any ideas for how I figure out what aspect of non-PASEF features is the culprit? Non-PASEF functionality should be unaffected by the changes in this PR. The TDF file for the unit test had an out of date mzML (i.e. TDF now has a dedicated nativeID CV term, and using inverse reduced mobility CV term instead of drift time), but it's mostly the same as the mzML I generated today. It has the same number of scans and format of the native ID.

bspratt · 2018-05-16T22:01:36Z

I'll do a pwiz debug build and see what I can figure out.

…

On Wed, May 16, 2018 at 2:44 PM, Matt Chambers ***@***.***> wrote: Any ideas for how I figure out what aspect of non-PASEF features is the culprit? Non-PASEF functionality should be unaffected by the changes in this PR. The TDF file for the unit test had an out of date mzML (i.e. TDF now has a dedicated nativeID CV term, and using inverse reduced mobility CV term instead of drift time), but it's mostly the same as the mzML I generated today. It has the same number of scans and format of the native ID. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#50 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABJSeaiFR4k7uWQo4oV1aQhhNTCzZDR6ks5tzJ2jgaJpZM4TxgRA> .

bspratt · 2018-05-16T22:46:14Z

I'll keep chasing this down, but I don't think its a problem in the reader - rather its with the way Skyline calls the reader using preferOnlyMsLevel.

…

On Wed, May 16, 2018 at 3:01 PM, Brian Pratt ***@***.***> wrote: I'll do a pwiz debug build and see what I can figure out. On Wed, May 16, 2018 at 2:44 PM, Matt Chambers ***@***.***> wrote: > Any ideas for how I figure out what aspect of non-PASEF features is the > culprit? Non-PASEF functionality should be unaffected by the changes in > this PR. The TDF file for the unit test had an out of date mzML (i.e. TDF > now has a dedicated nativeID CV term, and using inverse reduced mobility CV > term instead of drift time), but it's mostly the same as the mzML I > generated today. It has the same number of scans and format of the native > ID. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#50 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/ABJSeaiFR4k7uWQo4oV1aQhhNTCzZDR6ks5tzJ2jgaJpZM4TxgRA> > . >

chambm · 2018-05-16T23:02:28Z

That's the missing detail that I didn't test with. I'm sure it's the prefer parameter that is mucking up the frame scan pair to index math. I will fix that tomorrow.

…

On Wed, May 16, 2018, 5:46 PM Brian Pratt ***@***.***> wrote: I'll keep chasing this down, but I don't think its a problem in the reader - rather its with the way Skyline calls the reader using preferOnlyMsLevel. On Wed, May 16, 2018 at 3:01 PM, Brian Pratt ***@***.***> wrote: > I'll do a pwiz debug build and see what I can figure out. > > > > On Wed, May 16, 2018 at 2:44 PM, Matt Chambers ***@***.*** > > wrote: > >> Any ideas for how I figure out what aspect of non-PASEF features is the >> culprit? Non-PASEF functionality should be unaffected by the changes in >> this PR. The TDF file for the unit test had an out of date mzML (i.e. TDF >> now has a dedicated nativeID CV term, and using inverse reduced mobility CV >> term instead of drift time), but it's mostly the same as the mzML I >> generated today. It has the same number of scans and format of the native >> ID. >> >> — >> You are receiving this because you were mentioned. >> Reply to this email directly, view it on GitHub >> <#50 (comment)>, >> or mute the thread >> < https://github.com/notifications/unsubscribe-auth/ABJSeaiFR4k7uWQo4oV1aQhhNTCzZDR6ks5tzJ2jgaJpZM4TxgRA > >> . >> > > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#50 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEx54ZyD8QcREpO0oUiIbRPEDTYaWy2qks5tzKw3gaJpZM4TxgRA> .

* fixed TimsData::getSpectrumIndex() when preferOnlyMsLevel is non-zero (refactored vector of frames to a flat_map so we no longer use frame id as an index into a vector) * updated VendorReaderTestHarness to allow testing vendor files with non-default Reader::Configs * added PASEF data set to Bruker unit tests, and added Config tests for preferOnlyMsLevel 1/2 and combineIonMobilitySpectra on/off * fixed CompassDataTest build dependencies

chambm · 2018-05-17T20:33:38Z

OK, pull the latest changes. The find() issues should be fixed. I also updated VendorReaderTestHarness to allow testing with non-default Reader configs, so this kind of thing will be caught in the core tests in the future.
I've also added a PASEF file to the unit tests by trimming down a large tdf and tdf_bin. I edited the TDF SQLite file to delete frames with id > 6 (and references to them), set the number of scans in each frame to smaller number (i.e. from 985 to 350 for ms2, and 100 for ms1), then used Process Monitor to see what file offsets were being accessed when enumerating the full file. I was able to simply truncate the file after the last accessed offset (plus 100kb of buffer after the last offset). I'm pretty happy with the result: I may try this technique for other vendors to see if we can expand the test coverage for them.

bspratt · 2018-05-17T20:35:06Z

> I've also added a PASEF file to the unit tests by trimming down a large

tdf and tdf_bin. Impressive!

…

On Thu, May 17, 2018 at 1:33 PM, Matt Chambers ***@***.***> wrote: OK, pull the latest changes. The find() issues should be fixed. I also updated VendorReaderTestHarness to allow testing with non-default Reader configs, so this kind of thing will be caught in the core tests in the future. I may try this technique for other vendors to see if we can expand the test coverage for them. I've also added a PASEF file to the unit tests by trimming down a large tdf and tdf_bin. I edited the TDF SQLite file to delete frames with id > 6 (and references to them), set the number of scans in each frame to smaller number (i.e. from 985 to 350 for ms2, and 100 for ms1), then used Process Monitor to see what file offsets were being accessed when enumerating the full file. I was able to simply truncate the file after the last accessed offset (plus 100kb of buffer after the last offset). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#50 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABJSeXvA0wKOpz4p38gkOCgmf5DVfSnZks5tzd6jgaJpZM4TxgRA> .

bspratt · 2018-05-17T21:05:46Z

Not there yet - that Skyline perftest still fails, albeit in a slightly different way. But clearly something to do with not indexing the MS2. Investigating....

…

On Thu, May 17, 2018 at 1:35 PM, Brian Pratt ***@***.***> wrote: >> I've also added a PASEF file to the unit tests by trimming down a large tdf and tdf_bin. Impressive! On Thu, May 17, 2018 at 1:33 PM, Matt Chambers ***@***.***> wrote: > OK, pull the latest changes. The find() issues should be fixed. I also > updated VendorReaderTestHarness to allow testing with non-default Reader > configs, so this kind of thing will be caught in the core tests in the > future. I may try this technique for other vendors to see if we can expand > the test coverage for them. > > I've also added a PASEF file to the unit tests by trimming down a large > tdf and tdf_bin. I edited the TDF SQLite file to delete frames with id > 6 > (and references to them), set the number of scans in each frame to smaller > number (i.e. from 985 to 350 for ms2, and 100 for ms1), then used Process > Monitor to see what file offsets were being accessed when enumerating the > full file. I was able to simply truncate the file after the last accessed > offset (plus 100kb of buffer after the last offset). > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#50 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/ABJSeXvA0wKOpz4p38gkOCgmf5DVfSnZks5tzd6jgaJpZM4TxgRA> > . >

chambm · 2018-05-17T21:15:03Z

ARGH! Of course. Because the ms1 filtered file only has 1 frame! 😱

bspratt · 2018-05-17T21:18:27Z

Sounds like you have a fix in mind?

…

On Thu, May 17, 2018 at 2:15 PM, Matt Chambers ***@***.***> wrote: ARGH! Of course. Because the ms1 filtered file only has 1 frame! 😱 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#50 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABJSebFbVJV1rQz590vlQQ82cVYY8SfNks5tzehYgaJpZM4TxgRA> .

chambm · 2018-05-18T15:53:50Z

Hmm, false alarm. I can't reproduce the find() bug anymore. Either on the full HeLa file or the trimmed one (or a newer trimmed one where I included 17 frames instead of 6). Can you give more detail about the error? There are some other issues I need to take care of with the unit test, but those are related to mz5 and out of date test mzMLs.

bspratt · 2018-05-18T16:02:17Z

It comes down to an index mismatch - Skyline asks for MS2 to not be indexed, then asks for a frame.scan pair that the reader hasn't indexed because its MS2, even though Skyline thinks its asking for MS1 data. So I have to work out whether the problem is with the frame number sense of the asker or the askee.

…

On Fri, May 18, 2018 at 8:53 AM, Matt Chambers ***@***.***> wrote: Hmm, false alarm. I can't reproduce the find() bug anymore. Either on the full HeLa file or the trimmed one (or a newer trimmed one where I included 17 frames instead of 6). Can you give more detail about the error? There are some other issues I need to take care of with the unit test, but those are related to mz5 and out of date test mzMLs. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#50 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABJSedzNWp5Gw0KIRLrDA9Q9WfWhDB5Bks5tzu6PgaJpZM4TxgRA> .

chambm · 2018-05-18T16:11:31Z

Has Skyline cached the spectrum index somewhere rather than the abbreviated id? Those indexes certainly won't be the same with/without preferOnlyMsLevel set.

bspratt · 2018-05-18T16:31:00Z

It's something like that. I'm running master alongside our branch, in both cases the frame.scan value are the same so its not a Skyline cacheing issue per se. The trouble seems to be that MsDataFileImpl::GetSpectrumIndex(string id) returns a different value depending on whether MS2 prefiltering is on, and MsDataFileImpl:: GetSpectrum(int spectrumIndex) seems not to be dealing what that difference. So probably one or the other is dealing or failing to deal with the gaps in the indexing?

…

On Fri, May 18, 2018 at 9:11 AM, Matt Chambers ***@***.***> wrote: Has Skyline cached the spectrum index somewhere rather than the abbreviated id? Those indexes certainly won't be the same with/without preferOnlyMsLevel set. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#50 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABJSeZN0u1-Ryj69VEV4fNIw99-YlKDAks5tzvKzgaJpZM4TxgRA> .

chambm · 2018-05-18T17:08:57Z

The trouble seems to be that MsDataFileImpl::GetSpectrumIndex(string id)
returns a different value depending on whether MS2 prefiltering is on

That's correct. SpectrumList.find(SpectrumList.spectrumIdentity(SpectrumList.size()-1).id) will return a different number depending on how preferOnlyMsLevel is set, and it should always be less than SpectrumList.size() (which would mean the id wasn't found).

and MsDataFileImpl:: GetSpectrum(int spectrumIndex) seems not to be dealing
what that difference.

This is what I don't understand and haven't been able to reproduce since the find() fix. Is there a SpectrumList wrapper in play here?

So probably one or the other is dealing or failing to deal with the gaps in the indexing?

bspratt · 2018-05-18T17:23:13Z

There is a SpectrumList_IonMobility that wraps the original SpectrumList, though it's the original that's being accessed here.

…

On Fri, May 18, 2018 at 10:09 AM, Matt Chambers ***@***.***> wrote: The trouble seems to be that MsDataFileImpl::GetSpectrumIndex(string id) returns a different value depending on whether MS2 prefiltering is on That's correct. SpectrumList.find(SpectrumList.spectrumIdentity( SpectrumList.size()-1).id) will return a different number depending on how preferOnlyMsLevel is set, and it should always be less than SpectrumList.size() (which would mean the id wasn't found). and MsDataFileImpl:: GetSpectrum(int spectrumIndex) seems not to be dealing what that difference. This is what I don't understand and haven't been able to reproduce since the find() fix. Is there a SpectrumList wrapper in play here? So probably one or the other is dealing or failing to deal with the gaps in the indexing? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#50 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABJSeX8LUuGdoHUcIeftcWAInpA-D6IAks5tzwAsgaJpZM4TxgRA> .

bspratt · 2018-05-18T17:33:52Z

If you want to have a run at it, in Skyline.sln debug TestRunner with these arguments:: status=on offscreen=False loop=1 perftests=on test=MeasuredInverseK0ValuesPerfTest

…

On Fri, May 18, 2018 at 10:23 AM, Brian Pratt ***@***.***> wrote: There is a SpectrumList_IonMobility that wraps the original SpectrumList, though it's the original that's being accessed here. On Fri, May 18, 2018 at 10:09 AM, Matt Chambers ***@***.***> wrote: > The trouble seems to be that MsDataFileImpl::GetSpectrumIndex(string id) > returns a different value depending on whether MS2 prefiltering is on > > That's correct. SpectrumList.find(SpectrumList > .spectrumIdentity(SpectrumList.size()-1).id) will return a different > number depending on how preferOnlyMsLevel is set, and it should always > be less than SpectrumList.size() (which would mean the id wasn't found). > > and MsDataFileImpl:: GetSpectrum(int spectrumIndex) seems not to be > dealing > what that difference. > > This is what I don't understand and haven't been able to reproduce since > the find() fix. Is there a SpectrumList wrapper in play here? > > So probably one or the other is dealing or failing to deal with the gaps > in the indexing? > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#50 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/ABJSeX8LUuGdoHUcIeftcWAInpA-D6IAks5tzwAsgaJpZM4TxgRA> > . >

bspratt · 2018-05-18T17:37:23Z

You know, I'm pretty sure it's Skyline screwing up. It really looks like its managing to ask for an MS2 scan after promising not to to that. Ball is in my court.

…

On Fri, May 18, 2018 at 10:33 AM, Brian Pratt ***@***.***> wrote: If you want to have a run at it, in Skyline.sln debug TestRunner with these arguments:: status=on offscreen=False loop=1 perftests=on test= MeasuredInverseK0ValuesPerfTest On Fri, May 18, 2018 at 10:23 AM, Brian Pratt ***@***.***> wrote: > There is a SpectrumList_IonMobility that wraps the original SpectrumList, > though it's the original that's being accessed here. > > On Fri, May 18, 2018 at 10:09 AM, Matt Chambers ***@***.*** > > wrote: > >> The trouble seems to be that MsDataFileImpl::GetSpectrumIndex(string id) >> returns a different value depending on whether MS2 prefiltering is on >> >> That's correct. SpectrumList.find(SpectrumList >> .spectrumIdentity(SpectrumList.size()-1).id) will return a different >> number depending on how preferOnlyMsLevel is set, and it should always >> be less than SpectrumList.size() (which would mean the id wasn't found). >> >> and MsDataFileImpl:: GetSpectrum(int spectrumIndex) seems not to be >> dealing >> what that difference. >> >> This is what I don't understand and haven't been able to reproduce since >> the find() fix. Is there a SpectrumList wrapper in play here? >> >> So probably one or the other is dealing or failing to deal with the gaps >> in the indexing? >> >> — >> You are receiving this because you were mentioned. >> Reply to this email directly, view it on GitHub >> <#50 (comment)>, >> or mute the thread >> <https://github.com/notifications/unsubscribe-auth/ABJSeX8LUuGdoHUcIeftcWAInpA-D6IAks5tzwAsgaJpZM4TxgRA> >> . >> > >

…eImpl which could result in the ion mobility peak detector asking for MS2 data after promising not to do so.

…endorReaderTestHarness for non-mzML conversions - fixed bogus references to CompassXtract API for BAF and TDF files (which now use Baf2Sql and TIMS SDK, respectively) - removed redundant scan times and ranges from combined PASEF spectra's scanLists (for all but the first scan)

bspratt · 2018-05-18T18:20:52Z

Rerunning all my tests to be sure (including perf tests), but I'm pretty confident that what I just pushed puts us in position to get this work into trunk. I'm still fooling around with some performance optimizations here, but what we have is better than what we had.

…

On Fri, May 18, 2018 at 10:37 AM, Brian Pratt ***@***.***> wrote: You know, I'm pretty sure it's Skyline screwing up. It really looks like its managing to ask for an MS2 scan after promising not to to that. Ball is in my court. On Fri, May 18, 2018 at 10:33 AM, Brian Pratt ***@***.***> wrote: > If you want to have a run at it, in Skyline.sln debug TestRunner with > these arguments:: > status=on offscreen=False loop=1 perftests=on > test=MeasuredInverseK0ValuesPerfTest > > > On Fri, May 18, 2018 at 10:23 AM, Brian Pratt ***@***.***> > wrote: > >> There is a SpectrumList_IonMobility that wraps the original >> SpectrumList, though it's the original that's being accessed here. >> >> On Fri, May 18, 2018 at 10:09 AM, Matt Chambers < >> ***@***.***> wrote: >> >>> The trouble seems to be that MsDataFileImpl::GetSpectrumIndex(string >>> id) >>> returns a different value depending on whether MS2 prefiltering is on >>> >>> That's correct. SpectrumList.find(SpectrumList >>> .spectrumIdentity(SpectrumList.size()-1).id) will return a different >>> number depending on how preferOnlyMsLevel is set, and it should always >>> be less than SpectrumList.size() (which would mean the id wasn't >>> found). >>> >>> and MsDataFileImpl:: GetSpectrum(int spectrumIndex) seems not to be >>> dealing >>> what that difference. >>> >>> This is what I don't understand and haven't been able to reproduce >>> since the find() fix. Is there a SpectrumList wrapper in play here? >>> >>> So probably one or the other is dealing or failing to deal with the >>> gaps in the indexing? >>> >>> — >>> You are receiving this because you were mentioned. >>> Reply to this email directly, view it on GitHub >>> <#50 (comment)>, >>> or mute the thread >>> <https://github.com/notifications/unsubscribe-auth/ABJSeX8LUuGdoHUcIeftcWAInpA-D6IAks5tzwAsgaJpZM4TxgRA> >>> . >>> >> >> >

…'s scanLists (for all but the first scan)

chambm requested review from bspratt and brendanx67 May 3, 2018 17:42

chambm added 2 commits May 3, 2018 12:47

Merge branch 'master' into feature/better-Bruker-PASEF-support

bc6efc5

* added CLI bindings for MSData::getArrayByCVID()

7a99df5

bspratt reviewed May 3, 2018

View reviewed changes

- fixed some issues and tweaked ScanSummer's peak merging threshold t…

45b3d45

…o 1e-2 to more closely approximate the official Bruker peak merging, but eventually this threshold should be configurable

chambm commented May 8, 2018

View reviewed changes

bspratt reviewed May 8, 2018

View reviewed changes

chambm added 5 commits May 8, 2018 17:22

* added SpectrumList_Bruker::hasPASEF()

bd15d7c

* changed conversion of avgScanNumber to oneOverK0 to be on-demand to…

ff6426f

… avoid excessive file open times

Merge branch 'master' into feature/better-Bruker-PASEF-support

576fd2e

Merge branch 'master' into feature/better-Bruker-PASEF-support

69fc69b

back out some experimental changes around ion mobility filtering that…

124ba73

… ignored the need to TIC collection in MS1 data

chambm added 2 commits May 17, 2018 15:23

Merge branch 'master' into feature/better-Bruker-PASEF-support

6dc6e58

bspratt and others added 2 commits May 18, 2018 11:12

Skyline: remove an incautious use of preferOnlyMsLevel with MsDataFil…

f9fb44a

…eImpl which could result in the ion mobility peak detector asking for MS2 data after promising not to do so.

bspratt and others added 2 commits May 18, 2018 14:16

Skyline: fix a RedundantUsingDirective warning for quiet code inspection

af206ae

- removed redundant scan times and ranges from combined PASEF spectra…

4979f57

…'s scanLists (for all but the first scan)

chambm merged commit ec3ef3e into master May 19, 2018

chambm mentioned this pull request May 21, 2018

Confusing Requirements for BiblioSpec Compilation #41

Closed

Improvements to Bruker PASEF support #50

Improvements to Bruker PASEF support #50

Conversation

chambm commented May 3, 2018 • edited Loading

chambm commented May 3, 2018

bspratt left a comment

Choose a reason for hiding this comment

chambm commented May 3, 2018

bspratt commented May 3, 2018

chambm commented May 3, 2018

bspratt commented May 3, 2018

chambm commented May 3, 2018 • edited Loading

chambm commented May 3, 2018

brendanx67 commented May 3, 2018 via email

chambm May 8, 2018

Choose a reason for hiding this comment

chambm May 8, 2018

Choose a reason for hiding this comment

bspratt left a comment

Choose a reason for hiding this comment

chambm commented May 8, 2018

chambm commented May 15, 2018

bspratt commented May 15, 2018 via email

chambm commented May 15, 2018

bspratt commented May 15, 2018 via email

bspratt commented May 16, 2018 via email

bspratt commented May 16, 2018 via email

bspratt commented May 16, 2018 via email

chambm commented May 16, 2018 • edited Loading

bspratt commented May 16, 2018 via email

chambm commented May 16, 2018

bspratt commented May 16, 2018 via email

bspratt commented May 16, 2018 via email

chambm commented May 16, 2018 via email

chambm commented May 17, 2018 • edited Loading

bspratt commented May 17, 2018 via email

bspratt commented May 17, 2018 via email

chambm commented May 17, 2018

bspratt commented May 17, 2018 via email

chambm commented May 18, 2018

bspratt commented May 18, 2018 via email

chambm commented May 18, 2018

bspratt commented May 18, 2018 via email

chambm commented May 18, 2018

bspratt commented May 18, 2018 via email

bspratt commented May 18, 2018 via email

bspratt commented May 18, 2018 via email

bspratt commented May 18, 2018 via email

chambm commented May 3, 2018 •

edited

Loading

chambm commented May 3, 2018 •

edited

Loading

chambm commented May 16, 2018 •

edited

Loading

chambm commented May 17, 2018 •

edited

Loading