Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read component from IsoDat .dxf files #161

Open
jhowasmrtl opened this issue Mar 16, 2021 · 6 comments
Open

read component from IsoDat .dxf files #161

jhowasmrtl opened this issue Mar 16, 2021 · 6 comments

Comments

@jhowasmrtl
Copy link

In IsoDat 3, it is possible to assign a "component" label to peaks. This would be nice to read into the vendor_data_table.

@jhowasmrtl
Copy link
Author

Are there plans for adding other kinds of data from the iso_get_vendor_data_table for IsoDat .dxf continuous flow files, such as peak areas? Or are they in an inconvenient data structure in .dxf files?

@sebkopf
Copy link
Contributor

sebkopf commented Apr 10, 2021

Hi @jhowasmrtl , good idea for the component labels. Do you have a good example file to test this on? Component labels don't seem to get saved consistently so we couldn't figure out how to pull them out reliably but if we have more example files with and without component assignments we might be able to discern a more robust pattern to pull the information out.

Peak areas are already in the vendor data table but they natively come out of isodat called rIntensityxx which I find very counterintuitive. The isoprocessor package provides some functionality to convert the peak table labels to more intuitive and shorter names, check out the example below:

library(isoreader)
library(isoprocessor)
iso_get_reader_example("continuous_flow_example.dxf") %>% 
   iso_read_continuous_flow() %>% 
   iso_set_peak_table_from_isodat_vendor_data_table() %>% 
   iso_get_peak_table()

@jhowasmrtl
Copy link
Author

jhowasmrtl commented Apr 13, 2021 via email

@jhowasmrtl
Copy link
Author

I'm not overly familiar with GitHub, so I'm not sure if those files were made available to you via email attachments. I've made a repository with the files here

@sebkopf
Copy link
Contributor

sebkopf commented Apr 13, 2021

great thank you! you can attach zip files to comments in issues on github but it doesnt seem it works from email
Uploading Compound-Specific-IRMS-main.zip…

@sebkopf
Copy link
Contributor

sebkopf commented Dec 16, 2021

implementation-in-progress note:

Found a block in the binary dxf files that seem to contain the compound names list from the method. This is not directly associated with the peak table but should contain all the information necessary to match peaks with names (i.e. rt, window and min signal height). Unclear how to best do this automatically if there is ambiguity (isodat more or less ignores this problem). An example output looks like this (starts with the text block "Compound Names"):

# A tibble: 6 × 3
  text              start    end
  <chr>             <int>  <int>
1 "Compound Names" 606611 606639
2 ""               606669 606669
3 ""               606673 606673
4 "test component" 606711 606739
5 "test component" 606743 606771
6 "hexane"         606807 606819

Potential implementation approaches:

  • use this compound names list to automatically match peaks to compound names and try to deal with ambiguous assignments the same way as isodat
  • pull out this compound names list to provide the user with a peak map they can then themselves easily apply with the peak names mapping function (would need to implement width restrictions on peak heights same as in isodat) - currently favoring the latter for transparency of what's happening but it would mean iso file objects need to include an additional compound list fields (i.e. peak_map) and all the infrastructure surrounding interactions with it. This could also be used to assign peak maps internally when they are first assigned to files though and thus could be quite useful information to have along.

I'm more leaning towards the latter at the moment to keep the peak table pure of metadata and allow complete transparency in the workflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants