Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update GNPSExport.cpp #5594

Merged
merged 4 commits into from
Jan 5, 2022
Merged

Update GNPSExport.cpp #5594

merged 4 commits into from
Jan 5, 2022

Conversation

eeko-kon
Copy link
Contributor

I added a few more details on the proposed workflow (command examples) mainly - I think it would be great to add a link to an emptyfile.idXML at the IDMapper Requirements section
"Even in untargeted metabolomics/proteomics, an empty idXML or mzid (peptide annotation format) file is needed as an input which can be found here link/to/emptyfile.idXML.

Description

Please include a summary of the change and which issue is fixed.

Checklist:

  • Make sure that you are listed in the AUTHORS file
  • Add relevant changes and new features to the CHANGELOG file
  • I have commented my code, particularly in hard-to-understand areas
  • New and existing unit tests pass locally with my changes
  • Updated or added python bindings for changed or new classes. (Tick if no updates were necessary.)

How can I get additional information on failed tests during CI:

If your PR is failing you can check out

Note:

  • Once you opened a PR try to minimize the number of pushes to it as every push will trigger CI (automated builds and test) and is rather heavy on our infrastructure (e.g., if several pushes per day are performed).

I added a few more details on the proposed workflow (command examples) mainly - I think it would be great to add a link to an emptyfile.idXML at the IDMapper Requirements section 
"Even in untargeted metabolomics/proteomics, an empty idXML or mzid (peptide annotation format) file is needed as an input which can be found here link/to/emptyfile.idXML.
@timosachsenberg

This comment has been minimized.

on the consensusXML file and corresponding mzML files to generate the files needed for FBMN on GNPS.
These two files are:

- The MS/MS spectral data file (.MGF format) which is generated with the GNPSExport util.
- The feature quantification table (.CSV format) which is generated with the TextExport util.
- The feature quantification table (.TXT format) which is generated with the TextExport util.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is .txt the correct output format (TextExporter supports tsv, csv and txt)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When selecting OpenMS as a preprocessing tool in the FBMN workflow (GNPS), the required file format of the Feature Quant table is .txt (TextExporter supports that, yes).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the .txt output of textexporter compatible with the FBMN workflow? Or expects it a txt file that has the format of the .csv file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be compatible because I have used it repeatedly and it works very well. TextExporter generates a txt file that has the format of a tsv basically! So txt with tab-separated data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quoting the documentation from GNPS:

In brief, after running an OpenMS "metabolomics" pipeline, the GNPSExport TOPP tool can be used on the consensusXML file and corresponding mzML files to generate the files needed for FBMN on GNPS. These two files are:

The feature quantification table (.TXT format) which is generated with the TextExport tool.
The MS2 spectral summary file (.MGF format) which is generated with the GNPSExport tool.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok great. is there a way we can test that the txt format works?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I literally just saw this comment- sorry! So I have used those txt files generated by GNPSexport (OpenMS) in FBMN-GNPS and it works perfectly - examples:
https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=441a29dc057747f094330148d40493e0
https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=af126b5cd46840b79acf4b58cece09ec#

src/topp/GNPSExport.cpp Outdated Show resolved Hide resolved
src/topp/GNPSExport.cpp Outdated Show resolved Hide resolved
src/topp/GNPSExport.cpp Outdated Show resolved Hide resolved
src/topp/GNPSExport.cpp Outdated Show resolved Hide resolved
GNPSExport -ini iniFile-GNPSExport.ini -in_cm filtered.consensusXML -in_mzml inputFile0.mzML inputFile1.mzML -out GNPSExport_output.mgf
9. Run the @ref TOPP_TextExporter on the "filtered consensusXML file" to export an .TXT file.
TextExporter -in FileFilter.consensusXML -out FeatureQuantificationTable.txt
10. Upload your files to GNPS and run the Feature-Based Molecular Networking workflow. Instructions are here:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a way to automatize that? e.g. we could in principle also upload the data from the tool and download results

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great idea. Should I ask Ming during GNPS office hours (tomorrow 6 pm)?

Requirements:
- The IDMapper has to be run on the featureXML files, in order to associate MS2 scan(s) (peptide annotation) with each
features. These peptide annotations are used by the GNPSExport.
feature, using a peptide annotation file (idXML). Even in untargeted metabolomics/proteomics, an empty idXML or mzid (peptide annotation format) file is needed as an input.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having the empty idXML seems awkward. Could we make this optional?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is also confusing that it talks about protein / peptide annotations

Small corrections
@timosachsenberg
Copy link
Contributor

rebuild jenkins

@timosachsenberg
Copy link
Contributor

@eeko-kon can you check if this is up-to-date with what you are currently doing? I would give it another quick review so we can merge that.

@timosachsenberg timosachsenberg merged commit 3c38f48 into OpenMS:develop Jan 5, 2022
@eeko-kon eeko-kon deleted the patch-1 branch January 5, 2022 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants