Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow for isobaric analysis (untested) #166

Merged
merged 16 commits into from
Sep 11, 2020
Merged

Conversation

oliveralka
Copy link
Contributor

Reduced rt setting in IDMapper
TMT6-plex (K,N-term) in MSGFPlus
The correction Matrix for the TMT experiment has to be set by the user!

Reduced rt setting in IDMapper 
TMT6-plex (K,N-term) in MSGFPlus
The correction Matrix for the TMT experiment has to be set by the user!
@jpfeuffer
Copy link
Contributor

Very nice. I didn't have a look yet (will today).
But:

  • Did you make the input files accept relative paths inside the workspace?
  • Did you add a description of the workflow in the workflow settings for easy upload to the KNIME Hub?

@jpfeuffer
Copy link
Contributor

I will write a TODO list for PRs here.

@jpfeuffer
Copy link
Contributor

Ah this was just a change in one of the WFs but still, maybe you could add the abovementioned points. We should do it for every WF here anyway soon.

@oliveralka
Copy link
Contributor Author

oliveralka commented Aug 2, 2019

Did you make the input files accept relative paths inside the workspace?

Should we ship the input data with the workflow?

Did you add a description of the workflow in the workflow settings for easy upload to the KNIME Hub?

Can you please provide a reference / or example on how to do that?

@jpfeuffer
Copy link
Contributor

No, otherwise it gets too big. Data goes here (assume the workspace root is this folder, i.e. specify paths relative to that root):
https://abibuilder.informatik.uni-tuebingen.de/archive/openms/Tutorials/Data/

Instructions for description is here:
https://docs.google.com/document/d/1uwVCz-4LH7VUzhcsE5FVyrBbRmzDnGQvdFF2ZsA8BYU/edit#heading=h.3hm698qmdoi3

@jpfeuffer
Copy link
Contributor

The workflows have to run from the commandline. You can check by running the shell script executeBatch with arguments:

  1. knime executable
  2. workspace root

@oliveralka
Copy link
Contributor Author

There are some errors in the workflow which need to be fixed first - I will update the PR, when it is ready.

@oliveralka
Copy link
Contributor Author

We have to check if the IDMapper is switched to "precursor" see OpenMS/OpenMS#4216

@pavel-shliaha
Copy link

pavel-shliaha commented Oct 24, 2019

will the pipeline work for MS3 SPS?

@oliveralka
Copy link
Contributor Author

oliveralka commented Oct 24, 2019

The IsobaricAnalyzer should use MS3 automatically if possible (@cbielow).
Please use the current nightly build of OpenMS in KNIME to have full support.

@oliveralka
Copy link
Contributor Author

@timosachsenberg @jpfeuffer
Added a rought draft for integration into the tutorial. Updated the workflow using epifany - runable - but I still have to check the results.
It would be great if you could take a look and make some comments and suggestions, also in terms of explanation of the experimental design for TMT.

Julianus, could you have a peek at the parameters used for epifany?

@pavel-shliaha
Copy link

@oliveralka could you please point to where the tutorial for MS3 TMT is and the actual workflow that you recommend?

@pavel-shliaha
Copy link

@oliveralka. I have found the workflow, but I cannot find the example files (mzMLs and also the MSstatsTMT table). Could you perhaps give a direct link to it?

@oliveralka
Copy link
Contributor Author

oliveralka commented Mar 6, 2020

@pavel-shliaha
Copy link

@oliveralka. I downloaded the newest KNIME and OpenMS, the dataset and the database and set file locations on my computer. MSGFPlusAdapter and IsobaricAnalyzer nodes failed. Do they work in your hands?

@oliveralka
Copy link
Contributor Author

oliveralka commented Mar 8, 2020

@pavel-shliaha
Everything works in my case:
KNIME 4.1.1
OpenMS 2.5.0.20200201806

What are the error messages? Can you run the tools in debug mode?

@pavel-shliaha
Copy link

@oliveralka. I tried to execute the workflow on two separate computers. On both of them:

KNIME is 4.1.2
OpenMS is 2.5.0.202002241222

How can I run the software in debug mode (googled the question but could not find an answer)? Also if you are interested I can provide access to my computer so you can have a look.

@jpfeuffer
Copy link
Contributor

What does the Console log in the lower right of KNIME say?

@pavel-shliaha
Copy link

@jpfeuffer it says I might be missing some of the requirements (trying to isntall them now). I appologise for this but I used the website https://www.openms.de/download/knime-plugin/ and it does not mention the prerequisites. I will now install proteowizard as well

@pavel-shliaha
Copy link

pavel-shliaha commented Mar 10, 2020

When I load the workflow I get the following message:

error

I installed all the prerequisites from the suggested link but it did not help

@jpfeuffer
Copy link
Contributor

that can usually be ignored.

@pavel-shliaha
Copy link

I installed the prerequisites and the prerequisites for visual studio under this link

https://support.microsoft.com/en-us/help/2977003/the-latest-supported-visual-c-downloads

but it did not help

@jpfeuffer
Copy link
Contributor

As long as it just shows "... FileConverter requirements" this can also be ignored. You just might not be able to use Raw file conversion inside KNIME. We just check registry keys here. Sometimes it does not work correctly.

@pavel-shliaha
Copy link

it still fails to execute the workflow

@jpfeuffer
Copy link
Contributor

I thought so. Without the requirements no OpenMS node would work. That is why you need to check the Console and the Standard Output/Error of the nodes for info now.

@jpfeuffer
Copy link
Contributor

Just a random guess: did you configure the Input database in the ID Metanode?

@pavel-shliaha
Copy link

pavel-shliaha commented Mar 10, 2020

I have installed:

  1. KNIME
  2. OpenMS-2.5.0-Win64.exe from https://www.openms.de/download/openms-binaries
  3. OpenMS extension through KNIME
  4. OpenMS-2.5-prerequisites-installer.exe

the workflow still does not run. Should I install anything else? This is the error message.

(or a complete workflow) generated with an older version of the tool.
If you do not reconfigure the node (marked with an exclamation mark),
the current defaults will be loaded instead.
- Entry for parameter MzTabExporter.1.opt_columns not found in settings.xml.

ERROR LoadWorkflowRunnable Errors during load: Status: Error: Identification_quantification_isobaric_MSstatsTMT 0 loaded with errors
ERROR LoadWorkflowRunnable Status: Error: Identification_quantification_isobaric_MSstatsTMT 0
ERROR LoadWorkflowRunnable Status: Error: MzTabExporter 0:227
ERROR LoadWorkflowRunnable Status: Error: Loading model settings failed:
ERROR LoadWorkflowRunnable GenericKNIMENodes:
ERROR LoadWorkflowRunnable Maybe you are loading node settings (or a complete workflow) generated with an older version of the tool.
ERROR LoadWorkflowRunnable If you do not reconfigure the node (marked with an exclamation mark),
ERROR LoadWorkflowRunnable the current defaults will be loaded instead.
ERROR LoadWorkflowRunnable - Entry for parameter MzTabExporter.1.opt_columns not found in settings.xml.
WARN IsobaricAnalyzer 0:235 Can't continue loop as the workflow was restored with the loop being partially executed. Reset loop start and execute entire loop again.
WARN MSGFPlusAdapter 0:237:3 Can't continue loop as the workflow was restored with the loop being partially executed. Reset loop start and execute entire loop again.
WARN IsobaricAnalyzer 0:235 Can't continue loop as the workflow was restored with the loop being partially executed. Reset loop start and execute entire loop again.

@pavel-shliaha
Copy link

I have reimported the workflow and it seems to be working

@jpfeuffer
Copy link
Contributor

@oliveralka did you check: #166 (comment)

@oliveralka
Copy link
Contributor Author

oliveralka commented Mar 11, 2020

@jpfeuffer Yes, checked some time ago - is set to precursor.

Note: The current workflow does not have the protein level anymore, since we have to fix the inference first and input it to the MSstatsConverter.

Copy link
Contributor

@jpfeuffer jpfeuffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Minor changes.
Then we can go over the workflow together via Skype.

Handout/isobaric.tex Outdated Show resolved Hide resolved
Handout/isobaric.tex Outdated Show resolved Hide resolved
Handout/isobaric.tex Outdated Show resolved Hide resolved
The workflow has three input nodes, the first for the experimental design to allow for MSstatsTMT compatible export (.tsv). The second for the .mzML files with the centroided spectra data from the isobaric labeling experiment and the last one for the .fasta database used for identification. The quantification (A) is performed using the \KNIMENODE{IsobaricAnalzyer}. The tool is able to extract and normalize quantiative information from TMT and iTRAQ data. The values can be assessed from centroided MS2 or MS3 spectra and isotopte correction is performed based on the specified correction matrix (as provided by the manufacturer). The identification (C) is performed as known from the previous chapters by using database search and a target-decoy database.

The workflow is performed peptide level (B, D, F, H), were the posterior error probability (PEP) estimation and FDR filtering is performed on PSM level for each file individually (B). Afterwards the identification (PSM) and quantiative information is combined using the \KNIMENODE{IDMapper}. After the processing of all available files, the intermediate results are aggregated (D) and can be exported via \KNIMENODE{MzTabExporter} (F) or further processed to obtain a MSstatsTMT
compatible version. Here, the R package MSstatsTMT can be used for further processing. \\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a TODO that protein inference will be added in a later version of the tutorial.

@pavel-shliaha
Copy link

pavel-shliaha commented Jun 22, 2020

Looks good. Minor changes.
Then we can go over the workflow together via Skype.

Dear Jennifer and Oliver, could you please add me to the skype discussion as well? I use a lot of MS3 SPS and I think I could provide some valuable input to the discussion, as well as clarify some questions for myself. I wanted to ask about whether certain functionalities are implemented in openMS. In particular:

  1. Could yo clarify why IDConflictResolver is used with isobaric mass tagging?
  2. I understand that OpenMS workflow suggests using MSStats node for protein inference, but I actually wanted to export the ids with quant values and then look through the table myself. Is there a node that will convert IDConflictResolver output consensusXML to a csv output where the quantitation values are one line for PSM (wide format) or one line per channel (long format). I tried TextExporter, but it does not seem to generate this output
  3. is there a node to estimate precursor contamination?
  4. is there a node to estimate which of the fragments selected for MS3 originate from MS2 fragments which have been identified as true fragments from the identified peptide?

@jpfeuffer
Copy link
Contributor

Hi Pavel,

sorry, the meeting is right now. But you will be in the data clinic tomorrow right?

  1. It only makes sure that the top hit for the corresponding MS2 spectrum is retained (and not more). You could probably achieve the same with an IDFilter after the search engine.
  2. The MSstats node just rearranges the tables to be compatible with MSstats. It does not perform any inference.
    TextExporter should export one line per PSM in the "FEATURE" section. The intensities should be in the columns intensity_0 - intensity_9 (or similar). An alternative would be the MzTabExporter (but I did not double-check the completeness of the output for TMT yet).
  3. Yes, if you export meta values in the TextExporter, there should be a column with a precursor_purity score.
  4. No, but that is a good idea.

@jpfeuffer
Copy link
Contributor

@pavel-shliaha Regarding 4): How do you get the information about the isolated ions in the MS2. I have an SPS mzML here and all the MS3 spectra only report a single selected precursor ion:

              <selectedIonList count="1">
                <selectedIon>
                  ...

Not sure if this is a bug in the mzML converters (e.g. proteowizard) or if this information is just not available in the raw file.

@pavel-shliaha
Copy link

I am looking at the MSGFPlusAdapter and it does not have the TMTPro modification implemented. Could you please have a look?

@jpfeuffer
Copy link
Contributor

Hi, which Unimod accession is this?

@pavel-shliaha
Copy link

accession number is 2016

@oliveralka
Copy link
Contributor Author

I added the data here: https://abibuilder.informatik.uni-tuebingen.de/archive/openms/Tutorials/Data/isobaric_MSV000084264/

@jpfeuffer The path to the data has still to be added to the document, but please let me know what you think of the current version.

handout_isobaric_20200729.pdf

@jpfeuffer jpfeuffer merged commit c636f46 into master Sep 11, 2020
@jpfeuffer jpfeuffer deleted the oliveralka-patch-2 branch September 30, 2020 22:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants