Converter from DTASelect to pepXML
This converter has been developed in order to obtain pepXML files from DTASelect output files, for the need of creating spectrum libraries using SpectraST software from results of ProLuCID + DTASelect workflow.
How to obtain it
Go to here and download the jar file. It will contain all required dependencies, so you don't have to worry about it.
How to use it
- Download the jar file and save it to your computer.
- In the command line, move to the folder where you downloaded the program and type
java -jar dtaselect2pepxml.jar pepxml_file
pepxml_fileis either a single pepXML file or a folder. In case of being a folder, all files with
txtextension will be considered as input files for the conversion. (MANDATORY)
java -jar dtaselect2pepxml.jar c:\users\salva\desktop\data\DTASelect-filter.txt
This will generate a pepXML file as DTASelect-filter.pep.xml. By default, the pepXML file will have a reference to a Trypsin/P enzyme and to a mzXML file. In order to change these two references, see optionall parameters below.
java -jar dtaselect2pepxml.jar c:\users\salva\desktop\data
This will generate a pepXML for each file in the folder with an extension .txt.
Optionally there are these two optional parameters:
raw_file_extension: This optional parameter is present in order to specify the raw file type that eventually is going to be used in the creation of the library with SpectraST. (OPTIONAL)
enzyme_name: This optional parameter is present in order to specify the enzyme in the pepXML file. The following values are allowed: 'Lys-N' ,'Lys-C/P' ,'Arg-C' ,'PepsinA' ,'Trypsin_Mod' ,'dualArgC_Cathep' ,'NoCleavage' ,'TrypChymo' ,'Chymotrypsin' ,'Trypsin' ,'Arg-C/P' ,'Trypsin/P' ,'dualArgC_Cathep/P' ,'Asp-N' ,'V8-DE' ,'Lys-C' ,'V8-E' ,'None'. (OPTIONAL)
java -jar dtaselect2pepxml.jar c:\users\salva\desktop\data\DTASelect-filter.txt mzML Chymotrypsin
This will generate a pepXML file as DTASelect-filter.pep.xml. with a Chymotrypsin annotated inside and with a reference to a mzML file.
- Download and install SpectraST, included in TPP
- Go to the folder bin inside of the TPP installation folder.
- Follow the next steps:
3.A Create a library from all pep.xml converted from the dtaselect files:
spectrast -cNraw -cP0 file.pep.xml
This will create a spectra library file (raw.splib). Additionally raw.pepidx, raw.spidx and raw.sptxt files will be also created. raw.sptxt file is compatible with Skyline and can be edited and viewed there.
Note that instead of
file.pep.xml you could use wildcards in order to create a library from multiple pep.xml files (i.e.
-cNraw indicates that the output library files will be named as raw.splib.
-cP0 has to be added because the pepXML file comming from dtaselect doesn't have p-values, so we need to include all matches with pvalue>=0 (= option
IMPORTANT Referenced RAW files need to be present in the same folder. As an example:
Having a PSM in the DTASelect file like:
041117_DDA_Rep1.53788.53788.5 6.7315 0.5368 100.0 4584.0947 4584.0986 -0.9 1.068073E7 1 10.015502 4.98 36.7 1 -.MRECISIHVGQAGVQIGNACWELYCLEHGIQPDGQMPSDK.T
And asuming that the
raw_file_extension value was "mzXML", a file like
041117_DDA_Rep1.mzXML is going to be needed to be present in the same folder in order to allow SpectraST to read the spectrum that matched to that peptide.
3.B Create a consensus library:
spectrast -cNcons -cAC raw.splib
This will create another set of files for a library named as cons.splib.
-cAC indicates to create a consensus library, meaning that a single consensus spectrum will be created for each peptide sequence from all spectra matching to that peptide.
3.C Apply a quality control filter to the consensus splib library:
spectrast -cNconsQ -cAQ cons.splib
This will create another set of files for a library named as consQ.splib.
-cAQ indicates to perform the quality control filters which will discard some of the spectra in the library._
3.D Appending DECOY spectra to the library by:
spectrast -cNconsQdecoy -cAD -cc -cy1 consQ.splib
This will create another set of files for a library named as consQdecoy.splib.
-cAD generates decoy spectra to the library.
-cc concatenates the generated decoy spectra to the library.
-cy1 is the proportion of decoys over forward entries.
cy2 will mean that it will generate twice decoy entries over forward.