This repository contains scripts for SureQuant mass spec method building for analysis of peptides presented on MHCs, and for analyzing data generated by these methods for absolute and relative quantification.
Run the following in the command line to download the code:
git clone https://github.com/oleddy/SureQuant_MHC.git
Install all dependencies as follows:
pip install -r requirements.txt
Using a virtual environment to install dependencies and run the code is recommended. Cloning the repository and installing all dependencies should take on the order of 1-2 minutes. All scripts should take no more than 1-2 minutes to execute on a typical personal computer.
Code was tested with Python 3.11.3 on macOS 12.0.1
A detailed description of what each script does can be found in the Procedure section of the paper. Below, you can find information on how to format inputs and command line arguments.
To generate the tables you will need to build your instrument method, start with a report exported from Skyline containing data from a survey run including only the stable isotope labeled (SIL) trigger peptides for your SureQuant panel. An example can be found in method_building/example_inputs
.
The generate_method_tables.py
script takes the following command line arguments:
Flag | Definition | Required? (Y/N) |
---|---|---|
-i |
The path to the input data (a Survey_Run_Results report CSV file exported from Skyline) | Y |
-o |
The path to a directory where the output should be generated | Y |
-l |
A comma-spearated list of the SIL label mass offsets (in Daltons) that occur among your set of targets (e.g., -l 6,7 ) |
Y |
-c |
A comma-separated list of the charge states that occur among your set of targets (e.g., -c 2,3 ). Default value is "2,3". |
N |
-n |
The number of fragment ions in each pseudospectrum. Must be an integer, not greater than the number of fragment ions per peptide in your spectral library. Default value is 6. | N |
Example output files can be found in method_building/example_outputs
. To generate these outputs from the provided inputs, run the following in the method_building
directory:
python3 generate_method_tables.py -i ./example_inputs/Survey_Results_Review.csv -o ./example_outputs -c 2,3 -l 6,7
To compute the normalized, hipMHC-corrected relative intensities of target peptides in a relative quantification SureQuant MHC analysis, start with a report exported from Skyline that contains the following fields:
Isotope Label Type
Modified Sequence
Fragment Ion
Precursor Charge
Replicate
Peptide Modified Sequence
Area
The SQ_relative_quantification
script takes the following command line arguments
Flag | Definition | Required? (Y/N) |
---|---|---|
-d |
The path to the data (a CSV report exported from Skyline) | Y |
-o |
The path to a directory where the output should be generated | Y |
-s |
The path to a table of hipMHC standards to use for normalization (see below) | Y |
-t |
The path to a table of target peptides | Y |
-c |
The path to a table of experimental conditions | Y |
-n |
The number of fragment ions to use for quantification (default value is 3) | N |
The script will generate two tables. One contains the final normalized, hipMHC corrected intensities for the target peptides. The other contains the individual correction factors computed for each hipMHC, for each experimental condition. In calculating the normalized relative intensities of the target peptides, the correction factors for the individual hipMHCs are averaged to obtain the overall correction factor for a given experimental condition.
To generate the example outputs from the example input, run the following in the relative_quantification
directory:
python3 SQ_relative_quantification.py -o ./example_outputs -s ./example_inputs/hipMHC_table.csv -t ./example_inputs/target_list.csv -d ./example_inputs/Skyline_report_TAP.csv -c ./example_inputs/conditions_table_TAP.csv
Since Skyline will assing both the "light" (1x SIL labeled) hipMHC and the corresponding "heavy" (2x labeled) trigger peptide the label type "heavy," the hipMHC standards table has to separately specify the modified and labeled sequence of the hipMHC peptide and its corresponding trigger peptide. The table has the following columns (one example row is shown):
Light Annotated Seq | Heavy Annotated Seq | Charge |
---|---|---|
ALNEQIARL[+6] | AL[+7]NEQIARL[+6] | 2 |
An example of a full table can be found under relative_quantification/example_inputs
.
The target peptide table specifies the sequence and charge state of each target to be quantified as follows:
Peptide | Charge |
---|---|
LLDEGKQSL | 2 |
An example of a full table can be found under relative_quantification/example_inputs
.
The experimental conditions table lists which mass spec runs should be included in the quantification and includes a Boolean column to indicate which condition should be considered the reference for normalization (defined to have relative intensity = 1).
The table has the following columns (one example row is shown):
Filename | Condition | Reference |
---|---|---|
080623_THP_SQ_WT_H37Rv | WT +H37Rv | TRUE |
An example of a full table can be found under relative_quantification/example_inputs
.
To analyze absolute quantification of a peptide of interest, start with a Skyline report containing the following fields:
Fragment Ion
Modified Sequence
Product Mz
Peak Rank
Height
Replicate
The Absolute_quant.py
script takes the following arguments:
Flag | Definition | Required? (Y/N) |
---|---|---|
--dir |
The working directory | Y |
--file |
The input Skyline report CSV file | Y |
--output |
Linear regression plot name | Y |
--rep |
Replicate name of the sample to quantify (same as the original raw mass spec data file name, unless changed in Skyline) | Y |
--light |
Modified (if applicable) sequence of the endogenous peptide | Y |
--H1 |
Labeled and modified sequence of the first hipMHC standard peptide | Y |
--H2 |
Labeled and modified sequence of the second hipMHC standard peptide | Y |
--H3 |
Labeled and modified sequence of the third hipMHC standard peptide | Y |
--input |
Maximum hipMHC spike-in amount in fmol. | Y |
A 10x dilution series of hipMHC standards is assumed (so if --input
is 100 fmol, then the 1H, 2H, and 3H standards are assumed to have been spiked in at a molar amount of 1 fmol, 10 fmol, and 100 fmol respectively.
To generate the example outputs from the example input, run the following in the absolute_quantification
directory:
python ./Absolute_quant.py --file ./example_inputs/Absolute_Quantification_Report.csv --rep spleenocyte_pulsed_SQ_20231004 --output std_curve --cell 5 --light ITYTWTRL --H1 IT[+5]YTWTRL --H2 IT[+5]YT[+5]WTRL --H3 IT[+5]YT[+5]WT[+5]RL --dir . --input 100
For additional questions or support, please contact owenl [at] mit [dot] edu.