Skip to content

Commit

Permalink
Finished merged protein page documentation, various other documentation
Browse files Browse the repository at this point in the history
fixes.
  • Loading branch information
mriffle committed Oct 30, 2015
1 parent 552e2af commit 15af2c8
Show file tree
Hide file tree
Showing 6 changed files with 290 additions and 3 deletions.
Binary file added docs/images/merged-protein-euler-diagram.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/merged-protein-page.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ Here you will find documentation for both using and installing ProXL.
using/peptide
using/merged-peptide
using/protein
using/merged-protein
using/spectrum-viewer


Expand Down
5 changes: 3 additions & 2 deletions docs/using/merged-peptide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ the data from multiple searches and presents the results as an interactive table
The searches do not need to be from the same software pipeline. For example,
different versions of the same program may be compared, or the results from
entirely different programs (e.g., Kojak and XQuest) may be compared. Currently,
the total number of merged runs must be 2 or 3 and must be from the same
the total number of merged searches must be 2 or 3 and must be from the same
project. Note, for the peptide view page seen when viewing a single search,
see :doc:`/using/peptide`.

Expand Down Expand Up @@ -75,7 +75,8 @@ Euler diagram
======================================
.. image:: /images/merged-peptide-euler-diagram.png

The Euler diagram (similar to a Venn diagram) provides a graphical depiction of the overlap
The Euler diagram (similar to a Venn diagram) provides a graphical depiction of the
relative sizes and overlap
between the peptides found in the merged searches. The colors in the diagram match
the colors used for the search list above. The search list is provided to the
left of the diagram with their associated colors as a legend. The labels for each
Expand Down
285 changes: 285 additions & 0 deletions docs/using/merged-protein.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,285 @@
====================================
Merged Protein View Page
====================================

.. image:: /images/merged-protein-page.png

To reach this page, select multiple searches on the project page and click
"View Merged Proteins". (See :doc:`/using/project`.) This page combines and collates
the data from multiple searches and presents the results as an interactive table.
The searches do not need to be from the same software pipeline. For example,
different versions of the same program may be compared, or the results from
entirely different programs (e.g., Kojak and XQuest) may be compared. Currently,
the total number of merged searches must be 2 or 3 and must be from the same
project. For the protein view page seen when viewing a single search,
see :doc:`/using/protein`.

This page is designed to present the data at the protein level, or the UDR level. That is
to say that rows in the table represent distinct proteins linked at distinct positions. Rows
in the table may be clicked on to view the data at the individual search level, and each
of those searches may be expanded to view the underlying proteins, PSMs, and spectra.

**Note**: If any identified peptides map to multiple proteins, those proteins are listed here
as separate rows. For example if peptide 1 is linked to peptide 2, and peptide 1 maps to
protein A and peptide 2 maps to proteins B and C, rows will be present for A-B and A-C.
This may dramatically increase the number of reported crosslinks if your
protein database is redundant in terms of homologous proteins or proteoforms or if small
peptides are mapping to many proteins. The filtering options described below are meant to
help eliminate this redundancy in reported proteins.

Search List
=========================
The list of merged searches is presented below the top navigation. Each search
is shown next to its assigned color for the page, and the color referencing
this search is retained in the Euler diagram and in the peptide table. Clicking the
[+] icon will expand that search to view details:

Search Details
---------------------------
The "Path" is the location on disk from which the data were imported. The "Linker" is the
name of the crossinker used in the experiment. "Search Program(s)" is the name and
version number of the PSM search software used. "Upload date" is the date the data were
uploaded into ProXL. "FASTA file" is the name of the FASTA file used to perform the
PSM search.

Filter Data
=========================
The data presented may be filtered according to the following criteria. Note: Only crosslinks
or looplinks that meet ALL the filter criteria are shown.

PSM q-value cutoff
-------------------------
Only proteins that contain peptides that were identified in the combined data by at least one PSM with a q-value <= to the value
specified will be shown.

Peptide q-value cutoff
-------------------------
Only proteins that contain peptides that have a peptide-level q-value <= the value specified will be shown.
As not all software produces peptide-level q-values, this field will have no effect
on those data.

Exclude xlinks with
-------------------------
Crosslinks or looplinks that have any of the checked attributes will be excluded. The attributes are:

* no unique peptides - If all peptides that ID either one of the crosslinked proteins also map to another protein
* only one PSM - If a given crosslink or looplink was identified by a single PSM
* only one peptide - If a given crosslink or looplink was identifed by a single peptide, where a peptide is the combination of sequence, linked positions, and modifications

Exclude organisms
------------------------
Any links containing a protein that maps to any of the checked organisms will be excluded. The list of
organisms presented was gathered by the proteins found in the search. Useful for filtering out
groups of contaminant proteins.

Exclude protein(s)
------------------------
Any links containing a any of the selected proteins will be excluded. Multiple proteins may be selected
or unselected using control-click (command-click on the mac) or shift-click. Useful for filtering
out individual contaminant proteins.

Update
-------------------------
In order to apply new filter parameters to the shown data, the "Update" button must be clicked. This will
fetch filtered data from the ProXL server and display the data on the web page.

Euler diagram
======================================
.. image:: /images/merged-protein-euler-diagram.png

The Euler diagram (similar to a Venn diagram) provides a graphical depiction of the
relative sizes and overlap
between the proteins/UDRs found in the merged searches. The colors in the diagram match
the colors used for the search list above. The search list is provided to the
left of the diagram with their associated colors as a legend. The labels for each
color include the search ID number and the number of crosslink or looplink UDRs found in each
of the merged searches. The total number of crosslink or looplink UDRs resulting from the merge is presented
in the header above the legend next to "Merged Crosslinks" or "Merged Looplinks".

View Looplinks
=========================
By default, the table shows crosslinks. To switch to looplinks, click the [View Looplinks]
link at the top of the table. To view crosslinks again, click the [View Crosslinks] link
that appears at the top of the table.

Download Data
=========================
All crosslinks and looplinks that meet the current filtering criteria may be downloaded
as tab-delimited text by cliking the [Download Data (#)] link above the table. # indicates
the number of rows in the table.

Download UDRs
=========================
UDR stands for "unique distance restraint", which takes its name from 3D modelling
terminology. A UDR, in ProXL, is any specific position in a protein linked to a
specific position in another protein, whether it is a crosslink or a looplink. The
[Download UDRs (#)] link downloads a non-redundant tab-delimited text table of these UDRs consolidated
from the crosslinks and looplinks. The # is the number of UDRs.

Table Description
=========================
The table presents columns describing the proteins/UDRs and indicates in which of the merged searches
they were found. There is one row per UDR. Each row in the table may be clicked on to expand and view
the protein-level data by search. Each of these searches may then be clicked on to view peptides, PSMs
and spectra from those searches.

Columns
-------------------------
The columns are described below. Note that all column headers may be clicked to toggle between ascending and
descending sorting of that column. Holding the shift key while clicking column headers allow sorting on
multiple columns.

Search Columns
^^^^^^^^^^^^^^^^^^^^^^^^^
The first 1-3 columns will be labeled with search ID numbers as headers, and provide an indication for
whether or not the UDR in that row was found in that search. If found in that search, the cell for
this search in this row will be shaded the same color associated with that search in the Euler diagram
and search list at the top of the page. The column will also contain an asterisk. If not found, this
cell is empty.

Searches
^^^^^^^^^^^^^^^^^^^^^^^^^
The number of the merged searches that contain this UDR. The [+] icon indicates that the row may be clicked on to
be expanded to show underlying searches in which this UDR was found, the peptides and their statistics, and PSMs
and associated spectra.

Protein 1 and 2 (Crosslink-only)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
In the case of crosslinks, these are the crosslinked proteins

Position (Crosslink-only)
^^^^^^^^^^^^^^^^^^^^^^^^^^
This is the crosslinked position in the respective proteins, where the
first residue is counted as position 1.

Protein (Looplink-only)
^^^^^^^^^^^^^^^^^^^^^^^^^
In the case of looplinks, this is the looplinked protein

Position 1 and 2 (Looplink-only)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
In looplinked proteins, these are the positions in the protein that are linked.

PSMs
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The total number of PSMs (peptide spectrum matches) meeting the cutoff that identified either crosslinked (crosslink view) or looplinked
(looplink view) peptides that mapped to the reported proteins and positions.

# Peptides
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The total number of identified crosslinked (crosslink view) or looplinked (looplink view) peptides
that mapped to the reported proteins and positions. Only peptides with a peptide-level
q-value <= the requested cutoff (if applicable) AND having at least one PSM having a
psm-level cutoff <= the requested cutoff are counted.

**Note**: The individual peptides may be viewed by clicking a row in the table to view a
table of peptides. Rows in that peptide table may also be viewed to view the underlying
PSMs and view spectra.

# Unique Peptides
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Of the # of peptides, the total number that uniquely mapped to this protein pair (crosslink view) or
protein (looplink view).

Best Peptide Q-value
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Of the peptides describe above, the best peptide-level q-value found for those peptides (if available).

Best PSM Q-value
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The best PSM-level q-value among the PSMs described above.

View Search-level summary
=================================
Clicking on a row for a UDR will expand that row and present search-level data for that UDR--such as in which search(es) it was found, how many peptides
were found for it, how many PSMs, and best q-values. Clicking on the search rows will expand to reveal underlying peptides.

View Peptides
=========================
All peptides that meet the q-value cutoffs that were mapped to a protein-level crosslink
or looplink may be seen by clicking on the respective row in the search-level summary. Additionally, all rows
of this peptide table may clicked to view all PSMs associated with that peptide identification.

Columns
-------------------------
The peptides appear in a table with the following columns:

Reported peptide
^^^^^^^^^^^^^^^^^^^^^^^^^
The peptide identificaton as it was reported by the respective search program.

Peptide 1 and 2 (Crosslink-only)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The sequences of the two crosslinked peptides.

Pos (Crosslink-only)
^^^^^^^^^^^^^^^^^^^^^^^^^
The positions in the respective peptides that were crosslinked (starting at 1).

Peptide (Looplink-only)
^^^^^^^^^^^^^^^^^^^^^^^^^
The sequence of the looplinked peptide.

Pos 1 and 2 (Looplink-only)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The positions in the peptide that were looplinked.

Q-value
^^^^^^^^^^^^^^^^^^^^^^^^^
The peptide-level q-value for this peptide identification (if available)

# PSMs
^^^^^^^^^^^^^^^^^^^^^^^^^
The number of PSMs that meet the cutoff criteria that identified this peptide.

View PSMs
=========================
All PSMs with a q-value <= the specified PSM-level cutoff may be viewed for a peptide by clicking on a row
in the peptide table that is shown when clicking a row in the main protein table.

Columns
-------------------------
The PSMs appear in a table with the following columns:

Scan Num.
^^^^^^^^^^^^^^^^^^^^^^^^^
The scan number from the spectral file (e.g., mzML file)

Charge
^^^^^^^^^^^^^^^^^^^^^^^^^
The predicted charge state of the precursor ion.

Obs. m/z
^^^^^^^^^^^^^^^^^^^^^^^^^
The observed m/z of the precursor ion.

RT (min)
^^^^^^^^^^^^^^^^^^^^^^^^^
The retention time in minutes.

Scan Filename
^^^^^^^^^^^^^^^^^^^^^^^^^
The filename of the scan file.

Q-value
^^^^^^^^^^^^^^^^^^^^^^^^^
The q-value for the PSM.

PEP
^^^^^^^^^^^^^^^^^^^^^^^^^
The posterior error probabiliy for this PSM, if available.

SVM Score
^^^^^^^^^^^^^^^^^^^^^^^^^
The support vector machine score for this PSM, if available.

View Spectra
-------------------------
The annotated mass spectrum may be viewed for any PSM by clicking the "View Spectrum" link. For help on our
spectrum viewer, see the :doc:`/using/spectrum-viewer` page.

Sort Data
=========================
All column headers may be clicked to toggle between ascending and
descending sorting of that column. Holding the shift key while clicking column headers allow sorting on
multiple columns.
2 changes: 1 addition & 1 deletion docs/using/protein.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ The protein view page provides a table view of crosslinks or looplinks at the pr
Each row in the table corresponds to a unique crosslink ( specific position in protein A linked
to a specific position in protein B) or a unique looplink (specific pair of positions
in a protein). The data may be filtered according to confidence, taxonomy, or individually
by protein.
by protein. For the view page seen when merging multiple searches, see :doc:`/using/merged-protein`.

**Note**: If any identified peptides map to multiple proteins, those proteins are listed here
as separate rows. For example if peptide 1 is linked to peptide 2, and peptide 1 maps to
Expand Down

0 comments on commit 15af2c8

Please sign in to comment.