Skip to content
LuisFranciscoHS edited this page Apr 2, 2019 · 2 revisions

The output of PathwayMatcher is composed of two files, the Reaction and Pathway search result and the over representation analysis of the pathways matched.

Additionally, PathwayMatcher allows the user to generate a protein connection graph.

Search

Tab separated file with the list of reactions and pathways related to the input data. The columns are:

  • UNIPROT: The uniprot accession number of the protein associated to the input. Note that the proteins reported in this column are not explicitly given in the input, and can also be implied by the peptides the genetic variants affecting the protein after transcription and translation.
  • REACTION_STID: Reaction stable identifier
  • REACTION_DISPLAY_NAME: Reaction name
  • PATHWAY_STID: Pathway stable identifier
  • PATHWAY_DISPLAY_NAME: Pathway name
  • TOP_LEVEL_PATHWAY_STID: Top level pathway stable identifier
  • TOP_LEVEL_PATHWAY_DISPLAY_NAME: Top level pathway name

For the Genes, Genetic variants, Ensembl and proteoforms an extra column is added with the respective name:

  • GENE
  • ENSEMBL
  • RSID
  • Proteoform: The set of post-translational modifications with their PSI-MOD type and coordinate.

In case the command line argument "-T" is not used, then the last two columns about top level pathways are not shown.

Analysis

A csv file with the statistical analysis results of each of the hit pathways. Each row entry of the file corresponds to a pathway containing at least one participant entity of the input. It contains the following columns:

  • Pathway StId: The Reactome stable unique identifier.
  • Pathway Name: The name of the pathway in Reactome.
  • # Entities Found: The number of entities (proteins or proteoforms) found as participants in the pathway.
  • # Entities Total: The total number of entities participating in the pathway.
  • Entities Ratio: The quotient of the number entities found divided by the total number of entities in the pathway.
  • Entities P-Value: The probability of finding that number of entities given that the selection of entities in the input would be completely random and each protein was selected independently.
  • Significant: If the p-Value is less than or equal to 0.05
  • Entities FDR: The false discovery rate
  • # Reactions Found: The number of reactions in the pathway with a participating entity of the input.
  • # Reactions Total: The total number of reactions in the pathway.
  • Reactions Ratio: The quotient of the number of reactions found divided by the total number of reactions in the pathway.
  • Entities Found: The UniProt accession numbers of the entities found in the pathway.
  • Reactions Found: The Reactome stable identifiers of the reactions with participating entities found in the pathway.

Console output

While PathwayMatcher is running it displays messages at the main steps of the process. As for the entities, it calls them differently depending the stage where they are and the type of input.

Proteins: UniProt

  • Input: accessions in the input that follow the UniProt accession format description
  • Matched: accessions in the input that appear in the database
  • Hit: accessions matched that participate in at least one reaction
  • Vertices:

Proteoforms

  • Input: proteoforms in the input file that follow the SIMPLE format for proteoforms
  • Matched: proteoforms in the input which had an equivalent proteoform in the database
  • Hit: proteoforms matched that participate in at least a reaction
  • Vertices:

Proteins Connection Graph

Protein connection graph represents the relations of proteins in the Reactome database according to the data model.

See this page for more information.