This repository contains SDRF (Sample and Data Relationship Format) files for crosslinking-annotated datasets in ProteomeXchange.
The goal of this repository is to provide standardized metadata annotations for crosslinking proteomics datasets submitted to ProteomeXchange. Each dataset has its own SDRF file that describes the experimental design, samples, and data relationships according to the MAGE-TAB SDRF format.
crosslinking-datasets/
├── datasets/
│ ├── PXD000001/
│ │ └── PXD000001.sdrf.tsv
│ ├── PXD000002/
│ │ └── PXD000002.sdrf.tsv
│ └── ...
├── templates/
│ └── sdrf-template.tsv
└── README.md
SDRF (Sample and Data Relationship Format) is a tab-delimited format that describes:
- Sample characteristics and experimental variables
- Protocols and protocol parameters
- Raw and processed data files
- Relationships between samples and data files
Common SDRF columns for crosslinking experiments include:
source name: Biological source identifiercharacteristics[organism]: Species/organismcharacteristics[cell type]: Cell type (if applicable)characteristics[disease]: Disease state (if applicable)comment[data file]: Raw data file namescomment[fraction identifier]: Fraction informationcomment[technical replicate]: Technical replicate numbercomment[biological replicate]: Biological replicate numbercomment[label]: Labeling informationcomment[instrument]: Mass spectrometry instrumentcomment[modification parameters]: PTMs and crosslinker informationcomment[cleavage agent details]: Protease used
- Create a new directory under
datasets/with the ProteomeXchange accession ID (e.g.,PXD012345) - Create an SDRF file named
<accession>.sdrf.tsvin that directory - Follow the SDRF format guidelines and use the template as reference
- Ensure all required columns are present and properly formatted
- Validate the SDRF file using appropriate validation tools
SDRF files should be validated before submission. You can use tools like:
- sdrf-pipelines for validation
- ProteomeXchange submission validation tools
Automated Validation: All pull requests that modify *.sdrf.tsv files are automatically validated using the sdrf-pipelines tool before they can be merged into the main branch. The validation checks:
- File format and structure
- Required columns presence
- Ontology term correctness
- Data consistency
To contribute a new SDRF file:
- Fork this repository
- Add your SDRF file following the structure above
- Submit a pull request with a description of the dataset
- Ensure your SDRF file passes automated validation
For questions or issues, please open an issue in this repository.