Implementation of the sample sheet in the pipeline #102

marissaDubbelaar · 2021-10-27T11:37:08Z

It could be suggested that we create the following annotation for the input file:

sample ID,
vcf file
alleles (A*01:01),
peptide sequences file*,
protein file

The question here is where fo the id and sequence stand for (UniProt id and peptide sequence). Would it not be interesting to include the associated protein here as well?

christopher-mohr · 2021-10-27T13:03:22Z

alleles is mandatory, ONE of vcf file, peptide sequences file and protein file is mandatory

marissaDubbelaar · 2021-10-27T13:07:24Z

References to alter the files:

https://github.com/marissaDubbelaar/mhcquant/blob/master/bin/check_samplesheet.py
https://github.com/marissaDubbelaar/mhcquant/blob/master/modules/local/samplesheet_check.nf

jonasscheid · 2021-10-28T09:08:40Z

Proposed solution for the sample sheet structure:

SampleID | Alleles | FileName

where Alleles column contains EITHER a string of alleles (A*02:01;A*24:01;B*07:02;B*08:01;C*04:01;C*07:01) OR a text file containing one allele per line (no header)

and FileName contains EITHER a tsv file containing the peptides like
id | sequence
peptid1 | SYFPEITHI

OR a fasta file containing protein sequences
OR an annotated vcf file

The input of those provided files need to be validated

marissaDubbelaar · 2021-10-28T09:27:20Z

Maybe we can rename SampleID to ID to keep the uniform annotation

christopher-mohr · 2021-10-28T15:04:36Z

As discussed we will stick to the columns sample, alleles and filename.

Update check_samplesheet.py script for new format #102

marissaDubbelaar · 2021-11-01T10:01:13Z

@jonasscheid, @christopher-mohr
I noticed that the check_requested_models.py doesn't take "A01:01;A02:01" as input but "HLA-A01:01;HLA-A02:01".
So we need to update this in the check_samplesheet.py that it checks whether the HLA types start with "H-2-", or "HLA-" or we need to remove this check from the check_requested_models.py

jonasscheid · 2021-11-01T10:44:13Z

How about leaving the mouse nomenclature as is and allow 2 notations for the HLA alleles: With and without "HLA-" prefix.
I noticed that I need to allow mouse alleles in the check_samplesheet.py as well so I need to update it anyway. Lets wait on @christopher-mohr 's suggestion

ggabernet · 2021-12-21T13:07:22Z

Done in #124

marissaDubbelaar created this issue from a note in Hackathon-October-2021 (epitopeprediction) Oct 27, 2021

christopher-mohr added the enhancement New feature or request label Oct 27, 2021

marissaDubbelaar assigned jonasscheid Oct 27, 2021

marissaDubbelaar moved this from epitopeprediction to Done - Day 2 in Hackathon-October-2021 Oct 28, 2021

marissaDubbelaar moved this from Done - Day 2 to epitopeprediction in Hackathon-October-2021 Oct 28, 2021

jonasscheid mentioned this issue Oct 29, 2021

Update check_samplesheet.py script for new format #102 #108

Merged

4 tasks

christopher-mohr added a commit that referenced this issue Oct 29, 2021

Merge pull request #108 from jonasscheid/dsl2

f6dbaa9

Update check_samplesheet.py script for new format #102

ggabernet closed this as completed Dec 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of the sample sheet in the pipeline #102

Implementation of the sample sheet in the pipeline #102

marissaDubbelaar commented Oct 27, 2021 •

edited

Loading

christopher-mohr commented Oct 27, 2021

marissaDubbelaar commented Oct 27, 2021

jonasscheid commented Oct 28, 2021

marissaDubbelaar commented Oct 28, 2021

christopher-mohr commented Oct 28, 2021

marissaDubbelaar commented Nov 1, 2021 •

edited

Loading

jonasscheid commented Nov 1, 2021

ggabernet commented Dec 21, 2021

Implementation of the sample sheet in the pipeline #102

Implementation of the sample sheet in the pipeline #102

Comments

marissaDubbelaar commented Oct 27, 2021 • edited Loading

christopher-mohr commented Oct 27, 2021

marissaDubbelaar commented Oct 27, 2021

jonasscheid commented Oct 28, 2021

marissaDubbelaar commented Oct 28, 2021

christopher-mohr commented Oct 28, 2021

marissaDubbelaar commented Nov 1, 2021 • edited Loading

jonasscheid commented Nov 1, 2021

ggabernet commented Dec 21, 2021

marissaDubbelaar commented Oct 27, 2021 •

edited

Loading

marissaDubbelaar commented Nov 1, 2021 •

edited

Loading