Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dsl2 #110

Merged
merged 21 commits into from
Nov 18, 2021
Merged

Dsl2 #110

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
a7e4851
create get_samplesheet_paths to check the imput correctly and define …
marissaDubbelaar Oct 29, 2021
f583f10
Change the structure to dsl2, inclusion of modules: check_requested_m…
marissaDubbelaar Nov 1, 2021
f0f61d2
Inclusion of modules: check_requested_models, define_software, gen_pe…
marissaDubbelaar Nov 1, 2021
9b5c7e3
Added to do: somehow there is an empty result generated in the def ma…
marissaDubbelaar Nov 1, 2021
411defb
Merge branch 'marissaDubbelaar-modules' into dsl2
marissaDubbelaar Nov 1, 2021
cc8ff81
resolve check_samplesheet conflicts
marissaDubbelaar Nov 1, 2021
e4f67f0
change the epaa.py script, adjust the 'basic' files and include prope…
marissaDubbelaar Nov 10, 2021
a769a7c
Include local module csvtk_split and change the name of gen_peptides …
marissaDubbelaar Nov 10, 2021
2a17fa6
Include new modules, and workflow according to the DSL2 structure: ca…
marissaDubbelaar Nov 11, 2021
189d747
Include new modules: cat_fasta, cat_vcf, csvtk_concat, and merge_json
marissaDubbelaar Nov 12, 2021
a3813c1
Inclusion of version, appropriate naming and last parts of the conver…
marissaDubbelaar Nov 12, 2021
8870a52
Resolve markdown issues
marissaDubbelaar Nov 18, 2021
d0e19da
Resolve markdown issues
marissaDubbelaar Nov 18, 2021
0e653cb
Merge branch 'nf-core:dsl2' into dsl2
marissaDubbelaar Nov 18, 2021
c4bd543
Resolve markdown issues
marissaDubbelaar Nov 18, 2021
bbba9bb
Resolve markdown issues
marissaDubbelaar Nov 18, 2021
c4359f2
Resolve markdown issues
marissaDubbelaar Nov 18, 2021
c9b5efd
resolve conflict with the allele sheet
marissaDubbelaar Nov 18, 2021
c1e54fa
include .allele files
marissaDubbelaar Nov 18, 2021
34f35d3
Update nextflow version from 20.04 -> 21.04
marissaDubbelaar Nov 18, 2021
7ca3b62
Include right input link for test_peptides_h2
marissaDubbelaar Nov 18, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ jobs:
if: ${{ github.event_name != 'push' || (github.event_name == 'push' && github.repository == 'nf-core/epitopeprediction') }}
runs-on: ubuntu-latest
env:
NXF_VER: '20.04.0'
NXF_VER: '21.04.0'
NXF_ANSI_LOG: false
strategy:
matrix:
Expand Down Expand Up @@ -81,7 +81,7 @@ jobs:
if: ${{ github.event_name != 'push' || (github.event_name == 'push' && github.repository == 'nf-core/epitopeprediction') }}
runs-on: ubuntu-latest
env:
NXF_VER: '20.04.0'
NXF_VER: '21.04.0'
NXF_ANSI_LOG: false
steps:
- name: Check out pipeline code
Expand Down
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### `Added`

- [#73](https://github.com/nf-core/epitopeprediction/pull/73) - Add support for the non-free netmhc tool family including netMHC 4.0, netMHCpan 4.0, netMHCII 2.2, and netMHCIIpan 3.1
- [#101](https://github.com/nf-core/epitopeprediction/pull/101) - Inclusion of local modules and DSL2 conversion

### `Changed`

- [#100](https://github.com/nf-core/epitopeprediction/pull/89) - Merge previous template updates up to `v2.1`
- [#110](https://github.com/nf-core/epitopeprediction/pull/110) - DSL2 conversion

### `Fixed`

Expand Down
2 changes: 1 addition & 1 deletion assets/multiqc_config.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
report_comment: >
This report has been generated by the <a href="https://github.com/nf-core/epitopeprediction" target="_blank">nf-core/epitopeprediction</a>
analysis pipeline. For information about how to interpret these results, please see the
<a href="https://nf-co.re/epitopeprediction" target="_blank">documentation</a>.
<a href="https://github.com/nf-core/epitopeprediction" target="_blank">documentation</a>.
report_section_order:
software_versions:
order: -1000
Expand Down
12 changes: 7 additions & 5 deletions bin/check_requested_models.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ def read_peptide_input(filename):

def convert_allele_back(allele):
name = str(allele)
print(name)
if name.startswith("H-2-"):
# convert internal Fred2 representation back to the nf-core/epitopeprediction input allele format
return name.replace("H-2-", "H2-")
Expand All @@ -49,23 +50,24 @@ def __main__():
parser = argparse.ArgumentParser("Write out information about supported models by Fred2 for installed predictor tool versions.")
parser.add_argument('-p', "--peptides", help="File with one peptide per line")
parser.add_argument('-c', "--mhcclass", default=1, help="MHC class I or II")
parser.add_argument('-l', "--max_length", help="Maximum peptide length")
parser.add_argument('-ml', "--min_length", help="Minimum peptide length")
parser.add_argument('-a', "--alleles", help="<Required> MHC Alleles", required=True)
parser.add_argument('-l', "--max_length", help="Maximum peptide length", type=int)
parser.add_argument('-ml', "--min_length", help="Minimum peptide length", type=int)
parser.add_argument('-a', "--alleles", help="<Required> MHC Alleles", required=True, type=str)
parser.add_argument('-t', '--tools', help='Tools requested for peptide predictions', required=True, type=str)
parser.add_argument('-v', '--versions', help='<Required> File with used software versions.', required=True)
args = parser.parse_args()

selected_methods = [item for item in args.tools.split(',')]
with open(args.versions, 'r') as versions_file:
tool_version = [ (row[0], str(row[1][1:])) for row in csv.reader(versions_file, delimiter = "\t") ]
tool_version = [ (row[0].split()[0], str(row[1])) for row in csv.reader(versions_file, delimiter = "\t") ]
# NOTE this needs to be updated, if a newer version will be available via Fred2 and should be used in the future
tool_version.append(('syfpeithi', '1.0')) # how to handle this?
# get for each method the corresponding tool version
methods = { method:version for tool, version in tool_version for method in selected_methods if tool.lower() in method.lower() }

# get the alleles
alleles = FileReader.read_lines(args.alleles, in_type=Allele)
# alleles = FileReader.read_lines(args.alleles, in_type=Allele)
alleles= [Allele(a) for a in args.alleles.split(";")]

peptide_lengths = []
if (args.peptides):
Expand Down
37 changes: 16 additions & 21 deletions bin/check_samplesheet.py
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,12 @@


import os
import re
import sys
import errno
import argparse
import re


def parse_args(args=None):
Description = "Reformat nf-core/epitopeprediction samplesheet file and check its contents."
Epilog = "Example usage: python check_samplesheet.py <FILE_IN> <FILE_OUT>"
Expand All @@ -30,7 +31,7 @@ def print_error(error, context="Line", context_str=""):
def check_allele_nomenclature(allele):
pattern = re.compile("(^[A-Z][\*][0-9][0-9][:][0-9][0-9])$")
return pattern.match(allele) is not None


def make_dir(path):
if len(path) > 0:
Expand Down Expand Up @@ -62,7 +63,7 @@ def check_samplesheet(file_in, file_out):

Furhter Examples:
- Class2 allele format => https://raw.githubusercontent.com/nf-core/test-datasets/epitopeprediction/testdata/alleles/alleles.DRB1_01_01.txt
- Mouse allele format => https://raw.githubusercontent.com/nf-core/test-datasets/epitopeprediction/testdata/alleles/alleles.H2.txt
- Mouse allele format => https://raw.githubusercontent.com/nf-core/test-datasets/epitopeprediction/testdata/alleles/alleles.H2.txt
- pep.tsv => https://raw.githubusercontent.com/nf-core/test-datasets/epitopeprediction/testdata/peptides/peptides.tsv
- annotated_variants.tsv => https://raw.githubusercontent.com/nf-core/test-datasets/epitopeprediction/testdata/variants/variants.tsv
- annotated_variants.vcf => https://raw.githubusercontent.com/nf-core/test-datasets/epitopeprediction/testdata/variants/variants.vcf
Expand Down Expand Up @@ -97,33 +98,27 @@ def check_samplesheet(file_in, file_out):
"Line",
line,
)

sample, alleles, filename = lspl[: len(HEADER)]


## Check if the alleles given in the text file are in the right format
if alleles.endswith(".txt"):
with open(alleles, "r") as af:
alleles = ';'.join([al.strip('\n') if check_allele_nomenclature(al) else \
print_error("Allele format is not matching the nomenclature", "Line", line) for al in af.readlines()])

## Check sample name entries
sample, alleles, filename = lspl[: len(HEADER)]

file_extension = os.path.splitext(filename)[1][1::]
## Get annotation of filename column
if filename.endswith(".vcf"):
if filename.endswith(".vcf") | filename.endswith(".vcf.gz"):
anno = "variant"
elif filename.endswith(".tsv"):
elif filename.endswith(".tsv") | filename.endswith(".GSvar"):
## Check if it is a variant annotation file or a peptide file
with open(filename, "r") as tsv:
first_header_col = [col.lower() for col in tsv.readlines()[0].split('\t')][0]
if first_header_col == "id":
anno = "pep"
elif first_header_col == "#chr":
anno = "variant"
elif first_header_col == "#chr":
anno = "variant"
else:
anno = "prot"

sample_info = [sample, alleles, filename, anno]
## Create sample mapping dictionary
sample_info = [sample, alleles, filename, anno, file_extension]
## Create sample mapping dictionary
if sample not in sample_run_dict:
sample_run_dict[sample] = [sample_info]
else:
Expand All @@ -137,10 +132,10 @@ def check_samplesheet(file_in, file_out):
out_dir = os.path.dirname(file_out)
make_dir(out_dir)
with open(file_out, "w") as fout:
fout.write(",".join(["sample", "alleles", "filename", "anno"]) + "\n")
fout.write(",".join(["sample", "alleles", "filename", "anno", "ext"]) + "\n")

for sample in sorted(sample_run_dict.keys()):
for idx, val in enumerate(sample_run_dict[sample]):
for val in sample_run_dict[sample]:
fout.write(",".join(val) + "\n")


Expand All @@ -150,4 +145,4 @@ def main(args=None):


if __name__ == "__main__":
sys.exit(main())
sys.exit(main())
2 changes: 1 addition & 1 deletion bin/check_supported_models.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ def __main__():
# NOTE this needs to be updated manually, if other methods should be used in the future
available_methods = ['syfpeithi', 'mhcflurry', 'mhcnuggets-class-1', 'mhcnuggets-class-2']
with open(args.versions, 'r') as versions_file:
tool_version = [ (row[0], str(row[1][1:])) for row in csv.reader(versions_file, delimiter = "\t") ]
tool_version = [ (row[0].split()[0], str(row[1])) for row in csv.reader(versions_file, delimiter = "\t") ]
# NOTE this needs to be updated, if a newer version will be available via Fred2 and should be used in the future
tool_version.append(('syfpeithi', '1.0'))
# get for each method the corresponding tool version
Expand Down
12 changes: 6 additions & 6 deletions bin/decrypt
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
# Decrypts stdin -> stdout reading the passphrase from the environment variable
# DECRYPT_PASSPHRASE.
gpg \
--quiet \
--batch \
--yes \
--decrypt \
--passphrase="$DECRYPT_PASSPHRASE" \
--output -
--quiet \
--batch \
--yes \
--decrypt \
--passphrase="$DECRYPT_PASSPHRASE" \
--output -
Loading