Skip to content

Commit

Permalink
Add license
Browse files Browse the repository at this point in the history
  • Loading branch information
rrwick committed Oct 2, 2017
1 parent 6565d4a commit 5083994
Show file tree
Hide file tree
Showing 7 changed files with 748 additions and 10 deletions.
674 changes: 674 additions & 0 deletions LICENSE

Large diffs are not rendered by default.

17 changes: 13 additions & 4 deletions analysis.sh
@@ -1,13 +1,22 @@
#!/usr/bin/env bash

# This script conducts read and assembly analysis on a set of ONT reads, comparing them to a
# reference sequence. It is part of https://github.com/rrwick/Basecalling-comparison.
# Copyright 2017 Ryan Wick (rrwick@gmail.com)
# https://github.com/rrwick/Basecalling-comparison

# This program is free software: you can redistribute it and/or modify it under the terms of the
# GNU General Public License as published by the Free Software Foundation, either version 3 of the
# License, or (at your option) any later version. This program is distributed in the hope that it
# will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
# or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You
# should have received a copy of the GNU General Public License along with this program. If not,
# see <http://www.gnu.org/licenses/>.

# This script expects to find the following in the directory where it's run:
# This script conducts read and assembly analysis on a set of ONT reads, comparing them to a
# reference sequence. It expects to find the following in the directory where it's run:
# * reference.fasta: the reference sequence
# * 01_raw_fast5 directory: has all fast5 files
# * 02_basecalled_reads directory: has one or more fastq.gz/fasta.gz read files
# * read_id_to_fast5: a file with two columns: read_ID in the first and fast5 filename in the second
# * read_id_to_fast5: a file with two columns: read_ID and fast5 filename
# * illumina_1.fastq.gz illumina_2.fastq.gz: Illumina reads for the same sample

# Set this to the desired number of threads (for alignment and polishing).
Expand Down
15 changes: 13 additions & 2 deletions chop_up_assembly.py
@@ -1,7 +1,18 @@
#!/usr/bin/env python3
"""
This script takes an assembly as input and produces an output of 'reads': the assembly chopped into pieces.
It is for assessing the distribution of identity over the assembly.
Copyright 2017 Ryan Wick (rrwick@gmail.com)
https://github.com/rrwick/Basecalling-comparison
This script takes an assembly as input and produces an output of 'reads': the assembly chopped into
pieces. It is for assessing the distribution of identity over the assembly.
This program is free software: you can redistribute it and/or modify it under the terms of the GNU
General Public License as published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version. This program is distributed in the hope that it
will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should
have received a copy of the GNU General Public License along with this program. If not, see
<http://www.gnu.org/licenses/>.
"""

import sys
Expand Down
19 changes: 15 additions & 4 deletions fix_read_names.py
@@ -1,7 +1,10 @@
#!/usr/bin/env python3
"""
This script adjusts read headers to be consistent between basecallers and compatible with Nanopolish.
After running, each read header should be in this format:
Copyright 2017 Ryan Wick (rrwick@gmail.com)
https://github.com/rrwick/Basecalling-comparison
This script adjusts read headers to be consistent between basecallers and compatible with
Nanopolish. After running, each read header should be in this format:
5a8d447e-84e2-4f6f-922c-5ad7269f688c_Basecall_1D_template 5210_N125509_20170425_FN2002039725_MN19691_sequencing_run_klebs_033_restart_87298_ch152_read14914_strand
It also sorts the reads alphabetically by their new headers and removes 0-length reads.
Expand All @@ -11,11 +14,19 @@
It can take either fasta or fastq input and will output in the same format.
The read_id_to_fast5 file is a tab-delimited file with read IDs in the first column and fast5 filenames in the second.
For example:
The read_id_to_fast5 file is a tab-delimited file with read IDs in the first column and fast5
filenames in the second. For example:
0000974e-e5b3-4fc2-8fa5-af721637e66c_Basecall_1D_template 5210_N125509_20170425_FN2002039725_MN19691_sequencing_run_klebs_033_restart_87298_ch173_read25236_strand.fast5
00019174-2937-4e85-b104-0e524d8a7ba7_Basecall_1D_template 5210_N125509_20170424_FN2002039725_MN19691_sequencing_run_klebs_033_75349_ch85_read2360_strand.fast5
000196f6-6041-49a5-9724-77e9d117edbe_Basecall_1D_template 5210_N125509_20170425_FN2002039725_MN19691_sequencing_run_klebs_033_restart_87298_ch200_read1975_strand.fast5
This program is free software: you can redistribute it and/or modify it under the terms of the GNU
General Public License as published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version. This program is distributed in the hope that it
will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should
have received a copy of the GNU General Public License along with this program. If not, see
<http://www.gnu.org/licenses/>.
"""


Expand Down
11 changes: 11 additions & 0 deletions nanopolish_slurm_wrapper.py
@@ -1,8 +1,19 @@
#!/usr/bin/env python3
"""
Copyright 2017 Ryan Wick (rrwick@gmail.com)
https://github.com/rrwick/Basecalling-comparison
This script is a Nanopolish wrapper I wrote for use on my SLURM-managed cluster. It does the read
alignment, launches Nanopolish jobs, waits for them to finish and merges them together. If any
parts of the assembly fail in Nanopolish it replaces them with Ns so the merge can complete.
This program is free software: you can redistribute it and/or modify it under the terms of the GNU
General Public License as published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version. This program is distributed in the hope that it
will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should
have received a copy of the GNU General Public License along with this program. If not, see
<http://www.gnu.org/licenses/>.
"""

import sys
Expand Down
11 changes: 11 additions & 0 deletions read_length_identity.py
@@ -1,5 +1,8 @@
#!/usr/bin/env python3
"""
Copyright 2017 Ryan Wick (rrwick@gmail.com)
https://github.com/rrwick/Basecalling-comparison
This script produces a table with information for each read.
Inputs:
Expand All @@ -13,6 +16,14 @@
half aligns, only the aligned parts are used to determined the read identity.
Relative length is included to see if the reads are systematically too short or too long.
This program is free software: you can redistribute it and/or modify it under the terms of the GNU
General Public License as published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version. This program is distributed in the hope that it
will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should
have received a copy of the GNU General Public License along with this program. If not, see
<http://www.gnu.org/licenses/>.
"""


Expand Down
11 changes: 11 additions & 0 deletions read_table.py
@@ -1,7 +1,18 @@
#!/usr/bin/env python3
"""
Copyright 2017 Ryan Wick (rrwick@gmail.com)
https://github.com/rrwick/Basecalling-comparison
This script takes a single argument (a directory containing fast5 files) and prints a table to
stdout with some basic read info.
This program is free software: you can redistribute it and/or modify it under the terms of the GNU
General Public License as published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version. This program is distributed in the hope that it
will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should
have received a copy of the GNU General Public License along with this program. If not, see
<http://www.gnu.org/licenses/>.
"""

import h5py
Expand Down

0 comments on commit 5083994

Please sign in to comment.