Code for assembly approach presented in "Haplotype-resolved assembly of a tetraploid potato genome using long reads and low-depth offspring data"
run snakemake
in the coverage-analysis
directory. Requires minimap2
and samtools
.
Note: Requires an installation of the jellyfish
package. If not installed yet, you can install it via conda install jellyfish
git clone git@github.com:rebeccaserramari/polyploid-potato-assembly.git
cd polyploid-potato-assembly/kmer-counting
mkdir build; cd build; cmake ..; make
-
To run the full procedure, including finding unique k-mers in <targetfile>, counting the found unique k-mers in a set of sequences samples, and merging the resulting files:
run
snakemake
within thekmer-counting
directory.
Make sure to update the config files accordingly!
2. To run the first step individually, i.e. find k-mers of length <len> that are uniquely present in <targetfile> and not in <comparisonfile>:
`./polyassembly_findkmers find_kmers -r <targetfile> -s <comparisonfile> -k <kmerfile> -l <len>`
The resulting k-mers are stored in <kmerfile>.
-
To run the second step individually, i.e. count the unique k-mers in <samplefile>:
/polyassembly_findkmers count_kmers -s <samplefile> -k <kmerfile> -c <output> -l <len>
To run the full clustering procedure:
run snakemake
in the cluster-phasing
directory.