Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generating jaccard distances per kmer #284

Closed
JLC2141 opened this issue Oct 4, 2023 · 3 comments
Closed

Generating jaccard distances per kmer #284

JLC2141 opened this issue Oct 4, 2023 · 3 comments

Comments

@JLC2141
Copy link

JLC2141 commented Oct 4, 2023

popunk version: 2.6.0

I am attempting to re-create the poppunk_sketch jaccard distance table as shown in this previous issue: #167 (comment)

However, I am unable to use poppunk_sketch in my current version of poppunk. My current workaround is as follows:

sketchlib sketch -l files.txt -o database -s 1000 -k 15,30,3 --cpus 40
sketchlib query jaccard database -o dists --cpus 40
poppunk_extract_distances.py --distances dists --output distances.tab

Where the output from poppunk_extract_distances.py in the "Core" and "Accessory" columns appears to be the jaccard distances for the first two kmers of kseq specified in the "sketchlib sketch" function.

Is there a simpler approach to output a table of jaccard distances per kmer?

@JLC2141
Copy link
Author

JLC2141 commented Oct 4, 2023

Here's some additional information:
pp-sketchlib v2.1.1

Installations:
Poppunk Install
Conda create --name poppunk
conda activate poppunk
python3 -mpip install poppunk

pp-sketchlib Install
sudo apt install cmake gfortran libarmadillo-dev libeigen3-dev libopenblas-dev
pip3 install pp-sketchlib

@johnlees
Copy link
Member

johnlees commented Oct 5, 2023

Have you tried just omitting the output of the query step:

sketchlib sketch -l files.txt -o database -s 1000 -k 15,30,3 --cpus 40
sketchlib query jaccard database --cpus 40 > distances.tab

@JLC2141
Copy link
Author

JLC2141 commented Oct 5, 2023

Thank you

@johnlees johnlees closed this as completed Oct 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants