Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug/question with --force-all-parents --clades all #35

Open
MarieLataretu opened this issue Dec 1, 2022 · 3 comments
Open

Bug/question with --force-all-parents --clades all #35

MarieLataretu opened this issue Dec 1, 2022 · 3 comments

Comments

@MarieLataretu
Copy link

Hi there,

I just was wondering why I have no output and tried the second example from here:
https://github.com/lenaschimmel/sc2rf#no-output--some-sequences-not-shown

So I added --clades all --force-all-parent to my call, but it seems that they can't be used both:

The number of allowed parents, the number of selected clades, and the --force-all-parents conflict so that the results must be empty.

Also, --clades all can't be used as the last argument (before the input) because the input won't be recognized

Input sequences must be provided, except when rebuilding the examples. Use --help for more info. Program exits.

I'm not sure if this is only my setup/input problem.


Would you suggest to use -c all or -f? My full command is

  python3 sc2rf.py --csvfile ../${name}_sc2rf.csv --parents 1-35 --breakpoints 1-2 \
                      --max-intermission-count 3 --max-intermission-length 1 \
                      --unique 1 --max-ambiguous 10000 --max-name-length 55 \
                      ### --clades all  --force-all-parents  \ ###
                      ../${fasta}

Best
Marie

@corneliusroemer
Copy link

I'm sorry I can't help directly but maybe @ktmeaton can? She's the most knowledgeable person about sc2rf as far as I'm aware :)

@ktmeaton
Copy link

ktmeaton commented Dec 1, 2022

Hi Marie,

Here's my understanding of the problematic parameters.

  • --clades all: consider all clades defined in mapping.csv as potential parents. As of bd2a400, there are 36 potential clades.
  • --parents 1-35: restrict the number of parents in the output to a minimum of 1 and a maximum of 35.

With these arguments, --parents 1-35 conflicts with --clades all which includes 36 clades. My simple fix is to set --parents to an extremely high number (ex. --parents 1-1000). The following command and example data should not generate the warning about conflicting arguments.

Example data of 6 recombinants in Genbank: alignment.fasta.gz (gunzip first)

python3 sc2rf.py alignment.fasta \
	--csvfile tutorial.csv \
	--breakpoints 1-2 \
	--max-intermission-count 3 \
	--max-intermission-length 1 \
	--unique 1 \
	--max-ambiguous 10000 \
	--max-name-length 55 \
	--clades all \
	--force-all-parents \
	--parents 1-1000

However, with these arguments, no recombination will be detected either. This is because BA.4 and BA.5 really complicated things. From my understanding, there are very diagnostic mutations that are exclusively found in BA.2 and not BA.4 or BA.5 (and few diagnostic mutations found in BA.5, but not BA.2 or BA.4, etc.). From my experience, BA.2, BA.4, and BA.5 cannot all be included as potential parents at the same time, one of them has to be dropped. So the following debugging parameters shuold work for the example data:

python3 sc2rf.py alignment.fasta \
	--csvfile tutorial.csv \
	--breakpoints 0-10 \
	--max-intermission-count 3 \
	--max-intermission-length 1 \
	--unique 0 \
	--max-ambiguous 10000 \
	--max-name-length 55 \
	--force-all-parents \
	--parents 1-1000 \
	--clades BA.1 BA.2 BA.5 21J

@MarieLataretu
Copy link
Author

Thanks for your advice, @ktmeaton !
I'll check what fits best with our current usage.

The other problem was/is that input as a positional argument won't be recognized after any argument that accepts a list.
It's somewhat clear; I just expected the readme example to work ☺️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants