-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First trimmer script not completing #1
Comments
Hi Amanda, can you generate a self alignment of the canu contig selected by mitoVGP (or attach the fasta from canu). Usually this is very informative of repeats and other structures that may cause issues to mummer. We can increase the processivity of mummer with the option -z (e.g. 5000 could be a good value), if the problem is just finding the overlap within a nasty repeat. I close the other issue so we keep everything in one place :-) |
ps. a simple way to generate the self alignment is using online blast (paste the sequence twice) |
Ah interesting. Is it possible to add an error output that mummer failed before going on to create the next directories, etc? Here are self-alignments for both Thanks so much for your help resolving this! |
Following up here, increasing the sensitivity "Mummer sensitivity: -z 5000" didn't resolve the issue. The self-alignment links are no longer live on ncbi - what format would you like them in? Here's the canu fasta (renamed as .txt for uploading) instead in case that's easier. |
Hello! Another follow-up as I continue to tug at this. Line 172 of the trimmer script
is looking for those terminal ends, including BEGIN, but show-cords is not reporting this. There's no BEGIN to grep.
Does this imply that the sequence is sufficiently trimmed? Googling around some more it seems that I may have identified the issue as something you also encountered (mummer4/mummer#110). Were you able to resolve this? Is there a way to bypass this? Thank you! |
Hi Amanda! Blast still has the self-alignment function (need to check the option under the sequence box). When run on the first of your canu contigs (tig00000001) it produces this: This thus seems a perfectly legitimate/regular mitochondrial concatamer for what looks like Pectinophora gossypiella. Solving the concatamer is sometimes tricky as you noted, but the specific issue in mummer should be fixed in mitoVGP by now. I am a bit perplexed by your dotplot because based on blast dotplot it looks like the mitogenome is about 15kb while your dotplot shows 18k of unique sequence. Can you share tig00000001_polish2_10x1.fasta? Are you sure you have passed -z 5000? because your nucmer says -b 500 (so at least in this run you did not) Best |
Hi Giulio! Yes, sorry, I was playing directly in the command line to see if I could figure out what was happening. Whether I use -z 500 or -z 5000, the show-coords and trimming process results in that 18007 length fasta, attached here. Very circular looking blast self-alignment attached here too. It could be possible that we need to update programs or scripts from older versions - I could try that too. Thanks again for all of the help! |
Hi Amanda, yes you got this figured out. Mummer is a little sensitive, so it appears in this case is not able to trim the ends correctly, albeit the issue you pointed to should have been fixed by now. I wonder how the output of show coords look like (i.e. what is confusing the trimming script here). 2 quick options: trim manually and you are done (most of the hard work is done at this point anyway), or trim manually while leave 100 bp only for the next step and run the final steps one by one using the scripts included in mitoVGP. Note that since this is not a vertebrate the final orientation step wouldn't work anyway with default settings. All the best, Giulio |
Progress! At least we've determined the issue here. Here's the output of show-coords:
And this is the case whether I use mummer/4.0.0beta2 or mummer=3.23 as provided in mitoVGP_conda_env_pacbio.yml. Is it possibe that the issue isn't fixed? To confirm, if I manually trim, is it still important to polish the trimmed mitocontig with short reads again? Can you explain the rationale behind this? Thank you!! Amanda |
Interesting. It could be that my trick to fix mummer doesn't work in all cases. Basically mummer fails to find any overlap, even obvious ones, if the beginning of the two sequences being considered is exactly identical. Rather than trying to find the bug in mummer, my fix was introducing a small difference in one of the ends HOWEVER what it's probably happening here is that this worked for removing the first concatamers, but the internal ones are still perfectly identical and the fix doesn't work anymore. You could introduce a base change to one end in this partially trimmed sequence, I guess |
I found that I could circularize that contig with blast, using https://github.com/Ag100Pest/modified_mitoVGP/blob/master/circularizationCheck.py, modified from the mitofinder pipeline. I also added an exit statement to my mitoVGP script (https://github.com/Ag100Pest/modified_mitoVGP/blob/master/mitoVGP) to suggest that circularization strategy if the trimmer fails. |
Hello!
Thanks for building this.
I can't figure out why the first trimmer script is not completing. From what I can tell there's no error generated. It appears that mummer successfully runs twice, but the expected ${FNAME}_polish2_10x1_trim1.fasta isn't generated at the end for map10x2 to proceed. Any suggestions to resolve this?
Here's the relevant standard output:
I've attached a couple of other log files here - not sure what's most helpful. Thanks in advance for any ideas.
Amanda
The text was updated successfully, but these errors were encountered: