Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phyluce_align_add_missing_data_designators #34

Closed
mateusf opened this issue May 26, 2015 · 7 comments
Closed

Phyluce_align_add_missing_data_designators #34

mateusf opened this issue May 26, 2015 · 7 comments
Assignees
Labels
Milestone

Comments

@mateusf
Copy link

mateusf commented May 26, 2015

I'm getting the following error when trying to run phyluce_align_add_missing_data_designators:

Traceback (most recent call last):
File "/home/camila/anaconda/bin/phyluce_align_add_missing_data_designators", line 237, in
main()
File "/home/camila/anaconda/bin/phyluce_align_add_missing_data_designators", line 222, in main
results = pool.map(add_designators, work)
File "/home/camila/anaconda/lib/python2.7/multiprocessing/pool.py", line 272, in map
return self.map_async(func, iterable, chunksize).get()
File "/home/camila/anaconda/lib/python2.7/multiprocessing/pool.py", line 560, in get
raise self._value
ValueError: list.remove(x): x not in list

I've tried to change the number of cores (I've a Intel i7 3770K) from 1 to 8, and it didn't worked.
What you recommend to do?
Best,
Mateus

@brantfaircloth
Copy link
Member

I'm not sure what the problem is. One thing to keep in mind is that this CPU does not have 8 cores. It has 4 - but that 4 core processor masquerades as having 8 cores (through hyperthreading). Regardless, I don't think that's your problem. Ensure the input directory contains correctly formatted alignments with correct extensions.

Also, as an aside, if you are concatenating data for raxml, you no longer need to add missing data designators before the next step.

@mateusf
Copy link
Author

mateusf commented May 26, 2015

Thanks Brant,
Yeah, previously I was using the flag --cores 4, but I changed it a few times to check if the problem wasn't there. I'll double check the input directories again.
Do you think that the problem is related only with my data? nothing with the system? Because I'm having some problems with git, the following error appears every time I ran something:
"""
2015-05-26 15:26:49,730 - phyluce_align_format_nexus_files_for_raxml - INFO - Version: git usage: git [--version] [--help] [-C ] [-c name=value]
[--exec-path[=]] [--html-path] [--man-path] [--info-path]
[-p|--paginate|--no-pager] [--no-replace-objects] [--bare]
[--git-dir=] [--work-tree=] [--namespace=]
[]
....

'git help -a' and 'git help -g' lists available subcommands and some
concept guides. See 'git help ' or 'git help '
to read about a specific subcommand or concept.
"""
However, I was looking into this problem and it may be related with the proxy server here. I'm trying to fix this with the TI from here.

Since this will be a first assessment of the results, I'll just concatenate the data for raxml now, and then I'll look this again later.

Thanks again,

@brantfaircloth
Copy link
Member

No problem. The git thing I have to fix - it's called at the beginning of logging to report the hash number of a given commit... but it sometimes causes problems when (1) people don't have git installed or (2) in your case - you have git, but it's erroring out... although I don't know why. I want to keep the git part for my own purposes (because i often work from a git version of the code, this is the only way i can track what code ran what analysis).

I can't take a look at this right now, but will try to square away soon. in meantime, if you discover a solution, let me know - would be happy to implement.

@brantfaircloth brantfaircloth added this to the 1.6 milestone May 27, 2015
@brantfaircloth brantfaircloth self-assigned this May 27, 2015
@cathynewman
Copy link

I ran into the same error, both on HPC (16 cores) and on Mac with default setting (1 core). After adding strategically-placed "print local_organisms," "print seq.name," and "print new_seq_name" within add_gaps_to_align, I figured out that the error was caused by my organism/sequence names. All of my names are e.g. "Pserratus_BDT054." The code was stripping the "Pserratus_" from the alignment sequence names and so then couldn't find that new_seq_name (e.g. "BDT054") in the local_organisms list (which had the full original names) to delete it.

Using the --verbatim flag solved that for me, but it will throw the same error then too if all of your organism/sequence names are not all-lowercase.

I'm still trying to get it to work, so it's possible my fiddling around is making things worse, but in short, the problem could be your sample names, if you have them set up similarly (i.e., with underscores).

@brantfaircloth
Copy link
Member

You also can largely skip this step - if building a concatenated
alignment for raxml, you don’t need to add the missing data
designators - it will concatenate and fill in correctly.

-b

On 8 Jul 2015, at 15:49, cathynewman wrote:

I ran into the same error, both on HPC (16 cores) and on Mac with
default setting (1 core). After adding strategically-placed "print
local_organisms," "print seq.name," and "print new_seq_name" within
add_gaps_to_align, I figured out that the error was caused by my
organism/sequence names. All of my names are e.g. "Pserratus_BDT054."
The code was stripping the "Pserratus_" from the alignment sequence
names and so then couldn't find that new_seq_name (e.g. "BDT054") in
the local_organisms list (which had the full original names) to delete
it.

Using the --verbatim flag solved that for me, but it will throw the
same error then too if all of your organism/sequence names are not
all-lowercase.

I'm still trying to get it to work, so it's possible my fiddling
around is making things worse, but in short, the problem could be your
sample names, if you have them set up similarly (i.e., with
underscores).


Reply to this email directly or view it on GitHub:
#34 (comment)

@nicholasmason
Copy link

Just found this thread while encountering the same issue. FWIW, you can also use Nexus.combine from BioPython to concatenate nexus files that have missing taxa for certain loci (http://www.biopython.org/wiki/Concatenate_nexus)

@brantfaircloth
Copy link
Member

This function is not even needed any longer - all of the code will auto-merge alignments having missing taxa.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants