Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix phylomatic_local on Windows #17

Closed
sckott opened this issue Apr 13, 2016 · 51 comments · Fixed by #35
Closed

Fix phylomatic_local on Windows #17

sckott opened this issue Apr 13, 2016 · 51 comments · Fixed by #35
Assignees
Milestone

Comments

@sckott
Copy link
Contributor

sckott commented Apr 13, 2016

from #13

@sckott sckott self-assigned this Apr 13, 2016
@sckott sckott modified the milestones: v0.2, v0.3 Apr 13, 2016
@aeltonbg
Copy link

Ok!

waiting for a solution @sckott

Thanks again!

@johnroxton
Copy link

johnroxton commented Dec 20, 2016

Hello @sckott!

I am trying to use phylomatic_local() and installed msys. But the "Error: awk is missing, install it first" still pops up. I wonder if I need to set an environmental variable or something to let R know gawk is installed.

Sincerely

David

@sckott
Copy link
Contributor Author

sckott commented Dec 20, 2016

@aeltonbg and @johnroxton

sorry for the delay on this, I think we finally have a solution. we started a new pkg https://github.com/ropensci/phylocomr that is an R client for Phylocom that includes Phylocom itself, and should work across operating systems.

to try the new phylomatic local setup, install this pkg like

devtools::install_github("ropensci/brranching@phylocom-pkg")

which will install phylocomr pkg from github

you need the devtools package for this

let me know if that works for you

@johnroxton
Copy link

hi @sckott

I installed the package and it appears to work. However, it seems I need to download a megatree first to build a phylogeny from a taxa list, is that right?

@sckott
Copy link
Contributor Author

sckott commented Dec 21, 2016

see ?phylomatic_trees - which is a list of trees you can choose from that come with the package.

Seems like we can allow user to pass in their own tree. i'll add that.

the storedtree parameter has info on what trees come with the package.

@johnroxton
Copy link

but this is within the brranching pkg, not with phylocomr. using the ph_phylomatic() function from phylocomr, with phylo="R20120829" as an argument, the function crashes.

@sckott
Copy link
Contributor Author

sckott commented Dec 21, 2016

sorry, i forgot to say use brranching::phylomatic_local2()

@sckott
Copy link
Contributor Author

sckott commented Dec 21, 2016

which uses the ph_phylomatic function internally

@johnroxton
Copy link

works! thank you!

@sckott
Copy link
Contributor Author

sckott commented Dec 21, 2016

but note that the fxn name will likely change back to just phylomatic_local soon and before next CRAN version

@johnroxton
Copy link

johnroxton commented Dec 21, 2016

hello @sckott
i tried a bit more and found that for some reason, the phylomatic_local2 function becomes so slow it doesn't return a tree for more than 220 species, while the web interface returns a result immediately for the same set. could be my machine, but, for <220 species, it doesn't take a second to calculate.

@sckott
Copy link
Contributor Author

sckott commented Dec 21, 2016

@johnroxton can you include or attach here or send to me what taxa and tree you're using

@sckott
Copy link
Contributor Author

sckott commented Dec 21, 2016

keep in mind that phylomatic web interface is a different set of code than phylomatic in phylocom. the web interface is written in awk while the phylocom implementation is in C

and the inputs parameters are slightly different between them

@johnroxton
Copy link

I used the R20120829 tree. Here is a link to a txt-file containing the species I tried
https://www.dropbox.com/s/ykbhog3d96vx8rr/plants_test.txt?dl=0

@sckott
Copy link
Contributor Author

sckott commented Dec 21, 2016

I get

library(brranching)
taxa <- readLines("taxa.txt")
phylomatic_local2(taxa, storedtree = "R20120829")
#> preparing names...
#> processing with phylomatic...
#> NOTE: 2 taxa not matched:
#> acrocladium_sp.1
#> arthropteris_orientalis
#> 
#> Phylogenetic tree with 228 tips and 68 internal nodes.
#> 
#> Tip labels:
#> 	acanthaceae_sp.1, acanthopale_laxiflora, adhatoda_engleriana, asystasia_gangetica, asystasia_laticapsula, asystasia_mysorensis, ...
#> Node labels:
#> 	euphyllophyte, magnoliales_to_asterales, poales_to_asterales, , , , ...
#> 
#> Rooted; includes branch lengths.

@sckott
Copy link
Contributor Author

sckott commented Dec 21, 2016

i'll test on a windows machine soon to see if it only happens there

@aeltonbg
Copy link

aeltonbg commented Dec 21, 2016

Hello @sckott and @johnroxton ,

Now I am not working in the project with phylomatic, but I still interested in the results.

After upgrade to the new version and using library(brranching), I tried to insert my data using phylomatic_local2:

I tried with different data: My data set has more than 11000 species.

I had the error:

  • A data example:
    Cerrado <- read.delim("E:/Planilhas R/Taxa_Test.txt")
    class(Cerrado$V1)
    Cerrado$V1 <- as.character(Cerrado$V1)
    -Starting:
    phylomatic_local2(taxa = Cerrado,storedtree = "R20120829")
    preparing names...
    processing with phylomatic...
    Error in matrix(NA, sum(text == "(") + sum(text == ","), 2) :
    invalid 'nrow' value (too large or NA)

I try with this two types of data:
This paste had 3 different data sets, all of them is the same, just the data format that differ:
I tried with all the types and all of them had the same problem!
Data set:
https://www.dropbox.com/sh/p22y7mm2x2eivhq/AADylm3iyAXLHAl9ra4uh2Tza?dl=0

Thanks!!!

@aeltonbg
Copy link

Hey @sckott

I was thinking, and could be good that you test this real data set using MAC or linux!

The data have all species in Brazilian Savanna biome, including an Gminosperm as ancestral group!

I don´t know if the data is corrected set to use in R. But I believed that was.

Thanks, best regards!

@sckott
Copy link
Contributor Author

sckott commented Dec 21, 2016

thanks @aeltonbg I'll have a look

@johnroxton
Copy link

Hi @sckott

in my case, there must be some error in the return of the function values. Interestingly, when stopping calculations because the systems appears to be hanging, it still outputs (probably correctly) the names of the species not included.

@johnroxton
Copy link

johnroxton commented Dec 22, 2016

@sckott

I traced down the error and in my case it lies within the phylomatic function. For some reason, the output called out is truncated. It can contain 8095 characters at maximum. If I add too many species, the output just cuts somewhere in the middle of a species name, and the final ";" is missing. This forces read.newick into an eternal loop.

@aeltonbg

Your error is caused by the wrong taxa input format. Try
Cerrado <- readLines("E:/Planilhas R/Taxa_Test.txt")

This works for me, or at least causes the function to hang because of the bug with the character maximum on Windows systems.

@aeltonbg
Copy link

@johnroxton

I think is not this! I tried to input the data as character and the error was the same!

See the code, I was exactly converting the data to character. Yet, I used the command readLines as well, and the same error occurred!

Best regards,

Waiting for a solution!!!!

@johnroxton
Copy link

@aeltonbg
what happens if you change
phylomatic_local2(taxa = Cerrado,storedtree = "R20120829")

to

phylomatic_local2(taxa = Cerrado$V1,storedtree = "R20120829")

@aeltonbg
Copy link

@johnroxton

I tried it now?

See the same error!!

class(Cerrado$V1)
[1] "factor"
Cerrado$V1 <- as.character(Cerrado$V1)
class(Cerrado$V1)
[1] "character"
library(brranching)
phylomatic_local2(taxa = Cerrado$V1,storedtree = "R20120829")
preparing names...
processing with phylomatic...
Error in matrix(NA, sum(text == "(") + sum(text == ","), 2) :
invalid 'nrow' value (too large or NA)

I believe that is the limit of character! Although I tested with 61 species and had the same error!!

@johnroxton
Copy link

If you use the last file you provided (taxa_Savanna.txt) and use

taxNames<-readLines("taxa_Savanna.txt")

what happens with
phylomatic_local2(taxa=taxNames[1:100],storedtree="R20120829")

phylomatic_local2(taxa=taxNames,storedtree="R20120829")

@sckott
Copy link
Contributor Author

sckott commented Dec 22, 2016

the error is coming from phytools::read.newick

looking into it

@sckott
Copy link
Contributor Author

sckott commented Dec 22, 2016

@aeltonbg reinstall and try again, your example now works for me

make sure to set taxnames parameter for whether you have just taxon names or whether you have family/genus/species format

@sckott
Copy link
Contributor Author

sckott commented Dec 22, 2016

@johnroxton

I traced down the error and in my case it lies within the phylomatic function. For some reason, the output called out is truncated. It can contain 8095 characters at maximum. If I add too many species, the output just cuts somewhere in the middle of a species name, and the final ";" is missing. This forces read.newick into an eternal loop.

which phylomatic function is that? phylomatic_local2 this is with your example?

@johnroxton
Copy link

johnroxton commented Dec 22, 2016

@sckott
No this is the C function, I assume. I figured that out by running phylomatic_local2 line by line. The function invokes the call

out <- phylocomr::ph_phylomatic(taxa = dat_, phylo = tree,
lowercase = lowercase, nodes = nodes)

to ph_phylomatic which calls the phylomatic funciton.

out <- suppressWarnings(
phylocomr:::phylomatic(paste0(c(
paste0("-t ", taxa_file),
paste0("-f ", phylo_file),
if (tabular) "-y ",
if (lowercase) "-l ",
if (nodes) "-n "
), collapse = " "), stdout = TRUE)
)[1]

And this function returns a string of maximum 8095 characters on my system. I googled this a bit and found some information on it, but you may know more about that.
This may be related:
http://r.789695.n4.nabble.com/character-to-numeric-conversion-td820947.html

@sckott
Copy link
Contributor Author

sckott commented Dec 22, 2016

@johnroxton thanks for that.

so the error is with the same data from this #17 (comment) ?

@aeltonbg
Copy link

Hey @sckott I will reinstall and try again here!

To reinstall I must use the command

devtools::install_github("ropensci/brranching@phylocom-pkg")

Will it works? Or I need to download the package directly is github?

Thanks again

@sckott
Copy link
Contributor Author

sckott commented Dec 22, 2016

@aeltonbg yes, that's right. run that in R.

@aeltonbg
Copy link

If the answer is YES! @sckott

Here we are again!

I tested, and the answer is the same!

Here we had all command lines and files that I used:

First with pure species names: Like Caryocar brasilienses ....
Names<-readLines("taxa.txt")

phylomatic_local2(taxa=Names,storedtree="R20120829")
preparing names...
processing with phylomatic...
Error in matrix(NA, sum(text == "(") + sum(text == ","), 2) :
invalid 'nrow' value (too large or NA)

Second with complete taxa names, like caryocaraceae/caryocaraceae/caryocar_brasilienses ...
Names<-readLines("taxa_Savanna.txt")

phylomatic_local2(taxa=Names,storedtree="R20120829")
preparing names...
processing with phylomatic...
Error in matrix(NA, sum(text == "(") + sum(text == ","), 2) :
invalid 'nrow' value (too large or NA)

The files is shared in the link!
https://www.dropbox.com/sh/p22y7mm2x2eivhq/AADylm3iyAXLHAl9ra4uh2Tza?dl=0

Thanks again

@sckott
Copy link
Contributor Author

sckott commented Dec 22, 2016

@aeltonbg Did you remember to restart your R session after installing the new package version? What does packageVersion('brranching') give you?

@aeltonbg
Copy link

Hi @sckott

See the results and version!

The error continue

library(brranching)

Names<-readLines("taxa_Savanna.txt")
phylomatic_local2(taxa=Names,storedtree="R20120829")
preparing names...
processing with phylomatic...
Error in matrix(NA, sum(text == "(") + sum(text == ","), 2) :
invalid 'nrow' value (too large or NA)
packageVersion('brranching')
[1] ‘0.2.2.9150’

@sckott
Copy link
Contributor Author

sckott commented Dec 23, 2016

Okay, I'll have to look at it on a windows machine. I'll try to get to that soon

@sckott
Copy link
Contributor Author

sckott commented Jan 6, 2017

some progress - I can at least reproduce the error you get

@sckott
Copy link
Contributor Author

sckott commented Jan 6, 2017

made some progress but now getting down to some error that's again only happening on windows that I can't get an error message for, so not sure what's going on yet. asking phylocom/phylocom#25 about it

@johnroxton
Copy link

johnroxton commented Jan 6, 2017

@sckott

Thanks for coming back to that! Did you find a way to circumvent the truncation problem in Windows? I'd really like to test if I get the same problems you found (new thread 25)

@sckott
Copy link
Contributor Author

sckott commented Jan 6, 2017

@johnroxton are you sure the string isn't longer. like if you write it to a file, and look at it like that, e.g,. writeLines(yourstring, "filepath.txt")

@johnroxton
Copy link

@sckott yes. it's always 8095 characters long. However, I found the following error messages when stripping the suppress warnings() command from the phylomatic C function:
Warning messages:
1: running command '"C:/Users/David/Documents/R/R-3.3.2/phylocomr/bin/i386/phylomatic" -t C:\Users\David\AppData\Local\Temp\Rtmp654JrC\taxa_11e07e8f75a -f C:\Users\David\AppData\Local\Temp\Rtmp654JrC\phylo_11e0a816a3d' had status 1
2: In run("phylomatic", args, stdout) :
call to phylomatic failed with status 1
Maybe this helps

@sckott
Copy link
Contributor Author

sckott commented Jan 6, 2017

@johnroxton right, that's the same error i'm getting i think, opened an issue in the phylocom repo phylocom/phylocom#25 i think it's an issue in the C library, not in the phylocomr package

@johnroxton
Copy link

johnroxton commented Jan 6, 2017

I was wondering if it has something to do with what R help tells about the system() command: [...]Output lines of more than 8095 bytes will be split[...] (accessed by typing ?system in R) it also gives some info about differences of Windows and Unix systems in this respect, but that I don't understand.

@sckott
Copy link
Contributor Author

sckott commented Jan 6, 2017

I don't understand it either! Hoping to sort it out soon

@sckott
Copy link
Contributor Author

sckott commented Feb 2, 2017

still no response over there, pinged him again

@johnroxton
Copy link

In the meanwhile I found a paper doi:10.1093/jpe/rtv047 where they constructed an updated phylogeny for plants from the Zanne et al. 2014 paper and also provided an R script for tree creation. Worked for me, although extremely slow (took 3 hours to run the script for 1300 spp.)

@sckott
Copy link
Contributor Author

sckott commented Mar 6, 2017

@johnroxton Sorry about the slow response - still haven't sorted this out - the phylocom maintainer said he couldn't reproduce the problem phylocom/phylocom#25 (comment)

@johnroxton
Copy link

@sckott Thank you anyway for your effort. It's a pitty it doesn't run smoothly on Windows machines. For me, I found I can use the script I mentioned to create a phylogenetic tree that works ok.

@aeltonbg
Copy link

aeltonbg commented Mar 9, 2017

@sckott tnks for try!

I will to analyse using the same method of @johnroxton !

Regards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants