Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installation Error #30

Closed
statspro1 opened this issue Nov 15, 2016 · 15 comments
Closed

Installation Error #30

statspro1 opened this issue Nov 15, 2016 · 15 comments

Comments

@statspro1
Copy link

install_github("juliasilge/tidytext")
Downloading GitHub repo juliasilge/tidytext@master
from URL https://api.github.com/repos/juliasilge/tidytext/zipball/master
Installing tidytext
--2016-11-15 12:43:46-- https://cran.rstudio.com/src/contrib/tokenizers_0.1.4.tar.gz
Resolving cran.rstudio.com... 52.84.129.209
Connecting to cran.rstudio.com|52.84.129.209|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 50453 (49K) [application/x-gzip]
Saving to: “/tmp/RtmpbgvQJi/tokenizers_0.1.4.tar.gz”

 0K .......... .......... .......... .......... ......... 100% 1.23M=0.04s

2016-11-15 12:43:46 (1.23 MB/s) - “/tmp/RtmpbgvQJi/tokenizers_0.1.4.tar.gz” saved [50453/50453]

Installing tokenizers
'/usr/lib64/R/bin/R' --no-site-file --no-environ --no-save --no-restore
--quiet CMD INSTALL '/tmp/RtmpbgvQJi/devtoolsb60b281702c/tokenizers'
--library='/home/R/x86_64-redhat-linux-gnu-library/3.2'
--install-tests

  • installing source package ‘tokenizers’ ...
    ** package ‘tokenizers’ successfully unpacked and MD5 sums checked
    ** libs
    g++ -m64 -std=c++0x -I/usr/include/R -I/usr/local/include -I"/home/R/x86_64-redhat-linux-gnu-library/3.2/Rcpp/include" -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -c RcppExports.cpp -o RcppExports.o
    g++ -m64 -std=c++0x -I/usr/include/R -I/usr/local/include -I"/home/R/x86_64-redhat-linux-gnu-library/3.2/Rcpp/include" -fpic -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -c shingle_ngrams.cpp -o shingle_ngrams.o
    shingle_ngrams.cpp: In function ‘Rcpp::CharacterVector generate_ngrams_internal(Rcpp::CharacterVector, uint32_t, uint32_t, std::tr1::unordered_set<std::basic_string<char, std::char_traits, std::allocator >, std::tr1::hash<std::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::basic_string<char, std::char_traits, std::allocator > > >&, std::vector<std::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::basic_string<char, std::char_traits, std::allocator > > >&, std::string)’:
    shingle_ngrams.cpp:28: error: expected initializer before ‘:’ token
    shingle_ngrams.cpp:35: error: expected primary-expression before ‘ngram_out_len’
    shingle_ngrams.cpp:35: error: expected ‘)’ before ‘ngram_out_len’
    shingle_ngrams.cpp:35: error: ‘ngram_out_len’ was not declared in this scope
    shingle_ngrams.cpp:36: error: ‘ngram_out_len’ was not declared in this scope
    shingle_ngrams.cpp:44: error: ‘len’ was not declared in this scope
    shingle_ngrams.cpp: In function ‘Rcpp::ListOf<Rcpp::Vector<16, Rcpp::PreserveStorage> > generate_ngrams_batch(Rcpp::ListOf<const Rcpp::Vector<16, Rcpp::PreserveStorage> >, uint32_t, uint32_t, Rcpp::CharacterVector, Rcpp::String)’:
    shingle_ngrams.cpp:80: error: expected initializer before ‘:’ token
    shingle_ngrams.cpp:83: error: expected primary-expression before ‘for’
    shingle_ngrams.cpp:83: error: expected ‘;’ before ‘for’
    shingle_ngrams.cpp:83: error: expected primary-expression before ���for’
    shingle_ngrams.cpp:83: error: expected ‘)’ before ‘for’
    make: *** [shingle_ngrams.o] Error 1
    ERROR: compilation failed for package ‘tokenizers’
  • removing ‘/home/R/x86_64-redhat-linux-gnu-library/3.2/tokenizers’
    Error: Command failed (1)

sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS release 6.8 (Final)

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] devtools_1.12.0

loaded via a namespace (and not attached):
[1] httr_1.2.1 R6_2.2.0 tools_3.2.3 withr_1.0.2 curl_2.2
[6] memoise_1.0.0 knitr_1.11 git2r_0.15.0 digest_0.6.10

Thanks!

@juliasilge
Copy link
Owner

Hmmmm, looks like the tokenizers package (a dependency of tidytext) is not installing correctly for you.
https://github.com/ropensci/tokenizers
I imagine this is related to how the C++ code is being compiled (your C compiler?) but I will admit this is not my area of expertise. I am going to point you to the tokenizers repo and see if they can help.

@statspro1
Copy link
Author

Thanks for the feedback. I already opened an issue on the package page and upgraded Centos and GCC. But the compilation is still failing for that package.

cat /etc/centos-release
CentOS release 6.8 (Final)

gcc --version
gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-15)

I am wondering if other folks are having the same issue.

@juliasilge
Copy link
Owner

I haven't heard of anybody else running into this issue at this point, I'm afraid.

@mo58
Copy link

mo58 commented Dec 19, 2016

Same error for me. Im not sure if it is because of the gcc version 4.8. But probably it is.
Anyone solution?

@mo58
Copy link

mo58 commented Dec 19, 2016

Tried it with gcc 4.4.7. Still not working

@juliasilge
Copy link
Owner

Have you tried getting some help over at the tokenizers package?
https://github.com/ropensci/tokenizers
The tokenizers package is a dependency of tidytext, and does use compiled C++ code. You are on Red Hat Linux like the original poster?

@mo58
Copy link

mo58 commented Jan 3, 2017

yep I have already asked the question there. All versions (R,gcc,binutils etc) are up to date. Still searching what is causing the error.

@anglax
Copy link

anglax commented Feb 24, 2017

Hello, I'm getting the same issue, has anyone come up with a solution?

@wooopenr
Copy link

I am having same issue with below errors. I am using Linux 6.8 with gcc-c++ 4.4

shingle_ngrams.cpp:28: error: expected initializer before ‘:’ token
shingle_ngrams.cpp:35: error: expected primary-expression before ‘ngram_out_len’
shingle_ngrams.cpp:35: error: expected ‘)’ before ‘ngram_out_len’
shingle_ngrams.cpp:35: error: ‘ngram_out_len’ was not declared in this scope
shingle_ngrams.cpp:36: error: ‘ngram_out_len’ was not declared in this scope

shingle_ngrams.cpp:44: error: ‘len’ was not declared in this scope

@juliasilge
Copy link
Owner

Yep, those are errors from the tokenizers package. I see that you have asked over there on an issue in that repo, which is the right way to go. You might also show them the exact errors you are getting, like you did here.

@Ironholds
Copy link

Just a note - I think that this is caused by the dependency tokenizers has on C++11. It may be possible to switch the code to C++98, which would widen the number of compilers it can be used on. I've extended an offer to Lincoln to do that, since it's too makework for him to really be enthusiastic about but would help clear a couple of open issues and expand the usability of tokenizers and its dependent packages.

@Ironholds
Copy link

@statspro1 this should now be fixed; if you run:

devtools::install_github("ropensci/tokenizers")

before installing tidytext, you'll have a version which works swimmingly on CentOS with older GCC versions. It'll be in the next CRAN release, too, but in the meantime grabbing that developer version is the solution, and this bug is officially fixed!

@juliasilge
Copy link
Owner

@Ironholds Thank you SO MUCH for your work in getting this installation issue hammered out. ⭐⭐⭐

@Ironholds
Copy link

Noo problem! Although now it looks like I'm clearing tokenizers' bug backlog too. The reward for a job well done, etc, etc ;)

@github-actions
Copy link

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 26, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants