Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImportError: /lib64/libstdc++.so.6: version `CXXABI_1.3.9' not found when loading Pandas #1467

Closed
TPDeramus opened this issue Aug 30, 2023 · 24 comments

Comments

@TPDeramus
Copy link

Hi Experts.

I have seen this posted a few times but I haven't seen a clear answer, so forgive me if reposting (see: https://community.rstudio.com/t/reticulate-issues-in-rstudio-server/16555 & https://stackoverflow.com/questions/49875588/importerror-lib64-libstdc-so-6-version-cxxabi-1-3-9-not-found & #428)

I am trying to load a .pickle file with a pandas dataframe into R with reticulate.

However, whenever I try to import pandas, the following error gets thrown:

#Error

> pd <- import("pandas")
> Error in py_module_import(module, convert = convert) : ImportError: /lib64/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by /Users/thomas/Python/mambaforge/lib/python3.10/site-packages/pandas/_libs/window/aggregations.cpython-310-x86_64-linux-gnu.so)

Which I assume stems from trying to call the "default" library path on the server where our RStudio-Server instance is initialized.

I have tried multiple methods to address this in both the rmarkdown in the Quarto document I'm using, and the .Rprofile that is called prior to initializing any part of the project as described below:

#In .Rprofile
Sys.setenv(PATH = paste("/Users/thomas/Python/mambaforge/bin/",Sys.getenv()["PATH"],sep=";"))
Sys.setenv(RETICULATE_PYTHON = "/Users/thomas/Python/mambaforge/bin/python3")
Sys.setenv(LD_LIBRARY_PATH = "/Users/thomas/Python/mambaforge/lib/")
source("renv/activate.R")
#use_python("/Users/thomas/Python/mambaforge/bin/python3")
#In Markdown file
PATH <- Sys.getenv("PATH")
RETICULATE_PYTHON <- Sys.getenv("RETICULATE_PYTHON")
LD_LIBRARY_PATH <- Sys.getenv("LD_LIBRARY_PATH")
use_condaenv(condaenv = "base")
system("export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/Users/thomas/Python/mambaforge/lib/")
pd <- import("pandas")

However, the error seems to persist regardless of what path redirection I may call or what environment variables I may actively set.

Would anyone know how I might be able to address this error?

Any help would be greatly appreciated.

Below is the versioning information for the RStudio-Server session, R installation, py_config() output, and server OS.

RStudio Server

2022.07.0+548 "Spotted Wakerobin" Release (34ea3031089fa4e38738a9256d6fa6d70629c822, 2022-07-06) for CentOS 7
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36

R Version

> version
platform       x86_64-pc-linux-gnu         
arch           x86_64                      
os             linux-gnu                   
system         x86_64, linux-gnu           
status                                     
major          4                           
minor          2.1                         
year           2022                        
month          06                          
day            23                          
svn rev        82513                       
language       R                           
version.string R version 4.2.1 (2022-06-23)
nickname       Funny-Looking Kid

py_config() output:

python:         /Users/thomas/Python/mambaforge/bin/python3
libpython:      /Users/thomas/Python/mambaforge/lib/libpython3.10.so
pythonhome:     /Users/thomas/Python/mambaforge:/Users/thomas/Python/mambaforge
version:        3.10.10 | packaged by conda-forge | (main, Mar 24 2023, 20:08:06) [GCC 11.3.0]
numpy:          /Users/thomas/Python/mambaforge/lib/python3.10/site-packages/numpy
numpy_version:  1.25.1
pandas:         /Users/thomas/Python/mambaforge/lib/python3.10/site-packages/pandas

NOTE: Python version was forced by RETICULATE_PYTHON

OS Version:

NAME="Red Hat Enterprise Linux Server"
VERSION="7.9 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="7.9"
PRETTY_NAME="Red Hat Enterprise Linux"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.9:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.9
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.9"
@t-kalinowski
Copy link
Member

This is fundamentally a binary incompatibility between conda-provided and not-conda-provided binaries. There are various workarounds for the very motivated (you're on the right path with LD_LIBRARY_PATH), but the best recommendation is to completely side-step the issue by switching from conda environments to python virtual environments.

virtualenv_create()

@TPDeramus
Copy link
Author

I'll give making a new one a shot, but I do have a degree of version controlling I need to do regarding the pandas version.

Might I ask real quick what the argument would look like to say, install pandas=2.0.3

Or better yet, pass a environment.yml to the command to set all of that up?

@t-kalinowski
Copy link
Member

either of these should work:

virtualenv_create(packages = "pandas==2.0.3")
# or 
virtualenv_create(requirements = "path/to/requirements.txt") 

@TPDeramus
Copy link
Author

Thanks much!

Will give it a shot.

@TPDeramus
Copy link
Author

So this makes the environment but it doesn't install the packages.

Futher, it puts in the home directory and I'd rather it be in the pwd (e.g. getwd() or my renv folder)

Could you show a command that's a little more specific as to how to do this? I'm not seeing anything jump out in the documentation.

requirements.txt is attached, but adapted from an environment.yml so the == placement may be wrong.
requirements.txt

@t-kalinowski
Copy link
Member

t-kalinowski commented Aug 30, 2023

library(reticulate)

# First, make sure python 3.10 is installed on the system.
if (is.null(virtualenv_starter("3.10")))
  install_python("3.10:latest")

# create the requirements.txt file for pip
cat(file = "requirements.txt", "
numpy==1.23.4
oauthlib==3.2.2
pandas==1.5.3
")

virtualenv_create("./venv", version = "3.10", force = TRUE,
                  requirements = "requirements.txt")

Alternatively, instead of creating a requirements file, you can pass the packages directly to the virtualenv_create() call, like this:

virtualenv_create(
  "./venv",
  version = "3.10",
  force = TRUE,
  packages = c(
    "numpy==1.23.4",
    "oauthlib==3.2.2",
    "pandas==1.5.3"
  )
)

(( It's hard to tell if the oauthlib version dependency is actually worth pinning - my guess is you may want to either unpin it, (simply replace "oauthlib==3.2.2" with "oauthlib") or if it's being pulled in as a dependency of another package, remove it completely and let pip handle picking the appropriate version ))

@TPDeramus
Copy link
Author

TPDeramus commented Aug 30, 2023

I also managed to get it working with something like this:
virtualenv_create(version = "/Users/thomas/Python/mambaforge/bin/python3", virtualenv = paste0(getwd(),"/renv/python_reticulate"), requirements = "requirements.txt")

Thank you so much for your patience.

@TPDeramus
Copy link
Author

TPDeramus commented Aug 30, 2023

Just to further complicate things, it appears that this still isn't pointing to the correct library path (probably because I used the existing mamba environment tools to create reticulate environment).

It looks something like this in the .Rprofile:

source("renv/activate.R")
Sys.setenv(TMPDIR = "/Users/thomas/tmp")
library(reticulate)
if (!file.exists(paste0(getwd(),"/python_reticulate"))){
  virtualenv_create(paste0(getwd(),"/python_reticulate"), version = "/Users/thomas/Python/mambaforge/bin/python3", force = TRUE, requirements = "requirements.txt")
  #use_virtualenv(paste0(getwd(),"/python_reticulate"), required=TRUE)
  Sys.setenv(RETICULATE_PYTHON = "/Users/thomas/Gits/project/python_reticulate/bin/python3")
  RETICULATE_PYTHON = "/Users/thomas/Gits/project/python_reticulate/bin/python3"
} else {
  #use_virtualenv(paste0(getwd(),"/python_reticulate"), required=TRUE)
  Sys.setenv(RETICULATE_PYTHON = "/Users/thomas/Gits/project/python_reticulate/bin/python3")
  RETICULATE_PYTHON = "/Users/thomas/Gits/project/python_reticulate/bin/python3"
}

However, regardless of whether I force the version using RETICULATE_PYTHON or call use_virtualenv(), I get the same truncated error message and the following mappings:

py_config()
python:         /Users/thomas/Gits/project/python_reticulate/bin/python3
libpython:      /Users/thomas/Python/mambaforge/lib/libpython3.10.so
pythonhome:     /Users/thomas/Gits/project/python_reticulate:/Users/thomas/Gits/project/python_reticulate
version:        3.10.10 | packaged by conda-forge | (main, Mar 24 2023, 20:08:06) [GCC 11.3.0]
numpy:          /Users/thomas/Gits/project/python_reticulate/lib/python3.10/site-packages/numpy
numpy_version:  1.23.4

NOTE: Python version was forced by RETICULATE_PYTHON
Warning messages:
1: In base::sprintf(fmt, ...) :
  one argument not used by format 'The request to `use_python("%s")` will be ignored because the'
2: The request to `use_python("/PHShome/tpd10/Python/mambaforge/bin/python")` will be ignored because the 

Am I going to need to do a fresh install of 3.10 for this to work?

@t-kalinowski
Copy link
Member

t-kalinowski commented Aug 30, 2023

Can you please paste the full output you see after running the code I provided in my previous comment?


virtualenv_create(version = "/Users/thomas/Python/mambaforge/bin/python3"

You're using a conda build of python here to create the virtual env from, don't do that. It defeats the purpose of switching away from conda environments.

Install python through one of these methods:

  1. call reticulate::install_python() (all platforms)
  2. Download installers from www.python.org (mac and windows)
  3. on Linux, download pre-built binaries from https://github.com/rstudio/python-builds, or use the system python at /usr/bin/python

Also, it seems that you're using two approaches to python installation management, those provided by reticulate and others provided by renv. They overlap but are not identical, and serve different needs. Based on the error message above, it seems renv is setting RETICULATE_PYTHON, overriding other actions. https://rstudio.github.io/renv/reference/use_python.html

My recommendation is to get it working without renv in the mix, and then introduce renv once you have everything working.

@TPDeramus
Copy link
Author

The ones from your previous comments look like this:

> virtualenv_create(requirements = "requirements.txt") 
Using Python: /usr/bin/python3.6
Creating virtual environment '~/.virtualenvs/r-reticulate' ... 
+ /usr/bin/python3.6 -m venv /Users/thomas/.virtualenvs/r-reticulate
Done!
Installing packages: pip, wheel, setuptools
+ /Users/thomas/.virtualenvs/r-reticulate/bin/python -m pip install --upgrade --no-user pip wheel setuptools

Usage:   
  /Users/thomas/.virtualenvs/r-reticulate/bin/python -m pip install [options] <requirement specifier> [package-index-options] ...
  /Users/thomas/.virtualenvs/r-reticulate/bin/python -m pip install [options] -r <requirements file> [package-index-options] ...
  /Users/thomas/.virtualenvs/r-reticulate/bin/python -m pip install [options] [-e] <vcs project url> ...
  /Users/thomas/.virtualenvs/r-reticulate/bin/python -m pip install [options] [-e] <local project path> ...
  /Users/thomas/.virtualenvs/r-reticulate/bin/python -m pip install [options] <archive url/path> ...

no such option: --no-user
Error: Error installing package(s): 'pip', 'wheel', 'setuptools'
> library(reticulate)
> 
> # First, make sure python 3.10 is installed on the system.
> if (is.null(virtualenv_starter("3.10")))
+     install_python("3.10:latest")
trying URL 'https://github.com/pyenv/pyenv-installer/raw/master/bin/pyenv-installer'
Content type 'text/plain; charset=utf-8' length 2827 bytes
==================================================
downloaded 2827 bytes

Installing pyenv ...
sh: ./pyenv-installer: Permission denied
Error in pyenv_bootstrap_unix() : installation of pyenv failed
virtualenv_create(requirements = "requirements.txt") 
virtualenv: ~/.virtualenvs/r-reticulate
> virtualenv_create(
+     "./venv",
+     version = "3.10",
+     force = TRUE,
+     packages = c(
+         "numpy==1.23.4",
+         "oauthlib==3.2.2",
+         "pandas==1.5.3"
+     )
+ )
Error in stop_no_virtualenv_starter(version) : 
  Suitable Python installation for creating a venv not found.
Requested version constraint: 3.10
Please install Python with one of following methods:
- https://github.com/rstudio/python-builds/
- reticulate::install_python(version = '<version>')

The latest throws the following:

> reticulate::install_python()
trying URL 'https://github.com/pyenv/pyenv-installer/raw/master/bin/pyenv-installer'
Content type 'text/plain; charset=utf-8' length 2827 bytes
==================================================
downloaded 2827 bytes

Installing pyenv ...
sh: ./pyenv-installer: Permission denied
Error in pyenv_bootstrap_unix() : installation of pyenv failed

I don't have sudo permissions on this machine and the base python installation is older. Why I was using Conda/Mamba.

I can also confirm the renv setup and renv.lock aren't calling any python variables.

@t-kalinowski
Copy link
Member

Installing pyenv ...
sh: ./pyenv-installer: Permission denied
Error in pyenv_bootstrap_unix() : installation of pyenv failed

This is a surprising error - you should not need sudo permissions for install_python() to succeed. I'll see if I can reproduce in docker.

@t-kalinowski
Copy link
Member

I'm guessing that with this older version of RHEL, you/IT must be installing such a recent version of R manually, perhaps using https://github.com/rstudio/r-builds. While this issue is in flight, I suggest working on taking a similar approach with installing a more recent version of Python, perhaps using the sister repo: https://github.com/rstudio/python-builds/

@t-kalinowski
Copy link
Member

no such option: --no-user
Error: Error installing package(s): 'pip', 'wheel', 'setuptools'

This error is due to the ancient version of pip that comes with the RHEL system. You can work around this by calling virtualenv_install() with packages = FALSE, and then manually installing pip in the virtualenv.

virtualenv_create(..., packages = FALSE)
system2(virtualenv_python("r-reticulate"), "-m pip install --upgrade pip wheel setuptools")

@t-kalinowski
Copy link
Member

t-kalinowski commented Aug 30, 2023

I'm seeing a different error with install_python() on centos7, related to openssl. This seems like a pyenv bug, unrelated to the user permissions (maybe this? pyenv/pyenv#950) (I can run this as a regular user without sudo permissions).

[regularuser@19d370535cfc ~]$ /opt/R/4.2.1/bin/R -q -e 'reticulate::install_python("3.10:latest")'
> reticulate::install_python("3.10:latest")
+ /home/regularuser/.local/share/r-reticulate/pyenv/bin/pyenv update
+ /home/regularuser/.local/share/r-reticulate/pyenv/bin/pyenv install --skip-existing 3.10.13
Downloading Python-3.10.13.tar.xz...
-> https://www.python.org/ftp/python/3.10.13/Python-3.10.13.tar.xz
Installing Python-3.10.13...
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/regularuser/.pyenv/versions/3.10.13/lib/python3.10/ssl.py", line 99, in <module>
    import _ssl             # if we can't import it, let the error propagate
ModuleNotFoundError: No module named '_ssl'
ERROR: The Python ssl extension was not compiled. Missing the OpenSSL lib?

Please consult to the Wiki page to fix the problem.
https://github.com/pyenv/pyenv/wiki/Common-build-problems


BUILD FAILED (CentOS Linux 7 using python-build 20180424)

Inspect or clean up the working tree at /tmp/python-build.20230830192336.53400
Results logged to /tmp/python-build.20230830192336.53400.log

Last 10 log lines:
	LD_LIBRARY_PATH=/tmp/python-build.20230830192336.53400/Python-3.10.13:/opt/R/4.2.1/lib/R/lib:/usr/local/lib:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.372.b07-1.el7_9.x86_64/jre/lib/amd64/server ./python -E -m ensurepip \
		$ensurepip --root=/ ; \
fi
Looking in links: /tmp/tmpfu4pcsld
Processing /tmp/tmpfu4pcsld/setuptools-65.5.0-py3-none-any.whl
Processing /tmp/tmpfu4pcsld/pip-23.0.1-py3-none-any.whl
Installing collected packages: setuptools, pip
  WARNING: The scripts pip3 and pip3.10 are installed in '/home/regularuser/.pyenv/versions/3.10.13/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed pip-23.0.1 setuptools-65.5.0
Error: installation of Python 3.10.13 failed
Execution halted

@TPDeramus
Copy link
Author

We've hit similar issues in the past related to make/g++ incompatibilities with the existing RHEL7 libraries (what Sys.setenv(TMPDIR = "/Users/thomas/tmp") is a vestige of actually).

Our work around for this was to use the POSIT package managers for the RStudio-Server sessions:
https://packagemanager.posit.co/client/#/

e.g. options(repos = c(REPO_NAME = "https://packagemanager.posit.co/cran/latest"))

And so far, this is the only way we have been able to get reticulate to work with our setup thus far outside of trying to build R in a conda environment (which created its own problems).

There may be a way to get a pre-compiled version of python 3.10 to do the same in the same repository.

Will comment back if I find anything related to that after I try a few more of your suggestions.

@TPDeramus
Copy link
Author

So despite seemingly calling python 3.6, this initially showed some promise:

> virtualenv_create(paste0(getwd(),"/python_reticulate"), force = TRUE, packages = FALSE)
Using Python: /usr/bin/python3.6
Creating virtual environment '/Users/thomas/Gits/nlp-modeling/python_reticulate' ... 
+ /usr/bin/python3.6 -m venv /Users/thomas/Gits/nlp-modeling/python_reticulate
Done!
Virtual environment '/Users/thomas/Gits/nlp-modeling/python_reticulate' successfully created.
> system2(virtualenv_python("/Users/thomas/Gits/nlp-modeling/python_reticulate"), "-m pip install --upgrade pip wheel setuptools")
Cache entry deserialization failed, entry ignored
Collecting pip
  Downloading https://files.pythonhosted.org/packages/a4/6d/6463d49a933f547439d6b5b98b46af8742cc03ae83543e4d7688c2420f8b/pip-21.3.1-py3-none-any.whl (1.7MB)
Collecting wheel
  Cache entry deserialization failed, entry ignored
  Downloading https://files.pythonhosted.org/packages/27/d6/003e593296a85fd6ed616ed962795b2f87709c3eee2bca4f6d0fe55c6d00/wheel-0.37.1-py2.py3-none-any.whl
Cache entry deserialization failed, entry ignored
Collecting setuptools
  Downloading https://files.pythonhosted.org/packages/b0/3a/88b210db68e56854d0bcf4b38e165e03be377e13907746f825790f3df5bf/setuptools-59.6.0-py3-none-any.whl (952kB)
Installing collected packages: pip, wheel, setuptools
  Found existing installation: pip 9.0.3
    Uninstalling pip-9.0.3:
      Successfully uninstalled pip-9.0.3
  Found existing installation: setuptools 39.2.0
    Uninstalling setuptools-39.2.0:
      Successfully uninstalled setuptools-39.2.0
Successfully installed pip-21.3.1 setuptools-59.6.0 wheel-0.37.1
You are using pip version 21.3.1, however version 23.2.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

But it seems to be defaulting to the conda installation every time due to some parameter.

> use_virtualenv(paste0(getwd(),"/python_reticulate"), required=TRUE)
Warning messages:
1: In base::sprintf(fmt, ...) :
  one argument not used by format 'Previous request to `use_python("%s", required = TRUE)` will be ignored.'
2: Previous request to `use_python("/Users/thomas/Python/mambaforge/bin/python", required = TRUE)` will be ignored. 
3: In base::sprintf(fmt, ...) :
  one argument not used by format 'The request to `use_python("%s")` will be ignored because the'
4: The request to `use_python("/Users/thomas/Gits/nlp-modeling/python_reticulate/bin/python")` will be ignored because the
> py_config()
python:         /Users/thomas/Python/mambaforge/bin/python
libpython:      /Users/thomas/Python/mambaforge/lib/libpython3.10.so
pythonhome:     /Users/thomas/Python/mambaforge:/Users/thomas/Python/mambaforge
version:        3.10.10 | packaged by conda-forge | (main, Mar 24 2023, 20:08:06) [GCC 11.3.0]
numpy:          /Users/thomas/Python/mambaforge/lib/python3.10/site-packages/numpy
numpy_version:  1.25.1

NOTE: Python version was forced by RETICULATE_PYTHON
> Sys.getenv("RETICULATE_PYTHON")
[1] "/Users/thomas/Python/mambaforge/bin/python"

> system("which python")
/Users/thomas/Python/mambaforge/bin/python

You think it might help to shut off the base environment?

I'm honestly not sure what's passing the RETICULATE_PYTHON variable at this point, and I assume it's just the first instance of python it can find.

@TPDeramus
Copy link
Author

Nope.

Still doing that even with the base option turned off:

> virtualenv_create(paste0(getwd(),"/python_reticulate"), force = TRUE, packages = FALSE)
Using Python: /usr/bin/python3.6
Creating virtual environment '/Users/thomas/Gits/nlp-modeling/python_reticulate' ... 
+ /usr/bin/python3.6 -m venv /Users/thomas/Gits/nlp-modeling/python_reticulate
Done!
Virtual environment '/Users/thomas/Gits/nlp-modeling/python_reticulate' successfully created.
Warning messages:
1: In base::sprintf(fmt, ...) :
  one argument not used by format 'The request to `use_python("%s")` will be ignored because the'
2: The request to `use_python("/Users/thomas/Python/mambaforge/bin/python")` will be ignored because the

Probably going to have to figure out what's mapping that automatically either way.

@t-kalinowski
Copy link
Member

Some common locations where RETICULATE_PYTHON might be being set:

  • ~/.bash_profile
  • ~/.profile
  • ~/.bashrc
  • ~/.Rprofile
  • ~/.Renviron
  • ./.Rprofile
  • ./.Renviron
  • renv activation/startup routines
  • RStudio IDE configuration options

@TPDeramus
Copy link
Author

TPDeramus commented Aug 30, 2023

It was that last one.

@TPDeramus
Copy link
Author

So as an update.

I did the following to create a python virtualenv and install the relevant packages (even with the dated python version):

In the bash terminal:

/usr/bin/python3 -m venv /Users/thomas/Gits/Project/python_reticulate_venv
source /Users/thomas/Gits/Project/python_reticulate_venv/bin/activate
pip install numpy pandas
In the Quarto/R workflow:
setwd("/Users/thomas/Gits/Project")
use_python(paste0(getwd(),"/python_reticulate_venv/bin/python3"))
use_virtualenv(paste0(getwd(),"/python_reticulate_venv"), required=TRUE)
np <- import("numpy", convert=FALSE)
pd <- import("pandas")

But once I toss it into the workflow to try and read the .pickle it fails:

output <- pd$read_pickle(paste0(path_to_pickle,"/",file_name[index]))
Error in py_call_impl(callable, call_args$unnamed, call_args$named) :
AttributeError: Can't get attribute '_unpickle_block' on <module 'pandas._libs.internals' from '/Users/thomas/Gits/Project/python_reticulate_venv/lib64/python3.6/site-packages/pandas/_libs/internals.cpython-36m-x86_64-linux-gnu.so'>
Run `reticulate::py_last_error()` for details.
> reticulate::py_last_error()

── Python Exception Message ──────────────────────────────────────────────────────────────────────────────────────────────────────
Traceback (most recent call last):
  File "/Users/thomas/Gits/Project/python_reticulate_venv/lib64/python3.6/site-packages/pandas/io/pickle.py", line 182, in read_pickle
    return pickle.load(f)
AttributeError: Can't get attribute '_unpickle_block' on <module 'pandas._libs.internals' from '/Users/thomas/Gits/Project/python_reticulate_venv/lib64/python3.6/site-packages/pandas/_libs/internals.cpython-36m-x86_64-linux-gnu.so'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib64/python3.6/pickle.py", line 269, in _getattribute
    obj = getattr(obj, subpath)
AttributeError: module 'pandas._libs.internals' has no attribute '_unpickle_block'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/thomas/Gits/Project/python_reticulate_venv/lib64/python3.6/site-packages/pandas/io/pickle.py", line 187, in read_pickle
    return pc.load(f, encoding=None)
  File "/Users/thomas/Gits/Project/python_reticulate_venv/lib64/python3.6/site-packages/pandas/compat/pickle_compat.py", line 249, in load
    return up.load()
  File "/usr/lib64/python3.6/pickle.py", line 1050, in load
    dispatch[key[0]](self)
  File "/usr/lib64/python3.6/pickle.py", line 1347, in load_stack_global
    self.append(self.find_class(module, name))
  File "/Users/thomas/Gits/Project/python_reticulate_venv/lib64/python3.6/site-packages/pandas/compat/pickle_compat.py", line 189, in find_class
    return super().find_class(module, name)
  File "/usr/lib64/python3.6/pickle.py", line 1390, in find_class
    return _getattribute(sys.modules[module], name)[0]
  File "/usr/lib64/python3.6/pickle.py", line 272, in _getattribute
    .format(name, obj))
AttributeError: Can't get attribute '_unpickle_block' on <module 'pandas._libs.internals' from '/Users/thomas/Gits/Project/python_reticulate_venv/lib64/python3.6/site-packages/pandas/_libs/internals.cpython-36m-x86_64-linux-gnu.so'>

── R Traceback ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
    ▆
 1. └─pd$read_pickle(paste0(path_to_pickle, "/", file_name[index]))
 2.   └─reticulate:::py_call_impl(callable, call_args$unnamed, call_args$named)

I think it has to do with pip refusing to let me install pandas and numpy version from requirements.txt (says they are not available).

But I'm not sure what direction I can take it from here outside of asking my IT dept to install a new python version from the tarball.

@t-kalinowski
Copy link
Member

It looks like the version of pandas you are trying to work with does not work with Python 3.6, and requires a more recent version of Python.

I know this is easier said than done in bureaucracies, but I would say: insisting on having reasonably up-to-date software is, well, a very reasonable request. Python versions 3.6 and 3.7 are both EOL. https://devguide.python.org/versions/.

You can probably download and compile python yourself manually (perhaps using an older version of pyenv).

You can also try to put a band-aid over the binary incompatibility with conda by setting LD_PRELOAD for the appropriate .so before starting R, or maybe, by calling dyn.load() in an R session.

@TPDeramus
Copy link
Author

I tried a suggestion similar to that with the following:
https://pages.github.nceas.ucsb.edu/NCEAS/Computing/local_install_python_on_a_server.html

In this case, I used make in my local directory to compile 3.10.3 and am just trying to install the other dependencies with pip.

But that creates it's own problems.

Primarily:

  • The system still defaults to the "base" installation even with PYTHONPATH set to the local installation
  • Aliasing doesn't seem to work for local installations (it doesn't know where pip is or where to install sometimes).
  • When run directly or aliased, It can't connect to download new packages:
python3 -m pip install pandas
WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available.
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/pandas/
WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/pandas/
WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/pandas/
WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/pandas/
WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.")': /simple/pandas/
Could not fetch URL https://pypi.org/simple/pandas/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/pandas/ (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.")) - skipping
ERROR: Could not find a version that satisfies the requirement pandas (from versions: none)
ERROR: No matching distribution found for pandas
WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available.
Could not fetch URL https://pypi.org/simple/pip/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/pip/ (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.")) - skipping

t-kalinowski added a commit that referenced this issue Sep 5, 2023
system pip version does not support `--no-user`
(as seen in #1467)
@TPDeramus
Copy link
Author

Is this fix operating on the local python installation method and correctly calling the virtualenv?

I'm still working this out with IT.

@TPDeramus
Copy link
Author

I believe I will have to work within python itself to get around this in the interim until this is resolved.

Feel free to close the issue for the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants