Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: 'utf-8' codec can't decode byte 0x85 in position 0: invalid start byte #336

Closed
anthony-tuininga opened this issue May 17, 2024 Discussed in #335 · 18 comments
Closed
Assignees
Labels
bug Something isn't working patch available

Comments

@anthony-tuininga
Copy link
Member

Discussed in #335

Originally posted by bnvader May 16, 2024
I am trying to use python oracledb to connect to Oracle Database that has encoding - ISO-8859-1.
I have a custom database type of type DB Table.
When I use thick client mode, I am able to directly use the DBObject.Attribute_Name to retrieve the attribute values from a row.
However this same step fails if I do not use thick mode.

I would like to have this working in thin mode as I want to eventually use this from a simple AWS Lambda function.

Has anyone been able to use thin mode to successfully decode non utf-8 encoded Oracle DB?
If so what is the secret?

@anthony-tuininga anthony-tuininga added the bug Something isn't working label May 17, 2024
@anthony-tuininga anthony-tuininga self-assigned this May 17, 2024
@anthony-tuininga
Copy link
Member Author

I am able to confirm that the issue is due to the presence of an xmltype attribute within a database object and will see what is needed to address this issue.

@anthony-tuininga
Copy link
Member Author

Good news! I was able to correct the issue. If you are able to build from source you can verify that it works for you, too.

@bnvader
Copy link

bnvader commented May 18, 2024

Great. I have not been building or installing from source. I am using Anaconda/Spyder IDE for this development and not sure how to build from source using that. But let me check.

@bnvader
Copy link

bnvader commented May 18, 2024

I was able to build from source and tried to replace the oracledb folder content with the new local oracledb folder content but must be missing something. Sorry, I'm new to python and may not fully understand the module loading part here.

It fails with error -
File ~\Documents\software\python\batch_extracts\scripts./txgov_batch_audit_db.py:12
import oracledb
File ~\AppData\Local\anaconda3\Lib\site-packages\oracledb_init_.py:43
from . import base_impl, thick_impl, thin_impl
File ~\AppData\Local\anaconda3\Lib\site-packages\oracledb\base_impl.py:9
bootstrap()
File ~\AppData\Local\anaconda3\Lib\site-packages\oracledb\base_impl.py:7 in bootstrap
mod = importlib.util.module_from_spec(spec)
ImportError: DLL load failed while importing base_impl: The specified module could not be found.

@anthony-tuininga
Copy link
Member Author

I don't know what games Anaconda plays with modules. In a regular installation, however, there should not be a base_impl.py but only a base_impl.pyd! It might be better to rename the original directory and copy the new content from the build directory to the same named directory. It is possible that not all of the files were replaced and are wreaking havoc! Another option that I use is to simply set the environment variable PYTHONPATH to point to the build directory. You can also run python setup.py build install and that should also work.

@bnvader
Copy link

bnvader commented May 18, 2024

I did just copy the oracledb folder in its entirety to the location where anaconda keeps the sitepackages. But it is failing to load.
I guess I will just wait for this to be available where I can install it using pip install from anaconda as before.

@anthony-tuininga
Copy link
Member Author

It might be easier for you to do this, then, in the source directory:

python -m build

This will create a wheel in the dist subdirectory. You can then install that wheel in the same way you install other wheels.

If the build module is missing you can do this:

python -m pip install build

Let me know if that works better for you. We can update the build from source instructions to give that option.

@bnvader
Copy link

bnvader commented May 18, 2024

Good news! Was able to install the package directly to conda from git.
I was able to validate the fix! Thanks much for fixing this so quick.

@bnvader
Copy link

bnvader commented May 18, 2024

When will this fix be available as a release?

@anthony-tuininga
Copy link
Member Author

We will discuss internally whether it makes sense to create a patch release in the next week or two; otherwise, it would likely be in a couple of months. Do you need this sooner rather than later? :-)

@bnvader
Copy link

bnvader commented May 20, 2024

A patch release will be great if possible as it would ease our deployment process to aws probably.
Meanwhile could you pl point me to steps to install oracledb from git source to a user defined directory path on Linux.
It looks like python -m pip install build will install to python lib path which may not be accessible to the user.
Are the contents of build/lib folder after running the build command, sufficient to import and run the library?
Thanks.

@bnvader
Copy link

bnvader commented May 20, 2024

Update:
We are facing an error trying to install the beta version from git source on AWS ec2 cloudshell.
building 'oracledb.base_impl' extension
gcc -Wno-unused-result -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -O2 -ftree-vectorize -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -D_GNU_SOURCE -fPIC -fwrapv -O2 -ftree-vectorize -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -D_GNU_SOURCE -fPIC -fwrapv -O2 -ftree-vectorize -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python3.9 -c src/oracledb/base_impl.c -o build/temp.linux-x86_64-cpython-39/src/oracledb/base_impl.o
src/oracledb/base_impl.c:49:10: fatal error: Python.h: No such file or directory
49 | #include "Python.h"
| ^~~~~~~~~~
compilation terminated.

Is there any possibility you could provide a linux zip dist here for us to test it on linux.

@anthony-tuininga
Copy link
Member Author

49 | #include "Python.h"
| ^~~~~~~~~~
compilation terminated.

That error implies that you don't have the Python development package installed. You should be able to install that package fairly easily.

A patch release will be great if possible as it would ease our deployment process to aws probably.

We will plan for a patch release. Exact timing is yet to be determined but it should be sometime in the next week or two.

Meanwhile could you pl point me to steps to install oracledb from git source to a user defined directory path on Linux.

My suggestion is to use the build module to create a wheel and then install the wheel the same way you would any other wheel. You can run setup.py build install as well. You can force the package to write to a user directory with the --user option but recent versions of Python already check for writability of the target directory and automatically invoke that option if needed.

Is there any possibility you could provide a linux zip dist here for us to test it on linux.

For Linux a manylinux wheel is going to be the most likely option to succeed. A straight zip file may or may not work depending on the different versions of Linux that we may be running! Since you have been able to compile already and verify that it works for you, creating a wheel shouldn't be too much trouble using the build module. That's going to be the best option I think.

@bnvader
Copy link

bnvader commented May 20, 2024

I have a windows laptop and hence have tested everything on windows. The lambda environment is rhel Linux and unfortunately I do not have access to it.
Will I be able to build a manylinux wheel for linux from my local windows install?

@cjbj
Copy link
Member

cjbj commented May 21, 2024

Until we release an update on PyPi, options to get Linux include using VirtualBox, or a free 'Oracle Compute Instance' on https://www.oracle.com/cloud/free/

@bnvader
Copy link

bnvader commented May 22, 2024

We are struggling a bit getting this build from source working in the AWS Lambda environment where we have to eventually run this.
The pip install option had worked fine previously but creating the site-package from git source has been a challenge for the devs.

At the moment it fails to import oracledb with the following error -
"errorMessage": "/lib64/libc.so.6: version `GLIBC_2.34' not found (required by /opt/python/lib/python3.11/site-packages/oracledb/thick_impl.cpython-311-x86_64-linux-gnu.so)",
"errorType": "ImportError",

Any familiarity with this?
Not sure if the issue is with the AWS Lambda shell or the way we are packaging the library.

@anthony-tuininga
Copy link
Member Author

Generally you have to create a manylinux wheel or you have to make sure you build your wheel on an older platform than the one you are trying to distribute to! Building a manylinux wheel is fairly straightforward. I don't know what AWS uses, but assuming it uses the x86_64 platform, this is what I use for building wheels:

#! /usr/bin/bash
# Produces "manylinux" wheels using a container image built on CentOS 7. For
# additional information, see https://github.com/pypa/auditwheel.
#
#   Currently using manylinux2014 (based on CentOS 7):
#       podman pull quay.io/pypa/manylinux2014_x86_64
#
# This script should be run in the root directory of a clone of python-oracledb
# with a source distribution package already created and stored within the
# "dist" subdirectory. Once all of the wheels have been built they will be
# placed within the "dist" subdirectory as well.

# ensure that the dist subdirectory exists
if [ ! -d "dist" ]; then
    mkdir dist
fi

# generate script for building
SCRIPT_NAME=dist/linux_build_on_container.sh
cat > $SCRIPT_NAME << EOF
#! /bin/bash

cd /io

# build module for all supported Python versions
/opt/python/cp37-cp37m/bin/python3.7 -m build
/opt/python/cp38-cp38/bin/python3.8 -m build
/opt/python/cp39-cp39/bin/python3.9 -m build
/opt/python/cp310-cp310/bin/python3.10 -m build
/opt/python/cp311-cp311/bin/python3.11 -m build
/opt/python/cp312-cp312/bin/python3.12 -m build

# turn the base wheels into "manylinux" wheels
cd dist
auditwheel repair *.whl
rm -f oracledb-*x86_64.whl
mv -i wheelhouse/* .
rm -rf wheelhouse

exit

EOF
chmod +x $SCRIPT_NAME

# run script
sudo podman run -i -t -v `pwd`:/io quay.io/pypa/manylinux2014_x86_64 \
        /bin/bash -c /io/$SCRIPT_NAME
rm $SCRIPT_NAME

A few notes to help you if you want to use this approach:

  • you can use docker instead of podman (they should be interchangeable)
  • you can create the source package required by this script by running python setup.py sdist
  • you can remove all of the Python versions except the one you are interested in

Hope that helps! We have discussed creating a patch release and have tenatively scheduled that for early next week. Once it is out I will post here again.

@anthony-tuininga
Copy link
Member Author

This was included in version 2.2.1 which was just released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working patch available
Projects
None yet
Development

No branches or pull requests

3 participants