Skip to content

WillAyd/nanopandas

Repository files navigation

For usage you can simply pip install .

If developing install nanobind then:

cmake -S . -B build
cmake --build build
cd build/src

You can then run the test suite from the build folder with python -m pytest ../../tests

Usage:

>>> import nanopandas as nanopd
>>> arr = nanopd.StringArray(["foo", "bar", "baz", "baz", None])
>>> arr.size
5
>>> arr.nbytes
48
>>> arr.dtype
'large_string[nanoarrow]'
>>> arr.to_pylist()
['foo', 'bar', 'baz', 'baz', None]
>>> arr.unique().to_pylist()
['bar', 'baz', 'foo']

Note that we use utf8proc for string handling:

>>> import nanopandas as nanopd
>>> arr = nanopd.StringArray(["üàéµ"])
>>> arr.upper().to_pylist()
['ÜÀÉΜ']
>>> arr.capitalize().to_pylist()
['Üàéµ']

Developing with sanitizers can work. Try this cmake config from the project root:

cmake -S . -B build -DCMAKE_BUILD_TYPE=Debug -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DUSE_SANITIZERS=ON
cmake --build build
cd build/src
ASAN_OPTIONS="detect_leaks=0" LD_PRELOAD="$(gcc -print-file-name=libasan.so)" python -m pytest -s ../../tests/