Support for pyheavydb instead of pyomniscidb (#33)

Co-authored-by: Pearu Peterson <pearu.peterson@gmail.com>
heavyai · Apr 15, 2022 · 4de2f97 · 4de2f97
1 parent eb08ae2
commit 4de2f97
Show file tree

Hide file tree

Showing 21 changed files with 159 additions and 147 deletions.
diff --git a/README.md b/README.md
@@ -0,0 +1,50 @@
+[![PyPi package link](https://img.shields.io/pypi/v/heavyai?style=for-the-badge)](https://pypi.org/project/heavyai/)
+[![Conda package link](https://img.shields.io/conda/vn/conda-forge/heavyai?style=for-the-badge)](https://anaconda.org/conda-forge/heavyai)
+
+
+heavyai
+=======
+
+This package enables using common Python data science toolkits with
+[HeavyDB](http://heavy.ai).
+It brings data frame support on CPU and GPU as well as support for arrow.
+See the [documentation](http://heavyai.readthedocs.io/en/latest/?badge=latest)
+for more.
+
+Quick Install (CPU)
+-------------------
+
+Packages are available on conda-forge and PyPI:
+
+```bash
+# using conda-forge
+conda install -c conda-forge heavyai
+
+# using pip
+pip install heavyai
+```
+
+Quick Install (GPU)
+-------------------
+
+We recommend creating a fresh conda 3.8 or 3.9 environment when installing
+heavyai with GPU capabilities.
+
+To install heavyai for GPU Dataframe support (conda-only):
+
+```bash
+conda create -n heavyai-gpu -c rapidsai -c nvidia -c conda-forge -c defaults python cudf cudatoolkit heavyai
+```
+
+Note that `pyheavydb` needs to be installed in the environment with pip
+until `heavydb` is available on conda-forge.
+
+```bash
+conda activate heavyai-gpu
+pip install pyheavydb
+```
+
+Documentation
+-------------
+
+Further documentation for heavyai usage is available at: http://heavyai.readthedocs.io/
diff --git a/README.rst b/README.rst
diff --git a/ci/Dockerfile b/ci/Dockerfile
diff --git a/ci/build-conda.sh b/ci/build-conda.sh
@@ -39,15 +39,16 @@ while [[ $# != 0 ]]; do
 done
 
 build_test_cpu() {
-    mamba env create -f environment.yml
+    mamba env create -f ci/environment.yml
     conda activate heavyai-dev
     pip install --no-deps -e .
     pytest -sv tests/
 }
 
 build_test_gpu() {
-    mamba env create -f environment_gpu.yml
+    mamba env create -f ci/environment_gpu.yml
     conda activate heavyai-gpu-dev
+    python -c "import cudf"
     pip install --no-deps -e .
     pytest -sv tests/
 }

diff --git a/environment.yml → ci/environment.yml b/environment.yml → ci/environment.yml
@@ -3,11 +3,11 @@ channels:
 - conda-forge
 - defaults
 dependencies:
-- pandas
-- pyomniscidb
+- python >=3.7.0
 - pyarrow>=3.0.0
+- thrift >=0.13
+- sqlalchemy  # 3.10 issue with one of its dependency if pip
 - pandas
-- python >=3.7.0
 - geopandas
 - shapely
 - numpy
@@ -20,3 +20,6 @@ dependencies:
 - pytest-mock
 - sphinx
 - sphinx_rtd_theme
+- pip
+- pip:
+    - pyheavydb
diff --git a/environment_gpu.yml → ci/environment_gpu.yml b/environment_gpu.yml → ci/environment_gpu.yml
@@ -5,11 +5,12 @@ channels:
 - conda-forge
 - defaults
 dependencies:
-- cudf>=0.16
-- cudatoolkit=11.0
+- cudf
+- cudatoolkit
 - python >=3.7.0
-- pyomniscidb
 - pyarrow>=3.0.0=*cuda
+- thrift >=0.13
+- sqlalchemy  # 3.10 issue with one of its dependency if pip
 - pandas
 - geopandas
 - shapely
@@ -23,3 +24,6 @@ dependencies:
 - pytest-mock
 - sphinx
 - sphinx_rtd_theme
+- pip
+- pip:
+    - pyheavydb
diff --git a/docs/source/api.rst b/docs/source/api.rst
@@ -9,5 +9,5 @@ API Reference
 Exceptions
 ----------
 
-.. automodule:: omnisci.exceptions
+.. automodule:: heavyai.exceptions
    :members: Error, InterfaceError, DatabaseError, OperationalError, IntegrityError, InternalError, ProgrammingError, NotSupportedError
diff --git a/docs/source/contributing.rst b/docs/source/contributing.rst
@@ -16,7 +16,7 @@ Development Environment Setup
 -----------------------------
 
 heavyai is written in plain Python 3 (i.e. no Cython), and as such, doesn't require any specialized development
-environment outside of installing the dependencies. However, we do suggest creating a new conda development enviornment
+environment outside of installing the dependencies. However, we do suggest creating a new conda development environment
 with the provided conda `environment.yml` file to ensure that your changes work without relying on unspecified system-level
 Python packages.
 
@@ -35,7 +35,7 @@ CPU Environment
    # clone heavyai repo
    git clone https://github.com/heavyai/heavyai.git && cd heavyai
 
-   conda env create -f ./environment.yml
+   conda env create -f ci/environment.yml
 
    # ensure you have activated the environment
    conda activate heavyai-dev
@@ -50,7 +50,7 @@ GPU Environment
 .. code-block:: shell
 
    # from the heavyai project root
-   conda env create -f environment_gpu.yml
+   conda env create -f ci/environment_gpu.yml
 
    # ensure you have activated the environment
    conda activate heavyai-gpu-dev
@@ -138,7 +138,7 @@ installation instructions.
 Updating Apache Thrift Bindings
 -------------------------------
 
-When the upstream `mapd-core`_ project updates its Apache Thrift definition file, the bindings shipped with
+When the upstream `HeavyDB`_ project updates its Apache Thrift definition file, the bindings shipped with
 ``heavyai`` need to be regenerated. Note that the `heavydb` repository must be cloned locally.
 
 .. code-block:: shell
@@ -150,7 +150,7 @@ When the upstream `mapd-core`_ project updates its Apache Thrift definition file
    cd ./heavydb
 
    # Use Thrift to generate the Python bindings
-   thrift -gen py -r omnisci.thrift
+   thrift -gen py -r heavy.thrift
 
    # Copy the generated bindings to the heavyai root
    cp -r ./gen-py/heavydb/* ../heavyai/heavydb/
@@ -171,7 +171,7 @@ you need to install sphinx and sphinx-rtd-theme into your development environmen
    pip install sphinx sphinx-rtd-theme
 
 Once you have sphinx installed, to build the documentation switch to the ``heavyai/docs`` directory and run ``make html``. This will update the documentation
-in the ``heavyai/docs/build/html`` directory. From that directory, running ``python -m http.server`` will allow you to preview the site on ``localhost:8000``
+in the ``heavyai/docs/build/html`` directory. From that directory, ``index.html`` can be opened
 in the browser. Run ``make html`` each time you save a file to see the file changes in the documentation.
 
 --------------------------------
@@ -182,24 +182,21 @@ heavyai doesn't currently follow a rigid release schedule; rather, when enough f
 version to be released, or a sufficiently serious bug/issue is fixed, we will release a new version. heavyai is distributed via `PyPI`_
 and `conda-forge`_.
 
-Prior to submitting to PyPI and/or conda-forge, create a new `release tag`_ on GitHub (with notes), then run ``git pull`` to bring this tag to your
-local heavyai repository folder.
-
 ****
 PyPI
 ****
 
-To publish to PyPI, we use the `twine`_ package via the CLI. twine only allows for submitting to PyPI by registered users
-(currently, internal Heavy.AI employees):
+To publish to PyPI, we use `flit`_ in the CI. Upon a new tag push, the
+package is built and published on PyPI. Be sure to have a matching version
+in `pyproject.toml` and tag.
 
-.. code-block:: shell
+Authorized users can also publish a new version locally:
 
-   conda install twine
-   python setup.py sdist
-   twine upload dist/*
+.. code-block:: shell
 
-Publishing a package to PyPI is near instantaneous after runnning ``twine upload dist/*``. Before running ``twine upload``, be sure
-the ``dist`` directory only has the current version of the package you are intending to upload.
+   conda install flit
+   flit build
+   flit publish
 
 ***********
 conda-forge
@@ -212,7 +209,7 @@ nothing that needs to be done to speed this up, just be patient.
 When the conda-forge bot opens a PR on the heavyai-feedstock repo, one of the feedstock maintainers needs to validate the correctness
 of the PR, check the accuracy of the package versions on the `meta.yaml`_ recipe file, and then merge once the CI tests pass.
 
-.. _mapd-core: https://github.com/omnisci/mapd-core
+.. _HeavyDB: https://github.com/heavyai/heavydb
 .. _Docker: https://hub.docker.com/u/omnisci
 .. _CPU image: https://hub.docker.com/r/omnisci/core-os-cpu
 .. _HeavyDB Core GPU-enabled: https://hub.docker.com/r/omnisci/core-os-cuda
@@ -224,6 +221,5 @@ of the PR, check the accuracy of the package versions on the `meta.yaml`_ recipe
 .. _pull requests: https://github.com/heavyai/heavyai/pulls
 .. _PyPI: https://pypi.org/project/heavyai/
 .. _conda-forge: https://github.com/conda-forge/heavyai-feedstock
-.. _release tag: https://github.com/heavyai/heavyai/releases
-.. _twine: https://pypi.org/project/twine/
+.. _flit: https://pypi.org/project/flit/
 .. _meta.yaml: https://github.com/conda-forge/heavyai-feedstock/blob/main/recipe/meta.yaml
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -1,20 +1,15 @@
-.. heavyai documentation master file, created by
-   sphinx-quickstart on Fri Jun 23 12:29:54 2017.
-   You can adapt this file completely to your liking, but it should at least
-   contain the root `toctree` directive.
-
 heavyai
 =========
 
 `heavyai` provides a python DB API 2.0-compliant `HeavyDB`_
-interface (formerly MapD). In addition, it provides methods to get results in
+interface (formerly OmniSci and MapD). In addition, it provides methods to get results in
 the `Apache Arrow`_-based `cudf GPU DataFrame`_ format for efficient data interchange.
 
 .. code-block:: python
 
    >>> from heavyai import connect
    >>> con = connect(user="admin", password="HyperInteractive", host="localhost",
-   ...               dbname="heavydb")
+   ...               dbname="heavyai")
    >>> df = con.select_ipc_gpu("SELECT depdelay, arrdelay"
    ...                         "FROM flights_2008_10k"
    ...                         "LIMIT 100")

diff --git a/docs/source/releasenotes.rst b/docs/source/releasenotes.rst
@@ -3,9 +3,9 @@
 Release Notes
 =============
 
-The release notes for pymapd are managed on the GitHub repository in the `Releases tab`_. Since pymapd
+The release notes for heavyai are managed on the GitHub repository in the `Releases tab`_. Since heavyai
 releases try to track new features in the main OmniSci Core project, it's highly recommended that you check
-the Releases tab any time you install a new version of pymapd or upgrade OmniSci so that you understand any breaking
+the Releases tab any time you install a new version of heavyai or upgrade OmniSci so that you understand any breaking
 changes that may have been made during a new pymapd release.
 
 Some notable breaking changes include:
@@ -17,6 +17,8 @@ Some notable breaking changes include:
    =======    ===============
    Release    Breaking Change
    =======    ===============
+   `1.0`_     Change dependency from `pyomniscidb` to `pyheavydb`
+   `0.30`_    New name `heavyai`
    `0.17`_    Added preliminary support for Runtime User-Defined Functions
    `0.15`_    Support for binary TLS Thrift connections
    `0.14`_    Updated Thrift bindings to 4.8
@@ -37,15 +39,17 @@ Some notable breaking changes include:
 
 
 
-.. _Releases tab: https://github.com/omnisci/pymapd/releases
-.. _0.6: https://github.com/omnisci/pymapd/releases/tag/v0.6.0
-.. _0.7: https://github.com/omnisci/pymapd/releases/tag/v0.7.0
-.. _0.8: https://github.com/omnisci/pymapd/releases/tag/v0.8.0
-.. _0.9: https://github.com/omnisci/pymapd/releases/tag/v0.9.0
-.. _0.10: https://github.com/omnisci/pymapd/releases/tag/v0.10.0
-.. _0.11: https://github.com/omnisci/pymapd/releases/tag/v0.11.0
-.. _0.12: https://github.com/omnisci/pymapd/releases/tag/v0.12.0
-.. _0.13: https://github.com/omnisci/pymapd/releases/tag/v0.13.0
-.. _0.14: https://github.com/omnisci/pymapd/releases/tag/v0.14.0
-.. _0.15: https://github.com/omnisci/pymapd/releases/tag/v0.15.0
-.. _0.17: https://github.com/omnisci/pymapd/releases/tag/v0.17.0
+.. _Releases tab: https://github.com/heavyai/heavyai/releases
+.. _0.6: https://github.com/heavyai/heavyai/releases/tag/v0.6.0
+.. _0.7: https://github.com/heavyai/heavyai/releases/tag/v0.7.0
+.. _0.8: https://github.com/heavyai/heavyai/releases/tag/v0.8.0
+.. _0.9: https://github.com/heavyai/heavyai/releases/tag/v0.9.0
+.. _0.10: https://github.com/heavyai/heavyai/releases/tag/v0.10.0
+.. _0.11: https://github.com/heavyai/heavyai/releases/tag/v0.11.0
+.. _0.12: https://github.com/heavyai/heavyai/releases/tag/v0.12.0
+.. _0.13: https://github.com/heavyai/heavyai/releases/tag/v0.13.0
+.. _0.14: https://github.com/heavyai/heavyai/releases/tag/v0.14.0
+.. _0.15: https://github.com/heavyai/heavyai/releases/tag/v0.15.0
+.. _0.17: https://github.com/heavyai/heavyai/releases/tag/v0.17.0
+.. _0.30: https://github.com/heavyai/heavyai/releases/tag/v0.30.0
+.. _1.0: https://github.com/heavyai/heavyai/releases/tag/v1.0.0
diff --git a/docs/source/usage.rst b/docs/source/usage.rst
@@ -74,16 +74,16 @@ To create a :class:`Connection` using the ``connect()`` method along with ``user
 
    >>> from heavyai import connect
    >>> con = connect(user="admin", password="HyperInteractive", host="localhost",
-   ...               dbname="heavydb")
+   ...               dbname="heavyai")
    >>> con
-   Connection(mapd://admin:***@localhost:6274/heavydb?protocol=binary)
+   Connection(heavydb://admin:***@localhost:6274/heavyai?protocol=binary)
 
 Alternatively, you can pass in a `SQLAlchemy`_-compliant connection string to
 the ``connect()`` method:
 
 .. code-block:: python
 
-   >>> uri = "mapd://admin:HyperInteractive@localhost:6274/heavydb?protocol=binary"
+   >>> uri = "heavydb://admin:HyperInteractive@localhost:6274/heavyai?protocol=binary"
    >>> con = connect(uri=uri)
    Connection(mapd://admin:***@localhost:6274/heavydb?protocol=binary)
 
@@ -171,7 +171,7 @@ install, ``pandas.read_sql()`` works everywhere):
    >>> from heavyai import connect
    >>> import pandas as pd
    >>> con = connect(user="admin", password="HyperInteractive", host="localhost",
-   ...               dbname="heavydb")
+   ...               dbname="heavyai")
    >>> df = pd.read_sql("SELECT depdelay, arrdelay FROM flights_2008_10k limit 100", con)
 
 
@@ -190,7 +190,7 @@ Or by using a context manager:
 
 .. code-block:: python
 
-   >>> with con as c:
+   >>> with con.cursor() as c:
    ...     print(c)
    <heavyai.cursor.Cursor object at 0x1041f9630>