Ibis: Python data analysis framework for Hadoop and SQL engines
Ibis is a toolbox to bridge the gap between local Python environments, remote storage, execution systems like Hadoop components (HDFS, Impala, Hive, Spark) and SQL databases. Its goal is to simplify analytical workflows and make you more productive.
Install Ibis from PyPI with:
pip install ibis-framework
or from conda-forge with
conda install ibis-framework -c conda-forge
Ibis currently provides tools for interacting with the following systems:
- Apache Impala
- Apache Kudu
- Hadoop Distributed File System (HDFS)
- MySQL (Experimental)
- Pandas DataFrames (Experimental)
- OmniSciDB (Experimental)
- Spark (Experimental)
Learn more about using the library at http://docs.ibis-project.org.
- OmniSciDB backend support is tested against a development release
of their database using the
omnisci/core-os-cpu-devDocker image. Check the docker image tag used at docker-compose.yml. Some features may not work on earlier releases.