Productivity-centric Python data analysis framework for SQL systems and the Hadoop platform. Co-founded by the creator of pandas
Python C++ Shell CMake C Batchfile
Latest commit 051248f Jan 12, 2017 @maxmzkr maxmzkr committed with wesm Speed up compiling many nested subqueries
I ran into this after #901 . When
you have queries that have many levels of subqueries the compiling
time starts to increase drastically. By caching the graph traversal,
the compile time is decreased.    This includes the mentioned pr
#901 . I can merge this into that
pr if that makes more sense. I figured it might be helpful to split
the bug fix up from the improvement.    ```python  #
import ibis    def add_filter_and_ana(t):      t = t[t.columns +
[ibis.null().name('filter')]]        t = t[t['filter'].isnull()]
t = t[['col']]        t = t[t.columns + [t.count().name('analytic')]]
t = t[t.columns]      t = t[t['analytic'] < 0]      t = t[['col']]
return t    t = ibis.table((('col', 'int32'), ), name='t')    for _ in
range(4):      t = add_filter_and_ana(t)    ibis.impala.compile(t)
```    ```  root@582f1eb:/ibis/ibis# git checkout fix-analytic-
projection  Already on 'fix-analytic-projection'  Your branch is up-
to-date with 'origin/fix-analytic-projection'.
root@582f1eb:/ibis/ibis# time python
real    0m10.935s  user    0m10.828s  sys     0m0.096s
root@582f1eb:/ibis/ibis# git checkout fast-visit  Switched to
branch 'fast-visit'  root@582f1eb:/ibis/ibis# time python    real    0m0.535s  user    0m0.456s  sys
0m0.072s  ```

Author: Max Mizikar <>

Closes #902 from maxmzkr/fast-visit and squashes the following commits:

acc01d0 [Max Mizikar] Make lines shorter than 80 characters
dba29c4 [Max Mizikar] Remove extra line
2816bca [Max Mizikar] Speed up many levels of queries
Failed to load latest commit information.
LICENSES Bug fixes and various conveniences in HDFS + DDL utilities Jun 11, 2015
conda-recipes DEV: Bump impyla requirements Apr 3, 2016
dev Add a TableExpr.rename method Aug 24, 2015
docs Documentation updates for 0.8 release May 19, 2016
ibis Speed up compiling many nested subqueries Jan 12, 2017
scripts Properly cast types when generating test data May 4, 2016
testing/udf DEV: Updates for Impala 2.5 + native build toolchain Mar 13, 2016
.coveragerc Add async query execution API, initial 0.5 release notes Sep 5, 2015
.gitattributes DEV: add versioneer for more effortless __version__ management. Close #… Oct 8, 2015
.gitignore ENH: Partition metadata management APIs and ALTER TABLE support for t… Nov 22, 2015
.landscape.yaml TST: Jenkins script must pass prospector (flake8) or fail Sep 4, 2015
LICENSE.txt Packaging, requirements, user testing, licensing Mar 11, 2015 DEV: add versioneer for more effortless __version__ management. Close #… Oct 8, 2015
Makefile SQLAlchemy translator backend and preliminary SQLite client Aug 31, 2015 Add a circle-ci build status badge Jul 9, 2016
circle.yml Draft of column lineage API Oct 16, 2016
requirements.txt Pin pytest to <= 2.9.2 (#885) Oct 3, 2016
setup.cfg DEV: add versioneer for more effortless __version__ management. Close #… Oct 8, 2015 BUG: add postgres submodules to installation targets May 19, 2016 DEV: add versioneer for more effortless __version__ management. Close #… Oct 8, 2015


Current release from Anaconda-Server Badge

Ibis: Python data analysis framework for Hadoop and SQL engines

Ibis is a toolbox to bridge the gap between local Python environments and remote storage and execution systems like Hadoop components (HDFS, Impala, Hive, Spark) and SQL databases (Postgres, etc.). Its goal is to simplify analytical workflows and make you more productive.

Install Ibis from PyPI with:

$ pip install ibis-framework

At this time, Ibis provides tools for the interacting with the following systems:

Learn more about using the library at and read the project blog at for news and updates.