Showing with 2,130 additions and 308 deletions.
  1. +15 −40 README.md
  2. +1 −1 conda-recipes/ibis-framework/build.sh
  3. +2 −2 conda-recipes/ibis-framework/meta.yaml
  4. +6 −5 conda-recipes/impyla/meta.yaml
  5. +8 −0 conda-recipes/sasl/bld.bat
  6. +9 −0 conda-recipes/sasl/build.sh
  7. +63 −0 conda-recipes/sasl/meta.yaml
  8. +42 −6 conda-recipes/thrift_sasl/meta.yaml
  9. +1 −1 conda-recipes/thriftpy/build.sh
  10. +8 −4 conda-recipes/thriftpy/meta.yaml
  11. +1 −1 docs/build-notebooks.py
  12. +21 −0 docs/source/api.rst
  13. +1 −1 docs/source/impala.rst
  14. +44 −31 docs/source/index.rst
  15. +26 −0 docs/source/release.rst
  16. +1 −36 ibis/__init__.py
  17. +4 −1 ibis/compat.py
  18. +36 −74 ibis/expr/analysis.py
  19. +1 −1 ibis/expr/api.py
  20. +1 −1 ibis/expr/datatypes.py
  21. +11 −3 ibis/expr/groupby.py
  22. +7 −6 ibis/expr/operations.py
  23. +8 −0 ibis/expr/tests/test_analysis.py
  24. +3 −3 ibis/expr/tests/test_sql_builtins.py
  25. +1 −1 ibis/expr/tests/test_string.py
  26. +15 −0 ibis/expr/tests/test_table.py
  27. +5 −0 ibis/expr/tests/test_value_exprs.py
  28. +25 −9 ibis/expr/types.py
  29. +20 −9 ibis/impala/compiler.py
  30. +14 −0 ibis/impala/tests/test_exprs.py
  31. +4 −2 ibis/impala/tests/test_udf.py
  32. +100 −22 ibis/sql/alchemy.py
  33. +9 −6 ibis/sql/compiler.py
  34. +50 −0 ibis/sql/postgres/api.py
  35. +109 −0 ibis/sql/postgres/client.py
  36. +527 −0 ibis/sql/postgres/compiler.py
  37. +81 −0 ibis/sql/postgres/tests/common.py
  38. +81 −0 ibis/sql/postgres/tests/test_client.py
  39. +586 −0 ibis/sql/postgres/tests/test_functions.py
  40. +85 −0 ibis/sql/tests/test_compiler.py
  41. +2 −2 ibis/sql/tests/test_sqlalchemy.py
  42. +1 −1 ibis/tests/conftest.py
  43. +2 −1 requirements.txt
  44. +93 −38 scripts/test_data_admin.py
55 changes: 15 additions & 40 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,51 +1,26 @@
[![codecov.io](http://codecov.io/github/cloudera/ibis/coverage.svg?branch=master)](http://codecov.io/github/cloudera/ibis?branch=master)

Current release from Anaconda.org [![Anaconda-Server Badge](https://anaconda.org/conda-forge/ibis-framework/badges/version.svg)](https://anaconda.org/conda-forge/ibis-framework)

# Ibis: Python data analysis framework for Hadoop and SQL engines

Ibis is a toolbox to bridge the gap between local Python environments and
remote storage and execution systems like Hadoop components (HDFS, Impala,
Hive, Spark) and SQL databases (Postgres, etc.). Its goal is to simplify
analytical workflows and make you more productive.

Install Ibis from PyPI with:

$ pip install ibis-framework

Ibis is a Python data analysis library with a handful of related goals:

- Enable data analysts to transition analytics on SQL engines to Python code
instead of SQL code.
- Provide high level analytics APIs and workflow tools to accelerate
productivity.
- Provide high performance extensions for the Impala MPP query engine to enable
high performance Python code to operate in a scalable Hadoop-like environment
- Abstract away database-specific SQL differences
- Integrate with the Python data ecosystem using the above tools
At this time, Ibis provides tools for the interacting with the following
systems:

At this time, Ibis supports the following SQL-based systems:

- Impala (on HDFS)
- [Apache Impala (incubating)](http://impala.io/)
- [Apache Kudu (incubating)](http://getkudu.io)
- Hadoop Distributed File System (HDFS)
- PostgreSQL (Experimental)
- SQLite

Ibis is being designed and led by the creator of pandas
(github.com/pydata/pandas) and is intended to have a familiar user interface
for folks used to small data on single machines in Python.

Architecturally, Ibis features:

- A pandas-like domain specific language (DSL) designed specifically for
analytics, aka **Ibis expressions**, that enable composable, reusable
analytics on structured data. If you can express something with a SQL SELECT
query, you can write it with Ibis.
- A translation system that targets multiple SQL systems
- Tools for wrapping user-defined functions in Impala and eventually other SQL
engines

SQL engine support near on the horizon:

- PostgreSQL
- Redshift
- Vertica
- Spark SQL
- Presto
- Hive
- MySQL / MariaDB

Read the project blog at http://blog.ibis-project.org.

Learn much more at http://ibis-project.org.
Learn more about using the library at http://docs.ibis-project.org and read the
project blog at http://ibis-project.org for news and updates.
2 changes: 1 addition & 1 deletion conda-recipes/ibis-framework/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

$PYTHON setup.py install

$PYTHON -c "import ibis; print(ibis.__version__)" > __conda_version__.txt
$PYTHON -c "import ibis; print(ibis.__version__.replace('v', '').replace('+', '_'))" > __conda_version__.txt

# Add more build steps here, if they are necessary.

Expand Down
4 changes: 2 additions & 2 deletions conda-recipes/ibis-framework/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ requirements:
- pytest
- numpy >=1.7.0
- pandas >=0.12.0
- impyla >=0.12.0
- impyla >=0.13.6
- hdfs >=2.0.0
- sqlalchemy >=1.0.0
- six
Expand All @@ -23,7 +23,7 @@ requirements:
- pytest
- numpy >=1.7.0
- pandas >=0.12.0
- impyla >=0.12.0
- impyla >=0.13.6
- hdfs >=2.0.0
- sqlalchemy >=1.0.0
- six
Expand Down
11 changes: 6 additions & 5 deletions conda-recipes/impyla/meta.yaml
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
package:
name: impyla
version: "0.12.0"
version: "0.13.5"

source:
git_url: https://github.com/cloudera/impyla
git_rev: v0.13.5

build:
number: {{ environ.get('GIT_DESCRIBE_NUMBER', 0) }}
Expand All @@ -14,17 +15,17 @@ requirements:
- python
- setuptools
- six
- thrift_sasl
- bitarray
- thriftpy
- thrift
- thriftpy >=0.3.5

run:
- python
- setuptools
- six
- thrift_sasl
- bitarray
- thriftpy
- thriftpy >=0.3.5
- thrift

test:
imports:
Expand Down
8 changes: 8 additions & 0 deletions conda-recipes/sasl/bld.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
"%PYTHON%" setup.py install
if errorlevel 1 exit 1

:: Add more build steps here, if they are necessary.

:: See
:: http://docs.continuum.io/conda/build.html
:: for a list of environment variables that are set during the build process.
9 changes: 9 additions & 0 deletions conda-recipes/sasl/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#!/bin/bash

$PYTHON setup.py install

# Add more build steps here, if they are necessary.

# See
# http://docs.continuum.io/conda/build.html
# for a list of environment variables that are set during the build process.
63 changes: 63 additions & 0 deletions conda-recipes/sasl/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
package:
name: sasl
version: "0.2.1"

source:
fn: sasl-0.2.1.tar.gz
url: https://pypi.python.org/packages/source/s/sasl/sasl-0.2.1.tar.gz
md5: ca093d9a3d6f20b79b964a5e5add0202
# patches:
# List any patch files here
# - fix.patch

# build:
# noarch_python: True
# preserve_egg_dir: True
# entry_points:
# Put any entry points (scripts to be generated automatically) here. The
# syntax is module:function. For example
#
# - sasl = sasl:main
#
# Would create an entry point called sasl that calls sasl.main()


# If this is a new build for the same version, increment the build
# number. If you do not include this key, it defaults to 0.
# number: 1

requirements:
build:
- python
- setuptools
- six

run:
- python
- six

test:
# Python imports
imports:
- sasl

# commands:
# You can put test commands to be run here. Use this to test that the
# entry points work.


# You can also put a file called run_test.py in the recipe that will be run
# at test time.

# requires:
# Put any additional test requirements here. For example
# - nose

about:
home: http://github.com/toddlipcon/python-sasl
license: UNKNOWN
summary: 'Cyrus-SASL bindings for Python'

# See
# http://docs.continuum.io/conda/build.html for
# more information about meta.yaml
48 changes: 42 additions & 6 deletions conda-recipes/thrift_sasl/meta.yaml
Original file line number Diff line number Diff line change
@@ -1,29 +1,65 @@
package:
name: thrift_sasl
version: "0.1.0"
version: "0.2.0"

source:
fn: thrift_sasl-0.1.0.tar.gz
url: https://pypi.python.org/packages/source/t/thrift_sasl/thrift_sasl-0.1.0.tar.gz
md5: 0710ffa4ed33a657090a8305fd71ca1e
fn: thrift_sasl-0.2.0.tar.gz
url: https://pypi.python.org/packages/source/t/thrift_sasl/thrift_sasl-0.2.0.tar.gz
md5: 689cbd85d570b2494cdc338d7d261376
# patches:
# List any patch files here
# - fix.patch

# build:
# noarch_python: True
# preserve_egg_dir: True
# entry_points:
# Put any entry points (scripts to be generated automatically) here. The
# syntax is module:function. For example
#
# - thrift_sasl = thrift_sasl:main
#
# Would create an entry point called thrift_sasl that calls thrift_sasl.main()


# If this is a new build for the same version, increment the build
# number. If you do not include this key, it defaults to 0.
# number: 1

requirements:
build:
- python
- setuptools
- sasl >=0.2.1
- thrift
- thriftpy

run:
- python
- sasl >=0.2.1
- thrift
- thriftpy

test:
# Python imports
imports:
- thrift_sasl

# commands:
# You can put test commands to be run here. Use this to test that the
# entry points work.


# You can also put a file called run_test.py in the recipe that will be run
# at test time.

# requires:
# Put any additional test requirements here. For example
# - nose

about:
home: https://github.com/cloudera/thrift_sasl
license: Apache License, Version 2.0
summary: 'Thrift SASL Python module that implements SASL transports for Thrift (`TSaslClientTransport`).'

# See
# http://docs.continuum.io/conda/build.html for
# more information about meta.yaml
2 changes: 1 addition & 1 deletion conda-recipes/thriftpy/build.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/bash

$PYTHON setup.py build_ext
$PYTHON setup.py install

# Add more build steps here, if they are necessary.
Expand Down
12 changes: 8 additions & 4 deletions conda-recipes/thriftpy/meta.yaml
Original file line number Diff line number Diff line change
@@ -1,17 +1,21 @@
package:
name: thriftpy
version: "0.3.2"
version: "0.3.5"

source:
fn: thriftpy-0.3.2.tar.gz
url: https://pypi.python.org/packages/source/t/thriftpy/thriftpy-0.3.2.tar.gz
md5: 7e882aac1d3999af3bb29a6b65ed810f
git_url: https://github.com/eleme/thriftpy
git_rev: v0.3.5

build:
number: {{ environ.get('GIT_DESCRIBE_NUMBER', 0) }}
string: py{{ environ.get('PY_VER').replace('.', '') }}_{{ environ.get('GIT_BUILD_STR', 'GIT_STUB') }}

requirements:
build:
- python
- setuptools
- ply >=3.4,<4.0
- cython

run:
- python
Expand Down
2 changes: 1 addition & 1 deletion docs/build-notebooks.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@


def rstify_notebook(path, outpath):
cmd = ('ipython nbconvert --execute --to=rst {0} --output {1}'
cmd = ('jupyter nbconvert --execute --to=rst {0} --output {1}'
.format(path, outpath))

print cmd
Expand Down
21 changes: 21 additions & 0 deletions docs/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,26 @@ Executing expressions
ImpalaClient.execute
ImpalaClient.disable_codegen

.. _api.postgres:

PostgreSQL client
-----------------
.. currentmodule:: ibis.sql.postgres.api

The PostgreSQL client is accessible through the ``ibis.postgres`` namespace.

Use ``ibis.postgres.connect`` with a SQLAlchemy-compatible connection string to
create a client.

.. autosummary::
:toctree: generated/

connect
PostgreSQLClient.database
PostgreSQLClient.list_tables
PostgreSQLClient.list_databases
PostgreSQLClient.table

.. _api.sqlite:

SQLite client
Expand Down Expand Up @@ -243,6 +263,7 @@ Table methods
TableExpr.get_column
TableExpr.get_columns
TableExpr.group_by
TableExpr.groupby
TableExpr.limit
TableExpr.mutate
TableExpr.projection
Expand Down
2 changes: 1 addition & 1 deletion docs/source/impala.rst
Original file line number Diff line number Diff line change
Expand Up @@ -178,7 +178,7 @@ TABLE .. AS SELECT`` (CTAS) statement:
freqs.execute()
files = freqs.files()
files.path[0]
files
freqs.drop()
Expand Down
Loading