This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
These release notes are for versions of ibis **1.0 and later**. Release
notes for pre-1.0 versions of ibis can be found at :doc:`release-pre-1.0`
* :support:`2678` Improvement of the backend API. The former `Client` subclasses have been replaced by a `Backend` class that must
subclass `ibis.backends.base.BaseBackend`. The `BaseBackend` class contains abstract methods for the minimum subset of methods that
backends must implement, and their signatures have been standardized across backends. The Ibis compiler has been refactored, and
backends don't need to implement all compiler classes anymore if the default works for them. Only a subclass of
`ibis.backends.base.sql.compiler.Compiler` is now required. Backends now need to register themselves as entry points.
* :support:`2905` Deprecate `exists_table(table)` in favor of `table in list_tables()`
* :bug:`2991` Fix data races in impala connection pool accounting
* :bug:`2985` Fix null literal compilation in the Clickhouse backend
* :bug:`2984` Fix order of limit and offset parameters in the Clickhouse backend
* :support:`2977` Remove handwritten type parser; parsing errors that were previously `IbisTypeError` are now `parsy.ParseError`. `parsy` is now a hard requirement.
* :support:`2962` Methods `current_database` and `list_databases` raise an exception for backends that do not support databases
* :bug:`2956` Replace `equals` operation for geospatial datatype to `geo_equals`
* :support:`2913` Method `set_database` has been deprecated, in favor of creating a new connection to a different database
* :feature:`2938` Serialization-deserialization of Node via pickle is now byte compatible between different processes
* :support:`2914` Removed `log` method of clients, in favor of `verbose_log` option
* :feature:`2916` Support joining on different columns in ClickHouse backend
* :feature:`2908` Support summarization of empty data in Pandas backend
* :support:`2883` Output of `Client.version` returned as a string, instead of a setuptools `Version`
* :feature:`2882` Unify implementation of fillna and isna in Pyspark backend
* :support:`2862` Deprecated `list_schemas` in SQLAlchemy backends in favor of `list_databases`
* :bug:`2829` Fix .drop(fields). The argument can now be either a list of strings or a string.
* :feature:`2873` Support binary operation with Timedelta in Pyspark backend
* :support:`2865` Deprecated `ibis.<backend>.verify()` in favor of capturing exception in `ibis.<backend>.compile()`
* :bug:`2845` Fix projection on differences and intersections for SQL backends
* :feature:`2839`: Add `group_concat` operation for Clickhouse backend
* :bug:`2827` Backends are loaded in a lazy way, so third-party backends can import Ibis without circular imports
* :bug:`2830` Disable aggregation optimization due to N squared performance
* :bug:`2821` Fix `.cast()` to array outputting list instead of np.array in Pandas backend
* :bug:`2820` Fix aggregation with mixed reduction datatypes (array + scalar) on Dask backend
* :feature:`2808` Support comparison of ColumnExpr to timestamp literal
* :support:`2789` Simplification of data fetching. Backends don't need to implement `Query` anymore
* :feature:`2805` Make op schema a cached property
* :feature:`2613` :feature:`2778` Implement `.insert()` for SQLAlchemy backends
* :feature:`2792` Infer categorical and decimal Series to more specific Ibis types in Pandas backend
* :feature:`2790` Add `startswith` and `endswith` operations
* :feature:`2776` :feature:`2797` Allow more flexible return type for UDFs
* :feature:`2779` Implement Clip in the Pyspark backend
* :bug:`2770` Fix error when using reduction UDF that returns np.array in a grouped aggregation
* :feature:`2753` Use `ndarray` as array representation in Pandas backend
* :support:`2665` Move BigQuery backend to a `separate repository <https://github.com/ibis-project/ibis-bigquery>`_.
The backend will be released separately, use `pip install ibis-bigquery` or `conda install ibis-bigquery` to
install it, and then use as before.
* :bug:`2712` Fix time context trimming error for multi column udfs in pandas backend
* :bug:`2710` Fix error during compilation of range_window in base_sql backends (:issue:`2608`)
* :feature:`2687` Support Spark filter with window operation
* :bug:`2696` Fix wrong row indexing in the result for 'window after filter' for timecontext adjustment
* :bug:`2702` Fix `aggregate` exploding the output of Reduction ops that return a list/ndarray
* :bug:`2693` Fix issues with context adjustment for filter with PySpark backend
* :support:`2689` Supporting SQLAlchemy 1.4, and requiring minimum 1.3
* :support:`2680` Namespace time_col config, fix type check for trim_with_timecontext for pandas window execution
* :feature:`2646` Support context adjustment for udfs for pandas backend
* :feature:`2655` Add `auth_local_webserver`, `auth_external_data`, and
`auth_cache` parameters to BigQuery connect method. Set
`auth_local_webserver` to use a local server instead of copy-pasting an
authorization code. Set `auth_external_data` to true to request additional
scopes required to query Google Drive and Sheets. Set `auth_cache` to
`reauth` or `none` to force reauthentication.
* :bug:`2657` Add temporary struct col in pyspark backend to ensure that UDFs are executed only once
* :bug:`2588` Fix BigQuery connect bug that ignored project ID parameter
* :bug:`2636` Fix overwrite logic to account for DestructColumn inside mutate API
* :feature:`2641` Add `bit_and`, `bit_or`, and `bit_xor` integer column aggregates (BigQuery and MySQL backends)
* :feature:`2379` Backends are defined as entry points
* :bug:`2635` Fix fusion optimization bug that incorrectly changes operation order
* :feature:`2615` Add `ibis.array` for creating array expressions
* :feature:`2607` Implement Not operation in PySpark backend
* :feature:`2610` Added support for case/when in PySpark backend
* :bug:`2610` Fixes a NPE issue with substr in PySpark backend
* :feature:`2603` Add support for np.array as literals for backends that already support lists as literals
* :bug:`2354` Fixes binary data type translation into BigQuery bytes data type
* :bug:`2577` Make StructValue picklable
* :support:`2505` Remove deprecated `ibis.HDFS`, `ibis.WebHDFS` and `ibis.hdfs_connect`
* :feature:`2514` Add Struct.from_dict
* :feature:`2310` Add hash and hashbytes support for BigQuery backend
* :feature:`2511` Support reduction UDF without groupby to return multiple columns for Pandas backend
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"Note that if you installed Ibis with `pip` instead of `conda`, you may need to install the SQLite backend separately with `pip install ibis-framework[sqlite]`.\n",
"Note that if you installed Ibis with `pip` instead of `conda`, you may need to install the SQLite backend separately with `pip install 'ibis-framework[sqlite]'`.\n",
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"What's supported is pretty basic right now. We intend to support the full gamut of regular expression munging with a nice API, though in some cases some work will be required on Impala's backend to support everything. "
"What's supported is pretty basic right now. We intend to support the full gamut of regular expression munging with a nice API, though in some cases some work will be required on SQLite's backend to support everything. "
"Some backends support adding offsets, for example `independence.independence_date + ibis.interval(days=1)` or `ibis.now() - independence.independence_date`."
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"The filtering examples we've shown to this point have been pretty simple, either comparisons between columns or fixed values, or set filter functions like `isin` and `notin`. \n",
"\n",
"Ibis supports a number of richer analytical filters that can involve one or more of:\n",
"\n",
"- Aggregates computed from the same or other tables\n",
"- Conditional aggregates (in SQL-speak these are similar to \"correlated subqueries\")\n",
"- \"Existence\" set filters (equivalent to the SQL `EXISTS` and `NOT EXISTS` keywords)"
"We could always compute some aggregate value from the table and use that in another expression, or we can use a data-derived aggregate in the filter. Take the average of a column. For example the average of countries size:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"countries.area_km2.mean()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can use this expression as a substitute for a scalar value in a filter, and the execution engine will combine everything into a single query rather than having to access the database multiple times. For example, we want to filter European countries larger than the average country size in the world. See how most countries in Europe are smaller than the world average:"
"Suppose that we wish to filter using an aggregate computed conditional on some other expressions holding true.\n",
"\n",
"For example, we want to filter European countries larger than the average country size, but this time of the average in Africa. African countries have an smaller size compared to the world average, and France gets into the list:"
"Some filtering involves checking for the existence of a particular value in a column of another table, or amount the results of some value expression. This is common in many-to-many relationships, and can be performed in numerous different ways, but it's nice to be able to express it with a single concise statement and let Ibis compute it optimally.\n",
"\n",
"An example could be finding all countries that had **any** year with a higher GDP than 3 trillion US dollars:"
"Note how this is different than a join between `countries` and `gdp`, which would return one row per year. The method `.any()` is equivalent to filtering with a subquery."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Filtering in aggregations\n",
"\n",
"\n",
"Suppose that you want to compute an aggregation with a subset of the data for _only one_ of the metrics / aggregates in question, and the complete data set with the other aggregates. Most aggregation functions are thus equipped with a `where` argument. Let me show it to you in action:"