FEAT: Use pandas rolling apply to implement rows_with_max_lookback
The offset interval specified in RowsWithMaxLookback should be used in the determination of each row's window, not just the first n rows. For the pandas backend, rolling apply can be used to implement this functionality. Author: Emily Reff <emily.reff@twosigma.com> Closes #1868 from emilyreff7/maxlookback and squashes the following commits: 42a0e89 [Emily Reff] Break up test and if/else logic for window combine 563e082 [Emily Reff] Add max_lookback to slots dc84822 [Emily Reff] Use pandas rolling apply to implement rows_with_max_lookback
BUG: Fixed issues with geo data
In this PR: - Update OmniSci CI to v4.7.0 - Fix polygon format on tests - Fix postgis data load to keep field called `id` Depends on ibis-project/testing-data#2 (merged) Author: Ivan Ogasawara <ivan.ogasawara@gmail.com> Closes #1872 from xmnlab/fix-geo-data-test and squashes the following commits: 15697e2 [Ivan Ogasawara] Fixed geospatial tests 3e5e504 [Ivan Ogasawara] Fixed issues with geo data
Author: amroid <amr@dt3.org> Closes #1807 from amroid/spark-client and squashes the following commits: 300a71c [amroid] Merge branch 'master' of https://github.com/ibis-project/ibis into spark-client 83a50e8 [amroid] Improve schema checks in spark test_client.py, xfail map<struct<...>, ...> schema conversion test, fix results parameter in _execute in spark client 87d6d47 [amroid] Incorporate feedback for spark-client, fix bug in translating pyspark structs to ibis 411bccf [amroid] Add java to ci dockerfile 2bdbf9d [amroid] Add pytest importorskip for pyspark import error 24c1a79 [amroid] Merge branch 'master' of https://github.com/ibis-project/ibis into spark-client 3f195c0 [amroid] Merge branch 'master' of https://github.com/ibis-project/ibis into spark-client e884ad8 [amroid] incorporate some changes from spark-tests branch 156f960 [amroid] incorporated suggestions for spark-client fc9a096 [amroid] added pyspark>=2.4.3 to requirements-3.x-dev.yml files 5699161 [amroid] fixed bug where SparkUnion subclassed from ImpalaUnion which did not exist dc3608b [amroid] fixed PR changes 66067d9 [amroid] added kwargs for SparkClient and SparkContext bbb7244 [amroid] added spark client, compiler, some unit tests
BUG: Fix the case where we do not have an index when using preceding …
…with intervals Small bug in the pandas backend when ordering by a particular key and not grouping. Author: Phillip Cloud <cpcloud@gmail.com> Closes #1876 from cpcloud/fix-window-group-order and squashes the following commits: 3313cd5 [Phillip Cloud] BUG: Fix the case where we do not have an index when using preceding with intervals
Author: amroid <amr@dt3.org> Closes #1830 from amroid/spark-tests and squashes the following commits: 7430d44 [amroid] Small changes from PR comments 54f948c [amroid] Tests now pass 6bae62c [amroid] Add changes from PR review 8cb80ae [amroid] Correct nullable behavior for spark to ibis type translation, fix tests for Spark backend ebd8f08 [amroid] Break up spark to ibis type conversion into separate dt.dtype registered functions 1083348 [amroid] Merge branch 'master' of https://github.com/ibis-project/ibis into spark-tests 8c398ca [amroid] incorporated suggestions for spark-client 87a2ece [amroid] changed imports to not fail CI c1b122f [amroid] added string concat 2aa6d7a [amroid] test_temporal now mostly works with Spark backend with some known issues, added floor option in convert_unit in util.py 45e80c6 [amroid] test_param now works with Spark backend 2ea6ac8 [amroid] test_numeric now works with Spark backend, changed impala compiler implementation of _number_literal_format a8ecb0e [amroid] test_column, test_generic now work with Spark backend (test_column required no changes) f934e40 [amroid] test_client now works with Spark backend, fixed mistake in test_sql in test_client.py c96cb58 [amroid] test_client now works with Spark backend, fixed mistake in test_sql in test_client.py 3b30726 [amroid] added pyspark>=2.4.3 to requirements-3.x-dev.yml files 0560ada [amroid] test_array now works with Spark backend, fixed Spark table creation and tests bc9fff7 [amroid] test_aggregation now works with Spark backend, changed base compiler rewrite for any, notany, all, notall to use max and min instead of sum 54422ef [amroid] test_string now works with Spark backend, added xfails and xpasses to regex tests in test_string 0d7bb67 [amroid] fixed SparkUnion bug d55c64a [amroid] Merge branch 'spark-client' into spark-tests dc3608b [amroid] fixed PR changes 78f8fc1 [amroid] added Spark subclass of Backend 66067d9 [amroid] added kwargs for SparkClient and SparkContext bbb7244 [amroid] added spark client, compiler, some unit tests
BUG: Fix according to bug in pd.to_datetime when passing the unit flag
BUG: Make Nodes enforce the proper signature
Author: amroid <amr@dt3.org> Author: Phillip Cloud <cpcloud@gmail.com> Closes #1891 from amroid/node-signature-fix and squashes the following commits: bd9240a [amroid] Remove pandas version constraints 8d33a4d [Phillip Cloud] FIX: Constrain pandas on windows as well 2d21fd9 [amroid] Add pandas <0.25 to requirements 325f2f5 [amroid] Add test for too many and too few args given to operations e4a93bc [amroid] Remove use_spheroid argument in geospatial operations a16a431 [amroid] Fix TypeSignature validate method so that it uses inspect and has better argument checking with the signature (now errors if too many arguments are passed to a UDF)
BUG: Fix various Spark backend issues
Author: amroid <amr@dt3.org> Closes #1888 from amroid/spark-small-fixes and squashes the following commits: 08f2135 [amroid] Small tweaks to spark test_api and moving imports 87f12fe [amroid] Remove pandas version constraints cdb9947 [amroid] Use importorskip in global spark client testing method in ibis/tests/all/conftest.py 1dc4b32 [amroid] Revert b15f13a because pandas version restriction <0.25 was added to requirements 9046f75 [amroid] Refactor spark testing client code into ibis/all/tests/conftest.py 7c56c66 [amroid] Add pandas <0.25 to requirements d624775 [amroid] Refactor spark testing client code into new spark_testing_client.py file in ibis module 0742e22 [amroid] Change global spark testing client fixture to not be a fixture, so importorskip does not skip all backends e4204a2 [amroid] Fix pandas test that started passing an xfailed test due to new pandas release 00a5910 [amroid] Add importorskip for pyspark for test_verify in test_api.py dd511a7 [amroid] Add tests for float translation and functions compile and verify from spark api 1c52748 [amroid] Add to_sql method in Spark compiler so that ibis.spark.compile actually works ab0ca40 [amroid] Added compile and verify methods to spark api 031da23 [amroid] Fix spark float and double literal formatting (replicates part of f029584)
FEAT: Add support for Postgres UDFs
The following functionality is included: * ability to use user-
defined functions (UDFs) * ability to use UDFs that are already
defined in the database (I just want to refer to them and use them) *
ability to wrap a python function, automatically define it as a
PL/Python function in the database, and be able to use it with ibis
objects * changed Docker image to support testing of Postgres
PL/Python UDF functionality in the ibis test suite * Tests for UDF
functionality The current API looks like this. Comments and
feedback are welcome. ``` import ibis.expr.datatypes as dt from
ibis.sql.postgres.udf.api import existing_udf, func_to_udf my_udf =
existing_udf( 'my_udf', input_types=[dt.String()],
output_type=dt.Integer(), schema='my_schema' ) expr =
table[my_udf(table['str_col']).name('int_col_output')] def
mult_a_b(a, b): return a * b mult_a_b_udf = func_to_udf(
sqlalchemy_engine, mult_a_b, (dt.Int32(), dt.Int32()),
dt.Int32(), schema='my_schema', overwrite=True ) expr2 =
table[mult_a_b_udf(table['int_col1'],
table['int_col2']).name('int_col_output')] ```
Author: Scott Hajek <shajek@pivotal.io>
Closes #1871 from scottcode/pg_udf and squashes the following commits:
5f671dd [Scott Hajek] Postgres UDF: make functional style a public-facing API
11a896c [Scott Hajek] Respect dependency of impala on postgres service
84a163a [Scott Hajek] Increase waiter timeout
48b2897 [Scott Hajek] Postgres UDF: changes to docstrings, errors, and test names
5c6cef7 [Scott Hajek] reformat Postgres UDF-related code with `black`
27949f5 [Scott Hajek] Postgres UDF: update postgres docker image
cdea976 [Scott Hajek] Postgres UDF: make `in_types` and `out_type` mandatory arguments
06350ec [Scott Hajek] Postgres UDF: remove UDF decorator API and change sql type translation
b422603 [Scott Hajek] Postgres UDF: remove `func_to_udf` from public-facing API
d906f83 [Scott Hajek] Postgres UDF: refactor of remove_decorators
415e211 [Scott Hajek] Postgres UDF: simplify UDF decorator and type mapping
e15b9e7 [Scott Hajek] Postgres UDF: use parameter name `replace` instead of `overwrite`
1a0a6d0 [Scott Hajek] Postgres UDF tests: simplify test schema creation logic
b95da85 [Scott Hajek] Postgres UDF: add func decorator API
9c0943e [Scott Hajek] Postgres UDF revisions from feedback
02c039d [Scott Hajek] Postgres UDF testing: fix to exclude pl/python from Windows tests
4de98da [Scott Hajek] Postgres UDF testing: mark udf test file with pytest.mark.postgresql
0f4c20d [Scott Hajek] Postgres UDF testing: create pl/python extension, generate random schema
1a97a9a [Scott Hajek] Postgres UDF: add docstrings explaining the test cases
a544971 [Scott Hajek] Change Postgres Docker image to one including PL/Python
92f64b4 [Scott Hajek] Postgres UDF: define a UDF in-database based on a Python function
732c257 [Scott Hajek] Postgres UDF: test for using a UDF that is already defined in-database
8c0ec6e [Scott Hajek] Postgres UDF: use a UDF that is already defined in-databaseCloses #1889 Author: amroid <amr@dt3.org> Closes #1885 from amroid/spark-udf and squashes the following commits: d4a1a70 [amroid] Remove unnecessary noqa in test_udf 45cf515 [amroid] Add importorskip for py4j 7dd5541 [amroid] Refactor spark udf code so elementwise_pandas is now elementwise.pandas, some changes to test_udf, fix bug in spark compiler SparkContext class 3134790 [amroid] Merge branch 'master' of https://github.com/ibis-project/ibis into spark-udf e157759 [amroid] Put failing pyspark import after importorskip statement 3e0b114 [amroid] Move spark test_udf.py importorskip for pyspark to apply to whole file 4f8012c [amroid] Remove use_spheroid argument in geospatial operations d240959 [amroid] Remove pandas version constraints 92c3e66 [amroid] Merge branch 'master' of https://github.com/ibis-project/ibis into spark-udf 3336fbc [amroid] Add comments describing failure cases of Spark UDF tests 9be720c [amroid] Add pandas <0.25 to requirements bddfa6b [amroid] Add importorskip for pyspark in spark test_udf 2fd075a [amroid] Change wrapper method to __call__ in SparkUDF class 0194143 [amroid] Fixed tiny thing for linter 2f531cf [amroid] Add to_sql method in Spark compiler so that ibis.spark.compile actually works 068b426 [amroid] Added compile and verify methods to spark api f1dce5d [amroid] Merge branch 'master' of https://github.com/ibis-project/ibis into spark-udf f029584 [amroid] Add spark UDF tests, fix spark UDF output_type issue 7929c72 [amroid] Merge branch 'spark-tests' of https://github.com/amroid/ibis into spark-udf cc31f94 [amroid] Refactor Spark UDF code 0a01534 [amroid] Add pandas UDFs 54f948c [amroid] Tests now pass af9ceb3 [amroid] Fix TypeSignature validate method so that it uses inspect and has better argument checking with the signature (now errors if too many arguments are passed to a UDF) 5c5fc57 [amroid] Add ibis to spark dtype conversion for UDFs 6bae62c [amroid] Add changes from PR review c8850cc [amroid] Correct nullable behavior for spark to ibis type translation, fix tests for Spark backend 8cb80ae [amroid] Correct nullable behavior for spark to ibis type translation, fix tests for Spark backend c3def6b [amroid] Break up spark to ibis type conversion into separate dt.dtype registered functions ebd8f08 [amroid] Break up spark to ibis type conversion into separate dt.dtype registered functions c2bf387 [amroid] Clean up basic udf functionality eb060ac [amroid] Basic elementwise UDFs 1083348 [amroid] Merge branch 'master' of https://github.com/ibis-project/ibis into spark-tests 8c398ca [amroid] incorporated suggestions for spark-client 87a2ece [amroid] changed imports to not fail CI c1b122f [amroid] added string concat 2aa6d7a [amroid] test_temporal now mostly works with Spark backend with some known issues, added floor option in convert_unit in util.py 45e80c6 [amroid] test_param now works with Spark backend 2ea6ac8 [amroid] test_numeric now works with Spark backend, changed impala compiler implementation of _number_literal_format a8ecb0e [amroid] test_column, test_generic now work with Spark backend (test_column required no changes) f934e40 [amroid] test_client now works with Spark backend, fixed mistake in test_sql in test_client.py c96cb58 [amroid] test_client now works with Spark backend, fixed mistake in test_sql in test_client.py 3b30726 [amroid] added pyspark>=2.4.3 to requirements-3.x-dev.yml files 0560ada [amroid] test_array now works with Spark backend, fixed Spark table creation and tests bc9fff7 [amroid] test_aggregation now works with Spark backend, changed base compiler rewrite for any, notany, all, notall to use max and min instead of sum 54422ef [amroid] test_string now works with Spark backend, added xfails and xpasses to regex tests in test_string 0d7bb67 [amroid] fixed SparkUnion bug d55c64a [amroid] Merge branch 'spark-client' into spark-tests dc3608b [amroid] fixed PR changes 78f8fc1 [amroid] added Spark subclass of Backend 66067d9 [amroid] added kwargs for SparkClient and SparkContext bbb7244 [amroid] added spark client, compiler, some unit tests
FEAT: Add geopandas as output for omniscidb
In this PR: - Add geopandas output for omniscidb - Add postgresql backend on geospatial tests - Add SetSRID to postgresql backend - Add geo spatial rules for literal on postgresql backend Author: Ivan Ogasawara <ivan.ogasawara@gmail.com> Closes #1858 from xmnlab/add-omniscidb-geopandas-output and squashes the following commits: 0088881 [Ivan Ogasawara] Fixed typo issues d273c52 [Ivan Ogasawara] add geopandas as output for omniscidb
FEAT: Add shapely geometries as input for literals
In this PR: - Add shapely geometries as input for literals - add tests for shapely geometries as input for literals Author: Ivan Ogasawara <ivan.ogasawara@gmail.com> Closes #1860 from xmnlab/add-shapely-input and squashes the following commits: 2126aaf [Ivan Ogasawara] Fixed shapely import issue abaf5cf [Ivan Ogasawara] apply changes from review 23cb9f1 [Ivan Ogasawara] Add shapely as geo spatial input data
BUG: sql method doesn't work when the query uses LIMIT clause
Currently, SQL expressions which contain a LIMIT statement trigger an exception from the Omnisci Validator, as 'LIMIT 1' is used in the get_schema_using_query routine. `Exception: Sort node not supported as input to another sort` This PR introduced a BUGFIX which utilizes the built in mapd sql_validate routine, and associated thrift type mappings to build the schema via direct reference, without the one line query. UPDATE: - added regex to remove `;` - added tests using raw sql queries - set option 'default_limit' to `None` to avoid default `LIMIT 10000` Author: Ivan Ogasawara <ivan.ogasawara@gmail.com> Author: Michael Eaton <mpeaton@amdgtechnologies.com> Closes #1903 from mpeaton/validator and squashes the following commits: 97e6d12 [Ivan Ogasawara] Add skipif for python < 3.6 dc0951c [Ivan Ogasawara] Fixed compile args 2a17341 [Ivan Ogasawara] split sql tests and add it to mapd.tests 1e49ea1 [Michael Eaton] quick fix
Fix #1915 Author: Ivan Ogasawara <ivan.ogasawara@gmail.com> Closes #1917 from xmnlab/omniscidb-null-literal and squashes the following commits: 5d41852 [Ivan Ogasawara] skip test_coalesce for mapd. remove some skip test for mapd 9565987 [Ivan Ogasawara] Applied suggestion from review bbbfc34 [Ivan Ogasawara] Fixed Null Literal and add skip for some tests d646540 [Ivan Ogasawara] Added missing null literall op
ENH: DDL support for Spark backend
Author: amroid <amr@dt3.org> Closes #1908 from amroid/spark-ddl and squashes the following commits: d5940f6 [amroid] Add tests for _from_csv methods in Spark client 5a647e5 [amroid] Privatize the means of table production from csv f0a572d [amroid] Change some classes in Spark DDL to inherit from impala 37a4bff [amroid] Fix Spark tests in test_ddl_compilation that failed because location parameter was removed for create table 9efe482 [amroid] Remove location and path arguments to create external tables, cleanup according to PR feedback, fix small things with tests d30463d [amroid] Spark UDFs now get dropped immediately after query executes, remove list_functions from SparkClient, add DropFunction DDL 2efea62 [amroid] Small fixes, change default values for some DDl functions 9b5c3b4 [amroid] Refactor create_table_or_temp_view_from_csv, schema_from_csv, exists_table in SparkClient, move ibis-spark type translation into separate file ibis/spark/datatypes.py ff08327 [amroid] Rename table_or_temp_view_from_csv to create_table_or_temp_view_from_csv, small fix to qualified name in that method f7be498 [amroid] Add table_or_temp_view_from_csv to spark client, fix ibis.common.exceptions imports 11ab3fc [amroid] Fix import error c7f1e5d [amroid] Merge branch 'master' of https://github.com/ibis-project/ibis into spark-ddl 8f2ccc5 [amroid] Add schema_from_csv method to Spark client ff1add4 [amroid] Refactor Spark backend DDL implementation to subclass from Impala DDL 3a10ba3 [amroid] Clean up tests, add alter properties to Spark backend DDL f8caec9 [amroid] Add rename method to SparkTable a7c68ef [amroid] Add compute_stats function to SparkClient, remove ability to create global temp views from SparkClient.create_view d56a32c [amroid] Add truncate_table to SparkClient ddl 83c5278 [amroid] SparkClient create_table and insert can take pandas dataframe input 4646d2d [amroid] Override SparkClient.table method to deal with namespace issues for Spark temp views ec6287b [amroid] Add DDL for create table, create view, create database, drop table, drop database, insert
Closes #1936 Author: Phillip Cloud <cpcloud@gmail.com> Closes #1938 from cpcloud/fix-miniconda and squashes the following commits: b2206c3 [Phillip Cloud] Install ping on the doc box 1e42c6d [Phillip Cloud] Pin SQLAlchemy to get builds passing until a real fix can be made 62f109a [Phillip Cloud] Fix doc build 9a7c56c [Phillip Cloud] Install openjdk from conda-forge b0d1e78 [Phillip Cloud] Actually upgrade the JDK install e4d2a53 [Phillip Cloud] BUG: Upgrade to JDK11
BUG: Fix incorrect assumptions about attached SQLite databases
Author: Phillip Cloud <cpcloud@gmail.com> Closes #1937 from cpcloud/sqlite-sqlalchemy and squashes the following commits: f3cda71 [Phillip Cloud] Remove constraints on windows 22913f2 [Phillip Cloud] Don't use type hints for now a018895 [Phillip Cloud] Py35 lint a4ec057 [Phillip Cloud] Disposal is in fact needed c9e5d72 [Phillip Cloud] BUG: Fix incorrect assumptions about attached SQLite databases
This is a Pyspark backend for ibis. This is different from the spark backend where the ibis expr is compiled to SQL string. Instead, the pyspark backend compiles the ibis expr to pyspark.DataFrame exprs. Author: Li Jin <ice.xelloss@gmail.com> Author: Hyonjee Joo <5000208+hjoo@users.noreply.github.com> Closes #1913 from icexelloss/pyspark-backend-prototype and squashes the following commits: 213e371 [Li Jin] Add pyspark/__init__.py 8f1c35e [Li Jin] Address comments f173425 [Li Jin] Fix tests 0969b0a [Li Jin] Skip unimplemented tests 1f9409b [Li Jin] Change pyspark imports to optional 26b041c [Li Jin] Add importskip 108ccd8 [Li Jin] Add scope e00dc00 [Li Jin] Address PR comments 4764a4e [Li Jin] Add pyspark marker to setup.cfg 7cc2a9e [Li Jin] Remove dead code 72b45f8 [Li Jin] Fix rebase errors 9ad663f [Hyonjee Joo] implement pyspark numeric operations to pass all/test_numeric.py (#9) 675a89f [Li Jin] Implement compiler rules to pass all/test_aggregation.py 215c0d9 [Li Jin] Link existing tests with PySpark backend (#7) 88705fe [Li Jin] Implement basic join c4a2b79 [Hyonjee Joo] add pyspark compile rule for greatest, fix bug with selection (#4) fa4ad23 [Li Jin] Implement basic aggregation, group_by and window (#3) 54c2f2d [Li Jin] Initial commit of pyspark DataFrame backend (#1)
1. Allows for null geometries (it previously failed in unhelpful ways upon encountering those) 2. Add types for some missing geometry types, namely multilinestrings and multipoints. I still need to add some tests, but wanted to put this out for comment. Author: Ian Rose <ian.r.rose@gmail.com> Author: Ian Rose <ian.rose@lacity.org> Closes #1925 from ian-r-rose/postgis-enhancements and squashes the following commits: 2daa0ba [Ian Rose] Lint. 7c32bf4 [Ian Rose] Consolidate geospatial types to not repeat them as much. 6662213 [Ian Rose] Exercise APIs around multilinestring and multipoint in tests. ca56874 [Ian Rose] Add datatypes for multilinestring, multipoint. 9b16c35 [Ian Rose] Allow for null geometries when fetching from geospatial databases.
SUPP: Improve geospatial literals and smoke tests
Improves geo spatial literals and smoke tests: - changes `GeoMockConnection` to allow tests for `omniscid` to `postgresql` https://github.com/ibis- project/ibis/blob/master/ibis/expr/tests/mocks.py#L423 - fixes geo spatial literals used in https://github.com/ibis- project/ibis/blob/master/ibis/expr/tests/test_geospatial.py - improve tests in https://github.com/ibis- project/ibis/blob/master/ibis/expr/tests/test_geospatial.py - fixes black pre-commit hook configuration - allows literal without geo_types (geometry or geography) for `postgis` backend - fixes translation for geo spatial literals (ibis/common/geospatial.py) Resolves #1929 Resolves #1931 Resolves partially #1930 Author: Ivan Ogasawara <ivan.ogasawara@gmail.com> Closes #1928 from xmnlab/change-geo-mock-test-backend and squashes the following commits: 9aac60c [Ivan Ogasawara] Added an assert for test_geo_ops_smoke 7acb69f [Ivan Ogasawara] Fixed smoke test for geospatial operations
ENH: PySpark backend string and column ops (#1942)
* implement pyspark string ops and compiler distinct column op to pass all/test_string.py and all/test_column.py * fix string tests to xfail on backends * switch upper, lower, reverse, and length string functions to use Spark native functions * switch strip, lstrip, rstrip, capitalize string functions to use Spark native functions * remove unused pandas_udf import statement
Missing geospatial ops for OmniSciDB (#1958)
* Add omniscidb ops for intersects, disjoint, dwithin, and dfullywithin. * Add some tests for dwithin, dfullywithin, intersects, disjoint. * Back off of dfullywithin.
ENH: window operations for pyspark backend
Implemented window operations for PySpark backend to pass all tests in `ibis/tests/all/test_window.py`: - `WindowOp` - `Lag` - `Lead` - `MinRank` - `DenseRank` - `NTile` - `FirstValue` - `LastValue` - `RowNumber` - `CumulativeSum` - `CumulativeMean` - `CumulativeMin` - `CumulativeMax` Also enhanced select aggregation operations (e.g. `Any`, `NotAny`, `All`, `NotAll`, `Count`, `Max`, `Min`, `Mean`, `Sum`) to be interoperable with windows. Author: Hyonjee <hyonjee.joo@twosigma.com> Closes #1945 from hjoo/pyspark-window and squashes the following commits: 54a3405 [Hyonjee] change double quotes to single quotes in pyspark compiler.py 7cb2291 [Hyonjee] add helper methods for pyspark shift and cumulative window ops, refactor import line 5c82b19 [Hyonjee] remove extra test_window lead/lag tests that were added in previous commit but don't pass with non pyspark backends 7ac2dd8 [Hyonjee] skip unsupported window tests for OmniSciDB 419c309 [Hyonjee] fix test_window() in ibis/pyspark/tests/test_basic.py and remove xfail 7bd6585 [Hyonjee] window operations for pyspark backend
ENH: filter for PySpark backend
Introduces predicate handling in `ops.Selection` for `filter()` functionality. Also implemented a few missing comparison operations: - `And` - `Or` - `NotEquals` - `Less` - `LessEquals` Author: Hyonjee <hyonjee.joo@twosigma.com> Closes #1943 from hjoo/pyspark-filter and squashes the following commits: 42c2a3c [Hyonjee] parametrize pyspark test_filter 0480921 [Hyonjee] added ibis issue for missing sort_keys handling for pyspark selection op 28e6ad6 [Hyonjee] pyspark backend filter
Re-formatting all files using pre-commit hook
waiting for PyCQA/isort#1000 Author: Ivan Ogasawara <ivan.ogasawara@gmail.com> Closes #1963 from xmnlab/black-files-formatting and squashes the following commits: 9d1defb [Ivan Ogasawara] Re-formatting all files using pre-commit hook
ENH: Validate AsOfJoin tolerance and attempt interval unit conversion
Unlike other operations, not all of the arguments of AsOfJoin are automatically passed through the validation chain that occurs in superclass Annotable, since they are not passed into super(). The result is that whereas other operations with offset-like arguments (e.g. Lead/Lag) have their offsets' dtypes automatically inferred, the same does not occur for 'tolerance' in AsOfJoin. We can simply wire this validation in ourselves. Additionally, the default unit for Interval is 's' regardless of the unit of the value of origin. We can make a better attempt at inferring unit before immediately defaulting to seconds. Author: Emily Reff <emily.reff@twosigma.com> Closes #1952 from emilyreff7/hackday and squashes the following commits: 2d7ff71 [Emily Reff] Merge branch 'master' of https://github.com/ibis-project/ibis into hackday 40436e1 [Emily Reff] break up tests 134caf3 [Emily Reff] fix failing tests 91318be [Emily Reff] fix failing tests c524401 [Emily Reff] fix failing tests 97fa826 [Emily Reff] validate AsOfJoin tolerance and attempt interval unit conversion
ENH: Implement join for PySpark backend
Implement join functionality for PySpark backend. This PR only enables tests/all/test_join.py Author: Li Jin <ice.xelloss@gmail.com> Closes #1967 from icexelloss/pyspark-backend-join and squashes the following commits: 15b7bf5 [Li Jin] Fix impala tests and fix csv test for python 3.5 782db34 [Li Jin] Clean up 54225c8 [Li Jin] Disable csv tests afbf374 [Li Jin] Only support Pandas and PySpark backend in test_join 75550cd [Li Jin] Clean up tests c8cefa1 [Li Jin] ENH: Implement join for PySpark backend