Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: minimum version of pyarrow is incorrect #8795

Closed
1 task done
tswast opened this issue Mar 27, 2024 · 4 comments · Fixed by #8977
Closed
1 task done

bug: minimum version of pyarrow is incorrect #8795

tswast opened this issue Mar 27, 2024 · 4 comments · Fixed by #8977
Assignees
Labels
bug Incorrect behavior inside of ibis

Comments

@tswast
Copy link
Collaborator

tswast commented Mar 27, 2024

What happened?

With pyarrow 3.0.0 get the following: AttributeError: module 'pyarrow' has no attribute 'ExtensionScalar'

Per https://issues.apache.org/jira/browse/ARROW-13541 ExtensionScalar was added in pyarrow 6.0.0.

What version of ibis are you using?

8.0.0

What backend(s) are you using, if any?

BigQuery

Relevant log output

def test_block_from_local(data):
        expected = pandas.DataFrame(data)
        mock_session = mock.create_autospec(spec=bigframes.Session)
    
        # hard-coded the returned dimension of the session for that each of the test case contains 3 rows.
        mock_session._execute.return_value = (iter([[3]]), None)
    
>       block = blocks.Block.from_local(pandas.DataFrame(data), mock_session)

tests/unit/core/test_blocks.py:85: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
bigframes/core/blocks.py:158: in from_local
    return cls(
bigframes/core/blocks.py:123: in __init__
    self._expr = self._normalize_expression(expr, self._index_columns)
bigframes/core/blocks.py:1202: in _normalize_expression
    col_id for col_id in expr.column_ids if col_id not in index_columns
bigframes/core/__init__.py:96: in column_ids
    return self.schema.names
../../.pyenv/versions/3.9.16/lib/python3.9/functools.py:993: in __get__
    val = self.func(instance)
bigframes/core/__init__.py:110: in schema
    return self._compiled_schema
../../.pyenv/versions/3.9.16/lib/python3.9/functools.py:993: in __get__
    val = self.func(instance)
bigframes/core/__init__.py:114: in _compiled_schema
    compiled = self._compile_unordered()
bigframes/core/__init__.py:148: in _compile_unordered
    return compiling.compile_unordered_ir(self.node)
bigframes/core/compile/compiler.py:37: in compile_unordered_ir
    return typing.cast(compiled.UnorderedIR, compile_node(node, False))
bigframes/core/compile/compiler.py:50: in compile_node
    return _compile_node(node, ordered)
../../.pyenv/versions/3.9.16/lib/python3.9/functools.py:888: in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
bigframes/core/compile/compiler.py:84: in compile_readlocal
    ordered_ir = compiled.OrderedIR.from_pandas(array_as_pd, node.schema)
bigframes/core/compile/compiled.py:647: in from_pandas
    keys_memtable = ibis.memtable(ibis_values, schema=ibis.schema(ibis_schema))
.nox/unit-3-9/lib/python3.9/site-packages/ibis/expr/api.py:436: in memtable
    return _memtable(data, name=name, schema=schema, columns=columns)
.nox/unit-3-9/lib/python3.9/site-packages/ibis/common/dispatch.py:88: in call
    return dispatch(type(arg))(arg, *args, **kwargs)
.nox/unit-3-9/lib/python3.9/site-packages/ibis/expr/api.py:469: in _memtable_from_dataframe
    from ibis.formats.pandas import PandasDataFrameProxy
.nox/unit-3-9/lib/python3.9/site-packages/ibis/formats/pandas.py:17: in <module>
    from ibis.formats.pyarrow import PyArrowData, PyArrowSchema, PyArrowType
.nox/unit-3-9/lib/python3.9/site-packages/ibis/formats/pyarrow.py:17: in <module>
    class JSONScalar(pa.ExtensionScalar):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

name = 'ExtensionScalar'

    def __getattr__(name):
        if name in _deprecated:
            obj, new_name = _deprecated[name]
            _warnings.warn(_msg.format(name, new_name),
                           DeprecationWarning, stacklevel=2)
            return obj
        elif name in _serialization_deprecatd:
            _warnings.warn(_serialization_msg.format(name),
                           DeprecationWarning, stacklevel=2)
            return _serialization_deprecatd[name]
    
>       raise AttributeError(
            "module 'pyarrow' has no attribute '{0}'".format(name)
        )
E       AttributeError: module 'pyarrow' has no attribute 'ExtensionScalar'

.nox/unit-3-9/lib/python3.9/site-packages/pyarrow/__init__.py:255: AttributeError
_______________________________________________________ test_create_job_configs_labels_log_adaptor_call_method_under_length_limit ________________________________________________________

    def test_create_job_configs_labels_log_adaptor_call_method_under_length_limit():
        log_adapter.get_and_reset_api_methods()
        cur_labels = {
            "bigframes-api": "read_pandas",
            "source": "bigquery-dataframes-temp",
        }
>       df = bpd.DataFrame(
            {"col1": [1, 2], "col2": [3, 4]}, session=resources.create_bigquery_session()
        )

tests/unit/session/test_io_bigquery.py:68: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
bigframes/core/log_adapter.py:44: in wrapper
    return method(*args, **kwargs)
bigframes/dataframe.py:172: in __init__
    self._block = session.read_pandas(pd_dataframe)._get_block()
bigframes/session/__init__.py:994: in read_pandas
    return self._read_pandas(pandas_dataframe, "read_pandas")
bigframes/session/__init__.py:1007: in _read_pandas
    inline_df = self._read_pandas_inline(pandas_dataframe)
bigframes/session/__init__.py:1022: in _read_pandas_inline
    blocks.Block.from_local(pandas_dataframe, self)
bigframes/core/blocks.py:158: in from_local
    return cls(
bigframes/core/blocks.py:123: in __init__
    self._expr = self._normalize_expression(expr, self._index_columns)
bigframes/core/blocks.py:1202: in _normalize_expression
    col_id for col_id in expr.column_ids if col_id not in index_columns
bigframes/core/__init__.py:96: in column_ids
    return self.schema.names
../../.pyenv/versions/3.9.16/lib/python3.9/functools.py:993: in __get__
    val = self.func(instance)
bigframes/core/__init__.py:110: in schema
    return self._compiled_schema
../../.pyenv/versions/3.9.16/lib/python3.9/functools.py:993: in __get__
    val = self.func(instance)
bigframes/core/__init__.py:114: in _compiled_schema
    compiled = self._compile_unordered()
bigframes/core/__init__.py:148: in _compile_unordered
    return compiling.compile_unordered_ir(self.node)
bigframes/core/compile/compiler.py:37: in compile_unordered_ir
    return typing.cast(compiled.UnorderedIR, compile_node(node, False))
bigframes/core/compile/compiler.py:50: in compile_node
    return _compile_node(node, ordered)
../../.pyenv/versions/3.9.16/lib/python3.9/functools.py:888: in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
bigframes/core/compile/compiler.py:84: in compile_readlocal
    ordered_ir = compiled.OrderedIR.from_pandas(array_as_pd, node.schema)
bigframes/core/compile/compiled.py:647: in from_pandas
    keys_memtable = ibis.memtable(ibis_values, schema=ibis.schema(ibis_schema))
.nox/unit-3-9/lib/python3.9/site-packages/ibis/expr/api.py:436: in memtable
    return _memtable(data, name=name, schema=schema, columns=columns)
.nox/unit-3-9/lib/python3.9/site-packages/ibis/common/dispatch.py:88: in call
    return dispatch(type(arg))(arg, *args, **kwargs)
.nox/unit-3-9/lib/python3.9/site-packages/ibis/expr/api.py:469: in _memtable_from_dataframe
    from ibis.formats.pandas import PandasDataFrameProxy
.nox/unit-3-9/lib/python3.9/site-packages/ibis/formats/pandas.py:17: in <module>
    from ibis.formats.pyarrow import PyArrowData, PyArrowSchema, PyArrowType
.nox/unit-3-9/lib/python3.9/site-packages/ibis/formats/pyarrow.py:17: in <module>
    class JSONScalar(pa.ExtensionScalar):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

name = 'ExtensionScalar'

    def __getattr__(name):
        if name in _deprecated:
            obj, new_name = _deprecated[name]
            _warnings.warn(_msg.format(name, new_name),
                           DeprecationWarning, stacklevel=2)
            return obj
        elif name in _serialization_deprecatd:
            _warnings.warn(_serialization_msg.format(name),
                           DeprecationWarning, stacklevel=2)
            return _serialization_deprecatd[name]
    
>       raise AttributeError(
            "module 'pyarrow' has no attribute '{0}'".format(name)
        )
E       AttributeError: module 'pyarrow' has no attribute 'ExtensionScalar'

.nox/unit-3-9/lib/python3.9/site-packages/pyarrow/__init__.py:255: AttributeError

Code of Conduct

  • I agree to follow this project's Code of Conduct
@tswast tswast added the bug Incorrect behavior inside of ibis label Mar 27, 2024
@tswast
Copy link
Collaborator Author

tswast commented Mar 27, 2024

Per https://github.com/ibis-project/ibis/blob/8.0.0/pyproject.toml#L46C1-L46C20 the minimum version is advertised as 2.0.

@cpcloud
Copy link
Member

cpcloud commented Mar 27, 2024

Yeah, we can and should bump this. What lower bound would you prefer here?

@cpcloud
Copy link
Member

cpcloud commented Apr 2, 2024

@tswast Can you submit a PR to bump the lower bound?

@tswast
Copy link
Collaborator Author

tswast commented Apr 2, 2024

Can do. Will also investigate if we can add some tests against our lower bounds dependency versions via constraints files.

@tswast tswast self-assigned this Apr 2, 2024
gforsyth added a commit that referenced this issue Apr 16, 2024
…version (#8977)

Bump the lower bound of pandas to 1.5.3 to reflect the actual lower bound that we
test against in CI. 
Lower bound of pyarrow is now 10.0.1 (required by pandas and in practice >7 required for a lot of Ibis functionality)
Also bump the upper bound in conda environmentl
YAMLs for recent fixes to support pandas 2.2.

Closes #8795.

---------

Co-authored-by: Gil Forsyth <gil@forsyth.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis
Projects
Archived in project
2 participants