Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing requirement db_types #1188

Closed
dtomlinson91 opened this issue Mar 30, 2022 · 3 comments · Fixed by #1191
Closed

missing requirement db_types #1188

dtomlinson91 opened this issue Mar 30, 2022 · 3 comments · Fixed by #1191
Assignees
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@dtomlinson91
Copy link

dtomlinson91 commented Mar 30, 2022

Possibly related to #1186

Version 3 added an import db_types to main/google/cloud/bigquery/table.py.

This is missing as a dependency in setup.py.

Workaround is to specify db_types as a dependency.

Error
2022-03-30T15:02:22.013284720Z from google.cloud import bigquery
Error
2022-03-30T15:02:22.013288448Z File "/usr/local/lib/python3.8/site-packages/google/cloud/bigquery/__init__.py", line 35, in <module>
Error
2022-03-30T15:02:22.013292638Z from google.cloud.bigquery.client import Client
Error
2022-03-30T15:02:22.013296358Z File "/usr/local/lib/python3.8/site-packages/google/cloud/bigquery/client.py", line 64, in <module>
Error
2022-03-30T15:02:22.013300785Z from google.cloud.bigquery import _job_helpers
Error
2022-03-30T15:02:22.013304550Z File "/usr/local/lib/python3.8/site-packages/google/cloud/bigquery/_job_helpers.py", line 24, in <module>
Error
2022-03-30T15:02:22.013308792Z from google.cloud.bigquery import job
Error
2022-03-30T15:02:22.013312588Z File "/usr/local/lib/python3.8/site-packages/google/cloud/bigquery/job/__init__.py", line 27, in <module>
Error
2022-03-30T15:02:22.013316823Z from google.cloud.bigquery.job.copy_ import CopyJob
Error
2022-03-30T15:02:22.013320716Z File "/usr/local/lib/python3.8/site-packages/google/cloud/bigquery/job/copy_.py", line 21, in <module>
Error
2022-03-30T15:02:22.013324659Z from google.cloud.bigquery.table import TableReference
Error
2022-03-30T15:02:22.013328347Z File "/usr/local/lib/python3.8/site-packages/google/cloud/bigquery/table.py", line 32, in <module>
Error
2022-03-30T15:02:22.013350444Z import db_dtypes # type: ignore # noqa
Error
2022-03-30T15:02:22.013354379ZModuleNotFoundError: No module named 'db_dtypes'
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Mar 30, 2022
@tswast
Copy link
Contributor

tswast commented Mar 30, 2022

Thanks for the report! Definitely something we should fix ASAP.

@tswast tswast self-assigned this Mar 30, 2022
@tswast tswast added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. labels Mar 30, 2022
gcf-merge-on-green bot pushed a commit that referenced this issue Mar 30, 2022
…not (#1191)

`db-dtypes` is already present in the `pandas` "extras", but this PR ensures that if pandas is present and db-dtypes is not, a more understandable error message is raised.

```
google/cloud/bigquery/_pandas_helpers.py:991: ValueError
____________________________________ test_list_rows_nullable_scalars_extreme_dtypes[10] _____________________________________

    # Copyright 2019 Google LLC
    #
    # Licensed under the Apache License, Version 2.0 (the "License");
    # you may not use this file except in compliance with the License.
    # You may obtain a copy of the License at
    #
    #     http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    
    """Shared helper functions for connecting BigQuery and pandas."""
    
    import concurrent.futures
    from datetime import datetime
    import functools
    from itertools import islice
    import logging
    import queue
    import warnings
    
    try:
        import pandas  # type: ignore
    
        pandas_import_exception = None
    except ImportError as exc:  # pragma: NO COVER
        pandas = None
        pandas_import_exception = exc
    else:
        import numpy
    
    try:
>       import db_dtypes  # type: ignore
E       ModuleNotFoundError: No module named 'db_dtypes'

google/cloud/bigquery/_pandas_helpers.py:36: ModuleNotFoundError

The above exception was the direct cause of the following exception:

bigquery_client = <google.cloud.bigquery.client.Client object at 0x11e2d3580>
scalars_extreme_table = 'swast-scratch.python_bigquery_tests_system_20220330160830_ffff89.scalars_extreme_jsonl0x3ffeb'
max_results = 10

    @pytest.mark.parametrize(
        ("max_results",),
        (
            (None,),
            (10,),
        ),  # Use BQ Storage API.  # Use REST API.
    )
    def test_list_rows_nullable_scalars_extreme_dtypes(
        bigquery_client, scalars_extreme_table, max_results
    ):
        # TODO(GH#836): Avoid INTERVAL columns until they are supported by the
        # BigQuery Storage API and pyarrow.
        schema = [
            bigquery.SchemaField("bool_col", enums.SqlTypeNames.BOOLEAN),
            bigquery.SchemaField("bignumeric_col", enums.SqlTypeNames.BIGNUMERIC),
            bigquery.SchemaField("bytes_col", enums.SqlTypeNames.BYTES),
            bigquery.SchemaField("date_col", enums.SqlTypeNames.DATE),
            bigquery.SchemaField("datetime_col", enums.SqlTypeNames.DATETIME),
            bigquery.SchemaField("float64_col", enums.SqlTypeNames.FLOAT64),
            bigquery.SchemaField("geography_col", enums.SqlTypeNames.GEOGRAPHY),
            bigquery.SchemaField("int64_col", enums.SqlTypeNames.INT64),
            bigquery.SchemaField("numeric_col", enums.SqlTypeNames.NUMERIC),
            bigquery.SchemaField("string_col", enums.SqlTypeNames.STRING),
            bigquery.SchemaField("time_col", enums.SqlTypeNames.TIME),
            bigquery.SchemaField("timestamp_col", enums.SqlTypeNames.TIMESTAMP),
        ]
    
        df = bigquery_client.list_rows(
            scalars_extreme_table,
            max_results=max_results,
            selected_fields=schema,
>       ).to_dataframe()

tests/system/test_pandas.py:1084: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
google/cloud/bigquery/table.py:1925: in to_dataframe
    _pandas_helpers.verify_pandas_imports()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    def verify_pandas_imports():
        if pandas is None:
            raise ValueError(_NO_PANDAS_ERROR) from pandas_import_exception
        if db_dtypes is None:
>           raise ValueError(_NO_DB_TYPES_ERROR) from db_dtypes_import_exception
E           ValueError: Please install the 'db-dtypes' package to use this function.

google/cloud/bigquery/_pandas_helpers.py:991: ValueError
```

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
- [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery/issues/new/choose) before writing your code!  That way we can discuss the change, evaluate designs, and agree on the general idea
- [ ] Ensure the tests and linter pass
- [ ] Code coverage does not decrease (if any source code was changed)
- [ ] Appropriate docs were updated (if necessary)

Fixes #1188  🦕
@tswast
Copy link
Contributor

tswast commented Mar 30, 2022

Since pandas and db-dtypes is an extra, we didn't add it to the requirements, but I did improve the error message when pandas is installed by the db-dtypes package is missing.

waltaskew pushed a commit to waltaskew/python-bigquery that referenced this issue Jul 20, 2022
…not (googleapis#1191)

`db-dtypes` is already present in the `pandas` "extras", but this PR ensures that if pandas is present and db-dtypes is not, a more understandable error message is raised.

```
google/cloud/bigquery/_pandas_helpers.py:991: ValueError
____________________________________ test_list_rows_nullable_scalars_extreme_dtypes[10] _____________________________________

    # Copyright 2019 Google LLC
    #
    # Licensed under the Apache License, Version 2.0 (the "License");
    # you may not use this file except in compliance with the License.
    # You may obtain a copy of the License at
    #
    #     http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    
    """Shared helper functions for connecting BigQuery and pandas."""
    
    import concurrent.futures
    from datetime import datetime
    import functools
    from itertools import islice
    import logging
    import queue
    import warnings
    
    try:
        import pandas  # type: ignore
    
        pandas_import_exception = None
    except ImportError as exc:  # pragma: NO COVER
        pandas = None
        pandas_import_exception = exc
    else:
        import numpy
    
    try:
>       import db_dtypes  # type: ignore
E       ModuleNotFoundError: No module named 'db_dtypes'

google/cloud/bigquery/_pandas_helpers.py:36: ModuleNotFoundError

The above exception was the direct cause of the following exception:

bigquery_client = <google.cloud.bigquery.client.Client object at 0x11e2d3580>
scalars_extreme_table = 'swast-scratch.python_bigquery_tests_system_20220330160830_ffff89.scalars_extreme_jsonl0x3ffeb'
max_results = 10

    @pytest.mark.parametrize(
        ("max_results",),
        (
            (None,),
            (10,),
        ),  # Use BQ Storage API.  # Use REST API.
    )
    def test_list_rows_nullable_scalars_extreme_dtypes(
        bigquery_client, scalars_extreme_table, max_results
    ):
        # TODO(GH#836): Avoid INTERVAL columns until they are supported by the
        # BigQuery Storage API and pyarrow.
        schema = [
            bigquery.SchemaField("bool_col", enums.SqlTypeNames.BOOLEAN),
            bigquery.SchemaField("bignumeric_col", enums.SqlTypeNames.BIGNUMERIC),
            bigquery.SchemaField("bytes_col", enums.SqlTypeNames.BYTES),
            bigquery.SchemaField("date_col", enums.SqlTypeNames.DATE),
            bigquery.SchemaField("datetime_col", enums.SqlTypeNames.DATETIME),
            bigquery.SchemaField("float64_col", enums.SqlTypeNames.FLOAT64),
            bigquery.SchemaField("geography_col", enums.SqlTypeNames.GEOGRAPHY),
            bigquery.SchemaField("int64_col", enums.SqlTypeNames.INT64),
            bigquery.SchemaField("numeric_col", enums.SqlTypeNames.NUMERIC),
            bigquery.SchemaField("string_col", enums.SqlTypeNames.STRING),
            bigquery.SchemaField("time_col", enums.SqlTypeNames.TIME),
            bigquery.SchemaField("timestamp_col", enums.SqlTypeNames.TIMESTAMP),
        ]
    
        df = bigquery_client.list_rows(
            scalars_extreme_table,
            max_results=max_results,
            selected_fields=schema,
>       ).to_dataframe()

tests/system/test_pandas.py:1084: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
google/cloud/bigquery/table.py:1925: in to_dataframe
    _pandas_helpers.verify_pandas_imports()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    def verify_pandas_imports():
        if pandas is None:
            raise ValueError(_NO_PANDAS_ERROR) from pandas_import_exception
        if db_dtypes is None:
>           raise ValueError(_NO_DB_TYPES_ERROR) from db_dtypes_import_exception
E           ValueError: Please install the 'db-dtypes' package to use this function.

google/cloud/bigquery/_pandas_helpers.py:991: ValueError
```

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
- [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery/issues/new/choose) before writing your code!  That way we can discuss the change, evaluate designs, and agree on the general idea
- [ ] Ensure the tests and linter pass
- [ ] Code coverage does not decrease (if any source code was changed)
- [ ] Appropriate docs were updated (if necessary)

Fixes googleapis#1188  🦕
@mattwelke
Copy link

No problem to report here. Just saying thanks for that error message. It's helping people out. :p

...
    raise ValueError(_NO_DB_TYPES_ERROR) from db_dtypes_import_exception
ValueError: Please install the 'db-dtypes' package to use this function.

abdelmegahedgoogle pushed a commit to abdelmegahedgoogle/python-bigquery that referenced this issue Apr 17, 2023
…not (googleapis#1191)

`db-dtypes` is already present in the `pandas` "extras", but this PR ensures that if pandas is present and db-dtypes is not, a more understandable error message is raised.

```
google/cloud/bigquery/_pandas_helpers.py:991: ValueError
____________________________________ test_list_rows_nullable_scalars_extreme_dtypes[10] _____________________________________

    # Copyright 2019 Google LLC
    #
    # Licensed under the Apache License, Version 2.0 (the "License");
    # you may not use this file except in compliance with the License.
    # You may obtain a copy of the License at
    #
    #     http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    
    """Shared helper functions for connecting BigQuery and pandas."""
    
    import concurrent.futures
    from datetime import datetime
    import functools
    from itertools import islice
    import logging
    import queue
    import warnings
    
    try:
        import pandas  # type: ignore
    
        pandas_import_exception = None
    except ImportError as exc:  # pragma: NO COVER
        pandas = None
        pandas_import_exception = exc
    else:
        import numpy
    
    try:
>       import db_dtypes  # type: ignore
E       ModuleNotFoundError: No module named 'db_dtypes'

google/cloud/bigquery/_pandas_helpers.py:36: ModuleNotFoundError

The above exception was the direct cause of the following exception:

bigquery_client = <google.cloud.bigquery.client.Client object at 0x11e2d3580>
scalars_extreme_table = 'swast-scratch.python_bigquery_tests_system_20220330160830_ffff89.scalars_extreme_jsonl0x3ffeb'
max_results = 10

    @pytest.mark.parametrize(
        ("max_results",),
        (
            (None,),
            (10,),
        ),  # Use BQ Storage API.  # Use REST API.
    )
    def test_list_rows_nullable_scalars_extreme_dtypes(
        bigquery_client, scalars_extreme_table, max_results
    ):
        # TODO(GH#836): Avoid INTERVAL columns until they are supported by the
        # BigQuery Storage API and pyarrow.
        schema = [
            bigquery.SchemaField("bool_col", enums.SqlTypeNames.BOOLEAN),
            bigquery.SchemaField("bignumeric_col", enums.SqlTypeNames.BIGNUMERIC),
            bigquery.SchemaField("bytes_col", enums.SqlTypeNames.BYTES),
            bigquery.SchemaField("date_col", enums.SqlTypeNames.DATE),
            bigquery.SchemaField("datetime_col", enums.SqlTypeNames.DATETIME),
            bigquery.SchemaField("float64_col", enums.SqlTypeNames.FLOAT64),
            bigquery.SchemaField("geography_col", enums.SqlTypeNames.GEOGRAPHY),
            bigquery.SchemaField("int64_col", enums.SqlTypeNames.INT64),
            bigquery.SchemaField("numeric_col", enums.SqlTypeNames.NUMERIC),
            bigquery.SchemaField("string_col", enums.SqlTypeNames.STRING),
            bigquery.SchemaField("time_col", enums.SqlTypeNames.TIME),
            bigquery.SchemaField("timestamp_col", enums.SqlTypeNames.TIMESTAMP),
        ]
    
        df = bigquery_client.list_rows(
            scalars_extreme_table,
            max_results=max_results,
            selected_fields=schema,
>       ).to_dataframe()

tests/system/test_pandas.py:1084: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
google/cloud/bigquery/table.py:1925: in to_dataframe
    _pandas_helpers.verify_pandas_imports()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    def verify_pandas_imports():
        if pandas is None:
            raise ValueError(_NO_PANDAS_ERROR) from pandas_import_exception
        if db_dtypes is None:
>           raise ValueError(_NO_DB_TYPES_ERROR) from db_dtypes_import_exception
E           ValueError: Please install the 'db-dtypes' package to use this function.

google/cloud/bigquery/_pandas_helpers.py:991: ValueError
```

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
- [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery/issues/new/choose) before writing your code!  That way we can discuss the change, evaluate designs, and agree on the general idea
- [ ] Ensure the tests and linter pass
- [ ] Code coverage does not decrease (if any source code was changed)
- [ ] Appropriate docs were updated (if necessary)

Fixes googleapis#1188  🦕
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants