Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(deps): raise exception when pandas is installed but db-dtypes is not #1191

Merged
merged 9 commits into from
Mar 30, 2022

Conversation

tswast
Copy link
Contributor

@tswast tswast commented Mar 30, 2022

db-dtypes is already present in the pandas "extras", but this PR ensures that if pandas is present and db-dtypes is not, a more understandable error message is raised.

google/cloud/bigquery/_pandas_helpers.py:991: ValueError
____________________________________ test_list_rows_nullable_scalars_extreme_dtypes[10] _____________________________________

    # Copyright 2019 Google LLC
    #
    # Licensed under the Apache License, Version 2.0 (the "License");
    # you may not use this file except in compliance with the License.
    # You may obtain a copy of the License at
    #
    #     http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    
    """Shared helper functions for connecting BigQuery and pandas."""
    
    import concurrent.futures
    from datetime import datetime
    import functools
    from itertools import islice
    import logging
    import queue
    import warnings
    
    try:
        import pandas  # type: ignore
    
        pandas_import_exception = None
    except ImportError as exc:  # pragma: NO COVER
        pandas = None
        pandas_import_exception = exc
    else:
        import numpy
    
    try:
>       import db_dtypes  # type: ignore
E       ModuleNotFoundError: No module named 'db_dtypes'

google/cloud/bigquery/_pandas_helpers.py:36: ModuleNotFoundError

The above exception was the direct cause of the following exception:

bigquery_client = <google.cloud.bigquery.client.Client object at 0x11e2d3580>
scalars_extreme_table = 'swast-scratch.python_bigquery_tests_system_20220330160830_ffff89.scalars_extreme_jsonl0x3ffeb'
max_results = 10

    @pytest.mark.parametrize(
        ("max_results",),
        (
            (None,),
            (10,),
        ),  # Use BQ Storage API.  # Use REST API.
    )
    def test_list_rows_nullable_scalars_extreme_dtypes(
        bigquery_client, scalars_extreme_table, max_results
    ):
        # TODO(GH#836): Avoid INTERVAL columns until they are supported by the
        # BigQuery Storage API and pyarrow.
        schema = [
            bigquery.SchemaField("bool_col", enums.SqlTypeNames.BOOLEAN),
            bigquery.SchemaField("bignumeric_col", enums.SqlTypeNames.BIGNUMERIC),
            bigquery.SchemaField("bytes_col", enums.SqlTypeNames.BYTES),
            bigquery.SchemaField("date_col", enums.SqlTypeNames.DATE),
            bigquery.SchemaField("datetime_col", enums.SqlTypeNames.DATETIME),
            bigquery.SchemaField("float64_col", enums.SqlTypeNames.FLOAT64),
            bigquery.SchemaField("geography_col", enums.SqlTypeNames.GEOGRAPHY),
            bigquery.SchemaField("int64_col", enums.SqlTypeNames.INT64),
            bigquery.SchemaField("numeric_col", enums.SqlTypeNames.NUMERIC),
            bigquery.SchemaField("string_col", enums.SqlTypeNames.STRING),
            bigquery.SchemaField("time_col", enums.SqlTypeNames.TIME),
            bigquery.SchemaField("timestamp_col", enums.SqlTypeNames.TIMESTAMP),
        ]
    
        df = bigquery_client.list_rows(
            scalars_extreme_table,
            max_results=max_results,
            selected_fields=schema,
>       ).to_dataframe()

tests/system/test_pandas.py:1084: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
google/cloud/bigquery/table.py:1925: in to_dataframe
    _pandas_helpers.verify_pandas_imports()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    def verify_pandas_imports():
        if pandas is None:
            raise ValueError(_NO_PANDAS_ERROR) from pandas_import_exception
        if db_dtypes is None:
>           raise ValueError(_NO_DB_TYPES_ERROR) from db_dtypes_import_exception
E           ValueError: Please install the 'db-dtypes' package to use this function.

google/cloud/bigquery/_pandas_helpers.py:991: ValueError

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #1188 🦕

@tswast tswast requested a review from a team March 30, 2022 16:12
@tswast tswast requested a review from a team as a code owner March 30, 2022 16:12
@tswast tswast requested a review from shollyman March 30, 2022 16:12
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Mar 30, 2022
@parthea parthea added kokoro:run Add this label to force Kokoro to re-run the tests. kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Mar 30, 2022
@parthea parthea added kokoro:run Add this label to force Kokoro to re-run the tests. kokoro:force-run Add this label to force Kokoro to re-run the tests. and removed kokoro:run Add this label to force Kokoro to re-run the tests. kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Mar 30, 2022
@tswast tswast added kokoro:force-run Add this label to force Kokoro to re-run the tests. and removed kokoro:run Add this label to force Kokoro to re-run the tests. kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Mar 30, 2022
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Mar 30, 2022
@tswast tswast added the automerge Merge the pull request once unit tests and other checks pass. label Mar 30, 2022
@gcf-merge-on-green gcf-merge-on-green bot merged commit 4333910 into main Mar 30, 2022
@gcf-merge-on-green gcf-merge-on-green bot deleted the issue1188-db-dtypes branch March 30, 2022 19:20
@gcf-merge-on-green gcf-merge-on-green bot removed the automerge Merge the pull request once unit tests and other checks pass. label Mar 30, 2022
gcf-merge-on-green bot pushed a commit that referenced this pull request Mar 30, 2022
🤖 I have created a release *beep* *boop*
---


### [3.0.1](v3.0.0...v3.0.1) (2022-03-30)


### Bug Fixes

* **deps:** raise exception when pandas is installed but db-dtypes is not ([#1191](#1191)) ([4333910](4333910))
* **deps:** restore dependency on python-dateutil ([#1187](#1187)) ([212d7ec](212d7ec))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
waltaskew pushed a commit to waltaskew/python-bigquery that referenced this pull request Jul 20, 2022
…not (googleapis#1191)

`db-dtypes` is already present in the `pandas` "extras", but this PR ensures that if pandas is present and db-dtypes is not, a more understandable error message is raised.

```
google/cloud/bigquery/_pandas_helpers.py:991: ValueError
____________________________________ test_list_rows_nullable_scalars_extreme_dtypes[10] _____________________________________

    # Copyright 2019 Google LLC
    #
    # Licensed under the Apache License, Version 2.0 (the "License");
    # you may not use this file except in compliance with the License.
    # You may obtain a copy of the License at
    #
    #     http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    
    """Shared helper functions for connecting BigQuery and pandas."""
    
    import concurrent.futures
    from datetime import datetime
    import functools
    from itertools import islice
    import logging
    import queue
    import warnings
    
    try:
        import pandas  # type: ignore
    
        pandas_import_exception = None
    except ImportError as exc:  # pragma: NO COVER
        pandas = None
        pandas_import_exception = exc
    else:
        import numpy
    
    try:
>       import db_dtypes  # type: ignore
E       ModuleNotFoundError: No module named 'db_dtypes'

google/cloud/bigquery/_pandas_helpers.py:36: ModuleNotFoundError

The above exception was the direct cause of the following exception:

bigquery_client = <google.cloud.bigquery.client.Client object at 0x11e2d3580>
scalars_extreme_table = 'swast-scratch.python_bigquery_tests_system_20220330160830_ffff89.scalars_extreme_jsonl0x3ffeb'
max_results = 10

    @pytest.mark.parametrize(
        ("max_results",),
        (
            (None,),
            (10,),
        ),  # Use BQ Storage API.  # Use REST API.
    )
    def test_list_rows_nullable_scalars_extreme_dtypes(
        bigquery_client, scalars_extreme_table, max_results
    ):
        # TODO(GH#836): Avoid INTERVAL columns until they are supported by the
        # BigQuery Storage API and pyarrow.
        schema = [
            bigquery.SchemaField("bool_col", enums.SqlTypeNames.BOOLEAN),
            bigquery.SchemaField("bignumeric_col", enums.SqlTypeNames.BIGNUMERIC),
            bigquery.SchemaField("bytes_col", enums.SqlTypeNames.BYTES),
            bigquery.SchemaField("date_col", enums.SqlTypeNames.DATE),
            bigquery.SchemaField("datetime_col", enums.SqlTypeNames.DATETIME),
            bigquery.SchemaField("float64_col", enums.SqlTypeNames.FLOAT64),
            bigquery.SchemaField("geography_col", enums.SqlTypeNames.GEOGRAPHY),
            bigquery.SchemaField("int64_col", enums.SqlTypeNames.INT64),
            bigquery.SchemaField("numeric_col", enums.SqlTypeNames.NUMERIC),
            bigquery.SchemaField("string_col", enums.SqlTypeNames.STRING),
            bigquery.SchemaField("time_col", enums.SqlTypeNames.TIME),
            bigquery.SchemaField("timestamp_col", enums.SqlTypeNames.TIMESTAMP),
        ]
    
        df = bigquery_client.list_rows(
            scalars_extreme_table,
            max_results=max_results,
            selected_fields=schema,
>       ).to_dataframe()

tests/system/test_pandas.py:1084: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
google/cloud/bigquery/table.py:1925: in to_dataframe
    _pandas_helpers.verify_pandas_imports()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    def verify_pandas_imports():
        if pandas is None:
            raise ValueError(_NO_PANDAS_ERROR) from pandas_import_exception
        if db_dtypes is None:
>           raise ValueError(_NO_DB_TYPES_ERROR) from db_dtypes_import_exception
E           ValueError: Please install the 'db-dtypes' package to use this function.

google/cloud/bigquery/_pandas_helpers.py:991: ValueError
```

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
- [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery/issues/new/choose) before writing your code!  That way we can discuss the change, evaluate designs, and agree on the general idea
- [ ] Ensure the tests and linter pass
- [ ] Code coverage does not decrease (if any source code was changed)
- [ ] Appropriate docs were updated (if necessary)

Fixes googleapis#1188  🦕
waltaskew pushed a commit to waltaskew/python-bigquery that referenced this pull request Jul 20, 2022
🤖 I have created a release *beep* *boop*
---


### [3.0.1](googleapis/python-bigquery@v3.0.0...v3.0.1) (2022-03-30)


### Bug Fixes

* **deps:** raise exception when pandas is installed but db-dtypes is not ([googleapis#1191](googleapis#1191)) ([4333910](googleapis@4333910))
* **deps:** restore dependency on python-dateutil ([googleapis#1187](googleapis#1187)) ([212d7ec](googleapis@212d7ec))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
abdelmegahedgoogle pushed a commit to abdelmegahedgoogle/python-bigquery that referenced this pull request Apr 17, 2023
…not (googleapis#1191)

`db-dtypes` is already present in the `pandas` "extras", but this PR ensures that if pandas is present and db-dtypes is not, a more understandable error message is raised.

```
google/cloud/bigquery/_pandas_helpers.py:991: ValueError
____________________________________ test_list_rows_nullable_scalars_extreme_dtypes[10] _____________________________________

    # Copyright 2019 Google LLC
    #
    # Licensed under the Apache License, Version 2.0 (the "License");
    # you may not use this file except in compliance with the License.
    # You may obtain a copy of the License at
    #
    #     http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    
    """Shared helper functions for connecting BigQuery and pandas."""
    
    import concurrent.futures
    from datetime import datetime
    import functools
    from itertools import islice
    import logging
    import queue
    import warnings
    
    try:
        import pandas  # type: ignore
    
        pandas_import_exception = None
    except ImportError as exc:  # pragma: NO COVER
        pandas = None
        pandas_import_exception = exc
    else:
        import numpy
    
    try:
>       import db_dtypes  # type: ignore
E       ModuleNotFoundError: No module named 'db_dtypes'

google/cloud/bigquery/_pandas_helpers.py:36: ModuleNotFoundError

The above exception was the direct cause of the following exception:

bigquery_client = <google.cloud.bigquery.client.Client object at 0x11e2d3580>
scalars_extreme_table = 'swast-scratch.python_bigquery_tests_system_20220330160830_ffff89.scalars_extreme_jsonl0x3ffeb'
max_results = 10

    @pytest.mark.parametrize(
        ("max_results",),
        (
            (None,),
            (10,),
        ),  # Use BQ Storage API.  # Use REST API.
    )
    def test_list_rows_nullable_scalars_extreme_dtypes(
        bigquery_client, scalars_extreme_table, max_results
    ):
        # TODO(GH#836): Avoid INTERVAL columns until they are supported by the
        # BigQuery Storage API and pyarrow.
        schema = [
            bigquery.SchemaField("bool_col", enums.SqlTypeNames.BOOLEAN),
            bigquery.SchemaField("bignumeric_col", enums.SqlTypeNames.BIGNUMERIC),
            bigquery.SchemaField("bytes_col", enums.SqlTypeNames.BYTES),
            bigquery.SchemaField("date_col", enums.SqlTypeNames.DATE),
            bigquery.SchemaField("datetime_col", enums.SqlTypeNames.DATETIME),
            bigquery.SchemaField("float64_col", enums.SqlTypeNames.FLOAT64),
            bigquery.SchemaField("geography_col", enums.SqlTypeNames.GEOGRAPHY),
            bigquery.SchemaField("int64_col", enums.SqlTypeNames.INT64),
            bigquery.SchemaField("numeric_col", enums.SqlTypeNames.NUMERIC),
            bigquery.SchemaField("string_col", enums.SqlTypeNames.STRING),
            bigquery.SchemaField("time_col", enums.SqlTypeNames.TIME),
            bigquery.SchemaField("timestamp_col", enums.SqlTypeNames.TIMESTAMP),
        ]
    
        df = bigquery_client.list_rows(
            scalars_extreme_table,
            max_results=max_results,
            selected_fields=schema,
>       ).to_dataframe()

tests/system/test_pandas.py:1084: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
google/cloud/bigquery/table.py:1925: in to_dataframe
    _pandas_helpers.verify_pandas_imports()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    def verify_pandas_imports():
        if pandas is None:
            raise ValueError(_NO_PANDAS_ERROR) from pandas_import_exception
        if db_dtypes is None:
>           raise ValueError(_NO_DB_TYPES_ERROR) from db_dtypes_import_exception
E           ValueError: Please install the 'db-dtypes' package to use this function.

google/cloud/bigquery/_pandas_helpers.py:991: ValueError
```

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
- [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery/issues/new/choose) before writing your code!  That way we can discuss the change, evaluate designs, and agree on the general idea
- [ ] Ensure the tests and linter pass
- [ ] Code coverage does not decrease (if any source code was changed)
- [ ] Appropriate docs were updated (if necessary)

Fixes googleapis#1188  🦕
abdelmegahedgoogle pushed a commit to abdelmegahedgoogle/python-bigquery that referenced this pull request Apr 17, 2023
🤖 I have created a release *beep* *boop*
---


### [3.0.1](googleapis/python-bigquery@v3.0.0...v3.0.1) (2022-03-30)


### Bug Fixes

* **deps:** raise exception when pandas is installed but db-dtypes is not ([googleapis#1191](googleapis#1191)) ([4333910](googleapis@4333910))
* **deps:** restore dependency on python-dateutil ([googleapis#1187](googleapis#1187)) ([212d7ec](googleapis@212d7ec))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

missing requirement db_types
4 participants