Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: inserting non-finite floats with insert_rows() #728

Merged
merged 1 commit into from Jul 1, 2021

Conversation

@plamut
Copy link
Contributor

@plamut plamut commented Jun 29, 2021

Fixes #696.

This PR fixes inserting Nan and non-finite floats with the client.insert_rows() method.

PR checklist:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)
@plamut plamut requested review from as code owners Jun 29, 2021
@plamut plamut requested review from stephaniewang526 and removed request for Jun 29, 2021
@google-cla google-cla bot added the cla: yes label Jun 29, 2021
@plamut
Copy link
Contributor Author

@plamut plamut commented Jun 29, 2021

test_insert_rows_from_dataframe() fails, will investigate how the changes affected Nan-handling there.

Loading

@plamut
Copy link
Contributor Author

@plamut plamut commented Jun 30, 2021

I could not reproduce the failed test even after multiple trials, running it both in isolation and in the entire test suite. It seems weird why there would be no rows in the query results, NaN conversions didn't seem to play the role here.

        rows = list(
            bigquery_client.query(
                "SELECT * FROM `{}.{}.{}`".format(
                    table.project, table.dataset_id, table.table_id
                )
            )
        )

        sorted_rows = sorted(rows, key=operator.attrgetter("int_col"))
        row_tuples = [r.values() for r in sorted_rows]
        expected = [...]

>       assert len(row_tuples) == len(expected)
E       AssertionError: assert 0 == 6
E        +  where 0 = len([])
E        +  and   6 = len([(1.11, True, 'my string', 10), (2.22, False, 'another string', 20), (3.33, False, 'another string', 30), (4.44, True, 'another string', 40), (5.55, False, 'another string', 50), (6.66, True, None, 60)])

In any case, re-running the CI checks made them pass, and the same result was consistently reproduced locally.

Loading

@jimfulton
Copy link
Contributor

@jimfulton jimfulton commented Jun 30, 2021

I could not reproduce the failed test even after multiple trials, running it both in isolation and in the entire test suite. It seems weird why there would be no rows in the query results, NaN conversions didn't seem to play the role here.

        rows = list(
            bigquery_client.query(
                "SELECT * FROM `{}.{}.{}`".format(
                    table.project, table.dataset_id, table.table_id
                )
            )
        )

        sorted_rows = sorted(rows, key=operator.attrgetter("int_col"))
        row_tuples = [r.values() for r in sorted_rows]
        expected = [...]

>       assert len(row_tuples) == len(expected)
E       AssertionError: assert 0 == 6
E        +  where 0 = len([])
E        +  and   6 = len([(1.11, True, 'my string', 10), (2.22, False, 'another string', 20), (3.33, False, 'another string', 30), (4.44, True, 'another string', 40), (5.55, False, 'another string', 50), (6.66, True, None, 60)])

In any case, re-running the CI checks made them pass, and the same result was consistently reproduced locally.

That failed for me too yesterday. I'll create a flaky-test issue.

Eventual consistency?

Loading

@plamut
Copy link
Contributor Author

@plamut plamut commented Jun 30, 2021

@jimfulton Can't say, it's the first time I'm seeing this failure in 2+ years. Something on the backend might have changed related to when the streamed data is available. It should be as soon as after a few seconds, but even this might be too late in certain circumstances, just enough for the test to fail.

Loading

@tseaver
Copy link
Contributor

@tseaver tseaver commented Jun 30, 2021

@plamut, @jimfulton For the flaky test, the test_utils.retry.RetryResult wrapper might help. Something like:

from test_utils.retry import RetryResult

def _list_query(client, query):
    return list(client.query(query)

def _check_result(result, expected_length)
     return len(result) == expected_length

retry = RetryResult(functools.partial(_check_result, expected_length=len(expected))
retry(_list_query)(bigquery_client, query)

Loading

tswast
tswast approved these changes Jun 30, 2021
@plamut plamut merged commit d047419 into googleapis:master Jul 1, 2021
12 checks passed
Loading
@plamut plamut deleted the iss-696 branch Jul 1, 2021
gcf-merge-on-green bot pushed a commit that referenced this issue Jul 13, 2021
Supersedes #711.


## [2.21.0](https://www.github.com/googleapis/python-bigquery/compare/v2.20.0...v2.21.0) (2021-07-13)


### Features

* Add max_results parameter to some of the `QueryJob` methods. ([#698](https://www.github.com/googleapis/python-bigquery/issues/698)) ([2a9618f](https://www.github.com/googleapis/python-bigquery/commit/2a9618f4daaa4a014161e1a2f7376844eec9e8da))
* Add support for decimal target types. ([#735](https://www.github.com/googleapis/python-bigquery/issues/735)) ([7d2d3e9](https://www.github.com/googleapis/python-bigquery/commit/7d2d3e906a9eb161911a198fb925ad79de5df934))
* Add support for table snapshots. ([#740](https://www.github.com/googleapis/python-bigquery/issues/740)) ([ba86b2a](https://www.github.com/googleapis/python-bigquery/commit/ba86b2a6300ae5a9f3c803beeb42bda4c522e34c))
* Enable unsetting policy tags on schema fields. ([#703](https://www.github.com/googleapis/python-bigquery/issues/703)) ([18bb443](https://www.github.com/googleapis/python-bigquery/commit/18bb443c7acd0a75dcb57d9aebe38b2d734ff8c7))
* Make it easier to disable best-effort deduplication with streaming inserts. ([#734](https://www.github.com/googleapis/python-bigquery/issues/734)) ([1246da8](https://www.github.com/googleapis/python-bigquery/commit/1246da86b78b03ca1aa2c45ec71649e294cfb2f1))
* Support passing struct data to the DB API. ([#718](https://www.github.com/googleapis/python-bigquery/issues/718)) ([38b3ef9](https://www.github.com/googleapis/python-bigquery/commit/38b3ef96c3dedc139b84f0ff06885141ae7ce78c))


### Bug Fixes

* Inserting non-finite floats with `insert_rows()`. ([#728](https://www.github.com/googleapis/python-bigquery/issues/728)) ([d047419](https://www.github.com/googleapis/python-bigquery/commit/d047419879e807e123296da2eee89a5253050166))
* Use `pandas` function to check for `NaN`. ([#750](https://www.github.com/googleapis/python-bigquery/issues/750)) ([67bc5fb](https://www.github.com/googleapis/python-bigquery/commit/67bc5fbd306be7cdffd216f3791d4024acfa95b3))


### Documentation

* Add docs for all enums in module. ([#745](https://www.github.com/googleapis/python-bigquery/issues/745)) ([145944f](https://www.github.com/googleapis/python-bigquery/commit/145944f24fedc4d739687399a8309f9d51d43dfd))
* Omit mention of Python 2.7 in `CONTRIBUTING.rst`. ([#706](https://www.github.com/googleapis/python-bigquery/issues/706)) ([27d6839](https://www.github.com/googleapis/python-bigquery/commit/27d6839ee8a40909e4199cfa0da8b6b64705b2e9))
gcf-merge-on-green bot pushed a commit that referenced this issue Jul 14, 2021
🤖 I have created a release \*beep\* \*boop\*
---
## [2.21.0](https://www.github.com/googleapis/python-bigquery/compare/v2.20.0...v2.21.0) (2021-07-14)


### Features

* add always_use_jwt_access ([#714](https://www.github.com/googleapis/python-bigquery/issues/714)) ([92fbd4a](https://www.github.com/googleapis/python-bigquery/commit/92fbd4ade37e0be49dc278080ef73c83eafeea18))
* add max_results parameter to some of the QueryJob methods ([#698](https://www.github.com/googleapis/python-bigquery/issues/698)) ([2a9618f](https://www.github.com/googleapis/python-bigquery/commit/2a9618f4daaa4a014161e1a2f7376844eec9e8da))
* add support for decimal target types ([#735](https://www.github.com/googleapis/python-bigquery/issues/735)) ([7d2d3e9](https://www.github.com/googleapis/python-bigquery/commit/7d2d3e906a9eb161911a198fb925ad79de5df934))
* add support for table snapshots ([#740](https://www.github.com/googleapis/python-bigquery/issues/740)) ([ba86b2a](https://www.github.com/googleapis/python-bigquery/commit/ba86b2a6300ae5a9f3c803beeb42bda4c522e34c))
* enable unsetting policy tags on schema fields ([#703](https://www.github.com/googleapis/python-bigquery/issues/703)) ([18bb443](https://www.github.com/googleapis/python-bigquery/commit/18bb443c7acd0a75dcb57d9aebe38b2d734ff8c7))
* make it easier to disable best-effort deduplication with streaming inserts ([#734](https://www.github.com/googleapis/python-bigquery/issues/734)) ([1246da8](https://www.github.com/googleapis/python-bigquery/commit/1246da86b78b03ca1aa2c45ec71649e294cfb2f1))
* Support passing struct data to the DB API ([#718](https://www.github.com/googleapis/python-bigquery/issues/718)) ([38b3ef9](https://www.github.com/googleapis/python-bigquery/commit/38b3ef96c3dedc139b84f0ff06885141ae7ce78c))


### Bug Fixes

* inserting non-finite floats with insert_rows() ([#728](https://www.github.com/googleapis/python-bigquery/issues/728)) ([d047419](https://www.github.com/googleapis/python-bigquery/commit/d047419879e807e123296da2eee89a5253050166))
* use pandas function to check for NaN ([#750](https://www.github.com/googleapis/python-bigquery/issues/750)) ([67bc5fb](https://www.github.com/googleapis/python-bigquery/commit/67bc5fbd306be7cdffd216f3791d4024acfa95b3))


### Documentation

* add docs for all enums in module ([#745](https://www.github.com/googleapis/python-bigquery/issues/745)) ([145944f](https://www.github.com/googleapis/python-bigquery/commit/145944f24fedc4d739687399a8309f9d51d43dfd))
* omit mention of Python 2.7 in `CONTRIBUTING.rst` ([#706](https://www.github.com/googleapis/python-bigquery/issues/706)) ([27d6839](https://www.github.com/googleapis/python-bigquery/commit/27d6839ee8a40909e4199cfa0da8b6b64705b2e9))
---


This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

5 participants