Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load data with user defined schema #150

Merged
merged 2 commits into from Mar 14, 2018

Conversation

aktech
Copy link
Contributor

@aktech aktech commented Mar 14, 2018

The to_gbp function fails when there is a user-defined schema which is slightly different from the schema of DataFrame.

Traceback:

Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
  File "pandas_gbq/gbq.py", line 979, in to_gbq
    schema=table_schema)
  File "pandas_gbq/gbq.py", line 569, in load_data
    self.process_http_error(ex)
  File "pandas_gbq/gbq.py", line 450, in process_http_error
    raise GenericGBQException("Reason: {0}".format(ex))
pandas_gbq.gbq.GenericGBQException: Reason: 400 POST https://www.googleapis.com/upload/bigquery/v2/projects/aktech-labs/jobs?uploadType=resumable: Provided Schema does not match Table aktech-labs:pandas_test.gamma. Field A has changed type from FLOAT to STRING

The primary reason for this is that the load_data function ignores the schema argument.

Solution: The schema argument is now passed to the load_chunks function inside load_data function. A suitable integration test has been added.

@codecov-io
Copy link

codecov-io commented Mar 14, 2018

Codecov Report

Merging #150 into master will decrease coverage by 46.5%.
The diff coverage is 10%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #150       +/-   ##
==========================================
- Coverage    75.8%   29.3%   -46.51%     
==========================================
  Files           8       8               
  Lines        1546    1556       +10     
==========================================
- Hits         1172     456      -716     
- Misses        374    1100      +726
Impacted Files Coverage Δ
pandas_gbq/gbq.py 20.56% <ø> (-56.02%) ⬇️
pandas_gbq/tests/test_gbq.py 22.93% <10%> (-61.67%) ⬇️
pandas_gbq/_load.py 62.5% <0%> (-35%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ed17886...c563fc3. Read the comment docs.

@max-sixty
Copy link
Contributor

max-sixty commented Mar 14, 2018

Supersedes #131

@max-sixty
Copy link
Contributor

This looks great. Any feedback @tswast ?

@aktech did you run the tests? Otherwise I will. (Travis integration tests don't work on PRs because of the auth keys)

@tswast more generally, do you think it's OK to merge these small PRs that look good, so the tests run on master, and then correct in the 5% of cases where it fails? Or we should run the tests on an 'auth-enabled' travis first?

@tswast
Copy link
Collaborator

tswast commented Mar 14, 2018

LGTM.

@maxim-lian So long as whomever merges keeps an eye on the master build and fixes / reverts based on the outcome, I'm okay merging without requiring an auth-enabled Travis.

@max-sixty max-sixty merged commit 9a9a48e into googleapis:master Mar 14, 2018
@aktech
Copy link
Contributor Author

aktech commented Mar 14, 2018

Thanks for the quick reply and merging.

@aktech did you run the tests? Otherwise I will. (Travis integration tests don't work on PRs because of the auth keys)

Yes, I did ran the tests locally as well as on travis:
https://travis-ci.org/aktech/pandas-gbq/builds/353407949

@aktech aktech deleted the load_data_with_schema branch March 13, 2020 11:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants