Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery: load_table_from_dataframe automatically generate schema for known dtypes #9044

Closed
tswast opened this issue Aug 16, 2019 · 0 comments · Fixed by #9049
Closed

BigQuery: load_table_from_dataframe automatically generate schema for known dtypes #9044

tswast opened this issue Aug 16, 2019 · 0 comments · Fixed by #9049
Assignees
Labels
api: bigquery Issues related to the BigQuery API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@tswast
Copy link
Contributor

tswast commented Aug 16, 2019

Is your feature request related to a problem? Please describe.

I want to upload a DataFrame without specifying a schema. My DataFrame has well defined dtypes for all columns, so it should be safe to serialize the DataFrame without worrying about cases such as ARRAY / STRUCT columns.

With #9042, I may get a warning in this case, though.

Describe the solution you'd like

I'd like BigQuery to automatically generate a schema based on the dtypes in the DataFrame. Ideally, this generated schema could be merged with the partial schema at #8140.

If the number of columns for the generated schema + partial schema matches that of the DataFrame (meaning, we handled all the object columns with an explicit schema), there should be no warning message printed.

Describe alternatives you've considered

  • Status quo: let pandas/pyarrow/fastparquet figure out what to do.
  • Introspect the actual objects in the object column to guess the correct type. (Slow!)

/CC @plamut

@tswast tswast added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. api: bigquery Issues related to the BigQuery API. labels Aug 16, 2019
@tswast tswast self-assigned this Aug 16, 2019
@tswast tswast changed the title BigQuery: load_table_from_dataframe automatically generate schema for new columns BigQuery: load_table_from_dataframe automatically generate schema for known dtypes Aug 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
1 participant