You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed that in the API for version > 1.0, the module data_types disappeared. Which basically means it's not possible to get the schema of a dataframe with: data_types.extract_pyarrow_schema_from_pandas.
This function was quite handy, I normally call it to get the schema of a dataframe (often imported from csv), before calling wr.glue.create_table.
wr.glue.create_table is now called wr.glue.create_csv_table, and it has a parameter columns_types which is the schema of the dataframe (the format of the schema changed, and is now a Dict[str, str]). But I can't generate this schema anymore, which means my custom crawlers don't work anymore.
Of course I could get this schema with pyarrow, but I was wondering if it was intentional to remove this feature, and if it would be back at some point. Ideally a function that returns the schema in the format that columns_typesexpect would be great.
Cheers
The text was updated successfully, but these errors were encountered:
Ah perfect, sorry I missed the function in the doc. The tutorial is exactly the workflow I use. Also, the get_table_types function is nice, it can be used to compare the schema of previous crawler runs to the schema of the new files being crawled. Useful to detect new incoming schemas.
I noticed that in the API for version > 1.0, the module
data_types
disappeared. Which basically means it's not possible to get the schema of a dataframe with:data_types.extract_pyarrow_schema_from_pandas
.This function was quite handy, I normally call it to get the schema of a dataframe (often imported from csv), before calling
wr.glue.create_table
.wr.glue.create_table
is now calledwr.glue.create_csv_table
, and it has a parametercolumns_types
which is the schema of the dataframe (the format of the schema changed, and is now a Dict[str, str]). But I can't generate this schema anymore, which means my custom crawlers don't work anymore.Of course I could get this schema with pyarrow, but I was wondering if it was intentional to remove this feature, and if it would be back at some point. Ideally a function that returns the schema in the format that
columns_types
expect would be great.Cheers
The text was updated successfully, but these errors were encountered: