You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When a dataframe is pre-aggregated, our type detection based on cardinality often fail to detect the type correctly. For example, when the dataset size is small (often the case when data is pre-aggregated), nominal fields would get recognized as a quantitative type.
df = pd.read_csv("lux/data/car.csv")
df["Year"] = pd.to_datetime(df["Year"], format='%Y') # change pandas dtype for the column "Year" to datetype
a = df.groupby("Cylinders").mean()
a.data_type
As a related issue, we should also support the detection of types for named index, for example, in this case, Cylinders is an index, so its data type is not being computed.
The text was updated successfully, but these errors were encountered:
dorisjlee
changed the title
Better data type detection for pre_aggregated dataframes
Better data type detection for pre_aggregated, indexed dataframes
Aug 13, 2020
When a dataframe is pre-aggregated, our type detection based on cardinality often fail to detect the type correctly. For example, when the dataset size is small (often the case when data is pre-aggregated), nominal fields would get recognized as a
quantitative
type.As a related issue, we should also support the detection of types for named index, for example, in this case,
Cylinders
is an index, so its data type is not being computed.The text was updated successfully, but these errors were encountered: