Better data type detection for pre_aggregated, indexed dataframes #61

dorisjlee · 2020-08-13T08:41:20Z

When a dataframe is pre-aggregated, our type detection based on cardinality often fail to detect the type correctly. For example, when the dataset size is small (often the case when data is pre-aggregated), nominal fields would get recognized as a quantitative type.

df = pd.read_csv("lux/data/car.csv")
df["Year"] = pd.to_datetime(df["Year"], format='%Y') # change pandas dtype for the column "Year" to datetype
a = df.groupby("Cylinders").mean()

a.data_type

As a related issue, we should also support the detection of types for named index, for example, in this case, Cylinders is an index, so its data type is not being computed.

The text was updated successfully, but these errors were encountered:

dorisjlee · 2021-03-02T09:19:12Z

Closing this after #287 is merged in. Great work Kunal!

dorisjlee added bug Something isn't working easy Easy to fix; Good issues for newcomers labels Aug 13, 2020

dorisjlee changed the title ~~Better data type detection for pre_aggregated dataframes~~ Better data type detection for pre_aggregated, indexed dataframes Aug 13, 2020

dorisjlee mentioned this issue Sep 28, 2020

Lux Errors when set_index #49

Closed

dorisjlee assigned westernguy2 Sep 28, 2020

jinimukh added this to the S1: January 2021 milestone Jan 15, 2021

westernguy2 mentioned this issue Feb 10, 2021

LuxGroupby Implementation #260

Merged

dorisjlee modified the milestones: S1: January 2021, S2: February 2021 Feb 26, 2021

dorisjlee closed this as completed Mar 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better data type detection for pre_aggregated, indexed dataframes #61

Better data type detection for pre_aggregated, indexed dataframes #61

dorisjlee commented Aug 13, 2020 •

edited

Loading

dorisjlee commented Mar 2, 2021

Better data type detection for pre_aggregated, indexed dataframes #61

Better data type detection for pre_aggregated, indexed dataframes #61

Comments

dorisjlee commented Aug 13, 2020 • edited Loading

dorisjlee commented Mar 2, 2021

dorisjlee commented Aug 13, 2020 •

edited

Loading