[BUG] Format precentage in new category output #1766

shir22 · 2022-07-14T07:38:48Z

Describe the bug

See "percent of new category in sample"

To Reproduce
Use to following dataset:
https://github.com/AllonHammer/CPI_HRNN/blob/master/resources/cpi_us_dataset.csv
Dataset definition:
ds = Dataset(df, datetime_name='Date', label='Price', cat_features=['Category_id', 'Category', 'Indent', 'Parent', 'Parent_ID'])
Split first 40000 samples to be train, and the rest to be test.
And run the relevant checks (or the train-test-validation suite)

The text was updated successfully, but these errors were encountered:

kishore-s-15 · 2022-07-15T10:02:21Z

@shir22 Could you mention the steps to reproduce this issue ?

TheSolY · 2022-07-17T15:21:52Z

@shir22 If I understood correctly, the issue is that the conditions summary shows "0.02%" but the additional outputs show "0.00" and they should show the same number. Which dataset did you use?

shir22 · 2022-07-17T16:04:12Z

Yes, indeed @TheSolY . And specifically, for consistency, to use the same formatting function as is used in the "More Info" in the Conditions Summary table...

@kishore-s-15 About the dataset + steps to reproduce: I added the specific steps in the edited issue description

shir22 · 2022-07-18T06:00:04Z

@kishore-s-15 would you like to be assigned to this issue?

kishore-s-15 · 2022-07-18T16:44:01Z

@shir22 @noamzbr Sure.

noamzbr · 2022-07-19T15:59:02Z

Granted, and much appreciated!

kishore-s-15 · 2022-07-19T17:00:36Z

@shir22 Could you provide the code to reproduce the above error?

import pandas as pd

from deepchecks.tabular.dataset import Dataset
from deepchecks.suites import train_test_validation

df = pd.read_csv("./cpi_us_dataset.csv")

train_df = df.iloc[:40000, :]
test_df = df.iloc[40000:, :]

train_ds = Dataset(train_df, datetime_name='Date', label='Price',
        cat_features=['Category_id', 'Category', 'Indent', 'Parent', 'Parent_ID'])

test_ds = Dataset(test_df, datetime_name='Date', label='Price',
        cat_features=['Category_id', 'Category', 'Indent', 'Parent', 'Parent_ID'])

suite = train_test_validation()
suite.run(train_ds, test_ds)

I used the above code but was not able to reproduce the issue.

shir22 · 2022-07-24T15:52:32Z

Can you show the print screen of the Category Mismatch Test?
I just ran your code now, and this was the output, like in the original description it shows 0.00

kishore-s-15 · 2022-07-24T16:38:17Z

My bad, I ran the code as a script file instead of a notebook file. Got the same output now.

shir22 added the bug label Jul 14, 2022

shir22 added this to the Copernicus milestone Jul 14, 2022

github-actions bot added the needs triage Issue needs to be labeled and prioritized label Jul 14, 2022

noamzbr removed the needs triage Issue needs to be labeled and prioritized label Jul 14, 2022

noamzbr assigned TheSolY Jul 17, 2022

TheSolY removed their assignment Jul 17, 2022

noamzbr assigned TheSolY Jul 18, 2022

noamzbr assigned kishore-s-15 and unassigned TheSolY Jul 19, 2022

ItayGabbay unassigned kishore-s-15 Aug 1, 2022

Nadav-Barak assigned Nadav-Barak and noamzbr and unassigned Nadav-Barak Aug 1, 2022

noamzbr mentioned this issue Aug 1, 2022

Correctly count the ratio of new categories #1860

Merged

noamzbr closed this as completed in #1860 Aug 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Format precentage in new category output #1766

[BUG] Format precentage in new category output #1766

shir22 commented Jul 14, 2022 •

edited

kishore-s-15 commented Jul 15, 2022

TheSolY commented Jul 17, 2022

shir22 commented Jul 17, 2022 •

edited

shir22 commented Jul 18, 2022 •

edited

kishore-s-15 commented Jul 18, 2022

noamzbr commented Jul 19, 2022

kishore-s-15 commented Jul 19, 2022

shir22 commented Jul 24, 2022

kishore-s-15 commented Jul 24, 2022

[BUG] Format precentage in new category output #1766

[BUG] Format precentage in new category output #1766

Comments

shir22 commented Jul 14, 2022 • edited

kishore-s-15 commented Jul 15, 2022

TheSolY commented Jul 17, 2022

shir22 commented Jul 17, 2022 • edited

shir22 commented Jul 18, 2022 • edited

kishore-s-15 commented Jul 18, 2022

noamzbr commented Jul 19, 2022

kishore-s-15 commented Jul 19, 2022

shir22 commented Jul 24, 2022

kishore-s-15 commented Jul 24, 2022

shir22 commented Jul 14, 2022 •

edited

shir22 commented Jul 17, 2022 •

edited

shir22 commented Jul 18, 2022 •

edited