Skip to content
This repository has been archived by the owner on Mar 12, 2024. It is now read-only.

NaN values don't show in categorical column #149

Closed
F1nnM opened this issue Jan 25, 2021 · 8 comments
Closed

NaN values don't show in categorical column #149

F1nnM opened this issue Jan 25, 2021 · 8 comments
Labels
bug Something isn't working

Comments

@F1nnM
Copy link

F1nnM commented Jan 25, 2021

Hi,

I've come across a weird bug with the parallel plot:
One of my float columns/axis contains NaN-values.
When plotted with Hiplot only sometimes the entry "nan/inf/null" appears on the axis; more exactly it only appears if there are at least 6 unique values other than NaN in that column.
For example, if all entries in that column contain only the values [nan 3. 5. 7. 9. 15. 30. ] the "nan/inf/null" entry shows correctly.
If however I replace all 30's with 15's without changing anything else, the entry doesn't show up.

Is that a bug, or am I overseeing something on my side?

@F1nnM F1nnM changed the title NaN values in float column only shown when 6 or more unique values NaN values in float column only shown when 6 or more unique values in that column Jan 25, 2021
@danthe3rd danthe3rd added the bug Something isn't working label Jan 25, 2021
@danthe3rd
Copy link
Contributor

Hi @F1nnM - and thank you for reporting this bug!

It looks weird indeed - one possible explanation that comes to my mind: when your column does not have a lot of different values, it might no longer be considered as a numerical value, but as a categorical value (you can check/modify that by right clicking a column's name above the parallel plot). Changing back to numerical could be a workaround - you can also force the column type programmatically
Now there might be a bug with how we display NaN values for categorical columns.

@F1nnM
Copy link
Author

F1nnM commented Jan 25, 2021

Indeed, that seems to be the issue; it's considered categorical.
Thank you for that info and the workaround!

@F1nnM F1nnM changed the title NaN values in float column only shown when 6 or more unique values in that column NaN values don't show in categorical column Apr 4, 2021
@F1nnM
Copy link
Author

F1nnM commented Apr 4, 2021

So the NaN handling seems weird in general, I tried to fix it, but it's to complex to understand for me.
I might try again somewhen, should it not be fixed by then.

What I found out so far:
In the standalone version, just running the Hiplot command, the NaN values are shown in categorical columns, but numerical columns with nans appear completely empty.

params_max_columns categorical, standalone version
image

params_max_columns numerical, standalone version
image

In the streamlit component, categorical columns don't show NaNs, only the other values. But it correctly handles NaNs in numerical columns.

params_max_columns categorical, Streamlit version
image

params_max_columns numerical, Streamlit version
image

This time I can even supply the dataset I used.
Random-Forest-Example.csv.txt

For the Hiplot in Streamlit with this dataset you can also check here

@F1nnM
Copy link
Author

F1nnM commented Apr 5, 2021

Half-asleep idea: Probably the default csv handler already fills the NaNs with strings "NaN", does it?
That would explain it.
If that's the case, maybe the change in my linked PR is actually correct.

@danthe3rd
Copy link
Contributor

Since it's a problem that only happens with streamlit, I'm wondering if the experimental _compress flag might cause it (https://facebookresearch.github.io/hiplot/tuto_streamlit.html#improving-performance-with-streamlit-caching-experimental).
Otherwise, the code in both cases should be exactly the same...
I'll try to have a look at it later this week.

@danthe3rd
Copy link
Contributor

I've started this PR (still WIP) that should fix it, but I don't have time to extensively test it now - hopefully next week

@F1nnM
Copy link
Author

F1nnM commented Apr 14, 2021

Ah, awesome!

@danthe3rd
Copy link
Contributor

should be fixed - I just merged #180
I'm pushing version 0.1.25

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants