Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using ordered categorical columns with built-in charts causes an error #7776

Closed
3 of 4 tasks
LukasMasuch opened this issue Nov 29, 2023 · 1 comment · Fixed by #7771
Closed
3 of 4 tasks

Using ordered categorical columns with built-in charts causes an error #7776

LukasMasuch opened this issue Nov 29, 2023 · 1 comment · Fixed by #7771
Labels
feature:builtin-charts priority:P3 status:confirmed Bug has been confirmed by the Streamlit team type:bug Something isn't working

Comments

@LukasMasuch
Copy link
Collaborator

LukasMasuch commented Nov 29, 2023

Checklist

  • I have searched the existing issues for similar issues.
  • I added a very descriptive title to this issue.
  • I have provided sufficient information below to help reproduce this issue.

Summary

The usage of an ordered categorical column (pd.Categorical(..., ordered=True)) with the built-in charts results in a SchemaValidationError. The underlying issue is that ordered categorical columns are treated slightly differently within the infer_vegalite_type by returning a tuple:

elif typ == "categorical" and data.cat.ordered:
return ("ordinal", data.cat.categories.tolist())

To solve this, we would need to split up this tuple into type and sort within our built-in chart logic similar to how it is done here in Altair:

https://github.com/altair-viz/altair/blob/e1bb266f91bd743c815fce9908d03d3bb1ad13fc/altair/utils/core.py#L603-L605

Reproducible Code Example

Open in Streamlit Cloud

import pandas as pd
import streamlit as st

df = pd.DataFrame(
    {
        "categorical": pd.Series(
            pd.Categorical(
                ["b", "c", "a", "a"], categories=["c", "b", "a"], ordered=True
            )
        ),
        "numbers": pd.Series([1, 2, 3, 4]),
    }
)

st.scatter_chart(df, x="categorical", y="numbers")

Steps To Reproduce

Running the app will result in an error.

Expected Behavior

Show the chart with the x-axis using the categories ordered by "c", "b", "a".

Current Behavior

Exception:

SchemaValidationError: '['ordinal', ['c', 'b', 'a']]' is an invalid value for `type`. Valid values are: - One of ['quantitative', 'ordinal', 'temporal', 'nominal', 'geojson'] - Of type 'string'

Screenshot 2023-11-29 at 15 03 47

Is this a regression?

  • Yes, this used to work in a previous version.

Debug info

  • Streamlit version: 1.28.2
  • Python version: 3.10
  • Operating System: MacOS
  • Browser: Chrome

Additional Information

No response

@LukasMasuch LukasMasuch added type:bug Something isn't working status:needs-triage Has not been triaged by the Streamlit team labels Nov 29, 2023
Copy link

If this issue affects you, please react with a 👍 (thumbs up emoji) to the initial post.

Your feedback helps us prioritize which bugs to investigate and address first.

Visits

@LukasMasuch LukasMasuch added status:confirmed Bug has been confirmed by the Streamlit team priority:P3 feature:builtin-charts and removed status:needs-triage Has not been triaged by the Streamlit team labels Nov 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature:builtin-charts priority:P3 status:confirmed Bug has been confirmed by the Streamlit team type:bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant