Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark_area() color undefined with correct column name #3524

Closed
lborcard opened this issue Aug 6, 2024 · 4 comments
Closed

Mark_area() color undefined with correct column name #3524

lborcard opened this issue Aug 6, 2024 · 4 comments
Labels

Comments

@lborcard
Copy link

lborcard commented Aug 6, 2024

What happened?

Trying to add color by group but it shows color as undefined instead of displaying the color differences.

_chart_loss2 = (

    alt.Chart(sub_loss_real).transform_density("Percentage_loss", as_=["Percentage_loss", "density"])
    .mark_area()
    .encode(
        alt.X("Percentage_loss:Q", scale=alt.Scale(domain=[0, 1])),
        alt.Y("density:Q"),
        alt.Color("Percentage_loss_dim_2:N")
    )
)
_chart_loss2

info about the dataframe
<class 'pandas.core.frame.DataFrame'>
Index: 5000 entries, 2003 to 10552
Data columns (total 4 columns):
 #   Column                 Non-Null Count  Dtype  
---  ------                 --------------  -----  
 0   chain                  5000 non-null   int64  
 1   draw                   5000 non-null   int64  
 2   Percentage_loss_dim_2  5000 non-null   string 
 3   Percentage_loss        5000 non-null   float64
dtypes: float64(1), int64(2), string(1)
memory usage: 195.3 KB


What would you like to happen instead?

I would like the plot to show me a two different colors for each group and I would like to be able to know what is wrong with my input since I tried every way possible to modify my column so that it can separate it by group.

Which version of Altair are you using?

5.3.0

@lborcard lborcard added the bug label Aug 6, 2024
@dsmedia
Copy link
Contributor

dsmedia commented Aug 13, 2024

The examples here and here use the groupby parameter. Does this sample code (modeled after yours but with a sample dataset) generate the result you're looking for?

import altair as alt
import pandas as pd
import numpy as np

# Create a sample dataset
np.random.seed(42)  # for reproducibility
n = 1000
sub_loss_real = pd.DataFrame({
    'Percentage_loss': np.concatenate([
        np.random.beta(2, 5, n),  # distribution for group A
        np.random.beta(5, 2, n)   # distribution for group B
    ]),
    'Percentage_loss_dim_2': np.repeat(['Group A', 'Group B'], n)
})

# Create the chart
chart_loss = (
    alt.Chart(sub_loss_real)
    .transform_density(
        'Percentage_loss',
        groupby=['Percentage_loss_dim_2'],
        as_=['Percentage_loss', 'density']
    )
    .mark_area(opacity=0.5)
    .encode(
        x=alt.X('Percentage_loss:Q', scale=alt.Scale(domain=[0, 1])),
        y=alt.Y('density:Q'),
        color=alt.Color('Percentage_loss_dim_2:N', scale=alt.Scale(scheme='category10'))
    )
    .properties(
        width=600,
        height=400,
        title="Density Plot of Percentage Loss by Group"
    )
)

# Display the chart
chart_loss.show()

image

@joelostblom
Copy link
Contributor

Closing a solution has been posted above.

@lborcard
Copy link
Author

Thanks for help, I somehow found out that you needed groupby. Imo using color should by default do the grouping.

@joelostblom
Copy link
Contributor

We're working including a dedicated density mark in Vega-Lite / Altair vega/vega-lite#3442. This would support grouping via color directly and without the need to use the transform at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants