Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Facet Cannot Be Correctly Sorted #8675

Open
PBI-David opened this issue Jan 28, 2023 · 5 comments
Open

Facet Cannot Be Correctly Sorted #8675

PBI-David opened this issue Jan 28, 2023 · 5 comments
Labels

Comments

@PBI-David
Copy link
Contributor

PBI-David commented Jan 28, 2023

I realised we don't have a VL example of the Vega calendar so I tried to create one to submit as a PR. Everything is working fine apart from years with incomplete months cannot be sorted. Observe that Jan, Feb, Mar, Apr for 2020 are all pushed to the end of the chart and listed under Sep, Oct, Nov & Dec. I have tried various documented sort operations but this seems like a bug.

image

Editor link

@PBI-David
Copy link
Contributor Author

Possibly related to #5937 as this looks like a sort problem when gaps are present.

@NickCrews
Copy link

NickCrews commented Aug 28, 2023

I think I found the same thing. Using altair:

import altair as alt
from vega_datasets import data

source = data.cars()

chart = alt.Chart(source).mark_point().encode(
    x="Horsepower",
    y="Miles_per_Gallon",
    color="Origin",
).facet(
    row="Cylinders",
    column=alt.Column("Origin", sort=["USA", "Europe", "Japan"]),
)
chart.to_json()

Generates the following vega lite spec, where data points are placed in the incorrect column. If you remove the sort, then the points are placed in the correct column.
Open the Chart in the Vega Editor

@NickCrews
Copy link

A total hack workaround is to pad the values with unicode "zero width space" characters, so their natural sort order is what you want, but they still display as normal:

import altair as alt
from vega_datasets import data

source = data.cars()

# Default order is alphabetical, so would be Europe, Japan, USA
origins_sorted = [
    "USA",
    "Europe",
    "Japan",
]


def make_sortable(values, sort_order):
    # Use zero width space unicode char to pad the values
    # From https://superuser.com/questions/1590069/are-there-special-characters-that-can-be-used-for-sorting-but-not-displaying
    m = {orig: (i * "\u200b") + orig for i, orig in enumerate(origins_sorted)}
    return values.replace(m)


source = source.assign(Origin2=make_sortable(source.Origin, origins_sorted))
# now Origin2 is "USA", "\u200bEurope", "\u200b\u200bJapan"

chart = (
    alt.Chart(source, width=100, height=100)
    .mark_point()
    .encode(
        x="Horsepower",
        y="Miles_per_Gallon",
        color=alt.Color(
            "Origin",
            # Optional: make the legend for the color consistent with the facets
            sort=origins_sorted,
        ),
    )
    .facet(
        row="Cylinders",
        column="Origin2",
    )
)
chart
image

@apb-reports
Copy link

Any news on this Vega Team?

Still a bug as you can see here. The only solution I have found is to push in fake data into the data set so every row and column has data. But this shouldn't be necessary.

{
  "data": {"url": "data/cars.json"},
  "mark": "bar",
  "transform": [
    {
      "filter": "datum.Origin === 'Japan' || datum.Origin === 'Europe'"
    },
    {
      "filter": "datum.Horsepower >= 110"
    },
    {
      "joinaggregate": [{"op": "count", "field": "Name", "as": "CountOrigin"}],
      "groupby": ["Origin"]
    },
    {
      "calculate": "slice('000000' + format(datum.CountOrigin, '.0f'), -6) + '-' + datum.Origin",
      "as": "OriginSort"
    },
    {
      "joinaggregate": [
        {"op": "count", "field": "Name", "as": "CountCylinders"}
      ],
      "groupby": ["Cylinders"]
    },
    {
      "calculate": "slice('000000' + format(datum.CountCylinders, '.0f'), -6) + '-' + format(datum.Cylinders, '.0f')",
      "as": "CylindersSort"
    }
  ],
  "encoding": {
    "y": {"aggregate": "count", "field": "Name", "type": "quantitative"},
    "row": {
      "field": "Origin",
      "type": "nominal",
      "sort": {"field": "OriginSort", "order": "descending"}
    },
    "column": {
      "field": "Cylinders",
      "type": "quantitative",
      "sort": {"field": "CylindersSort", "order": "descending"}
    },
    "tooltip": [
      {"field": "Name"},
      {"field": "Origin"},
      {"field": "OriginSort"},
      {"field": "CylindersSort"}
    ],
    "color": {"field": "Horsepower", "type": "ordinal"}
  }
}

@PBI-David
Copy link
Contributor Author

I managed to fix the sorting on this by using a number and then a label expression. There is definitely still a bug here although my work around solves my use case. @domoritz , let me know if you want this in the examples.

Link

visualization

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants