New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support pre-binned data #2912
Comments
Relevant to tracking binned data -- #2862. |
It looks like we can make the
Either |
For now one workaround is just use Just be aware that you a) don’t get empty bins and b) don’t get the nice axis labels
|
To make sure the scale include missing bin, you can also use
Without manual scale |
Given pre-binned data, we may think about supporting prebinned data where the bin is non-uniform. |
There are two strategies: ask users to provide the start, end and step and asking users to give us the fields that are the bin boundaries and compute the steps from those. The latter allows non-uniform bins. |
|
I think these are required properties to have in
The reason is that if the users have
|
I think we need to support two use cases without additional calculate Thinking more, if possible I would like to avoid adding a Alternatively, we could think about pre-binned data in two cases:
For the @sirahd Anyway, it would help move our conversation forward if you can summarize what binned field does differently for scales and axes of position channels as well as scales and legends of non-position channels. |
For position channels,
For non-position channels,
@kanitw Please feel free to add anything that I might've missed here! |
It seems like the only think we need is a new property for letting the axis know the step size and this property should affect axis Basically, (Note that I cheat to use signal down here --it's not really officially supported in VL)
With this new
|
@sirahd I correct the comment above to include |
@sirahd it might be better to have tickStep affect values only ( and note in docs that both tickStep and values are affected by tickCount) The rationale is that we need to make tickStep behave like bin’s step on the tick in order to allow extraction of bin from encoding while preserving the same behavior. |
@kanitw I think we still need to override |
Yeah, may be you're right. In any case, we should make sure that the extracting bin to transform case can still produce 100% identical output. Thus, maybe we should merge any progress on this topic to a feature branch instead of |
After a long discussion, we decided that we'll add
|
It will actually affect offset too, but we can argue that's a part of how data get converted to visual values too (a scale = function from data domain to visual values). |
When we use a point mark, we only have one field. How would one create https://vega.github.io/editor/#/examples/vega-lite/circle_binned bit with prebinned data? |
In this case we want to encode the point position to be bin_mid, but set the scale domain to combine bin_start and bin_end. Thus, I think we should extend scale.domain to support fields. I'm adding a new issue in #3818. |
@sirahd @jakevdp @jheer @arvind Please vote for your favorite! We are trying to decide what the syntax should be if you already have binned data and want to render it but still have nice axes and legends. The dataset already has In the specs below, replace {
"data": {"url": "binned_data.json"},
"mark": "bar",
"encoding": {
"x": {
???,
"field": "bin_start",
"type": "quantitative"
},
"x2": {
"field": "bin_end",
"type": "quantitative"
},
"y": {
"aggregate": "count",
"type": "quantitative"
}
}
} Replace with:
|
I thought more about this. I think both 2. and 3. require adding one more property, and thus are less discoverable. (People already know about
I think Thus my vote is |
- implements #2912 - I'll add examples after code change is approved
Fixed in #3937 |
From an earlier conversation with @domoritz, supporting pre-binned data would be useful for connecting with database. I think this would be useful for @leibatt as well.
From the conversation, the tricky part is how to know the bin step.
I wonder if we should just let users input the step in such case?
The text was updated successfully, but these errors were encountered: