-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial table flattening #4
Conversation
} | ||
|
||
// lookupKeyDelimiter is used to flatten a string array to a single key. | ||
const lookupKeyDelimiter = "🔑🗝😡🗝🔑" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may actually be a really badly performant way of splitting this, but I wanted a delimiter that had the least chance of collision with row / column string values as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it! 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good man. Plugged it in and had a play 👍
A couple of questions:
-
Is there a specific reason why the deepest aggregation becomes the column titles? Lance would prefer the first aggregation to be column titles, and the rest row titles. This prob makes more sense in conjunction with the point below. Playing around locally, I've switched them around and doesn't seem to have an impact - but thought I'd check.
-
We need the ability to group the rows by headers within the aggregation (see example screeny below). In your opinion, should this be done on the backend, or should the FE handle grouping it as it needs? As an example, I've take some real data and bunged it into a table 'as is'. Then I've also replicated it in the group format that's preferred (all with some hacky JS 😂 ).
} | ||
|
||
// lookupKeyDelimiter is used to flatten a string array to a single key. | ||
const lookupKeyDelimiter = "🔑🗝😡🗝🔑" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it! 😂
With the examples you have, I don't think there's even a column aggregation to be honest - it's just rows right? An example with columns would be to then add a time component, by month for example, and I'm pretty sure you'd have the time as the deepest aggregate on the query but I may be wrong. Your example would be three different metrics for each cell, one for sum of each of the three different totals. This for example would be a single metric report with a column aggregated by month.
I think that column vs row is literally irrelevant as you can swap them can't you? Trying to figure that one out in my head and I can't see a reason not to. Anyway, easy enough to go either way, but figured you'd generally be bucketing on the x axis, then narrowing on the y. Does that make sense? As for the headers, I think that's best left to the front end, it's a view issue rather than a data issue. It will however mean we need to lock on to a sorting process to ensure that the grouping doesn't end up out of place. |
Actually I can see I'm wrong now, the status is the column and it's just a single metric. Ignore that part of my comments ;) So you're saying Lance wants to aggregate by "status" first, then by "Department" then by "Team" (or whatever those are)? |
Yeah, bang on, in the example used, he would want to aggregate by Here’s an example request payload from the FE: {
"query_type": "job",
"aggregation":{
"type":"string",
"field": "job_status.name",
"aggregation":{
"type":"string",
"field": "department.name",
"aggregation":{
"type":"string",
"field": "team.name"
}
}
},
"metrics":[{
"type": "count",
"field": "job.id"
},{
"type": "sum",
"field": "job.weekly_hours"
}]
} |
Ok, so my gut reaction is that that is wrong and the initial implementation is correct. Row first is pretty much everywhere, from matrix ordering, array ordering etc. in computer science. It would also mean we're being little endian over big endian which feels wrong. How strongly does Lance feel about it? Feel free to bring him into the convo here. |
Cool. As discussed in slack, let's leave this as is! |
This seems to be working ok, however has highlighted the lack of sorting in our initial result set implementation. Being a map, it's unable to retain ordering. This probably isn't an issue, as what we could be doing now is working out how we want to be able to sort and filter data (sort by mean, 10 results only) sorta thing.
Not sure where that would live as given a "table" you may want to sort on a total (i.e. the row). This is easy to do for max etc. however average wouldn't be possible without valueCount for example.
May need a think, but we can merge this in the meantime, just the tests are flaky because the input is non deterministic (maps don't order).