Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected GROUP BY results #504

Closed
tuxtlequino opened this issue Sep 28, 2021 · 10 comments
Closed

Unexpected GROUP BY results #504

tuxtlequino opened this issue Sep 28, 2021 · 10 comments
Assignees
Labels

Comments

@tuxtlequino
Copy link

@tuxtlequino tuxtlequino commented Sep 28, 2021

I am not sure how I am supposed to used the GROUP BY

It seems that if I group something, It doesn't show any other value in the rows other than the field Dataview is grouping things by. The other rows just show a dash on them. I am not sure I understand that use case of the GROUP BY with the data it is returning. Maybe I am missing something or group by is still not fully implemented. If an example of group by as intended is provided with some sample datasource, it will help to clarify a lot.

Thanks in advance.

@tuxtlequino
Copy link
Author

@tuxtlequino tuxtlequino commented Sep 28, 2021

Okay, I figured this out yesterday. I will be using the following values from a csv files someone had provided in this forum. Just create a file and name it something like "Test.csv" and put this information in the file and add the file to your folder (remember that it will not show there unless you add the excellent obsidian plugin "CSV Editor".)

Name,Category,Priority
1,Skill,High
2,Skill,High
3,Adventure,Low
4,Skill,Medium
5,Skill,Medium
6,Exam,Medium
10,Exam,Medium
12,Health,Medium
2131,Family,High
10,Exam,Medium
12,Health,Medium
2131,Family,High
10,Exam,Medium
12,Health,Medium
2131,Family,High

Now that we have some information, we can just do a simple query.

TABLE WITHOUT ID name, category, priority 
FROM csv("Test.csv")
SORT name asc

Just remember that the actual spelling of the CSV file matters and if you named your file "Test.csv" you need to type "Test.csv" and not "test.csv."

I chose this file as an example, because your can group this be everyone of it's rows. Now, let's do a simple GROUP BY example with this data.

TABLE WITHOUT ID name, category, priority 
FROM csv("Test.csv")
group by name
sort name asc

The first thing that one see is that other than the row Dataview is grouping by, the other information is not being showed. Somewhere in the documentation you read the following,

Group all results on a field. Yields one row per unique field value, which has 2 properties: one corresponding to the field being grouped on, and a rows array field which contains all of the pages that matched.

But what in the world does that mean? I think that the easiest way to show it, will be by looking at the following query.

TABLE WITHOUT ID name, category, priority, rows
FROM csv("Test.csv")
group by name
sort name asc

If you see there, I added the rows column to the table. This column will contain the array of all the pages that matched. You will see that Dataview is also using "name" and "Name" to include the fields with or without the initial capitalization. I will add a couple of examples to show how refining things will make group by even more useful

TABLE WITHOUT ID rows.name, rows.category, rows.priority
FROM csv("Test.csv")
where priority="High"
group by name
sort name asc
TABLE WITHOUT ID rows.name, rows.category, rows.priority
FROM csv("Test.csv")
where name=10
group by name
sort name asc

Notice in the last query how the column name is automatically identified as a number. This is great, because this means that you can finally do something like the following.

TABLE WITHOUT ID 
name, rows.category as Category, rows.priority as Priority, sum(rows.name) as Sum
FROM csv("Test.csv")
group by name
sort name asc

If you go back to the initial query

TABLE WITHOUT ID name, category, priority, rows
FROM csv("Test.csv")
group by name
sort name asc

Notice how row sometimes had several array values. This means that you can add those values as long as they are recognized as numbers.

I hope that this will help someone.

@blacksmithgu
Copy link
Owner

@blacksmithgu blacksmithgu commented Sep 28, 2021

The Dataview GROUP BY operator is a little confusing (perhaps my documentation is insufficient?), mainly because it puts everything into a rows attribute for each unique key. I've considered reworking it, though I will likely add a different operator which achieves the more intuitive visual grouping that people are looking for.

@tuxtlequino
Copy link
Author

@tuxtlequino tuxtlequino commented Sep 28, 2021

Yes. I think that the most intuitive way to use GROUP BY would be for the arrays to be seamless and instead of renaming the fields to rows.field to keep the same field name.

In that way you don't have to change your query if you want to use GROUP BY. It will also eliminate a lot of confusion. Specially since people don't know what in the world does some of the documentation means if they are new to Obsidian or Dataview.

I think that if you want to help people, it will be good to include a set of data (like the csv I use) and assume the people will use that dataset in the examples. It is hard to visualize what a query is doing if you don't have the original dataset available.

Thank you for the response and the plugin. This is awesome. The only thing I was wondering was why you cannot use FLATTEN with the arrays produced by GROUP BY

@blacksmithgu
Copy link
Owner

@blacksmithgu blacksmithgu commented Sep 28, 2021

You can - it's just not super intuitive since you have to flatten rows, not the original field.

TABLE .... FROM ...
GROUP BY field
FLATTEN rows

@tuxtlequino
Copy link
Author

@tuxtlequino tuxtlequino commented Sep 28, 2021

You can - it's just not super intuitive since you have to flatten rows, not the original field.

TABLE .... FROM ...
GROUP BY field
FLATTEN rows

Ohh. I had tried that but it seemed to undo the GROUP BY. Not sure if a bug or simple user error.

@blacksmithgu
Copy link
Owner

@blacksmithgu blacksmithgu commented Sep 28, 2021

You can also flatten on a field inside the rows

TABLE ... FROM ...
GROUP BY field
FLATTEN rows.some-other-field

@tuxtlequino
Copy link
Author

@tuxtlequino tuxtlequino commented Sep 28, 2021

I couldn't get this FLATTEN to work. It is still undoing the work done by GROUP BY

@blacksmithgu
Copy link
Owner

@blacksmithgu blacksmithgu commented Sep 29, 2021

What's your desired outcome?

@tuxtlequino
Copy link
Author

@tuxtlequino tuxtlequino commented Oct 2, 2021

Thank you very much for taking the time in this forum. I think I just needed to create more data and more examples to figure things out. I think I figured things out.

Again, thank you for your help.

@blacksmithgu
Copy link
Owner

@blacksmithgu blacksmithgu commented Oct 2, 2021

No problem!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants