Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confuse about Group By Labeling #21

Closed
zhanzecheng opened this issue Jan 10, 2019 · 2 comments
Closed

Confuse about Group By Labeling #21

zhanzecheng opened this issue Jan 10, 2019 · 2 comments

Comments

@zhanzecheng
Copy link

zhanzecheng commented Jan 10, 2019

Hi Tao,

Confuse about some group by label, show as below

1.Show the names of singers that have more than one song.
SELECT T1.Name FROM singer AS T1 JOIN song AS T2 ON T1.Singer_ID  =  T2.Singer_ID GROUP BY T1.Name HAVING COUNT(*)  >  1

2.What are the names of the singers that have more than one songs?
SELECT T1.Name FROM singer AS T1 JOIN song AS T2 ON T1.Singer_ID  =  T2.Singer_ID GROUP BY T1.Name HAVING COUNT(*)  >  1

3.find the name of employee who was awarded the most times in the evaluation.
SELECT t1.name FROM employee AS t1 JOIN evaluation AS t2 ON t1.Employee_ID  =  t2.Employee_ID GROUP BY t2.Employee_ID ORDER BY count(*) DESC LIMIT 1

4.Which employee received the most awards in evaluations? Give me the employee name.
SELECT t1.name FROM employee AS t1 JOIN evaluation AS t2 ON t1.Employee_ID  =  t2.Employee_ID GROUP BY t2.Employee_ID ORDER BY count(*) DESC LIMIT 1
​

Could you explain why 1 and 2 are group by name but 3 and 4 and group by primary key?

Thanks a lot,
Zecheng

@taoyds
Copy link
Owner

taoyds commented Feb 13, 2019

Hi Zecheng,

These questions are a little bit confusing if there are some singers with the exact same name. In most of the cases, we group the result by primary key. We require our annotators to select the column name which the SQL groups by. They might group by the name instead of the primary key if the question asking for the names.

Best,
Tao

@taoyds taoyds closed this as completed Feb 13, 2019
@GaoyanCheerup
Copy link

Hi, @taoyds , may I ask when will group the result by primary key, and when will group the result by the selected column name? From the training data, it seems hard to find pattern. That's maybe the reason why Group By clause (no having) has a low accuracy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants