Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[database] Make season_view and tvshow_view SQL ANSI92-compliant #15185

Merged
merged 1 commit into from Jan 4, 2019

Conversation

@wildcat2020
Copy link
Contributor

commented Jan 2, 2019

Description

For season_view: removed Group By clause as it is not needed and it errors out.
For tvshow_view: added full Group By Clause to make SQL ANSI 92 compliant.

Motivation and Context

This change is related to issue #15172
"myvideos113.season_view and tvshow_view not ANSI 92 compliant and therefore broken"

How Has This Been Tested?

The SELECT statements were tested via MySQL workbench, however I don't have the means to rebuild kodi from my branch and test so a final test of the view creation statements is needed. Below is description of final test to be made:

  • Build new kodi with this commit included.
  • Setup in advancesettings.xml video database to open to MySQL 5.7 or later
  • Open Kodi and let it create the kodi schema
  • Login via MySQL workbench to myvideos113 database as kodi user, and try to query from the below two views:
  1. season_view
  2. tvshow_view

Query should succeed for both views, returning zero rows.

Screenshots (if appropriate):

Types of change

  • Bug fix (non-breaking change which fixes an issue)
  • Clean up (non-breaking change which removes non-working, unmaintained functionality)
  • Improvement (non-breaking change which improves existing functionality)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that will cause existing functionality to change)
  • Cosmetic change (non-breaking change that doesn't touch code)
  • None of the above (please explain below)

Checklist:

  • My code follows the Code Guidelines of this project
  • My change requires a change to the documentation, either Doxygen or wiki
  • I have updated the documentation accordingly
  • I have read the Contributing document
  • I have added tests to cover my change
  • All new and existing tests passed
@pkerling

This comment has been minimized.

Copy link
Member

commented Jan 2, 2019

Thanks for the PR! I'm no SQL wiz so I hope I don't completely embarrass myself, but wouldn't the season_view also work without GROUP BY? Grouping by the id does not make sense IMO.

@wildcat2020

This comment has been minimized.

Copy link
Contributor Author

commented Jan 2, 2019

Correction;

season_view has the below aggregate functions:

count(DISTINCT episode.idEpisode) AS episodes,"
" count(files.playCount) AS playCount,"
" min(episode.c%02d) AS aired "

So it needs group by clause.
tvshow_view on the other hand, has no aggregate function in the SELECT clause and can do without group by clause, and therefore I have removed it.
However, note that by removing the group by, we are not making the statement 100% equivalent with what it was before, because group by operation includes PROJECTION and SORT operation which removes duplicates. I am not sure how the overall data schema design was done, but assuming this select statement does not return duplicates (I checked for duplicates coming from this view without the group by clause in myvideos113 database and I did not find any), it should be safe and faster (as it removes PROJECTION and hence SORT operation) to remove the group by clause. I hope this makes sense.

@wildcat2020

This comment has been minimized.

Copy link
Contributor Author

commented Jan 2, 2019

OK, I think I did it, I added a second commit. Please kinldy check. It should be ok now.

@wsnipex

This comment has been minimized.

Copy link
Member

commented Jan 2, 2019

jenkins build this please

@wildcat2020

This comment has been minimized.

Copy link
Contributor Author

commented Jan 2, 2019

Great, I saw it was built so I am downloading 64bit windows one to install and test....will let you know of the test result soon...

@wildcat2020

This comment has been minimized.

Copy link
Contributor Author

commented Jan 2, 2019

OK, tested it. season_view was not generated at all, and tvshow_view was generated but still complaining about missing group by. This is because also tvshowcounts is not correct as it is also missing GROUP BY clause.

@wildcat2020

This comment has been minimized.

Copy link
Contributor Author

commented Jan 2, 2019

still not ok...investigaiting...

@wildcat2020

This comment has been minimized.

Copy link
Contributor Author

commented Jan 2, 2019

OK, should be corect now after all those changes, but build fails with error:
$GIT_COMMIT (aca9256) doesn't match currently checked out revision (0308f2c)
Any help on how to proceed with build for testing would be appreciated.

@kib

This comment has been minimized.

Copy link
Member

commented Jan 2, 2019

jenkins build this please

wildcat2020 added a commit to wildcat2020/xbmc that referenced this pull request Jan 2, 2019
@wildcat2020 wildcat2020 force-pushed the wildcat2020:feature_myvideos113 branch from 2a90ad7 to 52aafce Jan 3, 2019
@pkerling

This comment has been minimized.

Copy link
Member

commented Jan 3, 2019

Thanks a lot for putting your time into this. Code is looking good now, but the commit message needs some work. It is best if you put a short concise and description of the fix into the title and not the issue number, because that is very opaque and does not help when scrolling through a list of commits. I'd suggest [database] Make season_view and tvshow_view SQL ANSI92-compliant for example. In the message body, you can describe more in-depth what changes were required to achieve the compatibility and finally say Fixes #15172. That magic line will cause GitHub to automatically close the issue once the pull request gets merged.

@pkerling

This comment has been minimized.

Copy link
Member

commented Jan 3, 2019

Also as @sopparus suggested it would be good to bump the DB version so Kodi recreates the views that people already might have. It is this line:

return 113;
- just change it to 114.

@wildcat2020 wildcat2020 force-pushed the wildcat2020:feature_myvideos113 branch from 52aafce to 8f9d072 Jan 3, 2019
@pkerling

This comment has been minimized.

Copy link
Member

commented Jan 3, 2019

..this is why I need to test this...I am not sure why whoever chose to write this file used variables to include in the prepare statement....this is a level of abstraction that I am not used to...

Mostly historic reasons as far as I'm aware. The whole database layer is pending a complete rewrite.

did it again and bumped the database version, please kindly check.

Version bump is fine. Can you put a message body into the commit message with more details and the "Fixes:" line as I requested please? Right now it seems to be empty.

@pkerling

This comment has been minimized.

Copy link
Member

commented Jan 3, 2019

Oh forgot to mention: SQL statement changes (including the column names) should be fine as far as I can tell.

@wildcat2020

This comment has been minimized.

Copy link
Contributor Author

commented Jan 3, 2019

Ok, thanks. I see I can use two times -m switch in git commit to add both title and message body, I'll try it out...

@wildcat2020 wildcat2020 force-pushed the wildcat2020:feature_myvideos113 branch from 8f9d072 to 18c7d6c Jan 3, 2019
@wildcat2020

This comment has been minimized.

Copy link
Contributor Author

commented Jan 3, 2019

Ok, done, please check.

@pkerling

This comment has been minimized.

Copy link
Member

commented Jan 3, 2019

Ok, thanks. I see I can use two times -m command in git commit to add both title and message body, I'll try it out...

You can also just leave out -m and the message on the command line and an editor should open where you can enter longer commit messages.

Last request (I think): You have to write fixes #15172 and not fixes issue #15172. If you include any other words between fixes and the number, the GitHub magic will not work unfortunately. Note that this is not that important, I'm just saying it as a learning experience for the future :-)

@wildcat2020

This comment has been minimized.

Copy link
Contributor Author

commented Jan 3, 2019

thanks, a lot. Now that I will do again the change, I think I should also fix one more thing....Although the current changes should make the views created valid, there is still something that is not a good coding practices used in tvshow_view and season_view: who ever wrote the statement, put in SELECT tvshow.* and SELECT seasons.*
If those base tables ever get a new column added, this will break the view again as the new column will automatically appear in the SELECT clause but not in the GROUP BY clause...
Would you like me to remove that as well and add normally the columns of the base tables?

@pkerling

This comment has been minimized.

Copy link
Member

commented Jan 3, 2019

Would you like me to remove that as well and add normally the columns of the base tables?

Yes, that sounds like a good idea.

@wildcat2020

This comment has been minimized.

Copy link
Contributor Author

commented Jan 3, 2019

ok, I'll do this in the next few hours (have to check the tables definitions from MySQL, and will also re-commit with the change in the message body again...have to take out the family now :-)

@wildcat2020 wildcat2020 force-pushed the wildcat2020:feature_myvideos113 branch from 18c7d6c to 4295579 Jan 3, 2019
@wildcat2020

This comment has been minimized.

Copy link
Contributor Author

commented Jan 3, 2019

ok did it, please kindly check

m_pDS->exec(tvshowcounts);

CLog::Log(LOGINFO, "create tvshow_view");
std::string tvshowview = PrepareSQL("CREATE VIEW tvshow_view AS SELECT "
" tvshow.*,"
" tvshow.idShow AS idShow,"
" tvshow.c00 AS c00, tvshow.c01 AS c01, tvshow.c02 AS c02, tvshow.c03 AS c03,"

This comment has been minimized.

Copy link
@pkerling

pkerling Jan 4, 2019

Member

Here also the column number constants should be used with PrepareSQL. I know it's tedious, but it's the only clean way with the strange design that we have. You can find the list of all constants and their values in VideoDatabase.h.

This comment has been minimized.

Copy link
@pkerling

pkerling Jan 4, 2019

Member

Actually... I think you can leave tvshow.* here. tvshow_view should behave mostly like tvshow, so including all columns is natural and we do not have a GROUP BY so there is no need to spell them out explicitly. Sorry for the confusion.

This comment has been minimized.

Copy link
@wildcat2020

wildcat2020 Jan 4, 2019

Author Contributor

Ok, changed back tvshow_view to SELECT tvshow.* as there is no way I can safely understand which labels to use for the tvshow.c%02d columns....
Possibly you guys need to create a wiki to document the logic of this videodatabase.h and which set of labels correspond to which tables...

This comment has been minimized.

Copy link
@MartijnKaijser

MartijnKaijser Jan 4, 2019

Member

You mean https://kodi.wiki/view/Databases/MyVideos
Database is gong to be redesigned anyway soon so if you are willing to help on that
https://github.com/xbmc/xbmc/tree/feature_odb

This comment has been minimized.

Copy link
@wildcat2020

wildcat2020 Jan 4, 2019

Author Contributor

Cool, many thanks. Let me know how I can help...

This comment has been minimized.

Copy link
@pkerling

pkerling Jan 4, 2019

Member

@lobermann Feel free to ask if you have something concrete you need help with. I'm not really up-to-date on the DB rework state.

removed group by clause from tvshow_view and added full group by clause to season_view as well as expanded select all from base table seasons to select every individual column in season_view: fixes #15172
@wildcat2020

This comment has been minimized.

Copy link
Contributor Author

commented Jan 5, 2019

From my point of view, this information is useless at TV show level.
If you would like to consider something like the below functionality (which would make sense from my point of view):

"Watch on: Netflix Vudu YouTube /data/...."

And be able to click on each source to start playing,

You would anyway make it at episode level, not at TV Show level...

@DaVukovic

This comment has been minimized.

Copy link
Member

commented Jan 5, 2019

Tbh, I mostly find this information pretty usefull. Haven't read the full story, but at least at the episode level and while at movies we should keep this info as that's the only section where you get the info about the file behind the item and if it's a mkv, an ISO or whatever. Pretty helpfull if you have different kind of media stored and also might be pretty helpful to debug problems if you know where the items comes from (smb, nfs, local)

@wildcat2020

This comment has been minimized.

Copy link
Contributor Author

commented Jan 5, 2019

I agree, this information is very useful at episode level if you can action on it ofcourse (I.e.choose from which source to play from) it could be also different quality files, like 1080 720 480...
But not at TV show level.

@MartijnKaijser

This comment has been minimized.

Copy link
Member

commented Jan 5, 2019

Ah the old feature (issue, cause this was also a pain for my artwork downloader addon to get it right) that you had your episodes scattered across several disks cause there wasn't enough space to put it all on one disk.

@phate89

This comment has been minimized.

Copy link
Contributor

commented Jan 5, 2019

@DaVukovic yes it is useful but it's potentially incorrect. You might have more than one path.. and that field only display one of the paths so you'll get a path where your file doesn't exist
@pkerling yes that is the path but actually in the info dialog we gather the full details of a video so we might actually espose all the paths in that field if we want with some change.
@wildcat2020 yes but current form isn't viable either. And we're finalizing leia so dropping the field isn't a viable solution. I can't think a solution to restore the same exact behavior changing only the query (at least not without a couple of subqueries).. we could still use a group by and take the min value (the first in alfabetical order) or edit the tv show table to have an external id to the "main" path and use that (not a great solution imo since it requires updates in several places when we drop the path)..

for the future the ideal solution could be drop the tv show "path" fields and expose a "paths" field with all the paths

@pkerling

This comment has been minimized.

Copy link
Member

commented Jan 5, 2019

Haven't read the full story, but at least at the episode level and while at movies we should keep this info

Sure. This is strictly about the TV show level.

we could still use a group by and take the min value (the first in alfabetical order)

That sounds like the most sensible thing to do for v18. Can you cook up a PR for that?

@wildcat2020

This comment has been minimized.

Copy link
Contributor Author

commented Jan 5, 2019

I could restore my branch and re-push also the change with minimum path?

@pkerling

This comment has been minimized.

Copy link
Member

commented Jan 5, 2019

@wildcat2020 If you want to try that, sure. But you will have to open a new PR in any case.

@wildcat2020

This comment has been minimized.

Copy link
Contributor Author

commented Jan 5, 2019

I will be lost...better somebody else do it....I am not even certain how to delete my branch at local level after I deleted it from within GIT web interface :-)

@pkerling

This comment has been minimized.

Copy link
Member

commented Jan 5, 2019

git branch -d [branch name]

@pkerling

This comment has been minimized.

Copy link
Member

commented Jan 5, 2019

You can also just push it again and it should re-appear :-) You can then open a new PR from there.

@wildcat2020

This comment has been minimized.

Copy link
Contributor Author

commented Jan 5, 2019

Ok, I deleted the local branch and create a new one.
However, now the file created is the one before this pull request.
How can I instruct my branch to take into account this pull request so that I get the most recent file?

@wildcat2020 wildcat2020 restored the wildcat2020:feature_myvideos113 branch Jan 5, 2019
@wildcat2020

This comment has been minimized.

Copy link
Contributor Author

commented Jan 5, 2019

Ok, no worries, deleted the new one and restored the old one. Now I will try and create new pull request for the new change...

@wildcat2020

This comment has been minimized.

Copy link
Contributor Author

commented Jan 5, 2019

But just realized....by introducing aggregate function to tvshow_view we now need full group by clause to be built...

@wildcat2020

This comment has been minimized.

Copy link
Contributor Author

commented Jan 5, 2019

anyway, created PR: [database] Set tvshow_view return minimum path #15200

@pkerling

This comment has been minimized.

Copy link
Member

commented Jan 6, 2019

But just realized....by introducing aggregate function to tvshow_view we now need full group by clause to be built...

Yes that cannot be avoided unfortunately.

@wildcat2020 wildcat2020 referenced this pull request Jan 6, 2019
3 of 13 tasks complete
wildcat2020 added a commit to wildcat2020/xbmc that referenced this pull request Jan 6, 2019
… it only returns one row per tvshow with correct strPath and idParentPath.

Added subquery in FROM clause in place of tvshowlinkpath table to return only the maximum strPath.
wildcat2020 added a commit to wildcat2020/xbmc that referenced this pull request Jan 6, 2019
… it only returns one row per tvshow with correct strPath and idParentPath.

Added subquery in FROM clause in place of tvshowlinkpath table to return only the maximum strPath.
wildcat2020 added a commit to wildcat2020/xbmc that referenced this pull request Jan 6, 2019
… it only returns one row per tvshow with correct strPath and idParentPath.

Added subquery in FROM clause in place of tvshowlinkpath table to return only the maximum strPath.
wildcat2020 added a commit to wildcat2020/xbmc that referenced this pull request Jan 6, 2019
… it only returns one row per tvshow with correct strPath and idParentPath.

Added subquery in FROM clause in place of tvshowlinkpath table to return only the maximum strPath.
wildcat2020 added a commit to wildcat2020/xbmc that referenced this pull request Jan 6, 2019
… it only returns one row per tvshow with correct strPath and idParentPath.

Added subquery in FROM clause in place of tvshowlinkpath table to return only the maximum strPath.
wildcat2020 added a commit to wildcat2020/xbmc that referenced this pull request Jan 6, 2019
… it only returns one row per tvshow with correct strPath and idParentPath.

Added subquery in FROM clause in place of tvshowlinkpath table to return only the maximum strPath.
@andbo467

This comment has been minimized.

Copy link

commented Jan 7, 2019

Great! Now my MySQL (MariaDB 10) database works again :)

@pkerling

This comment has been minimized.

Copy link
Member

commented Jan 7, 2019

That's interesting since MariaDB does not activate ONLY_FULL_GROUP_BY by default, so you should not have had this problem in the first place.

@andbo467

This comment has been minimized.

Copy link

commented Jan 7, 2019

Ok. Maybe something else broke my MyVideos113 then. I'm just happy that 114 works :)

fritsch added a commit to fritsch/xbmc that referenced this pull request Jan 12, 2019
…ake sure it only returns one row per tvshow with correct strPath and idParentPath."

This reverts commit 0fbd8c5.
MartijnKaijser added a commit to MartijnKaijser/xbmc that referenced this pull request Jan 22, 2019
… it only returns one row per tvshow with correct strPath and idParentPath.

Added subquery in FROM clause in place of tvshowlinkpath table to return only the maximum strPath.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.