VideoDatabase: add "dateAdded" field to file and path table #672

Merged
merged 5 commits into from Mar 29, 2012

Projects

None yet

9 participants

@Montellese
Member

The first commit of this PR adds an additional column named "dateAdded" to both the "file" and the "path" tables. The value of this field can then be used in e.g. recently added methods to get a real sorting by date added and not by some ID. The problem with the current approach is that on a manual refresh of a video, the old data set in the "file" table is removed and a new one added. Therefore the refreshed video appears at the top of the recently added list even though it has only been refreshed and not added. Adding the "dateAdded" column to the "file" table makes it possible to use this data for movies, episodes and musicvideos. To also support tvshows I also had to add the "dateAdded" column to the "path" table.

The second commit needs some discussion. What it basically does is not using the time the video/item is added to XBMC's database but instead it uses the time the video file has last been modified in the filesystem (for tvshows it uses the creation time because the modification time of a directory changes everytime one of the containing files/sub-directories changes). This is useful when building a database from scratch (after messing up an existing one or whatever) because it will keep the same order when sorted by dateAdded. I would like to get feedback on whether this should be handled this way all the time, whether it should be optional (default on or off?) or whether this doesn't make any sense at all. The idea for this came from a user (see http://forum.xbmc.org/showpost.php?p=1011112&postcount=1662).

The other commits make use of the newly added functionality by exposing it to JSON-RPC, using it in SSortFileItem::ByDateAdded to sort file items and exposing it to smartplaylists. Smartplaylist can benefit in two ways from this: The functionality of sorting by "Date Added" can be enhanced by using this new field instead of the idFile field and it can be used as a filter/rule as well.

I have tested this on win32 only but it should work on all platforms. Would be nice if someone could give it a try with MySQL as well because I can only test SQLite.

@CrystalP
Collaborator

This PR sounds like a good idea as Recently Added, Recent Movies/TV do unexpected things at the moment. Is dateAdded exported in XBMC generated .nfo files? And imported? Sorry really don't have much XBMC time lately and didn't read your code.

For the second commit, I personally prefer the first scan date of the file, as Recently Added won't work as expected for old files otherwise. BUT, using the file date for the first scan after scrapping the db file would make some sense. It's better than scan order I suppose.

@Montellese
Member

Good thing you ask about whether dateAdded is exported/imported to/from NFOs. I actually wanted to discuss this topic because I wasn't 100% sure if it should but I forgot to mention it. In the current code it is not exported/imported but IMO it would make sense to do so.

Concerning the second commit: Both approaches make sense depending on the use case. I can always make it optional (although this adds another GUI Setting). That's why I wanted to discuss this.

@jmarshallnz
Member

Perhaps we should add more than one date? e.g. create date (of file/directory) and added date (date added to db?)

That way any setting can be done without the user rescanning (which would make at least one of the dates useless)

@Montellese
Member

If we added both dates we could let the user decide which one to use for recently added and sorting by date added. Or we could leave the sorting by date added as it is right now and provide a new sorting method "by creation date" or something like that.

Another point is what we should do about files/paths that already are in the database and therefore don't have that/those date(s) set. Going through all the files/folders and CFile::Stat()ing them might take quite a while depending on the size of the user's library.

@Montellese
Member

I updated and rebased the commits. I removed the XBDateTime include from VideoDatabase.h and I added the code for CVideoInfoTag to serialize/deserialize it's m_dateAdded member to XML and CArchive.

So what's left to decide is what datetime to use when adding a new path/file to the database.

@AWilco
AWilco commented Feb 17, 2012

A little think piece.

If we use file modified time, it tells us when the file arrived on our system, or was last modified by the user (maybe updating metadata). Apart from metadata updating Content files are generally not modified.

I believe the "standard" use case is a user rips/copies/downloads a file into a content directory. A periodic XBMC scan, or a scan on next startup then picks it up. The scan time and file modified time are similar, and assuming a small number of files added in the time since the last scan, the order is roughly correct either way (but technically more correct using file modified time).

There is the case of the file is modified after we've added it to XBMC. Easy, don't modify the timestamp once it has been set.

I'm not sure I understand the "tvshows" directory bit, for multi-file items we should choose the most recent modified time.

The other major case is on a large scan, either on first run or after the user has dropped in a lot of files in one go. Current method, and by using scan time, results in a random list. (This is helped somewhat by the fact that scan are often sorted by name so later episdes come last). Using modified time in the case of a large transfer of files, or when the user has been through and modified file metadata or some such, is equally random). However in the more normal case that content has been added over time the file modified date gives the correct order.

Basically in a large scan the order is undefined for the simple reason you cannot define a sensible order. Using file modified date is I think always a preferential value, so I can't see any reason to offer an alternative.

Summary: Use Filed Modified time.

@jmarshallnz
Member

I see modified file time as useful in that XBMC could use it to know when to rescan an item for content change. eg if we read internal tags in mkv/mp4 for example. Alternatively, we could use another hash for this that's independent of the file date and instead dependent on file content, or we could use both (my preference is the latter). Thus, IMO having info on the file modification date in the file table makes sense to me.

The same applies to paths - it could be used as a hash of sorts to know when the content of the path has changed (in addition to some slower hash, such as the number of files/folders in the folder along with their modified times, size etc.)

Summary: Agreed - use file modified time for files at least. Tempted to use it for folders as well due to the above.

Whether or not we need a date added to the library, I think if we do that might be better in the movie/episode/show/musicvideo tables rather than the file table. In my opinion this isn't really needed at this point, and would be somewhat useless until we have proper updating of the tables on rescan.

Note that the file table isn't cleaned when a movie is removed due to content set to none then back to movies (i.e. refresh), so the original date set should remain.

@Montellese
Member

The reason I didn't put the dateAdded field into the movie/episode/show/musicvideo table is because it is easier to have it in one table than in four different ones and everytime we refresh a movie the old entry in one of those tables gets removed and a new entry added (which is really stupid IMO but legacy code nobody has updated so far). So that would require more extra logic to make sure the dateAdded value isn't changed when an entry is removed and a new one updated. This obviously isn't a big deal if we use the modification time of the media file but I didn't have that in mind at first and added that logic later on as an optional extension which needs to be discussed (which we are doing).

I don't agree with using modification time as dateAdded for folders because modification time of a folder changes everytime something in the folder itself changes (which is why I used creation time for this case). So in that case it's not really dateAdded anymore if the entry in the path table gets updated. I agree that it makes sense to store the modification time in the database for some other purpose (like detecting whether to rescan or not) but it shouldn't be used for dateAdded. If there is a new episode for The Simpsons I don't want adding it to the database (which involves rescanning the tvshow) change the dateAdded value for the whole tv show because (theoretically) I added that tv show several years ago to my library.

@jmarshallnz
Member

I suspect that users won't care that the date added on the tvshow folder is updated when a new episode is added. Indeed, I suspect many would actually want this (see the "on deck" service addon). IIRC recently added albums basically works off the songs, so this would be somewhat consistent if that were to be the defined action. I dunno - I don't really see the point of recently added tvshows particularly - shows with new episodes are more interesting. I do agree that technically the create date of the folder is likely to correspond better with date added to XBMC. Perhaps add both?

Agreed regarding dateAdded to the other tables - let's try not to add anything new there until they're dead :)

@Montellese
Member

I agree that recently added tvshows doesn't make any sense but that's not what I want to achieve. I just want a consistent way to determine when an item (video or tvshow) has been added to the database. I don't want to update that date when a new episode has been added because the field is called "dateAdded" and not "dateUpdated" or something like that. I'm fine with adding another field for something like that but I don't want to introcude a "dateAdded" field and then immediately abuse it for something else (e.g. dateUpdated).

So should I add another field "dateModified" or something like that for paths which is always updated to the modification time of the path when a path is rescanned?

@AWilco
AWilco commented Feb 20, 2012

I would say don't include a dateModified field. It's always available as a file property, not the kind of data we need to cache, and if we did cache it it may well not update fast enough for any application you'd want to use it for.

@Montellese
Member

Always available as a file property may not be true in all cases and even if it would doing a stat over a network takes quite a bit more time than looking into a database. But if there is no apparent use case that would/could make use of the modification time of video items there's no need to throw it into the database besides dateAdded.

@Montellese
Member

OK I've added a new commit which introduces a GUI setting in Settings -> Video -> Library which let's you choose whether or not to use the file's creation/modification time for dateAdded (disabled by default to more or less keep the current behaviour).

There are three things I'd like to discuss:

  1. Do you agree to make this optional and that it is disabled by default? I don't really care about the default value. Question is whether to try to stay as close to the current behaviour as possible (and taking the risk that a lot of users will overlook this new feature) or if we should force this feature on everyone by default.
  2. Can anyone come up with a better string/text for the setting?
  3. Should this setting also apply to sorting implementations like SSortItems and the GetRecentlyAddedFoo() calls to keep the current behaviour available as well?
@AWilco
AWilco commented Feb 26, 2012

I have to say, does this need to be in the GUI? Its a very technical setting that I don't think anyone outside of this discussion has even though about. I think it should be an advanced setting, if people come along and don't like the current behaviour they can change it, but otherwise the GUI will become full of settings that nobody understands, which will put people off.

Secondly, I think the default behaviour should be to use the file modified time. This would in almost all cases give better behaviour. I should think almost nobody notices how the recently added show list is selected, and anyway this will only affect the edge case where you add lot of shows at once. Therefore use the better method where we can.

SSortItems and GetRecentlyAdded should use the dateAdded field, and then their behaviour is dependent on how that is populated. Otherwise there isn't much point in the field surely?

@jmarshallnz
Member

I'd be tempted to just stick with the modification/create date and be done with it until someone complains.

@Montellese
Member

Well I can just keep the last commit around as a patch in case someone starts whining ;-) Most people probably won't even notice.

@master-lincoln

Please consider merging this! Reorganizing my files always screws up the recently added and all similar filters...
Modification date seems to be perfect for this.
IMHO this should also be added to the nfo...

@live4ever

Just some thoughts if this PR can factor in what happens when someone updates the quality of the video files within their library. I tend to upgrade the videos in my library as HD (Blu-rays) sources become available.

It's not a problem if I rename the files exactly as I have currently - but I like to keep the source in the filename and sometimes the container (extension) will change as well ie:

show.s01e01.dvd.mkv becomes show.s01e01.bluray.mkv
or
movie.(2001).dvd.avi becomes movie.(2001).bluray.iso

The TV library is especially dynamic where the captured shows (ie. 1080i .ts) are replaced with the BD (1080p) often before the next season starts (less than a year). So currently these get re-added as unwatched and the file dates will be newer as well.

@Montellese
Member

@live4ever: Your problem would require quite some extra code because it has to recognize that an existing file is missing but that a new and similar file is present. Currently detecting missing files and new files is completely seperated in XBMC so it's not an easy implementation. I suggest you create a new feature request ticket on Trac.

@jmarshallnz
Member

Just an additional thought while I remember. The current implementation changes the meaning of "recently added" from "recently appended to the movie table" (if we ignore the re-scan issue, "recently added to library") to "XBMC has seen the video and stored something for it", eg streamdetails. Except for tvshows, ofcourse, which stay as recently added to library.

I don't have a problem with this - in the future I'd like everything added to the library regardless - just noting it here for others reviewing.

@Montellese
Member

So is there anything that needs to be done for this to be ready to merge? I just rebased the whole PR.

@jmarshallnz
Member

Otherwise looks fine (complete with db version bump, so I can play more rebase games :p )

@Montellese
Member

OK I squashed and rebased the PR once again and it should be ready now. Can this goe into the current merge window or should I wait for the next one (May)?

@jmarshallnz
Member

Once we sort that last one, good to go for April.

@Montellese Montellese merged commit ea3835e into xbmc:master Mar 29, 2012
@jetskijoe

if you now add modification date to the database could that value be used for rescanning nfos so if the modified date is newer then the nfo needs rescanned?

@Montellese
Member

The problem with that is that reading the modification date takes time, especially if the media is not on a local harddisc so that's why we don't do that during scans as it would slow it down a lot.

@Monkeysweat

I'm running openelec on the Raspberry Pi which is built off the nightlies, it is doing the same thing my PC did when it was on the April release, some movies and tv shows stayed as the most recently added videos no matter how many videos I added,, I took my PC back to the latest stable release and it fixed itself, however the Pi has not current stable release is stuck with this issue,, is this going to be fixed or has it been reported or has it been fixed and the build of my Pi needs to be redone?

@Montellese
Member

@Monkeysweat: There was a report a while back about files with an "invalid" date far in the future because of which I added some validation checks which ignores dates that are newer than the date of scanning. I'm not sure if you experience a similar problem. It would be very helpful if you could check the mtime and ctime of those files that stay at the top of the recently added videos list and see if there's anything odd there.

@Monkeysweat

it seems to have been the issue and it does appear to be fixed, no issues since I updated to the June release,, been running it for at least 3 weeks now.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment