VideoInfoScanner slightly optimized #1191

Closed
wants to merge 1 commit into
from

Projects

None yet

4 participants

@dez-dk

Using assumptions about the user-specified settings the VideoInfoScanner has been optimized in an attempt to reduce roundtrips when scanning a movie collection which is mounted from a network location.

The DoScan() method has been refactored to use a CFileItemPtr instead of just a string. This let's us grab a lot more fileinfo at once instead of doing individual stat() calls on every subdir and file.

If the user has chosen "Movies are in separate folders" we assume:

  • They do not want to scan the root dir (eg. "/movies/") itself as a movie but they want to scan subdirs (eg. "/movies/Movie Title (year)"). This saves a Stack() and GetPathHash() call on a potentially large dir ("/movies/").

If the user has also selected not to recurse the dir we assume:

  • That it's okay to fasthash subdirs (eg. "/movies/Movie Title (year)") to determine if we've already scanned the movie. We use the date from the FileItem we've already grabbed to do fast hashing.

EDIT: added some examples.

dez-dk VideoInfoScanner slightly optimized
Using assumptions about the user-specified settings the VideoInfoScanner
has been optimized in an attempt to reduce roundtrips when scanning a
movie collection which is mounted from a network location.
8a117f9
@ghost

Haven't looked at the code but no stack will break eg two cd movies in cd x subfolders

@dez-dk

I guess you're right about the CD-subdirs.

I'm not in the know here but why are CD-subdirs scanned? I would think the movie is still the same and that it has the same information even if CD2 has changed.

EDIT:
I should probably point out that the focus of this PR is the RESCANNING of movies. Movies that have not yet been scanned are still scanned in the same way. CD-subdirs are still scanned initially. However, they are not RESCANNED if the user has enabled the "Movies are in separate folders" and disabled recursive scanning.

@jmarshallnz
Team Kodi member

You can't assume that the root dir contains no movies even if "movies in folders named after the movie" is enabled. Simple reason: VIDEO_TS folders are stacked down to the parent folder item, which is then converted to a file.

Further, this would make essentially bugger all difference to the work that needs to be done anyway.

@dez-dk

Can you point me towards the work that needs to be done? Any plans?

@ghost

work to be done as in the work the scanner has to perform.

@dez-dk

I disagree. With these changes I can rescan about 18 movies/second from a remote location. Before these changes I couldn't even rescan 0.5 movies/second (I stopped the scan before it finished because it simple took too long, so that's perhaps even a bit optimistic.)

The thing that makes all the difference is not the stack() call that is saved but the fact that fast hashing is used more extensively. I will revert the stack() change because it obviously makes no sense because of the DVD stacking you mentioned but I urge you to reconsider the rest of the PR.

EDIT: Please note these estimates are based on the rescanning of an entire movie directory and hence they include time spent rescanning movies that are not scannable. In a given second XBMC is rescanning 160 directories/second.

@ghost

sure, we haven't commented on the rest :)

@jmarshallnz
Team Kodi member

Do you have rarred files?

@dez-dk

No, the files are not rarred.

@jmarshallnz
Team Kodi member

So why is it so slow? Retrieving directories should be pretty quick - the music scan for example always does this, and I don't think we've had any complaints about that's rescan speed.

@dez-dk

I am using an HTTP source and stat() doesn't retrieve a modification time for HTTP directories. Nginx doesn't send a Last-Modified header because the "directory" is just a generated HTML page (IIRC).

EDIT: Which means XBMC has to do a full path hash on each directory.
(I need to stop abusing the edit feature)

@jmarshallnz
Team Kodi member

So the root cause is that the fasthash does not apply to your movie folders because Stat() is not implemented?

Do each of your movie folders doesn't have any subdirs in them? i.e. would they normally be fasthashed if they were on any filesystem other than http?

@dez-dk

It was what prompted me to take a look at the VideoInfoScanner.

I have only tested with single file directories. They also have subdirs that match exclude regexps (eg. a subs-dir) and as far as I can tell that means the dirs are fully path hashed.

I also have multi-CD movies but I haven't yet tested those. I would like to fasthash multi-CD movies as well. As I have previously stated I don't think the directory will suddenly contain a different movie because one of the CD-subdirs has changed.

@jmarshallnz
Team Kodi member

Multi-cd movies will be stacked down to a file, so they're not a problem. See the comment in CanFastHash for the other one - I wouldn't see a problem with excluding folders there. A simple modification there would fix your issue?

@dez-dk

Right you are :)

I must have stared myself blind at the problem when I was stepping through the code. Do you have any suggestions about the missing stat() feature for HTTP?

@jmarshallnz
Team Kodi member

Well, the first step is determining how it sets m_dateTime when retrieving a listing to begin with. If you can get that information you'll be fine. My guess is you can't :)

@dez-dk

In a previous PR I made I added some regexps to HTTPDirectory that made it possible to extract the date from a directory listing.

I believe I would have to retrieve the parent directory and use it to get the date information of the dir I'm looking for ... Seems like a lot of work though.

@Karlson2k
Team Kodi member

Personally I'm against this commits.
I don't like the idea of skipping everything in root folder - something can be temporary placed here.
And I don't like that idea of assumption that everyone is using name template like 'Title (Year)'. I'm using 'Original Title (edition) {Year}= Localized Title [Genres]' template. And XBMC works fine with my template after tuning AdvancedSettings.xml.

May be it's worth to think about different tool for sharing? NFS, SMB works pretty good with last modified time. Or in case that http is required, try other http server.
In any case: instead of implementing not-always-working workarounds, try to use right tools that are intended for your needs.

@dez-dk

@Karlson2k
The PR has already been debunked.
What difference does it make whether something is "temporarily placed" in the root? Care to clarify?
This PR changes nothing about the naming scheme mechanisms. The naming scheme were just illustrative examples. I do not use this naming scheme myself.
I need to use a protocol that works across WAN. NFS and SMB need to be tunneled through a VPN for that to work IIRC.

@classicspam

Did you try FTP?

@dez-dk

@classicspam
Yes, but it timed out a lot when listing large directories and sometimes it would fail to stream videos the first couple of tries.
HTTP was a much smoother experience so that was why I made the switch.

@jmarshallnz jmarshallnz closed this Aug 5, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment