-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audio often doesn't match when only using parts of files via secondsToProcess and startAtSecond #191
Comments
Extra info: I rewrote my code to use ffmpeg to cut the video for the time I'm interested in into separate files, and fingerprint/query off those and it finds the matches perfectly using the same timestamps that I was passing in to the secondsToProcess and startAtSecond. There's definitely an issue there. This is working for now, but I'd really rather not have to create these temporary video files just to fingerprint. |
I tried to reproduce it with a sample example but I couldn't. The following example fingerprints the entire file, then queries only 15 seconds starting at 10'th second. public static class Issue191
{
public static async Task<AVQueryResult> Reproduce()
{
var modelService = new InMemoryModelService();
var mediaService = new FFmpegAudioService();
string file = "path to a 30 seconds file";
var hashes = await FingerprintCommandBuilder.Instance
.BuildFingerprintCommand()
.From(file)
.UsingServices(mediaService)
.Hash();
var track = new TrackInfo("1", string.Empty, string.Empty);
modelService.Insert(track, hashes);
var results = await QueryCommandBuilder.Instance
.BuildQueryCommand()
.From(file, secondsToProcess: 15, startAtSecond: 10)
.UsingServices(modelService, mediaService)
.Query();
return results;
}
} I get the result as expected with the following properties:
Same holds if I fingerprint just a portion of the file: var hashes = await FingerprintCommandBuilder.Instance
.BuildFingerprintCommand()
.From(file, secondsToProcess: 15, startAtSecond: 10)
.UsingServices(mediaService)
.Hash(); In this case, I get full matches, with
|
Can you provide a sample example with expected and actual results? |
Mmm... I'd have to hunt for some that wouldn't be a copyright infringement, since the ones I'm using are TV shows ripped from dvd. It's not every file either. Some match fine, but some are "problem files" which don't match at all. Although, I have seen some files which "sometimes match" unreliably too when using startSeconds, but usually if it doesn't match, it'll never match. Most of my library is mkv for the container. Audio codecs vary though, I'll see if I can see a commonality for codec type. It may be something to do with the encoding perhaps, since normally I get back 400+ items in the results when querying the whole file against the whole series (most matches are ~1 second though), but on the files where it won't find matches, it returns no results at all, not even short ones. Those same files work fine when passing the full file in though. It only returns no results when using the startSeconds option. |
I see that there are logging outputs in your source, but I can't see a way to access them. How do I turn on logging? |
Both var queryCommandBuilder = new QueryCommandBuilder(loggerFactory); Once provided it will start logging in the project configured output. |
A similar bug has been described in #207 (not exactly the same scenario but a particular use case when both Audio | Video fingerprints are generated during query). |
Describe the bug
While using:
.From("/path/to/file", 123.4, 123.4, MediaType.Audio)
There is a much lower match rate than just using:
.From("/path/to/file", MediaType.Audio)
This affects both "BuildQueryCommand" and BuildFingerprintCommand. Using partial times with just one or the other doesn't have as much of an effect, but using with both at the same time reduces the matches to abysmal levels.
I've verified that the matches when using the non-partial version are within the segments that I'm restricting to (both in the track side and query side), but they just don't match when restricted to partial durations.
I'm using the InMemoryStorage, since I don't need to keep the fingerprints around for future use.
I'm not sure if it makes a difference, but my use case is picking out matching duplicates from longer streams (in this case - finding intros and credits in tv shows, so I'm trying to fingerprint the last few minutes of every show and then query each show against those fingerprints to find the timing of the credits, for instance).
I wanted to take a look at solving this myself, but it seems that the Emy package isn't open source, and the ffmpeg is part of that, so I can't really trace it through.
The text was updated successfully, but these errors were encountered: