Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved scene detection #2710

Merged
merged 1 commit into from
Jul 14, 2021
Merged

Conversation

master-of-zen
Copy link
Collaborator

Goals and motivation

Current fast scene detection in rav1e relatively slow to compared methods of scene detection, and have proclivity to show false results or don't detect scene changes where they appear

This pr reworks fast scene detection algorithm, making it faster, better, and more accurate

Achieved goals are:

  • Faster decision making ( Both less and more efficient computations )
  • More accurate scene detection, by adjusting threshold based on previous frames
  • Frame downscale for faster decisions.

Example of adaptive threshold not cutting high motion segment which default static threshold method would cut

?  [SC-Detect] Frame 519: T=25.0 P=9.5 No cut
?  [SC-Detect] Frame 520: T=25.0 P=10.2 No cut
?  [SC-Detect] Frame 521: T=25.0 P=17.2 No cut

?  [SC-Detect] P: 29.0 [0.0, 1.8, 9.5, 10.2, 17.2] Cut: false
?  [SC-Detect] Frame 522: T=25.0 P=29.0 No cut

?  [SC-Detect] P: 33.6 [1.8, 9.5, 10.2, 17.2, 29.0] Cut: false
?  [SC-Detect] Frame 523: T=25.0 P=33.6 No cut

?  [SC-Detect] P: 34.0 [9.5, 10.2, 17.2, 29.0, 33.6] Cut: false
?  [SC-Detect] Frame 524: T=25.0 P=34.0 No cut

?  [SC-Detect] P: 26.4 [10.2, 17.2, 29.0, 33.6, 34.0] Cut: false
?  [SC-Detect] Frame 525: T=25.0 P=26.4 No cut

?  [SC-Detect] Frame 526: T=25.0 P=22.8 No cut
?  [SC-Detect] Frame 527: T=25.0 P=15.2 No cut
?  [SC-Detect] Frame 528: T=25.0 P=0.4 No cut

Speed comparison

2160p

new old

720p

new old

Copy link
Collaborator

@vibhoothi vibhoothi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

Thanks for the patch, 57 commits for this is nto easy to review, please squash the relevant commits, so reviewers can look it in better way

Copy link
Collaborator

@lu-zero lu-zero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please squash and rebase on top of the current tree.

@shssoichiro
Copy link
Collaborator

shssoichiro commented Mar 31, 2021

My primary concern is that this seems to remove the scene flash detection. This was used for both fast and normal (cost-based) detection and is based on what x264 does to try and avoid an extra keyframe for a very short (5-frame-or-less) scene, preferring to just put one keyframe after the flash. In the past, it also helped to alleviate some false positives with pans being detected as scenecuts using the old fast method--maybe that issue has been resolved with the new fast algorithm?

@master-of-zen
Copy link
Collaborator Author

@shssoichiro
I overlooked the flash detection for new algorithm and improvements should be made.

At this moment, new algorithm should handle pans just as fine as it use dequeue of scores to adjust threshold, which should also handle high motion / fade-in / fade-out

Making score dequeue bidirectional (compare score for current frame to scores before and after) should fix the issue flash detection, and should give free speed boost for all speeds as it reduce amount of computation

@lu-zero
Copy link
Collaborator

lu-zero commented May 3, 2021

Can you please rebase this? :)

@master-of-zen
Copy link
Collaborator Author

It's wip) it will crash)

@master-of-zen master-of-zen changed the title Improved fast scene detection Improved scene detection May 14, 2021
@master-of-zen master-of-zen force-pushed the scene-detection branch 3 times, most recently from 4d1b7de to fbff89b Compare May 25, 2021 07:54
@master-of-zen master-of-zen force-pushed the scene-detection branch 2 times, most recently from 809fea8 to 3984fa8 Compare May 26, 2021 15:16
Copy link
Collaborator

@vibhoothi vibhoothi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not check the code in-depth, was seeing the general implementation, I have some suggestions before landing

  • The documentation for various things are good, but I felt the linebreaks are not consistent, sometimes happening after 20/30words, sometimes at 40/60, would be nice to be more consistent
  • The commit logs, would require a cleanup of commits to see by splitting into two/three, and giving a gist of the changes int eh commit messages, maybe could see the latest PR from @shssoichiro where he was giving a good explanation of things either in the commit log or the PR.

Apart from that It is good in shape generally. Thanks for the work.

@lu-zero would be nice if you can take a close-look in the rust part of APIs :)

)
}
// Initially fill score deque with forward frames
// ititiallization is different depending on frame set length
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// ititiallization is different depending on frame set length
// Initialization is different depending on frame set length

Need to fix the typo+ maybe a rewording is a good idea, like merging both sentences?
Initially fill score deque with forward frames based on frame_set length

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed typos

debug!("[SC-score-deque]{:.0?}", self.score_deque);
self.score_deque.clear();
} else {
// Keep score deque 5 + lookahead_size frames
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how did we reach 5+ lookahead_size, would be nice to add in commit message or here, whichever is better.

On seeing further code, could see the mention of 5 logic, would be better if it is mentioned here than later.

Copy link
Collaborator

@shssoichiro shssoichiro May 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC 5 was chosen because it's the size of a frame pyramid. I think at one point there was at least one comment mentioning that, although I haven't touched this code in a while so my memory is fuzzy. +1 for making sure we still have a comment mentioning that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved comment

has_scenecut: delta >= threshold,
}
) -> ScenecutData {
let frame2_ref2 = Arc::clone(&frame2);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there frame2_ref1 or frame2_ref ?
It maybe slightly confusing if we have 2 at the end if there is no frame2_ref or frame2_ref1. Could simply keep ref itself,
IIRC it was there earlier, itself, maybe @lu-zero could say if it is good idea to keep frame2_ref itself or not.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why it's done the way it is, I just refactored the code)

@master-of-zen master-of-zen force-pushed the scene-detection branch 2 times, most recently from aa10a8c to 3097ce2 Compare May 27, 2021 20:03
@coveralls
Copy link
Collaborator

coveralls commented May 27, 2021

Coverage Status

Coverage decreased (-0.4%) to 83.521% when pulling ecfd502 on master-of-zen:scene-detection into bdee3b9 on xiph:master.

src/scenechange/mod.rs Outdated Show resolved Hide resolved
src/scenechange/mod.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@vibhoothi vibhoothi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further to add here to the PR from the #daala IRC discussion,
On checking the BD-Rate1, there seems regression of 8-14% BD-Rates on Speed-10, and also 1 frame displacement in the new scene-change mechanism2, on lower speed levels tested, the BD-Rate remains neutral, while giving a boost in encoding/decoding time generally.

The subsequent frames in the decoded output have +1 change throughout, this happens in the case of a scene-change.
Things to try out,

  • Examine the Output of new scene-change with different decoders, and extracting frame-by-frame(maybe with ffmpeg), and see the frames which shown the issue.
  • I did not check the code properly to find the +1 change of frame, it is happening to some clips which have scene-change at some levels, need to test at different levels/disabling some encoding features and seeing if it is an offset of some other encoding function,

@master-of-zen master-of-zen force-pushed the scene-detection branch 2 times, most recently from 183e56a to 2745c52 Compare June 2, 2021 12:13
@master-of-zen
Copy link
Collaborator Author

@vibhoothi I changed fast scenecut threshold back to what it is on master, can you rerun awcy s10?

@lu-zero
Copy link
Collaborator

lu-zero commented Jul 9, 2021

Please rebase the whole thing.

@master-of-zen
Copy link
Collaborator Author

master-of-zen commented Jul 14, 2021

AWCY vimeo 10s corpus

speed 6

AWCY

2021 07 14_19:07:02

PSNR Y PSNR Cb PSNR Cr CIEDE2000 SSIM MS-SSIM PSNR-HVS Y PSNR-HVS Cb PSNR-HVS Cr PSNR-HVS VMAF VMAF-NEG
0.1524 N/A N/A N/A 0.0814 0.1332 N/A N/A N/A N/A 0.1599 0.1187

speed 9

AWCY

image

PSNR Y PSNR Cb PSNR Cr CIEDE2000 SSIM MS-SSIM PSNR-HVS Y PSNR-HVS Cb PSNR-HVS Cr PSNR-HVS VMAF VMAF-NEG
0.1763 N/A N/A N/A 0.1007 0.1433 N/A N/A N/A N/A 0.2342 0.1804

speed 10

AWCY
2021 07 14_19:09:59

PSNR Y PSNR Cb PSNR Cr CIEDE2000 SSIM MS-SSIM PSNR-HVS Y PSNR-HVS Cb PSNR-HVS Cr PSNR-HVS VMAF VMAF-NEG
0.2841 N/A N/A N/A 0.3555 0.3468 N/A N/A N/A N/A 0.2160 0.2319

speed 10 with default scene detection and new algo

AWCY
2021 07 14_19:13:35

PSNR Y PSNR Cb PSNR Cr CIEDE2000 SSIM MS-SSIM PSNR-HVS Y PSNR-HVS Cb PSNR-HVS Cr PSNR-HVS VMAF VMAF-NEG
-0.6573 N/A N/A N/A -0.4079 -0.6386 N/A N/A N/A N/A -1.3326 -1.3189

Copy link
Collaborator

@lu-zero lu-zero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is fine for me if it is fine for @shssoichiro .

@shssoichiro
Copy link
Collaborator

It is good for me as well

@shssoichiro shssoichiro dismissed vibhoothi’s stale review July 14, 2021 17:32

Requested changes were made; we got an updated AWCY run as requested via IRC

@shssoichiro shssoichiro merged commit 7970d35 into xiph:master Jul 14, 2021
@master-of-zen master-of-zen deleted the scene-detection branch July 15, 2021 00:11
tdaede pushed a commit to tdaede/rav1e that referenced this pull request Sep 1, 2021
Current fast scene detection in rav1e relatively slow to compared methods of scene detection, and have proclivity to show false results or don't detect scene changes where they appear

This pr reworks fast scene detection algorithm, making it faster, better, and more accurate

Achieved goals are:

    Faster decision making ( Both less and more efficient computations )
    More accurate scene detection, by adjusting threshold based on previous frames
    Frame downscale for faster decisions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants