Make scene detection faster and more accurate #3

Witiko · 2020-08-11T16:00:32Z

This pull request aims to make video processing faster by increasing the accuracy and the speed of scene detection:

We now use the mean squared error (MSE) of pixel values, which is differentiable and places emphasis on large errors, rather than the non-differentiable mean absolute error (MAE). This change should improve accuracy.
The frame image pixels are now denoised and sampled before comparison. This change should improve speed and accuracy.
The error of pixel values is computed in the CIE L*a*b* color space, where the Euclidean distance corresponds to the perceptual color distance, instead of sRGB. This change should improve accuracy.

As a result of these changes, we were able to increase threshold from 0.12 MAE to 0.22 MSE while retaining 100% accuracy on the training set. Just replacing MAE with MSE decreases the threshold from 0.12 MAE to 0.1 MSE, i.e. the increased threshold is due to the denoising and the perceptual color distance, which increase separability. For further speed improvement, the threshold could be increased to 0.25 MSE for ≥ 95% accuracy on the training set.

@xbankov, when you are testing new models, can you please write new code on top of the speed-up-scene-detection branch to see if this helps the conversion speed? If so, I will merge this and close #2.

Change mean absolute error to MSE, downscale frame images, use CIE LAB

xbankov · 2020-08-12T07:04:53Z

Looks great! I will use the speed-up-scene-detection branch for testing the conversion speed

Closes #4

Witiko · 2020-08-13T21:38:28Z

Evaluation with the withheld IA067-D2-20191112.mp4 recording shows that the updated scene detector significantly improves speed (except with the annotated page detector, which does no computation) with no loss of accuracy, see the table below and the experimental notebook. This is most significant with vgg16, which took 2:30 hours to finish without scene detection and only 12 minutes with scene detection on my machine. This is a significant leap forward, although the low accuracies are worrying: We need to replace vgg16 with something better before the system is practically useful, see #5.

@xbankov, if you remove the line that says # fastai is too large to load and rerun the experimental notebook, you should get a larger table that also contains the measurements for your screen detector. With the updated scene detector, we should finally see some real-time performance. Feel free to push the notebook with the updated table to the speed-up-scene-detection branch. I am interested to see what the results will be.

xbankov · 2020-08-14T16:41:06Z

fastai looks "more" real-time with distance, however, accuracies are a little bit worrying.
Why are page detectors better with a fastai screen detector than with annotated?

Witiko · 2020-08-14T17:53:12Z

Why are page detectors better with a fastai screen detector than with annotated?

The recording was captured in 2019, but the last annotations for screen positions are from 2016. I would expect that the cameras have moved since then and the annotations are therefore imperfect.

vgg16: 57.89% | 63.16%

Switching the pages may cause overexposure and it takes a few seconds for the camera to readjust. The training dataset does not contain these moments. Since the scene detector only uses the screen detector during these moments, it may be that the screen detector is unable to detect the screens correctly.

annotated: 25.00% | 30.26%

If fastai with the annotated page detector received < 100% accuracy, then it could be because fastai does not detect any screens or detects too many screens. However, I am baffled as to why fastai with the annotated page detector received less than fastai with vgg16. If you could delete the file docs/notebooks/__main__/speed_and_accuracy_outputs/IA067-D2-20191112-annotated-fastai-distance.accuracy, apply the following patch and rerun the notebook, then we should be able to tell.

diff --git a/docs/notebooks/__main__/annotated.py b/docs/notebooks/__main__/annotated.py
index 0dd2652..5c21d08 100644
--- a/docs/notebooks/__main__/annotated.py
+++ b/docs/notebooks/__main__/annotated.py
@@ -141,10 +141,14 @@ def evaluate_event_detector(event_detector):
             if page_number is None:
                 if not detected_page_dict:
                     num_successes += 1
+                else:
+                    print(f'Frame {frame_number}: Expected no pages, but detected {detected_page_dict}')                                                                                                        
             else:
                 detected_page_numbers = set(page.number for page in detected_page_dict.values())
                 if len(detected_page_dict) <= 2 and detected_page_numbers == set([page_number]):
                     num_successes += 1
+                else:
+                    print(f'Frame {frame_number}: Expected at most 2 screens with page {page_number}, but detected {len(detected_page_dict)} screens with pages {detected_page_numbers}')                       

         if isinstance(event, (ScreenAppearedEvent, ScreenChangedContentEvent)):
             detected_page_dict[event.screen_id] = event.page

The notebook is written so that only the table cell of fastai with the annotated page detector will be rerun.

xbankov · 2020-08-19T08:47:38Z

Thi print statement looks like the following:
That means screen detector does not work that good, am I right?

Frame 346: Expected at most 2 screens with page 1, but detected 0 screens with pages set()
Frame 346: Expected at most 2 screens with page 1, but detected 3 screens with pages {1}
Frame 346: Expected at most 2 screens with page 1, but detected 3 screens with pages {1}
Frame 346: Expected at most 2 screens with page 1, but detected 3 screens with pages {1}
Frame 346: Expected at most 2 screens with page 1, but detected 3 screens with pages {1}
Frame 1077: Expected at most 2 screens with page 2, but detected 2 screens with pages {1}
Frame 1077: Expected at most 2 screens with page 2, but detected 2 screens with pages {1}
Frame 1077: Expected at most 2 screens with page 2, but detected 2 screens with pages {1}
Frame 1077: Expected at most 2 screens with page 2, but detected 2 screens with pages {1}
...
Frame 1077: Expected at most 2 screens with page 2, but detected 2 screens with pages {1}
Frame 1077: Expected at most 2 screens with page 2, but detected 2 screens with pages {1}
Frame 1077: Expected at most 2 screens with page 2, but detected 2 screens with pages {1}
Frame 1077: Expected at most 2 screens with page 2, but detected 2 screens with pages {1, 2}
Frame 1077: Expected at most 2 screens with page 2, but detected 2 screens with pages {1, 2}
Frame 1077: Expected at most 2 screens with page 2, but detected 3 screens with pages {2}
Frame 1077: Expected at most 2 screens with page 2, but detected 3 screens with pages {2}
Frame 1077: Expected at most 2 screens with page 2, but detected 3 screens with pages {2}
...
Frame 1077: Expected at most 2 screens with page 2, but detected 3 screens with pages {2}

Frame 65925: Expected no pages, but detected {'screen-69': <PDFDocumentPage, page #65>, 'screen-130': <PDFDocumentPage, page #65>, 'screen-131': <PDFDocumentPage, page #65>}
Frame 65925: Expected no pages, but detected {'screen-69': <PDFDocumentPage, page #65>, 'screen-130': <PDFDocumentPage, page #65>, 'screen-131': <PDFDocumentPage, page #65>}

Witiko · 2020-08-19T12:07:14Z

@xbankov It is unexpected that the frame numbers should repeat. Is this the raw output you are getting? If so, this indicates errors in the evaluation code. I will need to troubleshoot this further, thank you for investigating.

What we can say for sure is that the screen detector seems to be detecting three screens instead of two in the first two cases. At the moment, this is not too troubling and we can relax the conditions for success to all screens showing the expected page, no matter the number of detected screens.

Witiko · 2020-08-19T20:18:04Z

I received a very different output and an accuracy of 75% (instead of 25%) after removing file IA067-D2-20191112-annotated-fastai-distance.accuracy and rerunning the notebook:

Frame 1476: Expected at most 2 screens with page 3, but detected 3 screens with pages {3}
Frame 5203: Expected at most 2 screens with page 11, but detected 3 screens with pages {11}
Frame 9144: Expected at most 2 screens with page 17, but detected 3 screens with pages {17}
Frame 14923: Expected at most 2 screens with page 26, but detected 3 screens with pages {26}
Frame 15365: Expected at most 2 screens with page 27, but detected 3 screens with pages {27}
Frame 15659: Expected at most 2 screens with page 28, but detected 3 screens with pages {28}
Frame 15904: Expected at most 2 screens with page 29, but detected 3 screens with pages {29}
Frame 16904: Expected at most 2 screens with page 30, but detected 3 screens with pages {30}
Frame 19077: Expected at most 2 screens with page 33, but detected 3 screens with pages {33}
Frame 25177: Expected at most 2 screens with page 45, but detected 3 screens with pages {45}
Frame 25851: Expected at most 2 screens with page 46, but detected 3 screens with pages {46}
Frame 27114: Expected at most 2 screens with page 47, but detected 3 screens with pages {47}
Frame 27885: Expected at most 2 screens with page 48, but detected 3 screens with pages {48}
Frame 42657: Expected at most 2 screens with page 50, but detected 3 screens with pages {50}
Frame 43460: Expected at most 2 screens with page 51, but detected 3 screens with pages {51}
Frame 44916: Expected at most 2 screens with page 53, but detected 3 screens with pages {53}
Frame 50477: Expected at most 2 screens with page 57, but detected 3 screens with pages {57}
Frame 50572: Expected at most 2 screens with page 58, but detected 3 screens with pages {58}

It seems that the only issue is an extra detected screen:

Do you have any idea why such output would be produced, @xbankov?
I will rerun the accuracy part of the evaluation, ignoring errors in the number of screens, and report the results tomorrow.

Witiko · 2020-08-21T14:51:35Z

@xbankov Below are the updated accuracies:

It seems that using the scene detector leads to some loss of accuracy afterall, although in is not clear whether it is the page detector or the screen detector that is responsible for this. We should be able to tell after I have fixed the faulty screen annotations and recomputed the accuracies.

Witiko · 2020-08-23T12:32:21Z

Here are the accuracies after I have fixed the screen annotations:

With imagehash and vgg16 page detectors, fastai achieves up to 10% worse accuracy compared to annotated, because it detects three screens instead of two, as discussed above. The siamese page detector is a great mystery, since it seems to benefit from fastai: this needs further investigation.

The distance scene detector decreases accuracy with both the fastai and the annotated screen detectors, but fastai is hit harder. My hypothesis is that the scene detector will often detect a scene transition when a crossfade between two presentation slides has not yet finished. Neither the fastai screen detector nor the page detectors have been trained with cross-faded images.

Witiko · 2020-08-23T16:54:28Z

The siamese page detector is a great mystery, since it seems to benefit from fastai: this needs further investigation.

Below is the output of the annotated screen detector:

Frame 346: Expected at most 2 screens with page 1, but detected 0 screens
Frame 1077: Expected at most 2 screens with page 2, but detected 0 screens
Frame 1476: Expected at most 2 screens with page 3, but detected 0 screens
Frame 1623: Expected at most 2 screens with page 4, but detected 0 screens
Frame 2063: Expected at most 2 screens with page 5, but detected 0 screens
Frame 2642: Expected at most 2 screens with page 6, but detected 0 screens
Frame 3543: Expected at most 2 screens with page 7, but detected 0 screens
Frame 4202: Expected at most 2 screens with page 8, but detected 0 screens
Frame 4414: Successfully detected page 9
Frame 4828: Expected at most 2 screens with page 10, but detected 0 screens
Frame 5203: Expected at most 2 screens with page 11, but detected 0 screens
...

Below is the output of the fastai screen detector:

Frame 346: Successfully detected page 1
Frame 1077: Successfully detected page 2
Frame 1476: Expected at most 2 screens with page 3, but detected 0 screens
Frame 1623: Successfully detected page 4
Frame 2063: Successfully detected page 5
Frame 2642: Expected at most 2 screens with page 6, but detected 0 screens
Frame 3543: Expected at most 2 screens with page 7, but detected 0 screens
Frame 4202: Expected at most 2 screens with page 8, but detected 0 screens
Frame 4414: Successfully detected page 9
Frame 4828: Expected at most 2 screens with page 10, but detected 0 screens
Frame 5203: Expected at most 2 screens with page 11, but detected 0 screens
...

As you can see in the output of the fastai screen detector with the annotated page detector, the fastai screen detector with the siamese page detector fails where fastai fails (frames 1476, 5203, ...). This indicates that siamese benefits from different coordinates of the screens detected by fastai, not from fastai's failure to detect the correct number of screens.

Below, you can see the detected screens from frames 346, 1077, and 1623, where fastai+siamese succeeds (green), but annotated+siamese fails (red):

As you can see, the screens detected by fastai do not seem better than annotated, i.e. this seems to be just an idiosyncrasy of siamese. It could be that siamese cannot cope with the annotated screens that extend beyond the boundaries of a screen.

Witiko changed the title ~~Speed up scene detection~~ Make scene detection faster and more accurate Aug 11, 2020

Rename FrameImageDistanceSceneDetector to MeanSquaredErrorSceneDetector

a04d0a9

Change mean absolute error to MSE, downscale frame images, use CIE LAB

Witiko force-pushed the speed-up-scene-detection branch from e9d222e to a04d0a9 Compare August 11, 2020 16:19

Witiko added 3 commits August 13, 2020 23:11

Conservatively decrease MeanSquaredErrorSceneDetector's max_mse

32b204c

Add -q/--quiet argument to the CLI

36a0aad

Evaluate speed and accuracy for all combinations of detectors

eff545a

Closes #4

Fix code style in docs/notebooks/__main__/annotated.py

28d8d13

Witiko mentioned this pull request Aug 13, 2020

Improve accuracy of page detectors #5

Open

10 tasks

Including fast.ai on google cloud GPU for comparison

0d970b7

Reevaluate accuracy

a96a868

Witiko added 2 commits August 21, 2020 16:58

Annotate screens for video IA067-D2-20191112.mp4

1878186

Reevaluate accuracy

2b02e83

Witiko merged commit 918378d into master Aug 23, 2020

xbankov deleted the speed-up-scene-detection branch February 6, 2021 16:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make scene detection faster and more accurate #3

Make scene detection faster and more accurate #3

Uh oh!

Witiko commented Aug 11, 2020 •

edited

Loading

Uh oh!

xbankov commented Aug 12, 2020

Uh oh!

Witiko commented Aug 13, 2020 •

edited

Loading

Uh oh!

xbankov commented Aug 14, 2020 •

edited by Witiko

Loading

Uh oh!

Witiko commented Aug 14, 2020 •

edited

Loading

Uh oh!

xbankov commented Aug 19, 2020 •

edited

Loading

Uh oh!

Witiko commented Aug 19, 2020 •

edited

Loading

Uh oh!

Witiko commented Aug 19, 2020 •

edited

Loading

Uh oh!

Witiko commented Aug 21, 2020

Uh oh!

Witiko commented Aug 23, 2020 •

edited

Loading

Uh oh!

Witiko commented Aug 23, 2020 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Make scene detection faster and more accurate #3

Make scene detection faster and more accurate #3

Uh oh!

Conversation

Witiko commented Aug 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xbankov commented Aug 12, 2020

Uh oh!

Witiko commented Aug 13, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xbankov commented Aug 14, 2020 • edited by Witiko Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Witiko commented Aug 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xbankov commented Aug 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Witiko commented Aug 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Witiko commented Aug 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Witiko commented Aug 21, 2020

Uh oh!

Witiko commented Aug 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Witiko commented Aug 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Witiko commented Aug 11, 2020 •

edited

Loading

Witiko commented Aug 13, 2020 •

edited

Loading

xbankov commented Aug 14, 2020 •

edited by Witiko

Loading

Witiko commented Aug 14, 2020 •

edited

Loading

xbankov commented Aug 19, 2020 •

edited

Loading

Witiko commented Aug 19, 2020 •

edited

Loading

Witiko commented Aug 19, 2020 •

edited

Loading

Witiko commented Aug 23, 2020 •

edited

Loading

Witiko commented Aug 23, 2020 •

edited

Loading