Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New frame selection model #195

Merged
merged 36 commits into from
Jul 15, 2022
Merged

New frame selection model #195

merged 36 commits into from
Jul 15, 2022

Conversation

ejm714
Copy link
Collaborator

@ejm714 ejm714 commented Jul 1, 2022

Replace the existing frame selection model (yolox-nano, image size 416, trained on 80k frames) and a new model (yolox-tiny, image size 640, trained on 800k frames).

Bonus fixes:

  • support the latest release of timm

Closes https://github.com/drivendataorg/pjmf-zamba/issues/88

@ejm714 ejm714 requested a review from pjbull July 1, 2022 02:08
@github-actions
Copy link
Contributor

github-actions bot commented Jul 1, 2022

@ejm714 ejm714 marked this pull request as draft July 1, 2022 21:35
@ejm714 ejm714 marked this pull request as ready for review July 1, 2022 22:38
@@ -448,7 +448,7 @@ def test_megadetector_lite_yolox_dog(tmp_path):
"-vcodec",
"libx264",
"-crf",
"25",
"23",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set CRF to default (otherwise test fails due to lossy compression in creating the test video): https://trac.ffmpeg.org/wiki/Encode/H.264

@codecov-commenter
Copy link

codecov-commenter commented Jul 11, 2022

Codecov Report

Merging #195 (9e0d5cd) into master (a3b27ea) will increase coverage by 1.6%.
The diff coverage is 99.2%.

@@           Coverage Diff            @@
##           master    #195     +/-   ##
========================================
+ Coverage    85.3%   86.9%   +1.6%     
========================================
  Files          30      29      -1     
  Lines        1858    1901     +43     
========================================
+ Hits         1585    1653     +68     
+ Misses        273     248     -25     
Impacted Files Coverage Δ
zamba/models/efficientnet_models.py 100.0% <ø> (ø)
zamba/object_detection/yolox/yolox_model.py 98.9% <98.9%> (ø)
zamba/data/video.py 80.9% <100.0%> (ø)
zamba/object_detection/__init__.py 100.0% <100.0%> (ø)
.../object_detection/yolox/megadetector_lite_yolox.py 99.1% <100.0%> (+0.1%) ⬆️

@ejm714
Copy link
Collaborator Author

ejm714 commented Jul 11, 2022

@pjbull this is ready for your review. the only failing test is due to netlify. note, I've used this code to successfully train and predict with the new frame selection method on the original set of 15k videos, but I'll open a separate PR with the new model weights once this is reviewed and merged.

We know from testing that this model is equivalently fast at a video level. However, since this model uses 640 x 640 as input for the MDLite model, the number of workers / batch size needs to be decreased for training and inference to avoid running out of GPU memory. This makes training and inference with this model slower and is the biggest current drawback.

@netlify
Copy link

netlify bot commented Jul 14, 2022

Deploy Preview for silly-keller-664934 ready!

Name Link
🔨 Latest commit 9e0d5cd
🔍 Latest deploy log https://app.netlify.com/sites/silly-keller-664934/deploys/62d09f5add855e0008f7f54d
😎 Deploy Preview https://deploy-preview-195--silly-keller-664934.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@ejm714
Copy link
Collaborator Author

ejm714 commented Jul 14, 2022

Ready for another look @pjbull. Addressed all your comments and bonus fixed all the object detection links in the docs which had been broken.

@ejm714 ejm714 requested a review from pjbull July 14, 2022 20:57
Copy link
Member

@pjbull pjbull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two little things that aren't dealbreakers

MANIFEST.in Outdated Show resolved Hide resolved
resized_frames = []
resized_video = np.zeros(
(video.shape[0], video.shape[3], self.config.image_height, self.config.image_width),
dtype=np.float32,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these floats at this point? May be worth double checking since a lot of times image data gets loaded in as unit8.

Copy link
Collaborator Author

@ejm714 ejm714 Jul 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the image is loaded as an int here: https://github.com/drivendataorg/zamba/blob/master/zamba/object_detection/yolox/megadetector_lite_yolox.py#L107

but the output of _preprocess is a float which is what gets slotted in: https://github.com/drivendataorg/zamba/blob/master/zamba/object_detection/yolox/megadetector_lite_yolox.py#L115-L124

AFAIC this is the correct input for mdlite

@pjbull pjbull merged commit 57461a6 into master Jul 15, 2022
@pjbull pjbull deleted the new-frame-selection-model branch July 15, 2022 00:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants