Add basic support for gifs/video files #12

AgentScrubbles · 2024-02-05T03:04:43Z

I think we could add some basic video/gif support by re-using the image processing and taking frames of each gif/video that pictrs stores.

I don't know python well, but I do know ffmpeg and video processing very well. Using a library like ffmpeg-python I think we could add on support for videos/gifs. ffmpeg could then be easily bundled with the containers, and could be toggled with an environment variable

The process could go like:

For (first frame), (last frame), (every 5 seconds), (random sequence of frames):
  grab the frame using something like "ffmpeg -ss 00:00:00.01 -i {input}.(mp4|gif|whatever) -frames:v 1 {framenumber}.jpg"
  if that detects as nsfw, reject

I think this would be a valid way to start checking videos as well. With the sequenced every n seconds combined with random frames grabbed, it would be extremely difficult to try to hide NSFW in a video.

The text was updated successfully, but these errors were encountered:

db0 · 2024-02-05T08:07:41Z

The problem is the false positives. the normal 6% rate is ok in a single image, but in a video will almost be a 100% hit rate among all the frames. A video has to be scanned using possibly some very clever thresholds and looping back to problem spots for a second pass

AgentScrubbles · 2024-02-05T16:38:11Z

Could that 6% number be used to our advantage in this case? If we know we'll be checking multiple photos, then we grant 1 positive as a false positive, but the chances of say, 3 or more false positives are... 0.0216% if I remember my combinatorics right. So if even 2 or more frames flag then we are pretty sure it should be flagged?

db0 · 2024-02-05T21:53:09Z

A video could have thousands or tens of thousands of frames. though. I am not saying it's not possible, but requires some thought

AgentScrubbles · 2024-02-05T22:48:51Z

That's why I think a solid sampling of them would be sufficient if a decent thought process could be thought of. Thinking from an attacker's perspective, they would want to try to get around it. So something like this example algorithm

For a the length of the video L in seconds, find a good amount of frames we want to analyze, labeled N. So every L/N seconds grab a frame. Assume we want 20 frames to analyze, so N=20, we are then guaranteed 20 frames from the video at standard intervals.
- This would take on the attack vector of them putting something at the beginning or ending, and any fading in or out
In addition, grab an additional N frames from random spots in the video
- This would take the attack vector of them learning where we take frames from and not being able to predict what frames will be analyzed
Take the additional first and last frames just as a double check.

If N is 20, this would be 42 total frames analyzed. We would cover the entirety of the video and it would be incredibly difficult to hide anything in the intervals we didn't scan because of the random frames grabbed. Of course those dials could be tweaked over time, or could be an environment variable for how fine tuned the user may want to analyze videos.

If more than some percentage fail, then consider the video failed. If 6% are false positives, then it'd be reasonable to say that if 10-20% failed then we are reasonably sure that the video should be failed. (That also could be an environment variable)

db0 · 2024-02-06T00:14:48Z

Yes that's what I meant with more thought :)

Anyway, feel free to send a PR ;)

AgentScrubbles · 2024-02-17T20:22:37Z

Wanted to give a heads up, don't worry haven't forgotten about this. I'm not a python dev so I'm learning python while fiddling with it. It looks like the library can detect if it's a gif and can let me grab frames using seek. I plan to grab the requested number of scanned frames from an environment variable, then follow the quick algorithm described above

For every gif then I'm going to try to make the variable mean "For every gif expect this many checks". So the higher the number the more fine-grained the check will be. Then another variable with a default value of 20% or so will say "If higher than this number register as a positive, reject"

If you could confirm, I'm planning on adding this logic to check.py as a private method, which if check_image determines the image is a gif it'll pass it into a new check_gif method.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add basic support for gifs/video files #12

Add basic support for gifs/video files #12

AgentScrubbles commented Feb 5, 2024 •

edited

Loading

db0 commented Feb 5, 2024

AgentScrubbles commented Feb 5, 2024

db0 commented Feb 5, 2024

AgentScrubbles commented Feb 5, 2024

db0 commented Feb 6, 2024

AgentScrubbles commented Feb 17, 2024 •

edited

Loading

Add basic support for gifs/video files #12

Add basic support for gifs/video files #12

Comments

AgentScrubbles commented Feb 5, 2024 • edited Loading

db0 commented Feb 5, 2024

AgentScrubbles commented Feb 5, 2024

db0 commented Feb 5, 2024

AgentScrubbles commented Feb 5, 2024

db0 commented Feb 6, 2024

AgentScrubbles commented Feb 17, 2024 • edited Loading

AgentScrubbles commented Feb 5, 2024 •

edited

Loading

AgentScrubbles commented Feb 17, 2024 •

edited

Loading