Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic support for gifs/video files #12

Open
AgentScrubbles opened this issue Feb 5, 2024 · 6 comments
Open

Add basic support for gifs/video files #12

AgentScrubbles opened this issue Feb 5, 2024 · 6 comments

Comments

@AgentScrubbles
Copy link

AgentScrubbles commented Feb 5, 2024

I think we could add some basic video/gif support by re-using the image processing and taking frames of each gif/video that pictrs stores.

I don't know python well, but I do know ffmpeg and video processing very well. Using a library like ffmpeg-python I think we could add on support for videos/gifs. ffmpeg could then be easily bundled with the containers, and could be toggled with an environment variable

The process could go like:

For (first frame), (last frame), (every 5 seconds), (random sequence of frames):
  grab the frame using something like "ffmpeg -ss 00:00:00.01 -i {input}.(mp4|gif|whatever) -frames:v 1 {framenumber}.jpg"
  if that detects as nsfw, reject

I think this would be a valid way to start checking videos as well. With the sequenced every n seconds combined with random frames grabbed, it would be extremely difficult to try to hide NSFW in a video.

@db0
Copy link
Owner

db0 commented Feb 5, 2024

The problem is the false positives. the normal 6% rate is ok in a single image, but in a video will almost be a 100% hit rate among all the frames. A video has to be scanned using possibly some very clever thresholds and looping back to problem spots for a second pass

@AgentScrubbles
Copy link
Author

Could that 6% number be used to our advantage in this case? If we know we'll be checking multiple photos, then we grant 1 positive as a false positive, but the chances of say, 3 or more false positives are... 0.0216% if I remember my combinatorics right. So if even 2 or more frames flag then we are pretty sure it should be flagged?

@db0
Copy link
Owner

db0 commented Feb 5, 2024

A video could have thousands or tens of thousands of frames. though. I am not saying it's not possible, but requires some thought

@AgentScrubbles
Copy link
Author

That's why I think a solid sampling of them would be sufficient if a decent thought process could be thought of. Thinking from an attacker's perspective, they would want to try to get around it. So something like this example algorithm

  • For a the length of the video L in seconds, find a good amount of frames we want to analyze, labeled N. So every L/N seconds grab a frame. Assume we want 20 frames to analyze, so N=20, we are then guaranteed 20 frames from the video at standard intervals.
    • This would take on the attack vector of them putting something at the beginning or ending, and any fading in or out
  • In addition, grab an additional N frames from random spots in the video
    • This would take the attack vector of them learning where we take frames from and not being able to predict what frames will be analyzed
  • Take the additional first and last frames just as a double check.

If N is 20, this would be 42 total frames analyzed. We would cover the entirety of the video and it would be incredibly difficult to hide anything in the intervals we didn't scan because of the random frames grabbed. Of course those dials could be tweaked over time, or could be an environment variable for how fine tuned the user may want to analyze videos.

If more than some percentage fail, then consider the video failed. If 6% are false positives, then it'd be reasonable to say that if 10-20% failed then we are reasonably sure that the video should be failed. (That also could be an environment variable)

@db0
Copy link
Owner

db0 commented Feb 6, 2024

Yes that's what I meant with more thought :)

Anyway, feel free to send a PR ;)

@AgentScrubbles
Copy link
Author

AgentScrubbles commented Feb 17, 2024

Wanted to give a heads up, don't worry haven't forgotten about this. I'm not a python dev so I'm learning python while fiddling with it. It looks like the library can detect if it's a gif and can let me grab frames using seek. I plan to grab the requested number of scanned frames from an environment variable, then follow the quick algorithm described above

For every gif then I'm going to try to make the variable mean "For every gif expect this many checks". So the higher the number the more fine-grained the check will be. Then another variable with a default value of 20% or so will say "If higher than this number register as a positive, reject"

If you could confirm, I'm planning on adding this logic to check.py as a private method, which if check_image determines the image is a gif it'll pass it into a new check_gif method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants