Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perceptual Image Hashing v1 #2

Closed
sambux1 opened this issue Feb 8, 2022 · 5 comments
Closed

Perceptual Image Hashing v1 #2

sambux1 opened this issue Feb 8, 2022 · 5 comments
Assignees
Labels
v0.1 version 0.1

Comments

@sambux1
Copy link
Collaborator

sambux1 commented Feb 8, 2022

The system currently only supports text data as a test. At a minimum, we need to be able to upload and hash images and videos.

I will edit this issue with more information and a specific course of action.

@ahmedh409
Copy link
Owner

If we can find a video- or image-hashing algorithm which can hash the media into a string, this is very easy to implement. The interface would just need an object to represent what is being uploaded and the file path of the media as the input to said algorithm, then we can use that string output as the _s_data value we're currently using for blocks.

It doesn't matter if the algorithm is slow or old, it just needs to work so we can get the system off the ground.

@ahmedh409
Copy link
Owner

Starting with image hashing - video hashing is much more complex, we'll build up to that. Looking into perceptual hashes which can hash an image into the same value regardless of minor modifications (e.g. similar resolutions, similar colors, text, etc.). I found an implementation of a well-known open-source algorithm pHash, so I'll try implementing that this week.

I'll also write up a guide on digital signal/image processing which would be useful for programming and image hashing. We'll need a "hashing guide" which covers general hashing algorithms, image hashing, and video hashing soon, too.

@ahmedh409 ahmedh409 changed the title Media Hashing Image Hashing Feb 9, 2022
@ahmedh409 ahmedh409 added the v0.1 version 0.1 label Feb 9, 2022
@ahmedh409 ahmedh409 changed the title Image Hashing Perceptual Image Hashing Feb 9, 2022
@ahmedh409
Copy link
Owner

Here is a basic guide with some pseudocode to implement a perceptual hash. Not up to pHash's robustness though.

Here is a really robust perceptual hash developed by the person who created pHash (above) - we can worry about implementing this later since it'll take more time, but this should be strongly considered later.

@sambux1
Copy link
Collaborator Author

sambux1 commented Feb 21, 2022

I implemented a rough version of perceptual image hashing. For now, it just takes an image as input, compresses it to 8x8, converts it to grayscale, extracts a bit from each pixel, and outputs the bits as a 16 digit hex string.

This is far from a complete version of image hashing, but it is a good enough first step for v0.1.

@sambux1
Copy link
Collaborator Author

sambux1 commented Feb 21, 2022

I'm removing the v0.1 label and adding the v0.3 label. This is enough to work through v0.2.

Edit: I'm going to undo that relabel so we can see this as progress in the v0.1 category, and I'll open a new issue for a future version of image hashing.

@sambux1 sambux1 added v0.3 version 0.3 v0.1 version 0.1 and removed v0.1 version 0.1 v0.3 version 0.3 labels Feb 21, 2022
@sambux1 sambux1 closed this as completed Feb 21, 2022
@sambux1 sambux1 changed the title Perceptual Image Hashing Perceptual Image Hashing v1 Feb 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v0.1 version 0.1
Projects
None yet
Development

No branches or pull requests

2 participants