imgidx is a simple image indexing tool for searching similar or the same images regardless of their format, size, and compression.
Basically,We can distinguish three cases of image search (sorted by increasing complexity):
- Search for exactly the same file. In this case, we simply can use file hashes to match them.
- Searching for the same image with different sizes/formats/compression/watermarks/etc.: this is the use case imgidx designed for.
- Searching for different pictures of the same or similar objects. E.g. searching for a photo of the same thing from another angle, face recognition, etc. This problem is usually solved by CNN or other Deep Learning technics, imgidx is too dumb for this task.
package main
import (
"github.com/alef-ru/imgidx"
)
idx, err := imgidx.NewCompositeIndex(8, 8)
// Handle the error
package main
import (
"github.com/alef-ru/imgidx"
"gorm.io/driver/sqlite"
)
idx, err = imgidx.NewPersistentCompositeIndex(8, 8, sqlite.Open("imgidx.db"))
Embedder is a component that represents an image as a vector of floats. You can develop your own embedder.
var img image.Image
// ... init img...
_, err := idx.AddImage(img, "image unique identifier", attributes)
// .. handle err ...
You can also add images by their URL or path to a local file
_, err = imgidx.AddImageUrl(idx, "https://example.com/image.jpeg", attributes)
_, err = imgidx.AddImageFile(idx, '/path/to/image.png', attributes)
attributes
is whatever additional information you want to associate with the image
Likewise, adding an image, there are three ways to find the nearest to given image in the index.
uri, attrs, dist, err = idx.Nearest(img)
uri, attrs, dist, err = imgidx.NearestByFile(idx, path)
uri, attrs, dist, err = imgidx.NearestByUrl(idx, url)
this functions return the same set of variables:
uri
: a unique image id: URL, the full path to file, or something custom, depending on how the image was addedattrs
: whatever you passed to AddImage*() method when adding the imagedist
: distance between the given and found image. The less the distance - the more similar the images
- JPEG
- PNG
- GIF
Here is a simple example of indexing and searching images with sample data.
For the full code of this example, please refer to the simple_demo
Let's say we have 20 images with pokemons and we want to find a few images against it.
This image (exactly the same file) was indexed, so Index.Nearest() finds it and returns the distance is 0.
This image was made of the previous one, but it has a different format, JPEG instead of PNG, and was severely compressed, so compression artifacts became visible.
Index.Nearest() finds the proper image, but the distance in this case 0.023425
This image was made of the original one, with a number of modifications:
- It was up upscaled and converted to JPEG
- It was squished a bit
- It has frames and some extra lines drawn on top
So, basically, this image is different from any of the indexed images. But Index.Nearest() can still find the original image. Distance, in this case, is 1.922400
If we try to remove the original image from the index and try to find it again, the index will return some other image. In our case this one:
Distance: 3.108842
In _examples/server
you can find a simple server that can be used to index and search images.
You can run it like this:
git clone https://github.com/alef-ru/imgidx.git
cd imgidx
HTTPS_HOSTNAME=imgidx.example.com AUTH_TOKEN=my_secret_token go run ./_examples/server/
HTTPS_HOSTNAME
is optional. Omit it if you don't want to use HTTPS. The app will listen on port 8080, in this case.
AUTH_TOKEN
is also optional, if you omit it, the server will not require token.