The latest version of the SOMHunter tool is now open-source and available here under the GPLv2 license.
This is an open-source version of the SOMHunter video search and retrieval engine, slightly simplified from the prototype that was featured at Video Browser Showdown 2020 in Daejeon, Korea (see https://videobrowsershowdown.org/ ).
- a very simple demonstration dataset is packed in the repository, indexed V3C1 dataset will be available for download
- keyword search, similarity-based search
- several ways of browsing and re-scoring the results (including SOMs)
- VBS-compatible submission and logging API clients
SOMHunter is licensed as free software, under the GPLv2 license. This grants you freedom to use the software for many research purposes and publish the results. For exploring and referencing the original work, you may find some of the following articles helpful:
- Kratochvíl, M., Veselý, P., Mejzlík, F., & Lokoč, J. (2020, January). SOM-Hunter: Video Browsing with Relevance-to-SOM Feedback Loop. In International Conference on Multimedia Modeling (pp. 790-795). Springer, Cham.
- Mejzlík, F., Veselý, P., Kratochvíl, M., Souček, T., & Lokoč, J. (2020, June). SOMHunter for Lifelog Search. In Proceedings of the Third Annual Workshop on Lifelog Search Challenge (pp. 73-75).
- Kratochvil, M., Mejzlík, F., Veselý, P., Soućek, T., & Lokoć, J. (2020, October). SOMHunter: Lightweight Video Search System with SOM-Guided Relevance Feedback. In Proceedings of the 28th ACM International Conference on Multimedia (pp. 4481-4484).
- Veselý, P., Mejzlík, F., & Lokoč, J. (2021, June). SOMHunter V2 at Video Browser Showdown 2021. In International Conference on Multimedia Modeling (pp. 461-466). Springer, Cham.
You can get a working SOMHunter copy from Docker:
docker pull exaexa/somhunter:v0.1 docker run -ti --rm -p8080:8080 exaexa/somhunter:v0.1
After that, open your browser at http://localhost:8080, and use login
som and password
Installation from source
- a working installation of Node.js with some variant of package manager
- Python 3
- C++ compiler
libcurl(see below for installation on various platforms)
After cloning this repository, change to the repository directory and run
npm install npm run start
If everything goes all right, you can start browsing at http://localhost:8080/
. The site is password-protected by default, you can use the default login
som and password
hunter, or set a different login in
Getting the dependencies on UNIX systems
You should be able to install all dependencies from the package management. On Debian-based systems (including Ubuntu and Mint) the following should work:
apt-get install build-essential libcurl4-openssl-dev nodejs yarnpkg
The build system uses
pkg-config to find libCURL -- if that fails, either
install the CURL pkgconfig file manually, or customize the build configuration
core/binding.gyp to fit your setup.
Similar (similarly named) packages should be available on most other distributions.
Getting the dependencies on Windows
The build systems expects libCURL to reside in
c:\Program Files\curl\. You
may want to install it using
- download and install
- install and export libCURL:
vcpkg install curl:x64-windows vcpkg export --raw curl:x64-windows
- copy the directory with the exported libCURL to
Alternatively, you can use any working development installation of libCURL by
filling the appropriate paths in
We have tested SOMHunter on Windows and several different Linux distributions, which should cover a majority of target environments. Please report any errors you encounter using the GitHub issue tracker, so that we can fix them (and improve the portability of SOMHunter).
Building the Docker image
The installation is described in
Dockerfile; you should be able to get a
working, correctly tagged (and possibly customized) image by running this in
docker build -t somhunter:$(git describe --always --tags --dirty=-$USER-$(date +%Y%m%d-%H%M%S)) .
The program is structured as follows:
- The frontend requests are routed in
app.jsto views and actions in
routes/somhunter.js, display-specific routes are present in
- The views (for the browser) are rendered in
- Node.js "frontend" communicates with C++ "backend" that handles the main data operations; the backend source code is in
core/; the main API is in
- The backend implementation is present in
core/src/which contains the following modules (
SomHunter-- main data-holding structure with the C++ version of the wrapper API
Submitter-- VBS API client for submitting search results for the competition, also contains the logging functionality
DatasetFrames-- loading of the dataset description (frame IDs, shot IDs, video IDs, ...)
DatasetFeatures-- loading of the dataset feature matrix
KeywordRanker-- loading and application of W2VV keywords (see Li, X., Xu, C., Yang, G., Chen, Z., & Dong, J. (2019, October). W2VV++ Fully Deep Learning for Ad-hoc Video Search. In Proceedings of the 27th ACM International Conference on Multimedia (pp. 1786-1794).)
RelevanceScores-- maintenance of the per-frame scores and feedback-based re-ranking
AsyncSom-- SOM implementation, background worker that computes the SOM
Additional minor utilities include:
config.hthat contains various
#defined constants, including file paths
log.hwhich defines a relatively user-friendly logging with debug levels
distfs.hdefine fast SSE-accelerated computation of vector-vector operations (provides around 4x speedup for almost all computation-heavy operations)
main.cpp, which is not compiled-in by default, but demonstrates how to run the SOMHunter core as a standalone C++ application.
The repository contains a (very small) pre-extracted indexed dataset (see https://doi.org/10.1109/ICMEW.2015.7169765 for dataset details). That should be ready to use.
We can provide a larger pre-indexed dataset based on the V3C1 video collection, but do not provide a direct download due to licensing issues. Please contact us to get a downloadable link. You will need to have the TRECVID data use agreement signed; see https://www-nlpir.nist.gov/projects/tv2019/data.html#licenses for details.
Using custom video data
You may set up the locations of the dataset files in
thumbnails of the extracted video frames must be placed in directory
public/thumbs/, so that they are accessible from the browser. (You may want
to use a symbolic link that points to the thumbnails elsewhere, in order to
save disk space and IO bandwidth.)
Description of extracting data from custom dataset is available in directory
extractor/ with a separate README.