Skip to content
An intelligent multimodal-learning based system for video, product and ads analysis. Based on the system, you can build a lot of downstream applications such as product recommendation, video retrieval, etc.
Python C++ C Jupyter Notebook Cuda Shell Other
Branch: master
Clone or download
Latest commit 4f91724 Oct 10, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information.
cache init project Oct 4, 2019
docker [scripts] fix bug Oct 10, 2019
docs [core] update, fix bugs Oct 7, 2019
hysia [core] update, fix bugs Oct 7, 2019
scripts [scripts] add features Oct 9, 2019
server [scripts] fix bug Oct 10, 2019
tests init project Oct 4, 2019
third [scripts] update Oct 7, 2019
.gitattributes [scripts] fix bug Oct 10, 2019
.gitignore [core] update Oct 4, 2019
LICENSE init project Oct 4, 2019 [scripts] fix bug Oct 10, 2019
environment.yml [scripts] add features Oct 9, 2019

Hysia Video to Online Platform [V1.0]

An intelligent multimodal-learning based system for video, product and ads analysis. You can build various downstream applications with the system, such as product recommendation, video retrieval. Several examples are provided.

V2 is under active development currently. You are welcome to create a issue, pull request here. We will credit them into V2.


Table of Contents

  1. Highlights
  2. Showcase
  3. Download Data
  4. Installation
  5. Configuration
  6. Demo
  7. Some Useful Tools
  8. Todo List
  9. Credits
  10. Contribute to Hysia-V2O
  11. About Us


  • Multimodal learning-based video analysis:
    • Scene / Object / Face detection and recognition
    • Multimodality data preprocessing
    • Results align and store
  • Downstream applications:
    • Intelligent ads insertion
    • Content-product match
  • Visualized testbed
    • Visualize multimodality results
    • Can be installed seperatelly


1. Upload video and process it by selecting different models


2. Display video processing result




3. Search scene by image and text



4. Insert product advertisement and display insertion result



Download Data

Here is a summary of required data / packed libraries.

File name Description File ID Unzipped directory
hysia-decoder-lib-linux-x86-64.tar.gz Hysia Decoder dependent lib 1fi-MSLLsJ4ALeoIP4ZjUQv9DODc1Ha6O hysia/core/HysiaDecode
weights.tar.gz Pretrained model weights 1O1-QT8HJRL1hHfkRqprIw24ahiEMkfrX .
object-detection-data.tar.gz Object detection data 1an7KGVer6WC3Xt2yUTATCznVyoSZSlJG third/object_detection

For users without Google Drive access, you can download from Baidu Wangpan and unzip files correspondingly. (See Option 2)

Option 1: Auto-download

# Make sure this script is run from project root
bash scripts/
cd ..

Option 2: Step-by-step download

Note: curl can be used to download from Google Drive directly according to amit-chahar's Gist. File names and file IDs are available from the above table:

fileid=<file id>
filename=<file name>
curl -c ./cookie -s -L "${fileid}" > /dev/null
curl -Lb ./cookie "`awk '/download/ {print $NF}' ./cookie`&id=${fileid}" -o ${filename}
rm cookie

Please cd to the specific folder (from the above table, column Unzipped directory) before execute curl.

1. Download Hysia Decoder dependent libraries and unzip it:

mv hysia-decoder-lib-linux-x86-64.tar.gz "${deocder_path}"
cd "${deocder_path}"
tar xvzf hysia-decoder-lib-linux-x86-64.tar.gz
rm -f hysia-decoder-lib-linux-x86-64.tar.gz
cd -

2. Download pretrained model weights and unzip it:

tar xvzf weights.tar.gz
# and remove the weights zip
rm -f weights.tar.gz

3. Download object detection data in third-party library and unzip it:

mv object-detection-data.tar.gz third/object_detection
cd third/object_detection
tar xvzf object-detction-data.tar.gz
rm object-detection-data.tar.gz
cd -



  • Conda
  • Nvidia driver
  • CUDA = 9*
  • g++
  • zlib1g-dev

We recommend to install this V2O platform in a UNIX like system. These scripts are tested on Ubuntu 16.04 x86-64 with CUDA9.0 and CUDNN7.

Option 1: Auto-installation

Run the following script:

# Execute this script at project root
bash ./scripts/
cd ..

Option 2. Step-by-step installation

# Firstly, make sure that your Conda is setup correctly and have CUDA,
# CUDNN installed on your system.

# Install Conda virtual environment
conda env create -f environment.yml

conda activate Hysia

export BASE_DIR=${PWD}

# Compile HysiaDecoder
cd "${BASE_DIR}"/hysia/core/HysiaDecode
make clean
# If nvidia driver is higher than 396, set NV_VERSION=<your nvidia major version>
make NV_VERSION=<your nvidia driver major version>

# Build mmdetect
# ROI align op
cd "${BASE_DIR}"/third/
cd mmdet/ops/roi_align
rm -rf build
python build_ext --inplace

# ROI pool op
cd ../roi_pool
rm -rf build
python build_ext --inplace

# NMS op
cd ../nms
make clean
make PYTHON=python

# Initialize Django
# This will prompt some input from you
cd "${BASE_DIR}"/server
python makemigrations restapi
python migrate
python loaddata dlmodels.json
python createsuperuser

python -m grpc_tools.protoc -I . --python_out=. --grpc_python_out=. protos/api2msl.proto

unset BASE_DIR

* Optional: Rebuild the frontend

You can omit this part as we have provided a pre-built frontend. If the frontend is updated, please run the following:

Option 1: auto-rebuild

cd server/react-build
bash ./

Option 2: Docker
See Run with Docker to build and install.

Option 2: Step-by-step rebuild

cd server/react-front

# Install dependencies
npm i
npm audit fix

# Build static files
npm run-script build

# fix js path
python build

# create a copy of build static files
mkdir -p tmp
cp -r build/* tmp/

# move static folder to static common
mv tmp/*html ../templates/
mv tmp/* ../static/
cp -rfl ../static/static/* ../static/
rm -r ../static/static/

# clear temp
rm -r tmp


  • Decode hardware:
    Change the configuration here at last line:
    Value can be CPU or GPU:<number> (e.g. GPU:0)
  • ML model running hardware: Change the configuration of model servers under this directory:
    # Custom request servicer
    class Api2MslServicer(api2msl_pb2_grpc.Api2MslServicer):
        def __init__(self):
            os.environ['CUDA_VISIBLE_DEVICES'] = '0'
    A possible value can be your device ID 0, 0,1, ...


cd server

# Start model server

# Run Django
python runserver

Then you can go to http://localhost:8000. Use username: admin and password: admin to login.

Some Useful Tools

  • Large dataset preprocessing
  • Video/audio decoding
  • Model profiling
  • Multimodality data testbed

Todo List

  • Improve models
  • Improve documents
  • CUDA 10 support
  • Docker support
  • Frontend separation
  • A minimal product database


Here is a list of models that we used in Hysia-V2O.

Models GitHub Repo License
Google Object detection
Scene Recognition
Audio Recognition
Image Retrieval
Face Detection
Face Recognition
Text Detection
Text Recognition

Contribute to Hysia-V2O

You are welcome to pull request. We will credit it in our version 2.0.

About Us


Previous Contributors

You can’t perform that action at this time.