# Video search
In this example we will be going over the code required to perform a video search. This example uses a Vgg model to extract video features that are then used with Milvus to build a system that can perform the searches. 
## Data

This example uses 10 animated gifs as an example to build an end-to-end solution that uses image search video. Readers can use their own video files to build the system.

Download location: https://drive.google.com/file/d/1hS4ANTQx9xNr9AByiLVeA1rEnxycdxtZ/view?usp=sharing

## Requirements

| Python Packages | Docker Servers |
| --------------- | -------------- |
| pymilvus        | Milvus-1.1.0   |
|

We will assume that you have familiarity with libraries including Tensorflow.

## Up and Running


### 1. Start Milvus Server

```bash
$  docker run -d --name milvus_cpu_1.0.0 --network my-net --ip 10.0.0.2 \
-p 19530:19530 \
-p 19121:19121 \
-v /home/$USER/milvus/db:/var/lib/milvus/db \
-v /home/$USER/milvus/conf:/var/lib/milvus/conf \
-v /home/$USER/milvus/logs:/var/lib/milvus/logs \
-v /home/$USER/milvus/wal:/var/lib/milvus/wal \
milvusdb/milvus:1.0.0-cpu-d030521-1ea92e
```

This demo uses Milvus 1.0. Refer to the [Install Milvus](https://milvus.io/docs/v1.0.0/milvus_docker-cpu.md) for how to install Milvus docker. 

## Code Overview
### Connecting to Servers

We first start off by connecting to the servers. In this case the docker containers are running on localhost and the ports are the default ports. 

In [38]:
#Connectings to Milvus

import milvus
milv = milvus.Milvus(host = '127.0.0.1', port = 19530)

### Building Collection and Setting Index

The next step involves creating a collection. A collection in Milvus is similar to a table in a relational database, and is used for storing all the vectors. To create a collection, we first must select a name, the dimension of the vectors being stored within, the index_file_size, and metric_type. The index_file_size corresponds to how large each data segmet will be within the collection. More information on this can be found here. The metric_type is the distance formula being used to calculate similarity. In this example we are using the Euclidean distance. 

In [39]:
#Creating collection

import time

collection_name = "test_collection"
milv.drop_collection(collection_name) 

collection_param = {
            'collection_name': collection_name,
            'dimension': 512,
            'index_file_size': 1024,  # optional
            'metric_type': milvus.MetricType.L2  # optional
            }

status, ok = milv.has_collection(collection_name)

if not ok:
    status = milv.create_collection(collection_param)
    print(status)

Status(code=0, message='Create collection successfully!')


After creating the collection we want to assign it an index type. This can be done before or after inserting the data. When done before, indexes will be made as data comes in and fills the data segments. In this example we are using IVF_SQ8 which requires the 'nlist' parameter. Each index types carries its own parameters. More info about this param can be found here.

In [40]:
#Indexing collection

index_param = {
    'nlist': 512
}

status = milv.create_index(collection_name, milvus.IndexType.IVF_SQ8, index_param)
status, index = milv.get_index_info(collection_name)
print(index)


(collection_name='test_collection', index_type=<IndexType: IVF_SQ8>, params={'nlist': 512})


### Processing and Storing Videos

In order to store the videos in Milvus, We first need to cut the frame of the video, here we choose the opencv method.. 

In [47]:
import cv2
import os
import shutil
save_path = "/data1/lcl/test_video_bootcamp/frame_res/"      # Path of save frame
if not os.path.exists(save_path):
    os.makedirs(save_path)  
path = "/data1/lcl/test_video_bootcamp/examle-gif-10/10-gif/"    #Path of raw video 

filelist = os.listdir(path)     
print(filelist)     
for item in filelist:  
    if item.endswith('.gif'):     # Write according to its own video file suffix, personal video files are in gif format
        print(item)
        try:
            src = os.path.join(path, item)
            vid_cap = cv2.VideoCapture(src)    
            success, image = vid_cap.read()
            count = 0
            while success:
                vid_cap.set(cv2.CAP_PROP_POS_MSEC, 1 * 1000 * count)   #The method of intercepting images Here is 1 second to intercept one Can change the parameters to set the interval of interception time
                video_to_picture_path= os.path.join(save_path, item.split(".")[0])    # Naming of video folders
                if not os.path.exists(video_to_picture_path):   #Create a folder corresponding to each video storage image
                    os.makedirs(video_to_picture_path)
                cv2.imwrite(video_to_picture_path+"/" + str(item.split(".")[0]) + "#" + str(count) + ".jpg", image)       # Addresses of stored images and naming of images
                success, image = vid_cap.read()
                count += 1
            print('Total frames: ', count)     #Print the number of intercepted images 
        except:
            print("error")
            
ALL_PIC = "/data1/lcl/test_video_bootcamp/all_pic"
os.makedirs(ALL_PIC)

for root, dirs, files in os.walk(save_path):
    for file in files:
        src_file = os.path.join(root, file)
        shutil.copy(src_file, ALL_PIC)        

Next, we use the VGG model to extract the vectors from the images we got earlier.

In [48]:
import sys
import os
from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input as preprocess_input_vgg
from keras.preprocessing import image
import numpy as np
from diskcache import Cache
from numpy import linalg as LA

class VGGNet:
    def __init__(self):
        self.input_shape = (224, 224, 3)
        self.weight = 'imagenet'
        self.pooling = 'max'
        self.model_vgg = VGG16(weights=self.weight,
                               input_shape=(self.input_shape[0], self.input_shape[1], self.input_shape[2]),
                               pooling=self.pooling,
                               include_top=False)
        self.model_vgg.predict(np.zeros((1, 224, 224, 3)))

    def vgg_extract_feat(self, img_path):
        img = image.load_img(img_path, target_size=(self.input_shape[0], self.input_shape[1]))
        img = image.img_to_array(img)
        img = np.expand_dims(img, axis=0)
        img = preprocess_input_vgg(img)
        feat = self.model_vgg.predict(img)
        norm_feat = feat[0] / LA.norm(feat[0])
        norm_feat = [i.item() for i in norm_feat]
        return norm_feat
    
def feature_extract(pic_path, model):
    default_cache_dir="./tmp"
    cache = Cache(default_cache_dir)
    feats = []
    names = []
    img_list = [os.path.join(pic_path, f) for f in os.listdir(pic_path) if (f.endswith('.jpg'))]
    model = model
    for i, img_path in enumerate(img_list):
        norm_feat = model.vgg_extract_feat(img_path)
        img_name = os.path.split(img_path)[1]
        feats.append(norm_feat)
        names.append(img_name.encode())
        current = i+1
        total = len(img_list)
        cache['current'] = current
        cache['total'] = total
        print ("extracting feature from image No. %d , %d images in total" %(current, total))
    return feats, names


vectors, names = feature_extract(ALL_PIC, VGGNet())

Import the image vector into milvus and store the returned vector id into the database along with the image name.

In [43]:
from diskcache import Cache

default_cache_dir="./tmp"
cache = Cache(default_cache_dir)

status, ids = milv.insert(collection_name=collection_name, records=vectors)
print(status)

for i in range(len(names)):
    cache[ids[i]] = names[i]

Status(code=0, message='Add vectors successfully!')


### Searching

When searching for an image,we use the same vgg model to extract the vector of this image.

In [44]:
import sys
import os
from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input as preprocess_input_vgg
from keras.preprocessing import image
import numpy as np
from numpy import linalg as LA

class VGGNet:
    def __init__(self):
        self.input_shape = (224, 224, 3)
        self.weight = 'imagenet'
        self.pooling = 'max'
        self.model_vgg = VGG16(weights=self.weight,
                               input_shape=(self.input_shape[0], self.input_shape[1], self.input_shape[2]),
                               pooling=self.pooling,
                               include_top=False)
        self.model_vgg.predict(np.zeros((1, 224, 224, 3)))

    def vgg_extract_feat(self, img_path):
        img = image.load_img(img_path, target_size=(self.input_shape[0], self.input_shape[1]))
        img = image.img_to_array(img)
        img = np.expand_dims(img, axis=0)
        img = preprocess_input_vgg(img)
        feat = self.model_vgg.predict(img)
        norm_feat = feat[0] / LA.norm(feat[0])
        norm_feat = [i.item() for i in norm_feat]
        return norm_feat

def search_feature_extract(pic_path, model):
    feats = []
    norm_feat = model.vgg_extract_feat(pic_path)
    feats.append(norm_feat)
    return feats

embeddings = search_feature_extract("/data1/lcl/test_video_bootcamp/frame_res/tumblr_lhns1x9P9d1qc3h7bo1_400/tumblr_lhns1x9P9d1qc3h7bo1_400#1.jpg",VGGNet())

Then we can use these embeddings in a search. The search requires a few arguments. It needs the name of the collection, the vectors being searched for, how many closest vectors to be returned, and the parameters for the index, in this case nprobe. 

In [49]:
search_sub_param = {
        "nprobe": 16
    }

search_param = {
    'collection_name': collection_name,
    'query_records': embeddings,
    'top_k': 10,
    'params': search_sub_param,
    }

start = time.time()
status, results = milv.search(**search_param)
print (status)
end = time.time() - start

print("Search took a total of: ", end)

The result of this search contains the IDs and corresponding distances of the top_k closes vectors. We can use the IDs in cache db to get the original video. 

In [50]:
def query_name_from_ids(vids):
    res = []
    cache = Cache(default_cache_dir)
    for i in vids:
        if i in cache:
            res.append(cache[i])
    return res


if status.OK():
    vids = [x.id for x in results[0]]
    res_name = [x.decode('utf-8') for x in query_name_from_ids(vids)]
    res_video_name = []
    for pic_name in res_name:
        video_name = pic_name[0:pic_name.rfind('#', 1)] + ".gif"
        res_video_name.append(video_name)
    print(res_video_name)
    
else:
    print("Search Failed.")

This is the basic way to conduct a video search.