# Problem Statement : 
    
    To Build a content based video recommender system which predicts the most similar video given a query video.

In [None]:
! tar --exclude='./CBVR' --exclude='./Videos' -zcvf .tgz .

## Algorithm :

        input : query video , set of pre-existent videos , pre-existent videos class
        ``Training`` :
            Step 1 : Extract out features from the pre-existent videos
            Step 2 : Reduce the dimension of the feature without loosing much information about the video
            Step 3 : With reduced feature and pre-existent videos class , train a classifier to detect the 
            genre of the video
            Step 4 : Index the reduced feature to be used for video search
            Step 5 : save the classifier to be used for genre prediction 
        ``Inference`` :
            Step 1 : Extract the features from the query video
            Step 2 : Reduce the dimension of the query video by using the same set of condition which was 
            used during training
            Step 3 : Get the genre of the query video 
            Step 4 : Load all indexed features corresponding to the detected genre
            Step 5 : Find the most close video vector from the indexed video vectors
        Output : Map the video vector  aganist the matching video id and throw it as recommendation

### Set of technical steps to use a content based recommender system

    load c3d extractor
    load standard scalar
    load pca transfomer
    load video classifier
    load_indexed videos

    display query video
    read query_video
    get the c3d feature
    scale the features
    take out principle components
    get the class 
    find similar video from the class

In [1]:
from keras.models import load_model
import pickle
import numpy as np
import all_in_one_utils as ao_util

def get_all_intermediates(directory):
    mlp_model = load_model(directory+"/mlp_model.pkl")
    with open(directory+"/scalar.pkl","rb") as scalar_:
        scaler = pickle.load( scalar_)
    ## pca
    with open(directory+"/pca.pkl","rb") as pca_:
        pca_90 = pickle.load( pca_)
     ## pca
    with open(directory+"/text_labels.pkl","rb") as text_labels_:
        text_labels = pickle.load( text_labels_)
    ## features 
    with open(directory+"/pca_df.pkl","rb") as pca_df_:
        pca_df = pickle.load( pca_df_) 
    return scaler,pca_90,pca_df,mlp_model,text_labels


def get_video_class(video_path):
    query_c3d = ao_util.get_3d_feature(video_path)

    scaled_query = scaler.transform(query_c3d)

    query_video_embdedding = pca_90.transform(scaled_query)

    predicted_class_id = mlp_model.predict_classes(query_video_embdedding,128)

    predicted_class = list(map(lambda x : text_labels[x],predicted_class_id))
    
    predicted_class = max(predicted_class,key=predicted_class.count)
    
    print("The video belongs from {} category".format(predicted_class))

    return query_video_embdedding,predicted_class

def find_similar(video_id,query_video_embdedding):
    video_id = int(video_url.split("/")[-1].split(".")[0])

    filtered_embedding = pca_df[(pca_df['label']==video_class) & 
                                (pca_df['video_id']!=video_id)]

    pixel_cols =[val for val in pca_df.columns if val.startswith('pixel')]

    indexed_embedding = filtered_embedding[pixel_cols]

    from scipy.spatial.distance import cdist
    distance = cdist(query_video_embdedding,indexed_embedding)


    n_least = 1
    sorted_idx = np.argsort(distance,axis=1)[:,:n_least]

    clip_to_id = np.apply_along_axis(lambda vid_id : filtered_embedding['video_id'].iloc[vid_id] , 1, sorted_idx)

    from scipy import stats
    repeated = stats.mode(clip_to_id)

    rec_id = np.asscalar(repeated[0])
    print("recommended video is %d "%rec_id)
    return rec_id

Using TensorFlow backend.


[INFO] C3D model loaded ...


In [4]:
from IPython.display import HTML
def render(rec_id):
    return HTML("""
    <video width="320" height="240" controls>
      <source src="Videos/{}.mp4" type="video/mp4">
    </video>
    """.format(rec_id))

def get_id_from_url(video_url):
    return video_url.split("/")[1].split(".")[0]

In [2]:
scaler,pca_90,pca_df,mlp_model,text_labels = get_all_intermediates("/home/deep-vision/.cbvr/persistent_files")

In [3]:
text_labels

Index(['FUNNY', 'WISHES', 'MOTIVATIONAL', 'DEVOTIONAL', 'SHAYARI', 'UGC',
       'SONGS'],
      dtype='object')

## Example 1

In [5]:
video_url = 'Videos/12099245.mp4'
render(get_id_from_url(video_url))

In [6]:
query_video_embdedding,video_class = get_video_class(video_url)

video stats  (2966, 400, 400, 3)
total temporal vectors  186
The video belongs from SHAYARI category


In [8]:
rec_id = find_similar(get_id_from_url(video_url),query_video_embdedding)

recommended video is 12119230 


In [9]:
render(rec_id)

##### Observations

Note the model not only recommended the similar video but it can also solve the problem of duplicates.
The above predicted video was registered as other vdeo id.

## Example 2

In [10]:
video_url = 'Videos/12377756.mp4'
render(get_id_from_url(video_url))

In [11]:
query_video_embdedding,video_class = get_video_class(video_url)

video stats  (1831, 480, 264, 3)
total temporal vectors  115
The video belongs from WISHES category


In [12]:
rec_id = find_similar(get_id_from_url(video_url),query_video_embdedding)

recommended video is 12381719 


In [13]:
render(rec_id)

##### Observations

We were able to find a video which has a sligh diffrence in their content (Text ovelay is in different language)

#### Example 3

In [14]:

video_url = 'Videos/12478453.mp4'
render(get_id_from_url(video_url))

In [15]:
query_video_embdedding,video_class = get_video_class(video_url)

video stats  (1225, 480, 320, 3)
total temporal vectors  77
The video belongs from DEVOTIONAL category


In [16]:
rec_id = find_similar(get_id_from_url(video_url),query_video_embdedding)

recommended video is 12386204 


In [17]:
render(rec_id)

##### Observations

We were able to find a video which has same genre (devotional)

### Example 4 

In [18]:
video_url = 'Videos/12471918.mp4'
render(get_id_from_url(video_url))

In [19]:
query_video_embdedding,video_class = get_video_class(video_url)

video stats  (8991, 360, 640, 3)
total temporal vectors  562
The video belongs from FUNNY category


In [20]:
rec_id = find_similar(get_id_from_url(video_url),query_video_embdedding)

recommended video is 12378333 


In [21]:
render(rec_id)

### Example 5

In [22]:
video_url = 'Videos/12129342.mp4'
render(get_id_from_url(video_url))

In [23]:
query_video_embdedding,video_class = get_video_class(video_url)

video stats  (8991, 360, 640, 3)
total temporal vectors  562
The video belongs from FUNNY category


In [24]:
rec_id = find_similar(get_id_from_url(video_url),query_video_embdedding)

recommended video is 12069887 


In [25]:
render(rec_id)

### Example 6

In [26]:
video_url = 'Videos/12124087.mp4'
render(get_id_from_url(video_url))

In [27]:
query_video_embdedding,video_class = get_video_class(video_url)

video stats  (3118, 480, 368, 3)
total temporal vectors  195
The video belongs from UGC category


In [28]:
rec_id = find_similar(get_id_from_url(video_url),query_video_embdedding)

recommended video is 12438292 


In [29]:
render(rec_id)

## How we can measure the quality of recommendation

## Algorithmically , semisupervised

Currently , have tried to attack the similarity problem by using a strong classifier which basically predicts the class of the video.
Once we get the class of the video we calculate the euclidean  distnace between the query video's vector and the indexed video's vectors.
This gives us good result which can be seen in the above examples.


We can also find better similar pairs by training a classifier for the task to predict if a given pair of videos belong to the same or different action
We can use triplet loss for optimisation in it.

I have already experimented with siamese loss as part of the assignments. I tried to learn a better embeddings from the video by training a small CNN with triplet loss. The visualisations of the learnt emebedding can be seen in the 

[Notebook 2](triplet_loss_customised_loss.ipynb)

Once we have a better embedding generated from the triplet loss network , we can use that particluar embedding to find the similar video.

## Unsupervised 
### coldstart Attempts and Baselines

Simple ranking functions (e.g ratio or difference) of the number of upvotes/downvotes an recommendation receives may serve as good baselines for the answer ranking problem.

## supervised
We can form a ground truth dataset having query video and recommendation pairs with their similarity score. We can then train a model which can predict the similarity score.

