# **The neighbour-based recommendation for XXXXXXX**
---

Various approaches(defers by algorithms) are to be tested and based on the time-lag, one will be finalised.
- The main objective is to increase the user engagement from current average usage of **6 minutes**.
- As each video is of almost 15 secs, then we have to take care of first 24 videos.
- The user will be asked to give genre preference and upon these, recommendations shall occur.
- The first k videos will be clubbed using rules
     - most recent
     - most viewed
     - most liked
     - most commented
- For remaining **(24-k)** videos, the following algorithm are applied
    - k-Nearest Neighbour

In [85]:
class nearest_neighbour_recommender():
    
    #!pip install sklearn,pandas
    #in case you do not have the dependecies, uncomment above line and run.
    
    """
    ---------------------------------
    
    This class will create an object of recommendation algorithm.
    This object will be fitted to the data provided and then,
    the required recommendations will be provided as per the query by
    the object.
    
    Parameters
    ==================================================================
    
    data = The data on which recommendation algorithm will be trained.
    recs = Total recommendations required
    n_jobs = -1 (This is the total processors that can be used for training.)
    
    Procedure
    =================================================================
    
    1. call preprocessing function and pass the data path.
    2. fit function will train the algorithm on data
    3. For recommendations, call recommend and pass the data for which 
    recommendation is in need """        
        
    def __init__(self, recs_reqd = 10,n_jobs =-1):
        self.recs_reqd = 10
        self.n_jobs    = -1
    
    
    def preprocessing(self,data_path):
        
        """
        Parameters
        ===============================================================
        data_path = path of the data stored in csv format.
        
        Functions
        ===============================================================
        1.This function will read the data in .CSV format
        2.It will store the video_names from which recommendation
        will be offered.
        3.It will drop the tags,description and video_name 
        columns so that the remaining data is in numeric form
        for feeding the algorithm.
        4. The missing values are filled with MODE of the relevant columns.
        5. Genres are filtered and the most relevant ones are then one-hot encoded
        6. The original genre column will be dropped to maintain non-redundancy.
        
        """
        
        import pandas as pd
        data = pd.read_csv(data_path)
        self.video_names = data.video_name
        data = data.drop(['tags','description','video_name'],axis=1)
        data.fillna(data.mode().iloc[0], inplace=True)
        gen = [i for i in list(data['genre'].str.lower().replace('#|/|.\\','').str.get_dummies(sep=',').
                       rename(lambda x: '' + x, axis='columns')) if len(i) > 3 and len(i) < 30 ]
        gens = data['genre'].str.lower().replace('#|/|.\\','').str.get_dummies(sep=',').rename(lambda x: '' + x, axis='columns')[gen]
        self.data = pd.concat([data.drop(['genre'],axis=1),gens],axis=1)
        
    
    def fit(self):
        from sklearn.neighbors import NearestNeighbors
        #import pandas as pd
        recommendations_object = NearestNeighbors(n_neighbors=self.recs_reqd,n_jobs=self.n_jobs)
        recommendations_object.fit(self.data)
        self.rec_sys = recommendations_object
        print('training is over. Call recommend function for getting recommendations.')
    
    def recommend(self,data_for_recommend_path):
        """
        Parameters:
        =========================================================================
        data_for_recommend_path = data in .CSV for which recommendation is seeked
        
        Function:
        =========================================================================
        returns the indeices of videos in database recommendations
        """
        recommendations = self.rec_sys.kneighbors(self.preprocessing(data_for_recommend_path),return_distance=False)
        #the below returns the indices of recommended videos.
        return recommendations

In [71]:
data_path = 'data.csv'

In [72]:
recommender = nearest_neighbour_recommender()

In [73]:
recommender.preprocessing(data_path)

In [74]:
recommender.fit()

training is over. Call recommend function for getting recommendations.


In [86]:
recs=recommender.recommend('Tnatan.csv')
#passed the same file for testing.

In [87]:
recs

array([[2082,  132,  117, ...,   34,   96,  258],
       [ 103,  348,  113, ...,  192,  515,    8],
       [ 266,  165,   26, ...,  170,  365,  160],
       ...,
       [3741, 3799, 3785, ..., 3688, 3733, 3855],
       [3779, 3808, 3790, ..., 3689, 3783, 3664],
       [3806, 3676, 3666, ..., 3674, 3665, 3663]])

In [89]:
recs[0]
#recommendations for first video in the test file

array([2082,  132,  117,  355,  323, 2116,  180,   34,   96,  258])