Ted talks are a good source to learn and take inspiration from. These days every platform is having a recommendation system to provide a better user experience. Most of the applications collect data to recommend similar content according to the interests of the user. We can use the same strategy to recommend ted talks. So in this project, I will take you through the Ted Talks recommendation system with Machine Learning using Python.

Ted Talks Recommendation System has to be purely based on the content rather than based on data of a user. As a user generally watches videos on Youtube and other applications mostly to get entertained. But a user watches Ted Talks to take some inspiration, so the data of the user has nothing to do here.

To recommend Ted Talks to a user we need to create a content-based recommendation system where all the ted talks will be recommended based on the content of the video that the user watched earlier. To create such a system we can use the concept of cosine similarity in machine learning.

### Ted Talks Recommendation System with Machine Learning

The dataset that I will be using here to create a Ted Talks recommendation system contains the transcripts of all the audios and videos of Ted talks uploaded at Ted.com. Let’s start the task of creating this recommendation system by importing the necessary Python libraries and the dataset:

In [1]:
import numpy as np
import pandas as pd
data = pd.read_csv(r"C:\Users\SHREE\Downloads\Python CODES\Ted Talks Recommendation System with Machine Learning\ted_talks.csv")
print(data.head())

                                          transcript  \
0  Good morning. How are you?(Laughter)It's been ...   
1  Thank you so much, Chris. And it's truly a gre...   
2  (Music: "The Sound of Silence," Simon & Garfun...   
3  If you're here today — and I'm very happy that...   
4  About 10 years ago, I took on the task to teac...   

                                                 url  
0  https://www.ted.com/talks/ken_robinson_says_sc...  
1  https://www.ted.com/talks/al_gore_on_averting_...  
2  https://www.ted.com/talks/david_pogue_says_sim...  
3  https://www.ted.com/talks/majora_carter_s_tale...  
4  https://www.ted.com/talks/hans_rosling_shows_t...  


The dataset contains the transcript of the ted talks and the URL of that content. So to continue with this dataset, I will create a new column as a title by separating the title from the URL:

In [2]:
data["title"] = data["url"].map(lambda x:x.split("/")[-1])

As I stated in the beginning that this recommender system has to be purely based on the content rather than the data of the user so here I will first prepare this dataset and then let’s use cosine similarity to measure the similarities between different Ted talks:

In [3]:
from sklearn.feature_extraction import text
ted_talks = data["transcript"].tolist()
bi_tfidf = text.TfidfVectorizer(input=ted_talks, stop_words="english", ngram_range=(1,2))
bi_matrix = bi_tfidf.fit_transform(ted_talks)

uni_tfidf = text.TfidfVectorizer(input=ted_talks, stop_words="english")
uni_matrix = uni_tfidf.fit_transform(ted_talks)

from sklearn.metrics.pairwise import cosine_similarity
bi_sim = cosine_similarity(bi_matrix)
uni_sim = cosine_similarity(uni_matrix)

Now that last step will be to create a Python function to recommend ted talks based on their content. So let’s define a Python function and have a look at some recommendations:

In [4]:
def recommend_ted_talks(x):
    return ".".join(data["title"].loc[x.argsort()[-5:-1]])
    
data["ted_talks_uni"] = [recommend_ted_talks(x) for x in uni_sim]
data["ted_talks_bi"] = [recommend_ted_talks(x) for x in bi_sim]
print(data['ted_talks_uni'].str.replace("_", " ").str.upper().str.strip().str.split("\n")[1])

['RORY BREMNER S ONE MAN WORLD SUMMIT', '.ALICE BOWS LARKIN WE RE TOO LATE TO PREVENT CLIMATE CHANGE HERE S HOW WE ADAPT', '.TED HALSTEAD A CLIMATE SOLUTION WHERE ALL SIDES CAN WIN', '.AL GORE S NEW THINKING ON THE CLIMATE CRISIS']


So we can get similar results which means that you can follow the same strategy while creating any type of content recommendation system.

## Summary

In this project, I introduced you to how to create a content-based recommender system for recommending ted talks to a user. To find the similarities between different ted talks I used the concept of cosine similarity here.