# Ted Talks Recommendation System with Machine Learning

Ted talks are a good source to learn and take inspiration from. These days every platform is having a recommendation system to provide a better user experience. Most of the applications collect data to recommend similar content according to the interests of the user. We can use the same strategy to recommend ted talks. So in this article, I will take you through the Ted Talks recommendation system with Machine Learning using Python.

In [1]:
import numpy as np
import pandas as pd
df= pd.read_csv("transcripts.csv")


In [2]:
df

Unnamed: 0,transcript,url
0,Good morning. How are you?(Laughter)It's been ...,https://www.ted.com/talks/ken_robinson_says_sc...
1,"Thank you so much, Chris. And it's truly a gre...",https://www.ted.com/talks/al_gore_on_averting_...
2,"(Music: ""The Sound of Silence,"" Simon & Garfun...",https://www.ted.com/talks/david_pogue_says_sim...
3,If you're here today — and I'm very happy that...,https://www.ted.com/talks/majora_carter_s_tale...
4,"About 10 years ago, I took on the task to teac...",https://www.ted.com/talks/hans_rosling_shows_t...
...,...,...
2462,"So, Ma was trying to explain something to me a...",https://www.ted.com/talks/duarte_geraldino_wha...
2463,This is a picture of a sunset on Mars taken by...,https://www.ted.com/talks/armando_azua_bustos_...
2464,"In my early days as a graduate student, I went...",https://www.ted.com/talks/radhika_nagpal_what_...
2465,I took a cell phone and accidentally made myse...,https://www.ted.com/talks/theo_e_j_wilson_a_bl...


In [3]:
df["transcript"][0]

'Good morning. How are you?(Laughter)It\'s been great, hasn\'t it? I\'ve been blown away by the whole thing. In fact, I\'m leaving.(Laughter)There have been three themes running through the conference which are relevant to what I want to talk about. One is the extraordinary evidence of human creativity in all of the presentations that we\'ve had and in all of the people here. Just the variety of it and the range of it. The second is that it\'s put us in a place where we have no idea what\'s going to happen, in terms of the future. No idea how this may play out.I have an interest in education. Actually, what I find is everybody has an interest in education. Don\'t you? I find this very interesting. If you\'re at a dinner party, and you say you work in education — Actually, you\'re not often at dinner parties, frankly.(Laughter)If you work in education, you\'re not asked.(Laughter)And you\'re never asked back, curiously. That\'s strange to me. But if you are, and you say to somebody, you

In [8]:
df['transcript'] = df['transcript'].str.lower()
df['transcript'] = df['transcript'].str.replace('[^\w\s]','')
df['transcript'] = df['transcript'].str.replace('\n','')
df['transcript'] = df['transcript'].str.replace('\d+','',regex=True)
df['transcript'] = df['transcript'].str.replace('\'','')
df['transcript'] = df['transcript'].str.replace('\r','')
df['transcript'] = df['transcript'].str.replace('\s+',' ')
df['transcript'] = df['transcript'].str.replace('https?://\S+|www\.\S+',' ')
df['transcript'] = df['transcript'].str.replace('<.*?>+',' ')
df['transcript'] = df['transcript'].str.replace('[%s]',' ')


In [5]:
df["title"] =df["url"].map(lambda x:x.split("/")[-1])

In [6]:
del df["url"]

In [9]:
df

Unnamed: 0,transcript,title
0,good morning. how are you?(laughter)its been g...,ken_robinson_says_schools_kill_creativity\n
1,"thank you so much, chris. and its truly a grea...",al_gore_on_averting_climate_crisis\n
2,"(music: ""the sound of silence,"" simon & garfun...",david_pogue_says_simplicity_sells\n
3,if youre here today — and im very happy that y...,majora_carter_s_tale_of_urban_renewal\n
4,"about years ago, i took on the task to teach ...",hans_rosling_shows_the_best_stats_you_ve_ever_...
...,...,...
2462,"so, ma was trying to explain something to me a...",duarte_geraldino_what_we_re_missing_in_the_deb...
2463,this is a picture of a sunset on mars taken by...,armando_azua_bustos_the_most_martian_place_on_...
2464,"in my early days as a graduate student, i went...",radhika_nagpal_what_intelligent_machines_can_l...
2465,i took a cell phone and accidentally made myse...,theo_e_j_wilson_a_black_man_goes_undercover_in...


In [10]:
from textblob import TextBlob
from nltk.stem import PorterStemmer
pr=PorterStemmer()

In [11]:
from sklearn.feature_extraction.text import CountVectorizer

In [12]:
def lemmafn(text):

    words=TextBlob(text).words

    return[pr.stem(word) for word in words]

In [13]:
!python -m textblob.download_corpora

[nltk_data] Downloading package brown to /root/nltk_data...
[nltk_data]   Unzipping corpora/brown.zip.
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.
[nltk_data] Downloading package conll2000 to /root/nltk_data...
[nltk_data]   Unzipping corpora/conll2000.zip.
[nltk_data] Downloading package movie_reviews to /root/nltk_data...
[nltk_data]   Unzipping corpora/movie_reviews.zip.
Finished.


In [14]:
ted_talks = df["transcript"].tolist()

In [15]:
vect = CountVectorizer(ngram_range=(1,2),max_features=10000,analyzer=lemmafn)

In [18]:
x=df['transcript']

In [19]:
x=vect.fit_transform(x)



In [20]:
from sklearn.metrics.pairwise import cosine_similarity

In [21]:
bi_sim = cosine_similarity(x)


In [22]:
def recommend_ted_talks(x):
    return ".".join(df["title"].loc[x.argsort()[-5:-1]])


df["ted_talks_bi"] = [recommend_ted_talks(x) for x in bi_sim]

In [24]:
df["ted_talks_bi"]=df['ted_talks_bi'].str.replace("_", " ").str.upper().str.strip().str.split("\n")

In [28]:
df.sample()


Unnamed: 0,transcript,title,ted_talks_bi
1229,"before march, , i was a photographic retoucher...",becci_manson_re_touching_lives_through_photos\n,[BEN SAUNDERS TO THE SOUTH POLE AND BACK THE H...


In [30]:
df['ted_talks_bi'][0]

['SIR KEN ROBINSON BRING ON THE REVOLUTION',
 '.NAIF AL MUTAWA SUPERHEROES INSPIRED BY ISLAM',
 '.DAN ARIELY ASKS ARE WE IN CONTROL OF OUR OWN DECISIONS',
 '.JIM YONG KIM DOESN T EVERYONE DESERVE A CHANCE AT A GOOD LIFE']