## Extracting User information from Account Link

User account links given in the users data frame redirects to the user profile page. We can get the below information from there.
- Movies watched.
- Number of movies watched.
- Links to the movies the user watched.

Links to the movies that the user has watched redirects to the movie home page where we can get the below information.
- Genre of the movie

### The idea here is to get the Genre of the movies that the users like.
### If a user is a fan of Animation movies then his ratings can be important to the business.

In [1]:
import urllib.request as url
from bs4 import BeautifulSoup as bs
import re
import requests
import warnings
warnings.filterwarnings("ignore")
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import time
import random
random.seed(123)

import spacy

In [2]:
data1 = pd.read_csv('users_df_3000.csv')

In [50]:
data1.shape

(3000, 5)

In [51]:
data1.dtypes

accountLink    object
displayName    object
realm          object
userId         object
primary_key     int64
dtype: object

In [52]:
data1.isna().sum()

accountLink    2383
displayName     142
realm             0
userId            0
primary_key       0
dtype: int64

### Working on each Columns

### accountLink

accountLink has 2383 null values. Its ideal to impute them with 'noLink' (as we dont have any link to the user profile)

In [53]:
data1.accountLink.fillna('noLink',inplace=True)

In [54]:
data1.accountLink.value_counts()

noLink                2383
/user/id/978066947       1
/user/id/910055258       1
/user/id/884569305       1
/user/id/977398823       1
/user/id/909411483       1
/user/id/978182164       1
/user/id/919591862       1
/user/id/978183467       1
/user/id/978093010       1
/user/id/975094169       1
/user/id/978188507       1
/user/id/925961379       1
/user/id/893821630       1
/user/id/977533762       1
/user/id/260077583       1
/user/id/977810936       1
/user/id/978184643       1
/user/id/892995698       1
/user/id/978196913       1
/user/id/260124198       1
/user/id/978193204       1
/user/id/978139766       1
/user/id/840405713       1
/user/id/978201768       1
/user/id/832760612       1
/user/id/941394100       1
/user/id/978165767       1
/user/id/800423755       1
/user/id/852999764       1
                      ... 
/user/id/978198813       1
/user/id/944344630       1
/user/id/907527398       1
/user/id/978193426       1
/user/id/978179874       1
/user/id/953214840       1
/

There are a 617 user profile links. We can gather information about 617 users.

## displayName

In [55]:
data1.displayName.fillna('unknown',inplace=True)

We keep the displayName as it is required by the business.

## realm

In [56]:
data1.realm.value_counts()

Fandango    2383
RT           617
Name: realm, dtype: int64

## userId

In [57]:
data1.userId.nunique()

3000

In [58]:
type(data1.accountLink[9])

str

## Extracting user information from account links

In [59]:
# This the main link that needs to be attached to the sublinks we have in account link column
main_link = 'https://www.rottentomatoes.com'

In [60]:
#Function to create links
def createlink(x):
    if x !='noLink':
        x = main_link + x + '/ratings' #https://www.rottentomatoes.com/user/id/958686346/ratings
    return(x)

In [61]:
data1['accountLink'] = data1['accountLink'].apply(lambda x: createlink(x))

## Extracting movie links from user profile page

In [217]:
from tqdm import tqdm #to monitor the progress of apply

In [214]:
# Function to get movie links from user profile
def get_movie_class(x):
    temp1=[]
    if x !='noLink':
        ## read the webpage from url
        html = url.urlopen(x).read()
        time.sleep(2)
        #load the data in beautiful soup specific format
        soup = bs(html, "html.parser")
        temp = soup.findAll("a")
        for i in range(len(temp)):
            if temp[i].has_attr('class'):
                if (temp[i]['class'] == ['ratings__movie-title']):
                    temp1.append(temp[i])
    else:
        temp1.append('noLink')
    
    return(temp1)

In [218]:
tqdm.pandas()
data1['user_rated_movie_links'] = data1['accountLink'].progress_apply(lambda x: get_movie_class(x))

100%|████████████████████████████████████████████████████████████████████████████| 3000/3000 [4:22:36<00:00,  5.03s/it]


## Deriving the number of movies rated by the user

In [249]:
data1['number_of_movies_rated'] = data1['user_rated_movie_links'].progress_apply(lambda x: 0 if x == ['noLink'] else len(x))

100%|██████████████████████████████████████████████████████████████████████████| 3000/3000 [00:00<00:00, 498945.72it/s]


In [250]:
data1['number_of_movies_rated'].value_counts()

0     2383
1      149
3       70
20      68
2       61
4       35
19      31
18      29
5       26
17      25
7       22
6       19
16      13
14      11
15      11
9       10
13      10
10       9
8        9
12       5
11       4
Name: number_of_movies_rated, dtype: int64

In [253]:
data1.sample(10)

Unnamed: 0,accountLink,displayName,realm,userId,primary_key,user_rated_movie_links,number_of_movies_rated
1571,noLink,Mark,Fandango,c8e256c0-dc26-44cc-a365-c338e07c4206,1571,[noLink],0
267,noLink,Joyce,Fandango,a55f3918-554d-429f-9d75-39549ecd0e89,267,[noLink],0
2278,noLink,Facebook U,Fandango,f9e447d9-68a1-4cea-a779-1537d5a511ae,2278,[noLink],0
2577,noLink,Erick V,Fandango,A69824E1-9669-4B05-A5E5-9B92D224515A,2577,[noLink],0
1453,noLink,Judy A,Fandango,23B2368D-E073-4AA7-BB0B-00632B00C558,1453,[noLink],0
624,noLink,Hyonyang,Fandango,8dc5415f-753d-4f63-a6fe-1827618842c0,624,[noLink],0
2734,noLink,Amanda A,Fandango,E27A5036-8CC1-4742-BC6B-B1179AE937E3,2734,[noLink],0
811,noLink,First L,Fandango,65FA9905-11B7-43C5-80B9-0E37C7200897,811,[noLink],0
2488,noLink,Steven C,Fandango,169307EA-BB48-4F23-A87C-055119B8B624,2488,[noLink],0
1409,noLink,Melanie L,Fandango,B8EDFD8F-B1D7-4855-BED0-A8346F7CA0E7,1409,[noLink],0


## Deriving movie title from the movie home page.

In [320]:
data1['movies'] = data1['user_rated_movie_links'].progress_apply(lambda x: [w['title'] for w in x if x != ['noLink']] )

100%|██████████████████████████████████████████████████████████████████████████| 3000/3000 [00:00<00:00, 214133.49it/s]


In [325]:
# Filling movies coulmn with no_movie_rated for all the user we dont have account links for.
data1['movies'] = data1['movies'].progress_apply(lambda x: 'no_movie_rated' if x == [] else x)

100%|██████████████████████████████████████████████████████████████████████████| 3000/3000 [00:00<00:00, 428821.59it/s]


### Extracting the list of movies watched by the users who rated Lion King

In [282]:
#i is the sublist element
#j is the list element in nested list.
#Below list comprehension is used to extract element(i) from each list(j) in the nested list data1[data1['number_of_movies_rated'] != 0]['user_rated_movie_links'].
all_movie_links = np.array([i for j in data1[data1['number_of_movies_rated'] != 0]['user_rated_movie_links'] for i in j])

In [283]:
len(all_movie_links)

4928

In [315]:
all_movie_links[0] = all_movie_links[0]['title']

<a class="ratings__movie-title" href="/m/the_lion_king_2019" target="_top" title="The Lion King">The Lion King</a>

In [303]:
def extract_elements(list1):
    d = {'title':[],'link':[]}
    for i in list(range(0,len(list1))):
        d['title'].append(list1[i]['title'])
        d['link'].append(main_link + list1[i]['href'])

    movies_df = pd.DataFrame(d)
    return movies_df

In [988]:
movies_df = extract_elements(all_movie_links)

In [989]:
movies_df.drop_duplicates(inplace=True)

In [990]:
movies_df.dropna(inplace=True)

In [991]:
movies_df

Unnamed: 0,title,link
0,The Lion King,https://www.rottentomatoes.com/m/the_lion_king...
1,Captain Marvel,https://www.rottentomatoes.com/m/captain_marvel
3,Avengers: Endgame,https://www.rottentomatoes.com/m/avengers_endgame
4,Aquaman,https://www.rottentomatoes.com/m/aquaman_2018
5,Spider-Man: Into the Spider-Verse,https://www.rottentomatoes.com/m/spider_man_in...
6,Zootopia,https://www.rottentomatoes.com/m/zootopia
7,Fantastic Beasts: The Crimes of Grindelwald,https://www.rottentomatoes.com/m/fantastic_bea...
8,Logan,https://www.rottentomatoes.com/m/logan_2017
9,Mission: Impossible -- Fallout,https://www.rottentomatoes.com/m/mission_impos...
10,Norm of the North,https://www.rottentomatoes.com/m/norm_of_the_n...


In [992]:
def get_movie_genre(x):
    try:
        ## read the webpage from url
        html = url.urlopen(x).read()
        time.sleep(2)
        #load the data in beautiful soup specific format
        soup = bs(html, "html.parser")
        temp = soup.findAll("script")
        temp = list(temp[0])
        temp = re.split('genre',temp[0])
        temp = temp[-1].strip()
        temp = re.sub(r"[}\":\]\[]",'',temp)
        return(temp.split(','))
    except:
        pass

In [993]:
get_movie_genre(movies_df.link[0])

['Action & Adventure', 'Animation', 'Drama']

In [994]:
movies_df['movie_genre'] = movies_df['link'].progress_apply(lambda x: get_movie_genre(x))





  0%|                                                                                         | 0/1439 [00:00<?, ?it/s]



  0%|                                                                                 | 2/1439 [00:02<27:51,  1.16s/it]



  0%|▏                                                                                | 3/1439 [00:04<36:23,  1.52s/it]



  0%|▏                                                                                | 4/1439 [00:07<42:51,  1.79s/it]



  0%|▎                                                                                | 5/1439 [00:09<46:46,  1.96s/it]



  0%|▎                                                                                | 6/1439 [00:11<49:47,  2.08s/it]



  0%|▍                                                                                | 7/1439 [00:14<55:04,  2.31s/it]



  1%|▍                                                                              | 8/1439 [00:18<1:07:40,  2.84s/it]



  1%|▍      

  9%|███████                                                                      | 133/1439 [06:21<1:00:47,  2.79s/it]



  9%|███████▏                                                                     | 134/1439 [06:24<1:01:13,  2.81s/it]



  9%|███████▏                                                                     | 135/1439 [06:27<1:02:05,  2.86s/it]



  9%|███████▎                                                                     | 136/1439 [06:29<1:01:19,  2.82s/it]



 10%|███████▎                                                                     | 137/1439 [06:32<1:02:21,  2.87s/it]



 10%|███████▌                                                                       | 138/1439 [06:35<59:47,  2.76s/it]



 10%|███████▍                                                                     | 139/1439 [06:41<1:21:17,  3.75s/it]



 10%|███████▍                                                                     | 140/1439 [06:44<1:15:20,  3.48s/it]



 10%|███████▌   

 18%|██████████████▌                                                                | 265/1439 [12:50<55:54,  2.86s/it]



 18%|██████████████▌                                                                | 266/1439 [12:52<53:13,  2.72s/it]



 19%|██████████████▋                                                                | 267/1439 [12:55<54:27,  2.79s/it]



 19%|██████████████▋                                                                | 268/1439 [12:58<56:13,  2.88s/it]



 19%|██████████████▊                                                                | 269/1439 [13:01<55:47,  2.86s/it]



 19%|██████████████▊                                                                | 270/1439 [13:04<55:56,  2.87s/it]



 19%|██████████████▉                                                                | 271/1439 [13:07<56:17,  2.89s/it]



 19%|██████████████▉                                                                | 272/1439 [13:10<56:11,  2.89s/it]



 19%|███████████

 28%|█████████████████████▊                                                         | 397/1439 [19:17<48:45,  2.81s/it]



 28%|█████████████████████▊                                                         | 398/1439 [19:19<48:09,  2.78s/it]



 28%|█████████████████████▉                                                         | 399/1439 [19:22<48:29,  2.80s/it]



 28%|█████████████████████▉                                                         | 400/1439 [19:25<50:15,  2.90s/it]



 28%|██████████████████████                                                         | 401/1439 [19:28<50:22,  2.91s/it]



 28%|██████████████████████                                                         | 402/1439 [19:30<42:01,  2.43s/it]



 28%|██████████████████████                                                         | 403/1439 [19:32<43:51,  2.54s/it]



 28%|██████████████████████▏                                                        | 404/1439 [19:35<46:29,  2.70s/it]



 28%|███████████

 37%|█████████████████████████████                                                  | 529/1439 [25:55<44:00,  2.90s/it]



 37%|█████████████████████████████                                                  | 530/1439 [25:58<43:55,  2.90s/it]



 37%|█████████████████████████████▏                                                 | 531/1439 [26:00<43:43,  2.89s/it]



 37%|█████████████████████████████▏                                                 | 532/1439 [26:03<43:16,  2.86s/it]



 37%|█████████████████████████████▎                                                 | 533/1439 [26:06<42:37,  2.82s/it]



 37%|█████████████████████████████▎                                                 | 534/1439 [26:09<42:12,  2.80s/it]



 37%|█████████████████████████████▎                                                 | 535/1439 [26:12<42:14,  2.80s/it]



 37%|█████████████████████████████▍                                                 | 536/1439 [26:14<40:14,  2.67s/it]



 37%|███████████

 46%|████████████████████████████████████▎                                          | 661/1439 [32:23<39:31,  3.05s/it]



 46%|████████████████████████████████████▎                                          | 662/1439 [32:25<38:38,  2.98s/it]



 46%|████████████████████████████████████▍                                          | 663/1439 [32:28<37:56,  2.93s/it]



 46%|████████████████████████████████████▍                                          | 664/1439 [32:31<36:01,  2.79s/it]



 46%|████████████████████████████████████▌                                          | 665/1439 [32:33<34:26,  2.67s/it]



 46%|████████████████████████████████████▌                                          | 666/1439 [32:36<35:21,  2.74s/it]



 46%|████████████████████████████████████▌                                          | 667/1439 [32:39<35:37,  2.77s/it]



 46%|████████████████████████████████████▋                                          | 668/1439 [32:42<35:32,  2.77s/it]



 46%|███████████

 55%|███████████████████████████████████████████▌                                   | 793/1439 [38:56<30:34,  2.84s/it]



 55%|███████████████████████████████████████████▌                                   | 794/1439 [38:59<30:15,  2.81s/it]



 55%|███████████████████████████████████████████▋                                   | 795/1439 [39:01<28:48,  2.68s/it]



 55%|███████████████████████████████████████████▋                                   | 796/1439 [39:04<28:58,  2.70s/it]



 55%|███████████████████████████████████████████▊                                   | 797/1439 [39:07<30:29,  2.85s/it]



 55%|███████████████████████████████████████████▊                                   | 798/1439 [39:10<30:32,  2.86s/it]



 56%|███████████████████████████████████████████▊                                   | 799/1439 [39:13<30:54,  2.90s/it]



 56%|███████████████████████████████████████████▉                                   | 800/1439 [39:16<30:50,  2.90s/it]



 56%|███████████

 64%|██████████████████████████████████████████████████▊                            | 925/1439 [45:40<33:31,  3.91s/it]



 64%|██████████████████████████████████████████████████▊                            | 926/1439 [45:43<32:12,  3.77s/it]



 64%|██████████████████████████████████████████████████▉                            | 927/1439 [45:46<30:18,  3.55s/it]



 64%|██████████████████████████████████████████████████▉                            | 928/1439 [45:49<28:51,  3.39s/it]



 65%|███████████████████████████████████████████████████                            | 929/1439 [45:50<21:31,  2.53s/it]



 65%|███████████████████████████████████████████████████                            | 930/1439 [45:53<22:11,  2.62s/it]



 65%|███████████████████████████████████████████████████                            | 931/1439 [45:56<23:20,  2.76s/it]



 65%|███████████████████████████████████████████████████▏                           | 932/1439 [45:59<24:11,  2.86s/it]



 65%|███████████

 73%|█████████████████████████████████████████████████████████▎                    | 1057/1439 [52:15<20:06,  3.16s/it]



 74%|█████████████████████████████████████████████████████████▎                    | 1058/1439 [52:18<19:25,  3.06s/it]



 74%|█████████████████████████████████████████████████████████▍                    | 1059/1439 [52:20<18:44,  2.96s/it]



 74%|█████████████████████████████████████████████████████████▍                    | 1060/1439 [52:23<18:19,  2.90s/it]



 74%|█████████████████████████████████████████████████████████▌                    | 1061/1439 [52:26<18:44,  2.98s/it]



 74%|█████████████████████████████████████████████████████████▌                    | 1062/1439 [52:29<18:44,  2.98s/it]



 74%|█████████████████████████████████████████████████████████▌                    | 1063/1439 [52:32<18:26,  2.94s/it]



 74%|█████████████████████████████████████████████████████████▋                    | 1064/1439 [52:36<19:30,  3.12s/it]



 74%|███████████

 83%|████████████████████████████████████████████████████████████████▍             | 1189/1439 [58:56<13:50,  3.32s/it]



 83%|████████████████████████████████████████████████████████████████▌             | 1190/1439 [58:59<13:34,  3.27s/it]



 83%|████████████████████████████████████████████████████████████████▌             | 1191/1439 [59:02<12:57,  3.14s/it]



 83%|████████████████████████████████████████████████████████████████▌             | 1192/1439 [59:05<12:46,  3.10s/it]



 83%|████████████████████████████████████████████████████████████████▋             | 1193/1439 [59:08<12:26,  3.04s/it]



 83%|████████████████████████████████████████████████████████████████▋             | 1194/1439 [59:14<12:56,  3.17s/it]



 83%|████████████████████████████████████████████████████████████████▊             | 1195/1439 [59:17<15:12,  3.74s/it]



 83%|████████████████████████████████████████████████████████████████▊             | 1196/1439 [59:20<14:14,  3.52s/it]



 83%|███████████

 92%|█████████████████████████████████████████████████████████████████████▊      | 1321/1439 [1:05:53<06:21,  3.23s/it]



 92%|█████████████████████████████████████████████████████████████████████▊      | 1322/1439 [1:05:56<06:11,  3.17s/it]



 92%|█████████████████████████████████████████████████████████████████████▊      | 1323/1439 [1:05:59<06:00,  3.10s/it]



 92%|█████████████████████████████████████████████████████████████████████▉      | 1324/1439 [1:06:03<06:34,  3.43s/it]



 92%|█████████████████████████████████████████████████████████████████████▉      | 1325/1439 [1:06:08<07:05,  3.73s/it]



 92%|██████████████████████████████████████████████████████████████████████      | 1326/1439 [1:06:12<07:16,  3.86s/it]



 92%|██████████████████████████████████████████████████████████████████████      | 1327/1439 [1:06:15<06:43,  3.61s/it]



 92%|██████████████████████████████████████████████████████████████████████▏     | 1328/1439 [1:06:18<06:33,  3.54s/it]



 92%|███████████

In [995]:
movies_df['movie_genre']

0                  [Action & Adventure, Animation, Drama]
1         [Action & Adventure, Science Fiction & Fantasy]
3       [Action & Adventure, Drama, Science Fiction & ...
4         [Action & Adventure, Science Fiction & Fantasy]
5       [Action & Adventure, Animation, Kids & Family,...
6                 [Action & Adventure, Animation, Comedy]
7       [Action & Adventure, Kids & Family, Science Fi...
8       [Action & Adventure, Drama, Science Fiction & ...
9         [Action & Adventure, Drama, Mystery & Suspense]
10                [Action & Adventure, Animation, Comedy]
11         [Action & Adventure, Comedy, Sports & Fitness]
12      [Action & Adventure, Animation, Drama, Kids & ...
13                            [Action & Adventure, Drama]
14         [Action & Adventure, Animation, Kids & Family]
15      [Action & Adventure, Drama, Science Fiction & ...
16        [Action & Adventure, Science Fiction & Fantasy]
17        [Action & Adventure, Science Fiction & Fantasy]
18      [Actio

In [996]:
movies_df.shape

(1439, 3)

In [998]:
movies_df[movies_df['title'].duplicated()]

Unnamed: 0,title,link,movie_genre
12,The Lion King,https://www.rottentomatoes.com/m/the_lion_king,"[Action & Adventure, Animation, Drama, Kids & ..."
408,Godzilla,https://www.rottentomatoes.com/m/godzilla,"[Action & Adventure, Horror, Science Fiction &..."
462,Dumbo,https://www.rottentomatoes.com/m/dumbo_2019,"[Animation, Kids & Family, Science Fiction & F..."
586,Aladdin,https://www.rottentomatoes.com/m/1042582-aladdin,"[Action & Adventure, Animation, Comedy, Kids &..."
628,Child's Play,https://www.rottentomatoes.com/m/childs_play,"[Horror, Mystery & Suspense]"
810,Peter Rabbit,https://www.rottentomatoes.com/m/peter_rabbit_...,"[Action & Adventure, Animation, Comedy]"
1422,Halloween,https://www.rottentomatoes.com/m/1009113-hallo...,"[Horror, Mystery & Suspense]"
1798,Escape Room,https://www.rottentomatoes.com/m/escape_room_2017,[Horror]
2533,Death Note,https://www.rottentomatoes.com/m/death_note_2007,"[Action & Adventure, Animation, Anime & Manga,..."
2950,Murder on the Orient Express,https://www.rottentomatoes.com/m/murder_on_the...,[Mystery & Suspense]


In [999]:
duplicated_movie_ids = movies_df[movies_df['title'].duplicated()].index

In [1000]:
movies_df.drop(duplicated_movie_ids,inplace=True)

In [1001]:
movies_df.shape

(1423, 3)

In [1004]:
bkp_movies_df = movies_df.copy()

In [1060]:
fav_genre1=[]
fav_genre2=[]
fav_genre3=[]

In [1061]:
def get_top_three_genre(a):
    if a != 'no_movie_rated':
        temp = [list(movies_df.loc[movies_df.title == x,'movie_genre']) for x in a]
        temp = [x for xs in temp for x in xs]
        if None in temp:
            temp.remove(None)
            temp = [x for xs in temp for x in xs]
        else:
            temp = [x for xs in temp for x in xs]
        temp = dict(pd.Series(temp).value_counts())
        keys = list(temp.keys())
        fav_genre1.append(keys[0])
        fav_genre2.append(keys[1])
        fav_genre3.append(keys[2])
    else:
        fav_genre1.append('no_movie_rated')
        fav_genre2.append('no_movie_rated')
        fav_genre3.append('no_movie_rated')

In [1062]:
data1.movies.apply(lambda x: get_top_three_genre(x))

0       None
1       None
2       None
3       None
4       None
5       None
6       None
7       None
8       None
9       None
10      None
11      None
12      None
13      None
14      None
15      None
16      None
17      None
18      None
19      None
20      None
21      None
22      None
23      None
24      None
25      None
26      None
27      None
28      None
29      None
        ... 
2970    None
2971    None
2972    None
2973    None
2974    None
2975    None
2976    None
2977    None
2978    None
2979    None
2980    None
2981    None
2982    None
2983    None
2984    None
2985    None
2986    None
2987    None
2988    None
2989    None
2990    None
2991    None
2992    None
2993    None
2994    None
2995    None
2996    None
2997    None
2998    None
2999    None
Name: movies, Length: 3000, dtype: object

In [1063]:
print(len(fav_genre1))
print(len(fav_genre2))
print(len(fav_genre3))

3000
3000
3000


In [1064]:
pd.concat([pd.Series(fav_genre1), pd.Series(fav_genre2),pd.Series(fav_genre3)], axis=1,names=)

Unnamed: 0,0,1,2
0,no_movie_rated,no_movie_rated,no_movie_rated
1,no_movie_rated,no_movie_rated,no_movie_rated
2,no_movie_rated,no_movie_rated,no_movie_rated
3,no_movie_rated,no_movie_rated,no_movie_rated
4,no_movie_rated,no_movie_rated,no_movie_rated
5,no_movie_rated,no_movie_rated,no_movie_rated
6,no_movie_rated,no_movie_rated,no_movie_rated
7,no_movie_rated,no_movie_rated,no_movie_rated
8,no_movie_rated,no_movie_rated,no_movie_rated
9,Action & Adventure,Animation,Science Fiction & Fantasy


In [1065]:
bkp_data1 = data1.copy()

In [1066]:
data1['fav_genre1'] = fav_genre1

In [1067]:
data1['fav_genre2'] = fav_genre2

In [1068]:
data1['fav_genre3'] = fav_genre3

In [1069]:
bkp_data1 = data1.copy()

In [1070]:
data1.sample()

Unnamed: 0,accountLink,displayName,realm,userId,primary_key,user_rated_movie_links,number_of_movies_rated,movies,fav_genre1,fav_genre2,fav_genre3
2492,https://www.rottentomatoes.com/user/id/9759011...,unknown,RT,975901100,2492,"[<a class=""ratings__movie-title"" href=""/m/the_...",19,"[The Lion King, Toy Story 4, Aladdin, Wonder P...",Kids & Family,Comedy,Animation


In [1071]:
data1.to_csv('users_df_3000_fav_genres.csv',index=False)
movies_df.to_csv('movies_df.csv',index=False)