## Introduction

Netflix is an American, subscription-based service offering online streaming from a library of films and television series, including those produced in-house. Similar to Netflix, Disney Plus service distributes films and television series produced by The Walt Disney Studios and Walt Disney Television and is one of the biggest competitor of Netflix. 

Due to Covid, there has been a tremendous increase in usage of OTT platform as cinemas were shut down and more people were staying home.

## Problem Statement

As more people are shifting towards OTT platform, we would like to get more insights in the similarities and differences between two of the most competitive online streaming platforms - Netflix and Disneyplus. This analysis would help understand users which platform is more suitable for them and hence which one they would like to choose.

**This notebook contains the scraping of disneyplus posts from [Reddit](https://www.reddit.com/r/disneyplus/).**


### Contents

- [Scrape Sample Posts](#Scrape-Sample-Posts)
- [Scrape Sufficient Data](#Scrape-Sufficient-Data)
- [Export To CSV](#Export-To-CSV)

In [1]:
import pandas as pd
import numpy as np
import requests 
import random
import time

### Scrape Sample Posts

First we will fetch sample posts from disney plus subreddit to understand the json data retrieved.

In [2]:
url = 'https://www.reddit.com/r/DisneyPlus.json'

In [3]:
res = requests.get(url, headers={'User-agent': 'Arti 1.0 Inc'})

We will check the status of the request made to fetch data. 

In [4]:
res.status_code

200

In [5]:
reddit_dict = res.json()

In [6]:
print(reddit_dict)

{'kind': 'Listing', 'data': {'modhash': '', 'dist': 26, 'children': [{'kind': 't3', 'data': {'approved_at_utc': None, 'subreddit': 'DisneyPlus', 'selftext': '[Join the Cordcutting &amp; Streaming TV Discord](https://discord.gg/rJxX2cw5Eg)\n\n**The Disney+ Discord Server recently expanded.** We now include channels for other streaming services, including Netflix, Hulu, and Paramount+. To gain access to these channels, you will use our #role-selector channel. Our expansion also invites a new name: **Cordcutting &amp; Streaming TV.**\n\nWe started this discord server in August of 2019, just before the Disney+ launch. Since that time, we have seen our server expand to nearly 2,000 individuals. We have had great conversations about the service, including troubleshooting and content like *The Mandalorian* and *WandaVision.* This Discord is a merger of our current server and those run by partner subreddits. We realize that most of us subscribe to multiple streaming services. A Discord dedicat

From above we can see that the data is received in json format from the request. <br>
First we will list out the keys from this data and then understand the values linked with these keys so that we can get the exact disney plus posts data we need.

In [7]:
reddit_dict.keys()

dict_keys(['kind', 'data'])

In [8]:
reddit_dict['kind']

'Listing'

In [9]:
reddit_dict['data']

{'modhash': '',
 'dist': 26,
 'children': [{'kind': 't3',
   'data': {'approved_at_utc': None,
    'subreddit': 'DisneyPlus',
    'selftext': '[Join the Cordcutting &amp; Streaming TV Discord](https://discord.gg/rJxX2cw5Eg)\n\n**The Disney+ Discord Server recently expanded.** We now include channels for other streaming services, including Netflix, Hulu, and Paramount+. To gain access to these channels, you will use our #role-selector channel. Our expansion also invites a new name: **Cordcutting &amp; Streaming TV.**\n\nWe started this discord server in August of 2019, just before the Disney+ launch. Since that time, we have seen our server expand to nearly 2,000 individuals. We have had great conversations about the service, including troubleshooting and content like *The Mandalorian* and *WandaVision.* This Discord is a merger of our current server and those run by partner subreddits. We realize that most of us subscribe to multiple streaming services. A Discord dedicated to each stre

data in response again contains a dictionary. We will now do teh same as before, list the keys and understand the values associated with these keys

In [10]:
reddit_dict['data'].keys()

dict_keys(['modhash', 'dist', 'children', 'after', 'before'])

In [11]:
reddit_dict['data']['children']

[{'kind': 't3',
  'data': {'approved_at_utc': None,
   'subreddit': 'DisneyPlus',
   'selftext': '[Join the Cordcutting &amp; Streaming TV Discord](https://discord.gg/rJxX2cw5Eg)\n\n**The Disney+ Discord Server recently expanded.** We now include channels for other streaming services, including Netflix, Hulu, and Paramount+. To gain access to these channels, you will use our #role-selector channel. Our expansion also invites a new name: **Cordcutting &amp; Streaming TV.**\n\nWe started this discord server in August of 2019, just before the Disney+ launch. Since that time, we have seen our server expand to nearly 2,000 individuals. We have had great conversations about the service, including troubleshooting and content like *The Mandalorian* and *WandaVision.* This Discord is a merger of our current server and those run by partner subreddits. We realize that most of us subscribe to multiple streaming services. A Discord dedicated to each streamer fragments the fandom. This expansion now

children in data contains all the posts data that we will need for our analysis.<br>

At a time request retrives 26 posts from reddit.

In [12]:
len(reddit_dict['data']['children'])

26

In [13]:
reddit_dict['data']['children'][1]

{'kind': 't3',
 'data': {'approved_at_utc': None,
  'subreddit': 'DisneyPlus',
  'selftext': '',
  'author_fullname': 't2_8wpjm',
  'saved': False,
  'mod_reason_title': None,
  'gilded': 0,
  'clicked': False,
  'title': 'End credit of Raya &amp; the last dragon 😂',
  'link_flair_richtext': [{'a': ':WORLD:',
    'e': 'emoji',
    'u': 'https://emoji.redditmedia.com/5ovbb4a6c8t41_t5_r0hux/WORLD'},
   {'e': 'text', 't': ' Global'}],
  'subreddit_name_prefixed': 'r/DisneyPlus',
  'hidden': False,
  'pwls': 6,
  'link_flair_css_class': '',
  'downs': 0,
  'thumbnail_height': 59,
  'top_awarded_type': None,
  'hide_score': False,
  'name': 't3_m1oo4o',
  'quarantine': False,
  'link_flair_text_color': 'dark',
  'upvote_ratio': 0.98,
  'author_flair_background_color': None,
  'subreddit_type': 'public',
  'ups': 265,
  'total_awards_received': 1,
  'media_embed': {},
  'thumbnail_width': 140,
  'author_flair_template_id': None,
  'is_original_content': False,
  'user_reports': [],
  'secure

Each post information contains kind and data. This data key in the dictionary contains the post information.

In [14]:
reddit_dict['data']['children'][1].keys()

dict_keys(['kind', 'data'])

In [15]:
reddit_dict['data']['children'][1]['kind']

't3'

In [16]:
reddit_dict['data']['children'][1]['data']

{'approved_at_utc': None,
 'subreddit': 'DisneyPlus',
 'selftext': '',
 'author_fullname': 't2_8wpjm',
 'saved': False,
 'mod_reason_title': None,
 'gilded': 0,
 'clicked': False,
 'title': 'End credit of Raya &amp; the last dragon 😂',
 'link_flair_richtext': [{'a': ':WORLD:',
   'e': 'emoji',
   'u': 'https://emoji.redditmedia.com/5ovbb4a6c8t41_t5_r0hux/WORLD'},
  {'e': 'text', 't': ' Global'}],
 'subreddit_name_prefixed': 'r/DisneyPlus',
 'hidden': False,
 'pwls': 6,
 'link_flair_css_class': '',
 'downs': 0,
 'thumbnail_height': 59,
 'top_awarded_type': None,
 'hide_score': False,
 'name': 't3_m1oo4o',
 'quarantine': False,
 'link_flair_text_color': 'dark',
 'upvote_ratio': 0.98,
 'author_flair_background_color': None,
 'subreddit_type': 'public',
 'ups': 265,
 'total_awards_received': 1,
 'media_embed': {},
 'thumbnail_width': 140,
 'author_flair_template_id': None,
 'is_original_content': False,
 'user_reports': [],
 'secure_media': None,
 'is_reddit_media_domain': True,
 'is_meta'

In [17]:
reddit_dict['data']['children'][1]['data']['subreddit']

'DisneyPlus'

Above is the target : DisneyPlus

Below are the text fields that will mainly be used for our analysis.

In [18]:
reddit_dict['data']['children'][1]['data']['title']

'End credit of Raya &amp; the last dragon 😂'

In [19]:
reddit_dict['data']['children'][1]['data']['selftext']

''

Get all the posts information into a single dataframe.

In [20]:
posts = [p['data'] for p in reddit_dict['data']['children']]

In [21]:
pd.DataFrame(posts)

Unnamed: 0,approved_at_utc,subreddit,selftext,author_fullname,saved,mod_reason_title,gilded,clicked,title,link_flair_richtext,...,created_utc,num_crossposts,media,is_video,url_overridden_by_dest,is_gallery,media_metadata,gallery_data,crosspost_parent_list,crosspost_parent
0,,DisneyPlus,[Join the Cordcutting &amp; Streaming TV Disco...,t2_6l4z3,False,,0,False,Join the Cordcutting &amp; Streaming TV Discord,"[{'e': 'text', 't': 'Announcement'}]",...,1615180000.0,0,,False,,,,,,
1,,DisneyPlus,,t2_8wpjm,False,,0,False,End credit of Raya &amp; the last dragon 😂,"[{'a': ':WORLD:', 'e': 'emoji', 'u': 'https://...",...,1615348000.0,0,,False,https://i.redd.it/7aqr5wp7i4m61.jpg,,,,,
2,,DisneyPlus,,t2_x2ai7,False,,0,False,Disney+ Passes 100 Million Paid Subscribers,"[{'a': ':WORLD:', 'e': 'emoji', 'u': 'https://...",...,1615315000.0,1,,False,https://www.hollywoodreporter.com/news/disney-...,,,,,
3,,DisneyPlus,,t2_rgsvw,False,,0,False,Where We Are in the Riordanverse! | Rick Riord...,[],...,1615311000.0,0,,False,https://rickriordan.com/2021/03/where-we-are-i...,,,,,
4,,DisneyPlus,,t2_rgsvw,False,,0,False,‘WandaVision’ EP Jac Schaeffer On Who Didn’t S...,[],...,1615312000.0,0,,False,https://deadline.com/2021/03/wandavision-serie...,,,,,
5,,DisneyPlus,Title 🤷🏻‍♂️,t2_7amu67t,False,,0,False,Anyone else having trouble with D+ atm? Everyt...,"[{'a': ':UK:', 'e': 'emoji', 'u': 'https://emo...",...,1615335000.0,0,,False,,,,,,
6,,DisneyPlus,,t2_8h97d63v,False,,0,False,New Disney+ Hotstar VIP access packs announced...,"[{'a': ':IN:', 'e': 'emoji', 'u': 'https://emo...",...,1615346000.0,0,,False,https://onlytech.com/vi-launches-rs-401-601-80...,,,,,
7,,DisneyPlus,Only the dubbed version will be available to D...,t2_11gbyc,False,,0,False,The Falcon and The Winter Soldier will be dubb...,"[{'a': ':IN:', 'e': 'emoji', 'u': 'https://emo...",...,1615301000.0,0,,False,,,,,,
8,,DisneyPlus,,t2_9pypjnbj,False,,0,False,New posters for Falcon and the Winter Soldier!...,"[{'a': ':WORLD:', 'e': 'emoji', 'u': 'https://...",...,1615226000.0,0,,False,https://www.reddit.com/gallery/m0lgkr,True,"{'x5vwo59ieul61': {'status': 'valid', 'e': 'Im...","{'items': [{'media_id': 'x9s95wwheul61', 'id':...",,
9,,DisneyPlus,,t2_6pdrne4t,False,,0,False,"First D+ Original to release in Hindi, Tamil &...","[{'a': ':IN:', 'e': 'emoji', 'u': 'https://emo...",...,1615314000.0,0,,False,https://www.reddit.com/gallery/m1cnjz,True,"{'sv9f29oeo1m61': {'status': 'valid', 'e': 'Im...","{'items': [{'media_id': 't0eq017eo1m61', 'id':...",,


We have explored a single request and the data received from it. We need to scrape more to fetch sufficient data for our analysis. <br>
In order to do that the last post of the response data will help us determine which will be our next batch of posts.<br> We can get the reference of last post in the response dictionary as below.

In [24]:
reddit_dict['data']['after']

't3_m0u9dx'

The url formed below will be used to fetch next batch of posts from reddit. 

In [25]:
url + '?after=' + reddit_dict['data']['after']

'https://www.reddit.com/r/DisneyPlus.json?after=t3_m0u9dx'

### Scrape Sufficient Data

In [26]:
posts = []
user_agents = ['Arti Inc 1.0', 'AJ 1.0 Inc', 'Arti Inc 2.0', 'AJ 2.0 Inc', 'Arti Inc 3.0', 
               'AJ 3.0 Inc', 'Arti Inc 4.0', 'AJ 4.0 Inc','Arti Inc 5.0', 'AJ 5.0 Inc',]
after = None

for a in range(60):
    if after == None:
        current_url = url
    else:
        current_url = url + '?after=' + after
    print(current_url)
    res = requests.get(current_url, headers={'User-agent': random.choice(user_agents)})
    
    if res.status_code != 200:
        print('Status error', res.status_code)
        break
    
    current_dict = res.json()
    current_posts = [p['data'] for p in current_dict['data']['children']]
    posts.extend(current_posts)
    after = current_dict['data']['after']
    
    # generate a random sleep duration to look more 'natural'
    sleep_duration = random.randint(2,30)
    print(sleep_duration)
    time.sleep(sleep_duration)

https://www.reddit.com/r/DisneyPlus.json
19
https://www.reddit.com/r/DisneyPlus.json?after=t3_m0u9dx
30
https://www.reddit.com/r/DisneyPlus.json?after=t3_lzvzvi
3
https://www.reddit.com/r/DisneyPlus.json?after=t3_lyra30
25
https://www.reddit.com/r/DisneyPlus.json?after=t3_ly975r
18
https://www.reddit.com/r/DisneyPlus.json?after=t3_lxieai
4
https://www.reddit.com/r/DisneyPlus.json?after=t3_lvqlit
10
https://www.reddit.com/r/DisneyPlus.json?after=t3_lw7lzm
23
https://www.reddit.com/r/DisneyPlus.json?after=t3_lut9p1
28
https://www.reddit.com/r/DisneyPlus.json?after=t3_ltn4n5
14
https://www.reddit.com/r/DisneyPlus.json?after=t3_ltja7g
22
https://www.reddit.com/r/DisneyPlus.json?after=t3_lsbufm
27
https://www.reddit.com/r/DisneyPlus.json?after=t3_lsccjf
14
https://www.reddit.com/r/DisneyPlus.json?after=t3_lqtdg4
27
https://www.reddit.com/r/DisneyPlus.json?after=t3_lqod33
3
https://www.reddit.com/r/DisneyPlus.json?after=t3_lq24iu
20
https://www.reddit.com/r/DisneyPlus.json?after=t3_lpc3br
27

In [27]:
posts_df = pd.DataFrame(posts)

In [28]:
pd.set_option('display.max_columns', len(posts_df.columns))
posts_df

Unnamed: 0,approved_at_utc,subreddit,selftext,author_fullname,saved,mod_reason_title,gilded,clicked,title,link_flair_richtext,subreddit_name_prefixed,hidden,pwls,link_flair_css_class,downs,thumbnail_height,top_awarded_type,hide_score,name,quarantine,link_flair_text_color,upvote_ratio,author_flair_background_color,subreddit_type,ups,total_awards_received,media_embed,thumbnail_width,author_flair_template_id,is_original_content,user_reports,secure_media,is_reddit_media_domain,is_meta,category,secure_media_embed,link_flair_text,can_mod_post,score,approved_by,author_premium,thumbnail,edited,author_flair_css_class,author_flair_richtext,gildings,post_hint,content_categories,is_self,mod_note,created,link_flair_type,wls,removed_by_category,banned_by,author_flair_type,domain,allow_live_comments,selftext_html,likes,suggested_sort,banned_at_utc,view_count,archived,no_follow,is_crosspostable,pinned,over_18,preview,all_awardings,awarders,media_only,link_flair_template_id,can_gild,spoiler,locked,author_flair_text,treatment_tags,visited,removed_by,num_reports,distinguished,subreddit_id,mod_reason_by,removal_reason,link_flair_background_color,id,is_robot_indexable,report_reasons,author,discussion_type,num_comments,send_replies,whitelist_status,contest_mode,mod_reports,author_patreon_flair,author_flair_text_color,permalink,parent_whitelist_status,stickied,url,subreddit_subscribers,created_utc,num_crossposts,media,is_video,url_overridden_by_dest,is_gallery,media_metadata,gallery_data,crosspost_parent_list,crosspost_parent,collections,author_cakeday
0,,DisneyPlus,[Join the Cordcutting &amp; Streaming TV Disco...,t2_6l4z3,False,,0,False,Join the Cordcutting &amp; Streaming TV Discord,"[{'e': 'text', 't': 'Announcement'}]",r/DisneyPlus,False,6,,0,,,False,t3_m08afa,False,light,0.57,,public,2,0,{},,,False,[],,False,False,,{},Announcement,False,2,,True,self,False,,[],{},self,,True,,1.615208e+09,richtext,6,,,text,self.DisneyPlus,False,"&lt;!-- SC_OFF --&gt;&lt;div class=""md""&gt;&lt...",,,,,False,False,False,False,False,{'images': [{'source': {'url': 'https://extern...,[],[],False,54820300-5ddd-11e9-b2e3-0e5740f4b800,False,False,False,,[],False,,,,t5_r0hux,,,#ea0027,m08afa,True,,AutoModerator,,0,True,all_ads,False,[],False,,/r/DisneyPlus/comments/m08afa/join_the_cordcut...,all_ads,True,https://www.reddit.com/r/DisneyPlus/comments/m...,108474,1.615180e+09,0,,False,,,,,,,,
1,,DisneyPlus,,t2_8wpjm,False,,0,False,End credit of Raya &amp; the last dragon 😂,"[{'a': ':WORLD:', 'e': 'emoji', 'u': 'https://...",r/DisneyPlus,False,6,,0,59.0,,False,t3_m1oo4o,False,dark,0.98,,public,262,1,{},140.0,,False,[],,True,False,,{},:WORLD: Global,False,262,,False,https://a.thumbs.redditmedia.com/T6tJV3OTDerqG...,False,,[],{'gid_1': 1},image,,False,,1.615377e+09,richtext,6,,,text,i.redd.it,False,,,,,,False,False,False,False,False,{'images': [{'source': {'url': 'https://previe...,"[{'giver_coin_reward': None, 'subreddit_id': N...",[],False,8fba2d22-8013-11ea-945e-0ec44de85c5f,False,False,False,,[],False,,,,t5_r0hux,,,#edeff1,m1oo4o,True,,ngocburin,,6,True,all_ads,False,[],False,,/r/DisneyPlus/comments/m1oo4o/end_credit_of_ra...,all_ads,False,https://i.redd.it/7aqr5wp7i4m61.jpg,108474,1.615348e+09,0,,False,https://i.redd.it/7aqr5wp7i4m61.jpg,,,,,,,
2,,DisneyPlus,,t2_x2ai7,False,,0,False,Disney+ Passes 100 Million Paid Subscribers,"[{'a': ':WORLD:', 'e': 'emoji', 'u': 'https://...",r/DisneyPlus,False,6,,0,78.0,,False,t3_m1d4w1,False,dark,0.98,,public,904,1,{},140.0,,False,[],,False,False,,{},:WORLD: Global,False,904,,True,https://b.thumbs.redditmedia.com/GlJmUgbXPTC5o...,False,,[],{'gid_1': 1},link,,False,,1.615344e+09,richtext,6,,,text,hollywoodreporter.com,True,,,,,,False,False,False,False,False,{'images': [{'source': {'url': 'https://extern...,"[{'giver_coin_reward': None, 'subreddit_id': N...",[],False,8fba2d22-8013-11ea-945e-0ec44de85c5f,False,False,False,,[],False,,,,t5_r0hux,,,#edeff1,m1d4w1,True,,08830,,78,True,all_ads,False,[],False,,/r/DisneyPlus/comments/m1d4w1/disney_passes_10...,all_ads,False,https://www.hollywoodreporter.com/news/disney-...,108474,1.615315e+09,1,,False,https://www.hollywoodreporter.com/news/disney-...,,,,,,,
3,,DisneyPlus,,t2_rgsvw,False,,0,False,Where We Are in the Riordanverse! | Rick Riord...,[],r/DisneyPlus,False,6,,0,,,False,t3_m1bjtj,False,dark,0.98,#e3e3e3,public,161,0,{},,453da5b8-5d68-11e9-8bbd-0e5883b3ae1e,False,[],,False,False,,{},,False,161,,True,default,False,,"[{'a': ':Castle:', 'e': 'emoji', 'u': 'https:/...",{},,,False,,1.615340e+09,text,6,,,richtext,rickriordan.com,False,,,,,,False,False,False,False,False,,[],[],False,,False,False,False,:Castle: Senior Moderator,[],False,,,,t5_r0hux,,,,m1bjtj,True,,MattHall83,,10,True,all_ads,False,[],False,dark,/r/DisneyPlus/comments/m1bjtj/where_we_are_in_...,all_ads,False,https://rickriordan.com/2021/03/where-we-are-i...,108474,1.615311e+09,0,,False,https://rickriordan.com/2021/03/where-we-are-i...,,,,,,,
4,,DisneyPlus,,t2_rgsvw,False,,0,False,‘WandaVision’ EP Jac Schaeffer On Who Didn’t S...,[],r/DisneyPlus,False,6,,0,78.0,,False,t3_m1bvpu,False,dark,0.85,#e3e3e3,public,23,0,{},140.0,453da5b8-5d68-11e9-8bbd-0e5883b3ae1e,False,[],,False,False,,{},,False,23,,True,spoiler,False,,"[{'a': ':Castle:', 'e': 'emoji', 'u': 'https:/...",{},link,,False,,1.615341e+09,text,6,,,richtext,deadline.com,False,,,,,,False,False,False,False,False,{'images': [{'source': {'url': 'https://extern...,[],[],False,,False,True,False,:Castle: Senior Moderator,[],False,,,,t5_r0hux,,,,m1bvpu,True,,MattHall83,,5,True,all_ads,False,[],False,dark,/r/DisneyPlus/comments/m1bvpu/wandavision_ep_j...,all_ads,False,https://deadline.com/2021/03/wandavision-serie...,108474,1.615312e+09,0,,False,https://deadline.com/2021/03/wandavision-serie...,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1486,,DisneyPlus,"As of two days ago, any time I try to watch it...",t2_hu8hh,False,,0,False,Can't Watch Mandalorian,"[{'e': 'text', 't': 'Technical Support'}]",r/DisneyPlus,False,6,,0,,,False,t3_lx0d5t,False,dark,0.50,,public,0,0,{},,,False,[],,False,False,,{},Technical Support,False,0,,False,self,False,,[],{},,,True,,1.614824e+09,richtext,6,,,text,self.DisneyPlus,False,"&lt;!-- SC_OFF --&gt;&lt;div class=""md""&gt;&lt...",,,,,False,True,False,False,False,,[],[],False,80b5c958-4246-11eb-a7e5-0e6279c766ad,False,False,False,,[],False,,,,t5_r0hux,,,#ffb000,lx0d5t,True,,Bodom1994,,6,True,all_ads,False,[],False,,/r/DisneyPlus/comments/lx0d5t/cant_watch_manda...,all_ads,False,https://www.reddit.com/r/DisneyPlus/comments/l...,108483,1.614795e+09,0,,False,,,,,,,,
1487,,DisneyPlus,One of the benefits we found with Disney Plus ...,t2_3cujg,False,,0,False,Unpopular opinion? Star made the service worse...,[],r/DisneyPlus,False,6,,0,,,False,t3_lx8cnc,False,dark,0.36,,public,0,0,{},,,False,[],,False,False,,{},,False,0,,False,self,False,,[],{},,,True,,1.614846e+09,text,6,,,text,self.DisneyPlus,False,"&lt;!-- SC_OFF --&gt;&lt;div class=""md""&gt;&lt...",,,,,False,True,False,False,False,,[],[],False,,False,False,False,,[],False,,,,t5_r0hux,,,,lx8cnc,True,,ninj4,,12,True,all_ads,False,[],False,,/r/DisneyPlus/comments/lx8cnc/unpopular_opinio...,all_ads,False,https://www.reddit.com/r/DisneyPlus/comments/l...,108483,1.614817e+09,0,,False,,,,,,,,
1488,,DisneyPlus,,t2_rgsvw,False,,0,False,Disney+ Subscriptions Are Growing Beyond Expec...,[],r/DisneyPlus,False,6,,0,93.0,,False,t3_lw7m59,False,dark,0.95,#e3e3e3,public,70,0,{},140.0,453da5b8-5d68-11e9-8bbd-0e5883b3ae1e,False,[],,False,False,,{},,False,70,,True,https://b.thumbs.redditmedia.com/DV0Tbr8LzVjum...,False,,"[{'a': ':Castle:', 'e': 'emoji', 'u': 'https:/...",{},link,,False,,1.614735e+09,text,6,,,richtext,finance.yahoo.com,False,,,,,,False,False,False,False,False,{'images': [{'source': {'url': 'https://extern...,[],[],False,,False,False,False,:Castle: Senior Moderator,[],False,,,,t5_r0hux,,,,lw7m59,True,,MattHall83,,20,True,all_ads,False,[],False,dark,/r/DisneyPlus/comments/lw7m59/disney_subscript...,all_ads,False,https://finance.yahoo.com/news/disney-subscrip...,108483,1.614706e+09,0,,False,https://finance.yahoo.com/news/disney-subscrip...,,,,,,,
1489,,DisneyPlus,Device: Chromecast with Google TV + Panasoni...,t2_5pzl2cox,False,,0,False,Dolby Vision/Atmos not showing Dolby Vision/Atmos,"[{'e': 'text', 't': 'Technical Support'}]",r/DisneyPlus,False,6,,0,,,False,t3_lwt27y,False,dark,0.50,,public,0,0,{},,,False,[],,False,False,,{},Technical Support,False,0,,False,self,False,,[],{},,,True,,1.614804e+09,richtext,6,,,text,self.DisneyPlus,False,"&lt;!-- SC_OFF --&gt;&lt;div class=""md""&gt;&lt...",,,,,False,True,False,False,False,,[],[],False,80b5c958-4246-11eb-a7e5-0e6279c766ad,False,False,False,,[],False,,,,t5_r0hux,,,#ffb000,lwt27y,True,,juiceinmyears,,3,True,all_ads,False,[],False,,/r/DisneyPlus/comments/lwt27y/dolby_visionatmo...,all_ads,False,https://www.reddit.com/r/DisneyPlus/comments/l...,108483,1.614776e+09,0,,False,,,,,,,,


In [29]:
posts_df.columns

Index(['approved_at_utc', 'subreddit', 'selftext', 'author_fullname', 'saved',
       'mod_reason_title', 'gilded', 'clicked', 'title', 'link_flair_richtext',
       ...
       'media', 'is_video', 'url_overridden_by_dest', 'is_gallery',
       'media_metadata', 'gallery_data', 'crosspost_parent_list',
       'crosspost_parent', 'collections', 'author_cakeday'],
      dtype='object', length=115)

All the post information along with meta data is available in a single record. We will not use meta data like clicked, saved, hidden , etc. <br>
We will only use text fields of post containing meaningful words.

In [30]:
posts_df['selftext'].eq('').sum()

615

In [31]:
posts_df.shape

(1491, 115)

### Export To CSV

In [35]:
posts_df.to_csv('../Data/DisneyPlus.csv', index=False)