### Background

Netflix is one of the world's most popular streaming service. In 2022, Netflix lost over 1.2 million customers.

Based on the [source](https://www.makeuseof.com/why-netflix-is-losing-subscribers/), here are the following reasons why Neflix is losing subscribers:
- 1. Netflix Pulled Its Service From Russia
- 2. Netflix Hiked Up Prices 
- 3. Netflix Account Sharing Slows Down Growth
- 4. Netflix Keeps Losing Content From Media Companies
- 5. Quality of Content
- 6. There Are Too Many Streaming Services

This project would like to dive deeper into studying and comparing Netflix with one of its biggest competitor, Disney Plus.

### Outside Research

**Reference :** [Disney Plus vs. Netflix: How Do They Compare?](https://www.cnet.com/tech/services-and-software/disney-plus-vs-netflix-how-to-choose-between-both-streaming-services/#:~:text=Disney%20Plus%20and%20Netflix%20differ,streams%20and%20has%20Netflix%20beat)

Obi-Wan Kenobi and Stranger Things 4 are hot tickets on their platforms, but here's how Disney Plus and Netflix stack up.

**Netflix**<br>
Back in 2007, Netflix is credited with starting the streaming wars. Since then, the industry leader in streaming has raised the price of its subscriptions, but Netflix has remained committed to original content while also offering a wide selection of authorized TV episodes and films. Its user interface makes it simple to browse content by age, popularity, or category. 

**Disney Plus** <br>
A playground for children, Star Wars fans, Marvel lovers and throwbacks<br>
With its selection of family-friendly films and television programs, Disney Plus takes the top spot. As of May, the site had more than 137 million subscribers, and it is still expanding. For anyone who like all things Disney, Marvel, Pixar, and National Geographic, it's a great deal at $8 per month. Nearly ten decades' worth of amusing and enlightening material is available to viewers.

**How do these services compare in pricing?** <br>
Disney Plus and Netflix differ greatly in costs <br>
DisneyPlus costs \\$8 per month while Netflix's cost ranges from \\$10-20 per month depending on the subscription plan.

**Originals** <br>
Without truly setting a match on anything completely new, Disney Plus draws on established franchises and legacy products. When viewers must wait for the streamer to release new content, this strategy hurts Disney. Even if successful series like Loki and The Mandalorian have great production values, they are part of broader cinematic universes. Additionally, the site doesn't release many original films, unlike Netflix. 

However, that doesn't imply that you can't find anything good to stream from Marvel, NatGeo or any other brand in the Disney family. It's the only place you can view Obi-Wan Kenobi, Encanto, Turning Red and Moon Knight.

Netflix's original content, on the other hand, is nothing to laugh at. In addition to the now-famous Stranger Things, the platform has successfully incorporated a number of unique TV shows and movies into popular culture, including Squid Game, Money Heist, and Tiger King. This does not imply that Netflix avoids adaptations, remakes, or reboots. Whether it's an original television show based on an existing property like Fuller House or Lucifer, or a novel adaption like Bridgerton.

However, the content is always changing. A list of Netflix's newest original releases for each month and year is simple to locate. Netflix provides a consistent flow of original content that customers can depend on every week, whether it's a movie, reality program, or highly anticipated new season of a popular series. 

**How do they compare in overall variety?** <br>
Disney Plus has some gaps for adults, while Netflix has a balance<br>

Disney's selection of children's films and television shows cannot be contested. Its inventory, which spans almost 100 years of goods, was created with children's imaginations in mind. Having saying that, every show and movie on Disney Plus upholds the tradition of creating family-friendly material. It's a haven for youngsters and a nostalgia home run for grownups.

There are roughly a little over 1,000 total pieces of content on Disney Plus, including a sizable collection of documentaries, TV episodes, animated stories, and blockbuster movies. The majority of it originates from a Disney-owned studio or network, such as ABC, Pixar, or Marvel.

According to reports, Netflix's site currently offers over 3,000 films and 1,000 TV shows. It provides a wide selection of both original and licensed content for children of all ages, as well as selections for every genre. You may view video from a variety of sources in addition to its own original productions, including the CW, Fox, Universal Pictures, Showtime, USA, and more.

Visit Netflix to watch anime, kid-friendly movies, rom-coms, horror movies, and just about anything else. Because there are material types suitable for all ages, from G to NC-17, adults can enjoy all the bloodshed, dark humor, and romance they desire.

### Problem Statement

As the member of the Netflix data scientist team, I am assigned to research the general customer opinions toward Netflix and one of its competitors, Disney+, on Reddit. 

The goals of this project are: 
- to categorize postings on the r/netflix and r/DisneyPlus subreddits
- to investigate the reasons behind Netflix's declining subscriber count 
- to suggest potential solutions to increase Netflix's subscriptions

### Import Libraries

In [28]:
import pandas as pd
import numpy as np
import requests
import time
import datetime as dt
import json

### Creating a function to collect data using Pushshift's API

In [None]:
# subreddit: r/talesfromtechsupport and r/TalesFromRetail
# post_type: type of post to search for: submission
# loops: number of times to request posts
# size: number of posts per request (max 100 per pushshift api)
# skip: skip posts

In [33]:
#Ref: https://github.com/pushshift/api

def pushshift(subreddit, post_type, loops=1, size=100, skip=1):

    # data fields to return for submissions
    subfields = ['subreddit', 'title', 'selftext']    
    
    # instantiate list for posts data
    list_posts = [] 
    url_stem = "https://api.pushshift.io/reddit/search/{}/?subreddit={}&size={}".format(post_type, subreddit, size)  

    for i in range(loops):
        # add parameters to url to skip posts 
        url = '{}&after={}d'.format(url_stem, skip * i) 
        
        # monitor status as loops run
        print(i, url)
        
        # get data from url
        res = requests.get(url)
        # add dictionaries for posts to list_posts
        list_posts.extend(res.json()['data']) 
        
        # allow for break in between requests
        time.sleep(1) 

    # turn list_posts (a list of dictionaries where each dictionary contains data on one post) into a dataframe
    df_posts = pd.DataFrame.from_dict(list_posts) 

    # drop any duplicates
    df_posts.drop_duplicates(inplace=True)
    
    return df_posts

In [80]:
# i_sub = pushshift('netflix', post_type='submission', loops=25, size=100, skip=1)
# print('shape', i_sub.shape)

0 https://api.pushshift.io/reddit/search/submission/?subreddit=netflix&size=100&after=1d
1 https://api.pushshift.io/reddit/search/submission/?subreddit=netflix&size=100&after=2d
2 https://api.pushshift.io/reddit/search/submission/?subreddit=netflix&size=100&after=3d
3 https://api.pushshift.io/reddit/search/submission/?subreddit=netflix&size=100&after=4d
4 https://api.pushshift.io/reddit/search/submission/?subreddit=netflix&size=100&after=5d
5 https://api.pushshift.io/reddit/search/submission/?subreddit=netflix&size=100&after=6d
6 https://api.pushshift.io/reddit/search/submission/?subreddit=netflix&size=100&after=7d
7 https://api.pushshift.io/reddit/search/submission/?subreddit=netflix&size=100&after=8d
8 https://api.pushshift.io/reddit/search/submission/?subreddit=netflix&size=100&after=9d
9 https://api.pushshift.io/reddit/search/submission/?subreddit=netflix&size=100&after=10d
10 https://api.pushshift.io/reddit/search/submission/?subreddit=netflix&size=100&after=11d
11 https://api.pus

In [101]:
# a_sub = pushshift('DisneyPlus', post_type='submission', loops=50, size=100, skip=1)
# print('shape', a_sub.shape)

0 https://api.pushshift.io/reddit/search/submission/?subreddit=DisneyPlus&size=100&after=1d
1 https://api.pushshift.io/reddit/search/submission/?subreddit=DisneyPlus&size=100&after=2d
2 https://api.pushshift.io/reddit/search/submission/?subreddit=DisneyPlus&size=100&after=3d
3 https://api.pushshift.io/reddit/search/submission/?subreddit=DisneyPlus&size=100&after=4d
4 https://api.pushshift.io/reddit/search/submission/?subreddit=DisneyPlus&size=100&after=5d
5 https://api.pushshift.io/reddit/search/submission/?subreddit=DisneyPlus&size=100&after=6d
6 https://api.pushshift.io/reddit/search/submission/?subreddit=DisneyPlus&size=100&after=7d
7 https://api.pushshift.io/reddit/search/submission/?subreddit=DisneyPlus&size=100&after=8d
8 https://api.pushshift.io/reddit/search/submission/?subreddit=DisneyPlus&size=100&after=9d
9 https://api.pushshift.io/reddit/search/submission/?subreddit=DisneyPlus&size=100&after=10d
10 https://api.pushshift.io/reddit/search/submission/?subreddit=DisneyPlus&size

In [102]:
# i_sub.to_csv('i_sub.csv')
# a_sub.to_csv('a_sub.csv')

In [103]:
i_df = pd.read_csv('./i_sub.csv')
a_df = pd.read_csv('./a_sub.csv')

In [104]:
i_df.shape, a_df.shape

((1337, 5), (1329, 5))

In [105]:
i_df.drop(i_df[i_df['selftext']=='[removed]'].index, axis=0, inplace=True)
a_df.drop(a_df[a_df['selftext']=='[removed]'].index, axis=0, inplace=True)

In [106]:
i_df.shape, a_df.shape

((934, 5), (974, 5))

In [107]:
i_df.isnull().sum()

Unnamed: 0      0
selftext      513
subreddit       0
title           0
post_type       0
dtype: int64

In [108]:
a_df.isnull().sum()

Unnamed: 0      0
selftext      458
subreddit       0
title           0
post_type       0
dtype: int64

In [109]:
i_df.dropna(axis=0, inplace=True)
a_df.dropna(axis=0, inplace=True)

In [110]:
i_df.shape, a_df.shape

((421, 5), (516, 5))