Bazham Khanatayev \
Data 512 HW 1 \
Data Acquisition Notebook

This notebook uses the Pageviews Wikipedia API to pull data about a set of movies provided in the thank_the_academy.AUG.2023.csv file. Three files are produced in this notebook in accordance with the following guidelines:

Monthly mobile access - The API separates mobile access types into two separate requests, you will need to sum these to make one count for all mobile pageviews. You should store the mobile access data in a file called: \
academy_monthly_mobile_startYYYYMM-endYYYYMM.json 

Monthly desktop access - Monthly desktop page traffic is based on one single request. You should store the desktop access data in a file called: \
academy_monthly_desktop_startYYYYMM-endYYYYMM.json 

Monthly cumulative - Monthly cumulative data is the sum of all mobile, and all desktop traffic per article. You should store the monthly cumulative data in a file called: \
academy_monthly_cumulative_startYYYYMM-endYYYYMM.json


This notebook uses the Wikipedia Pageviews API to grab a specific subset of Wikipedia Page Articles based on the provided csv file. That data is stored in three different JSON files.

# Setup

### The following code was provided by the intructor. It shows how to use the API and establishes some constants.

License: This code example was developed by Dr. David W. McDonald for use in DATA 512, a course in the UW MS Data Science degree program. This code is provided under the Creative Commons CC-BY license. Revision 1.2 - August 14, 2023

In [1]:
# These are standard python modules
import json, time, urllib.parse

# The 'requests' module is not a standard Python module. You will need to install this with pip/pip3 if you do not already have it
import requests

In [2]:
#########
#
#    CONSTANTS
#

# The REST API 'pageviews' URL - this is the common URL/endpoint for all 'pageviews' API requests
API_REQUEST_PAGEVIEWS_ENDPOINT = 'https://wikimedia.org/api/rest_v1/metrics/pageviews/'

# This is a parameterized string that specifies what kind of pageviews request we are going to make
# In this case it will be a 'per-article' based request. The string is a format string so that we can
# replace each parameter with an appropriate value before making the request
API_REQUEST_PER_ARTICLE_PARAMS = 'per-article/{project}/{access}/{agent}/{article}/{granularity}/{start}/{end}'

# The Pageviews API asks that we not exceed 100 requests per second, we add a small delay to each request
API_LATENCY_ASSUMED = 0.002       # Assuming roughly 2ms latency on the API and network
API_THROTTLE_WAIT = (1.0/100.0)-API_LATENCY_ASSUMED

# When making a request to the Wikimedia API they ask that you include your email address which will allow them
# to contact you if something happens - such as - your code exceeding rate limits - or some other error 
REQUEST_HEADERS = {
    'User-Agent': '<uwnetid@uw.edu>, University of Washington, MSDS DATA 512 - AUTUMN 2023',
}

# This is just a list of English Wikipedia article titles that we can use for example requests
ARTICLE_TITLES = [ 'Bison', 'Northern flicker', 'Red squirrel', 'Chinook salmon', 'Horseshoe bat' ]

# This template is used to map parameter values into the API_REQUST_PER_ARTICLE_PARAMS portion of an API request. The dictionary has a
# field/key for each of the required parameters. In the example, below, we only vary the article name, so the majority of the fields
# can stay constant for each request. Of course, these values *could* be changed if necessary.
ARTICLE_PAGEVIEWS_PARAMS_TEMPLATE = {
    "project":     "en.wikipedia.org",
    "access":      "desktop",      # this should be changed for the different access types
    "agent":       "user",
    "article":     "",             # this value will be set/changed before each request
    "granularity": "monthly",
    "start":       "2015010100",   # start and end dates need to be set
    "end":         "2023040100"    # this is likely the wrong end date
}

In [3]:
#########
#
#    PROCEDURES/FUNCTIONS
#

def request_pageviews_per_article(article_title = None, 
                                  endpoint_url = API_REQUEST_PAGEVIEWS_ENDPOINT, 
                                  endpoint_params = API_REQUEST_PER_ARTICLE_PARAMS, 
                                  request_template = ARTICLE_PAGEVIEWS_PARAMS_TEMPLATE,
                                  headers = REQUEST_HEADERS):

    # article title can be as a parameter to the call or in the request_template
    if article_title:
        request_template['article'] = article_title

    if not request_template['article']:
        raise Exception("Must supply an article title to make a pageviews request.")

    # Titles are supposed to have spaces replaced with "_" and be URL encoded
    article_title_encoded = urllib.parse.quote(request_template['article'].replace(' ','_'))
    request_template['article'] = article_title_encoded
    
    # now, create a request URL by combining the endpoint_url with the parameters for the request
    request_url = endpoint_url+endpoint_params.format(**request_template)
    
    # make the request
    try:
        # we'll wait first, to make sure we don't exceed the limit in the situation where an exception
        # occurs during the request processing - throttling is always a good practice with a free
        # data source like Wikipedia - or other community sources
        if API_THROTTLE_WAIT > 0.0:
            time.sleep(API_THROTTLE_WAIT)
        response = requests.get(request_url, headers=headers)
        json_response = response.json()
    except Exception as e:
        print(e)
        json_response = None
    return json_response

## The following establishes the constants and imports that I will need.

I am reusing some of the code provided in the sample.

In [4]:
# Standard libraries
import json
import time
import urllib.parse

# Third-party libraries
import requests
import pandas as pd

# Constants
API_REQUEST_PAGEVIEWS_ENDPOINT = 'https://wikimedia.org/api/rest_v1/metrics/pageviews/'
API_REQUEST_PER_ARTICLE_PARAMS = 'per-article/{project}/{access}/{agent}/{article}/{granularity}/{start}/{end}'
API_LATENCY_ASSUMED = 0.002       
API_THROTTLE_WAIT = (1.0/100.0)-API_LATENCY_ASSUMED

REQUEST_HEADERS = {
    'User-Agent': '<uwnetid@uw.edu>, University of Washington, MSDS DATA 512 - AUTUMN 2023',
}

ARTICLE_PAGEVIEWS_PARAMS_TEMPLATE = {
    "project":     "en.wikipedia.org",
    "access":      "desktop",
    "agent":       "user",
    "article":     "",
    "granularity": "monthly",
    "start":       "2015070100",
    "end":         "2023093000"
}


## Reading the CSV

In [5]:
# Read the CSV
movies_df = pd.read_csv("thank_the_academy.AUG.2023.csv")

# Extract the 'name' column which contains the movie titles
movie_titles = movies_df['name'].tolist()


Everything Everywhere All at Once
All Quiet on the Western Front (2022 film)
The Whale (2022 film)
Top Gun: Maverick
Black Panther: Wakanda Forever


## Mobile

We will start by pulling the data for mobile which includes mobile-app and mobile-web.

In [6]:
def get_combined_mobile_pageviews(article_title):
    """ Fetch combined mobile pageviews (mobile-web and mobile-app) for the given article. 
    
        This function sends two separate requests: one for mobile-web and another for mobile-app.
        After fetching the data, it combines the views from both sources into a single response.

        Parameters:
            article_title (str): The title of the article for which to fetch pageviews.

        Returns:
            dict: The combined pageviews data for mobile. 
                  If any of the requests doesn't contain 'items', it returns None.

        Notes:
            The function modifies the ARTICLE_PAGEVIEWS_PARAMS_TEMPLATE global variable.
            The combined result format will be the same as the individual mobile-web or mobile-app
            response, but with views aggregated and the 'access' field set to 'mobile'.
    
    """
    # Update global params template to fetch mobile-web pageviews
    ARTICLE_PAGEVIEWS_PARAMS_TEMPLATE["access"] = "mobile-web"
    mobile_web_views = request_pageviews_per_article(article_title)
    
    # Update global params template to fetch mobile-app pageviews
    ARTICLE_PAGEVIEWS_PARAMS_TEMPLATE["access"] = "mobile-app"
    mobile_app_views = request_pageviews_per_article(article_title)
    
    # Check if 'items' key exists in both responses
    if 'items' not in mobile_web_views or 'items' not in mobile_app_views:
        print(f"Missing data for article: {article_title}")
        return None

    # Combining pageviews and setting access to mobile
    for web_view, app_view in zip(mobile_web_views['items'], mobile_app_views['items']):
        web_view['views'] += app_view['views']
        web_view['access'] = 'mobile'  # Explicitly set access to mobile
    
    return mobile_web_views


In [None]:
def fetch_and_save_mobile_pageviews(movie_titles):
    """This function iterates over a list of movie titles, fetches the combined mobile pageviews 
    for each movie title using the `get_combined_mobile_pageviews` function, and then 
    saves the aggregated data to a JSON file. The filename is generated based on the 
    start and end dates defined in the ARTICLE_PAGEVIEWS_PARAMS_TEMPLATE global variable.

    Parameters:
        movie_titles (list of str): A list of movie titles for which to fetch and save pageviews.

    Returns:
        None: The function doesn't return anything, but it prints the progress and the filename where the data is saved.

    Notes:
        The function relies on the global variable ARTICLE_PAGEVIEWS_PARAMS_TEMPLATE to fetch 
        pageviews and to generate the filename. Ensure that this global variable is appropriately 
        set before calling the function. """
    
    # Dictionary to store the combined mobile pageviews for all movie title
    all_pageviews = {}
    
    # Iterate over the list of movie titles and fetch combined mobile pageviews for each
    for title in movie_titles:
        print(f"Fetching mobile pageviews for: {title}")
        combined_mobile_views = get_combined_mobile_pageviews(title)
        all_pageviews[title] = combined_mobile_views

    # Define the filename
    start_date = ARTICLE_PAGEVIEWS_PARAMS_TEMPLATE["start"][:-2]
    end_date = ARTICLE_PAGEVIEWS_PARAMS_TEMPLATE["end"][:-2]
    filename = f"academy_monthly_mobile_{start_date}-{end_date}.json"
    
    # Save the data to the JSON file
    with open(filename, 'w') as json_file:
        json.dump(all_pageviews, json_file, indent=4)

    print(f"Data saved to {filename}")

In [7]:
# Run the function
fetch_and_save_mobile_pageviews(movie_titles)

Fetching mobile pageviews for: Everything Everywhere All at Once
Fetching mobile pageviews for: All Quiet on the Western Front (2022 film)
Fetching mobile pageviews for: The Whale (2022 film)
Fetching mobile pageviews for: Top Gun: Maverick
Fetching mobile pageviews for: Black Panther: Wakanda Forever
Fetching mobile pageviews for: Avatar: The Way of Water
Fetching mobile pageviews for: Women Talking (film)
Fetching mobile pageviews for: Guillermo del Toro's Pinocchio
Fetching mobile pageviews for: Navalny (film)
Fetching mobile pageviews for: The Elephant Whisperers
Fetching mobile pageviews for: An Irish Goodbye
Fetching mobile pageviews for: The Boy, the Mole, the Fox and the Horse (film)
Fetching mobile pageviews for: RRR (film)
Fetching mobile pageviews for: CODA (2021 film)
Fetching mobile pageviews for: Dune (2021 film)
Fetching mobile pageviews for: The Eyes of Tammy Faye (2021 film)
Fetching mobile pageviews for: No Time to Die
Fetching mobile pageviews for: The Windshield Wip

Fetching mobile pageviews for: Paperman
Fetching mobile pageviews for: Brave (2012 film)
Fetching mobile pageviews for: Searching for Sugar Man
Fetching mobile pageviews for: Inocente
Fetching mobile pageviews for: Curfew (2012 film)
Fetching mobile pageviews for: The Artist (film)
Fetching mobile pageviews for: Hugo (film)
Fetching mobile pageviews for: The Iron Lady (film)
Fetching mobile pageviews for: The Descendants
Fetching mobile pageviews for: The Girl with the Dragon Tattoo (2011 film)
Fetching mobile pageviews for: Midnight in Paris
Fetching mobile pageviews for: The Help (film)
Fetching mobile pageviews for: A Separation
Fetching mobile pageviews for: The Fantastic Flying Books of Mr. Morris Lessmore
Fetching mobile pageviews for: The Shore (2011 film)
Fetching mobile pageviews for: Undefeated (2011 film)
Fetching mobile pageviews for: The Muppets (film)
Fetching mobile pageviews for: Saving Face (2012 film)
Fetching mobile pageviews for: Beginners
Fetching mobile pageviews 

Fetching mobile pageviews for: Monsters, Inc.
Fetching mobile pageviews for: Pearl Harbor (film)
Fetching mobile pageviews for: Iris (2001 film)
Fetching mobile pageviews for: Shrek
Fetching mobile pageviews for: Training Day
Fetching mobile pageviews for: Monster's Ball
Fetching mobile pageviews for: Thoth (film)
Fetching mobile pageviews for: For the Birds (film)
Fetching mobile pageviews for: No Man's Land (2001 film)
Fetching mobile pageviews for: Murder on a Sunday Morning
Fetching mobile pageviews for: The Accountant (2001 film)
Fetching mobile pageviews for: Gladiator (2000 film)
Fetching mobile pageviews for: Crouching Tiger, Hidden Dragon
Fetching mobile pageviews for: Traffic (2000 film)
Fetching mobile pageviews for: Erin Brockovich (film)
Fetching mobile pageviews for: Almost Famous
Fetching mobile pageviews for: Wonder Boys (film)
Fetching mobile pageviews for: How the Grinch Stole Christmas (2000 film)
Fetching mobile pageviews for: U-571 (film)
Fetching mobile pageviews 

Fetching mobile pageviews for: Cyrano de Bergerac (1990 film)
Fetching mobile pageviews for: American Dream (film)
Fetching mobile pageviews for: Journey of Hope (film)
Fetching mobile pageviews for: Days of Waiting: The Life & Art of Estelle Ishigo
Fetching mobile pageviews for: Creature Comforts
Fetching mobile pageviews for: The Lunch Date
Fetching mobile pageviews for: Misery (film)
Fetching mobile pageviews for: Total Recall (1990 film)
Fetching mobile pageviews for: Driving Miss Daisy
Fetching mobile pageviews for: Glory (1989 film)
Fetching mobile pageviews for: Born on the Fourth of July (film)
Fetching mobile pageviews for: My Left Foot
Fetching mobile pageviews for: The Little Mermaid (1989 film)
Fetching mobile pageviews for: Dead Poets Society
Fetching mobile pageviews for: The Abyss
Fetching mobile pageviews for: Indiana Jones and the Last Crusade
Fetching mobile pageviews for: Henry V (1989 film)
Fetching mobile pageviews for: The Johnstown Flood (1989 film)
Fetching mobi

Fetching mobile pageviews for: The Black Stallion (film)
Fetching mobile pageviews for: The Deer Hunter
Fetching mobile pageviews for: Coming Home (1978 film)
Fetching mobile pageviews for: Midnight Express (film)
Fetching mobile pageviews for: Heaven Can Wait (1978 film)
Fetching mobile pageviews for: Days of Heaven
Fetching mobile pageviews for: California Suite (film)
Fetching mobile pageviews for: The Buddy Holly Story
Fetching mobile pageviews for: Death on the Nile (1978 film)
Fetching mobile pageviews for: The Flight of the Gossamer Condor
Fetching mobile pageviews for: Get Out Your Handkerchiefs
Fetching mobile pageviews for: Scared Straight!
Fetching mobile pageviews for: Special Delivery (1978 film)
Fetching mobile pageviews for: Teenage Father
Fetching mobile pageviews for: Thank God It's Friday (film)
Fetching mobile pageviews for: Superman (1978 film)
Fetching mobile pageviews for: Annie Hall
Fetching mobile pageviews for: Star Wars (film)
Fetching mobile pageviews for: Ju

Fetching mobile pageviews for: Camelot (film)
Fetching mobile pageviews for: Bonnie and Clyde (film)
Fetching mobile pageviews for: Guess Who's Coming to Dinner
Fetching mobile pageviews for: Doctor Dolittle (1967 film)
Fetching mobile pageviews for: The Graduate
Fetching mobile pageviews for: Thoroughly Modern Millie
Fetching mobile pageviews for: Cool Hand Luke
Fetching mobile pageviews for: The Dirty Dozen
Fetching mobile pageviews for: A Place to Stand (film)
Fetching mobile pageviews for: The Anderson Platoon
Fetching mobile pageviews for: The Box (1967 film)
Fetching mobile pageviews for: Closely Watched Trains
Fetching mobile pageviews for: The Redwoods
Fetching mobile pageviews for: A Man for All Seasons (1966 film)
Fetching mobile pageviews for: Who's Afraid of Virginia Woolf? (film)
Fetching mobile pageviews for: Grand Prix (1966 film)
Fetching mobile pageviews for: Fantastic Voyage
Fetching mobile pageviews for: A Man and a Woman
Fetching mobile pageviews for: Born Free
Fetc

Fetching mobile pageviews for: Declaration of Independence (film)
Fetching mobile pageviews for: The Defiant Ones
Fetching mobile pageviews for: Der Fuehrer's Face
Fetching mobile pageviews for: Desert Victory
Fetching mobile pageviews for: Design for Death
Fetching mobile pageviews for: Designing Woman
Fetching mobile pageviews for: Destination Moon (film)
Fetching mobile pageviews for: The Diary of Anne Frank (1959 film)
Fetching mobile pageviews for: Disraeli (1929 film)
Fetching mobile pageviews for: The Divine Lady
Fetching mobile pageviews for: Divorce Italian Style
Fetching mobile pageviews for: The Divorcee
Fetching mobile pageviews for: Dodsworth (film)
Fetching mobile pageviews for: The Dot and the Line
Fetching mobile pageviews for: A Double Life (1947 film)
Fetching mobile pageviews for: The Dove (1927 film)
Fetching mobile pageviews for: Dr. Jekyll and Mr. Hyde (1931 film)
Fetching mobile pageviews for: Dumbo
Fetching mobile pageviews for: Dylan Thomas (film)
Fetching mobi

Fetching mobile pageviews for: The Living Desert
Fetching mobile pageviews for: The Longest Day (film)
Fetching mobile pageviews for: Lost Horizon (1937 film)
Fetching mobile pageviews for: The Lost Weekend (film)
Fetching mobile pageviews for: Love Is a Many-Splendored Thing (film)
Fetching mobile pageviews for: Love Me or Leave Me (film)
Fetching mobile pageviews for: Lust for Life (1956 film)
Fetching mobile pageviews for: Magoo's Puddle Jumper
Fetching mobile pageviews for: Main Street on the March!
Fetching mobile pageviews for: The Man Who Knew Too Much (1956 film)
Fetching mobile pageviews for: Manhattan Melodrama
Fetching mobile pageviews for: Marie-Louise (film)
Fetching mobile pageviews for: Marty (film)
Fetching mobile pageviews for: Mary Poppins (film)
Fetching mobile pageviews for: Men Against the Arctic
Fetching mobile pageviews for: The Merry Widow (1934 film)
Fetching mobile pageviews for: A Midsummer Night's Dream (1935 film)
Fetching mobile pageviews for: Mighty Joe Y

Fetching mobile pageviews for: Stagecoach (1939 film)
Fetching mobile pageviews for: Stairway to Light
Fetching mobile pageviews for: Stalag 17
Fetching mobile pageviews for: Star in the Night
Fetching mobile pageviews for: A Star Is Born (1937 film)
Fetching mobile pageviews for: State Fair (1945 film)
Fetching mobile pageviews for: The Story of Louis Pasteur
Fetching mobile pageviews for: The Stratton Story
Fetching mobile pageviews for: Street Angel (1928 film)
Fetching mobile pageviews for: A Streetcar Named Desire (1951 film)
Fetching mobile pageviews for: Strike Up the Band (film)
Fetching mobile pageviews for: Surogat
Fetching mobile pageviews for: Sundays and Cybele
Fetching mobile pageviews for: Sunrise: A Song of Two Humans
Fetching mobile pageviews for: Sunset Boulevard (film)
Fetching mobile pageviews for: Survival City
Fetching mobile pageviews for: Suspicion (1941 film)
Fetching mobile pageviews for: Sweet Bird of Youth (1962 film)
Fetching mobile pageviews for: Sweethear

## Desktop

The following code has essentially the same purpose as the mobile section. The main differences are the names of the output files and the parameters. The other difference is that the mobile functions have to update the global variable twice because the access field needs to change. For desktop that is not necessary since it is the default in the global variable. You can refer to the comments above and apply them to these two functions. I have made notes of changes. 

In [None]:
def get_desktop_pageviews(article_title):
    """Fetch desktop pageviews for the given article."""
    # only fetching for "desktop", unlike mobile which fetches for both "mobile-web" and "mobile-app"
    ARTICLE_PAGEVIEWS_PARAMS_TEMPLATE["access"] = "desktop"
    desktop_views = request_pageviews_per_article(article_title)
    
    # Check if 'items' key exists in the response. Only checking 'items' key for the single "desktop" response, while mobile checks for both "mobile-web" and "mobile-app"
    if 'items' not in desktop_views:
        print(f"Missing data for article: {article_title}")
        return None
    
    return desktop_views

In [None]:
def fetch_and_save_desktop_pageviews(movie_titles):
    
    all_pageviews = {}
    
    for title in movie_titles:
        print(f"Fetching desktop pageviews for: {title}")
        #Calling the desktop-specific fetch function. Mobile version calls the combined mobile fetch function.
        desktop_views = get_desktop_pageviews(title)
        all_pageviews[title] = desktop_views

    # Define the filename
    start_date = ARTICLE_PAGEVIEWS_PARAMS_TEMPLATE["start"][:-2]
    end_date = ARTICLE_PAGEVIEWS_PARAMS_TEMPLATE["end"][:-2]
    # Filename has "desktop" instead of "mobile" in its naming.
    filename = f"academy_monthly_desktop_{start_date}-{end_date}.json"
    
    # Save the data to the JSON file
    with open(filename, 'w') as json_file:
        json.dump(all_pageviews, json_file, indent=4)

    print(f"Data saved to {filename}")

In [8]:
# Run the function
fetch_and_save_desktop_pageviews(movie_titles)

Fetching desktop pageviews for: Everything Everywhere All at Once
Fetching desktop pageviews for: All Quiet on the Western Front (2022 film)
Fetching desktop pageviews for: The Whale (2022 film)
Fetching desktop pageviews for: Top Gun: Maverick
Fetching desktop pageviews for: Black Panther: Wakanda Forever
Fetching desktop pageviews for: Avatar: The Way of Water
Fetching desktop pageviews for: Women Talking (film)
Fetching desktop pageviews for: Guillermo del Toro's Pinocchio
Fetching desktop pageviews for: Navalny (film)
Fetching desktop pageviews for: The Elephant Whisperers
Fetching desktop pageviews for: An Irish Goodbye
Fetching desktop pageviews for: The Boy, the Mole, the Fox and the Horse (film)
Fetching desktop pageviews for: RRR (film)
Fetching desktop pageviews for: CODA (2021 film)
Fetching desktop pageviews for: Dune (2021 film)
Fetching desktop pageviews for: The Eyes of Tammy Faye (2021 film)
Fetching desktop pageviews for: No Time to Die
Fetching desktop pageviews for: 

Fetching desktop pageviews for: Zero Dark Thirty
Fetching desktop pageviews for: Amour (2012 film)
Fetching desktop pageviews for: Anna Karenina (2012 film)
Fetching desktop pageviews for: Paperman
Fetching desktop pageviews for: Brave (2012 film)
Fetching desktop pageviews for: Searching for Sugar Man
Fetching desktop pageviews for: Inocente
Fetching desktop pageviews for: Curfew (2012 film)
Fetching desktop pageviews for: The Artist (film)
Fetching desktop pageviews for: Hugo (film)
Fetching desktop pageviews for: The Iron Lady (film)
Fetching desktop pageviews for: The Descendants
Fetching desktop pageviews for: The Girl with the Dragon Tattoo (2011 film)
Fetching desktop pageviews for: Midnight in Paris
Fetching desktop pageviews for: The Help (film)
Fetching desktop pageviews for: A Separation
Fetching desktop pageviews for: The Fantastic Flying Books of Mr. Morris Lessmore
Fetching desktop pageviews for: The Shore (2011 film)
Fetching desktop pageviews for: Undefeated (2011 film)

Fetching desktop pageviews for: 8 Mile (film)
Fetching desktop pageviews for: A Beautiful Mind (film)
Fetching desktop pageviews for: The Lord of the Rings: The Fellowship of the Ring
Fetching desktop pageviews for: Moulin Rouge!
Fetching desktop pageviews for: Black Hawk Down (film)
Fetching desktop pageviews for: Gosford Park
Fetching desktop pageviews for: Monsters, Inc.
Fetching desktop pageviews for: Pearl Harbor (film)
Fetching desktop pageviews for: Iris (2001 film)
Fetching desktop pageviews for: Shrek
Fetching desktop pageviews for: Training Day
Fetching desktop pageviews for: Monster's Ball
Fetching desktop pageviews for: Thoth (film)
Fetching desktop pageviews for: For the Birds (film)
Fetching desktop pageviews for: No Man's Land (2001 film)
Fetching desktop pageviews for: Murder on a Sunday Morning
Fetching desktop pageviews for: The Accountant (2001 film)
Fetching desktop pageviews for: Gladiator (2000 film)
Fetching desktop pageviews for: Crouching Tiger, Hidden Dragon
F

Fetching desktop pageviews for: City Slickers
Fetching desktop pageviews for: Deadly Deception: General Electric, Nuclear Weapons and Our Environment
Fetching desktop pageviews for: Dances with Wolves
Fetching desktop pageviews for: Dick Tracy (1990 film)
Fetching desktop pageviews for: Ghost (1990 film)
Fetching desktop pageviews for: Goodfellas
Fetching desktop pageviews for: The Hunt for Red October (film)
Fetching desktop pageviews for: Reversal of Fortune
Fetching desktop pageviews for: Cyrano de Bergerac (1990 film)
Fetching desktop pageviews for: American Dream (film)
Fetching desktop pageviews for: Journey of Hope (film)
Fetching desktop pageviews for: Days of Waiting: The Life & Art of Estelle Ishigo
Fetching desktop pageviews for: Creature Comforts
Fetching desktop pageviews for: The Lunch Date
Fetching desktop pageviews for: Misery (film)
Fetching desktop pageviews for: Total Recall (1990 film)
Fetching desktop pageviews for: Driving Miss Daisy
Fetching desktop pageviews for

Fetching desktop pageviews for: Kramer vs. Kramer
Fetching desktop pageviews for: All That Jazz (film)
Fetching desktop pageviews for: Apocalypse Now
Fetching desktop pageviews for: Norma Rae
Fetching desktop pageviews for: Breaking Away
Fetching desktop pageviews for: Alien (film)
Fetching desktop pageviews for: Being There
Fetching desktop pageviews for: A Little Romance
Fetching desktop pageviews for: Best Boy (film)
Fetching desktop pageviews for: Board and Care
Fetching desktop pageviews for: Every Child (film)
Fetching desktop pageviews for: Paul Robeson: Tribute to an Artist
Fetching desktop pageviews for: The Tin Drum (film)
Fetching desktop pageviews for: The Black Stallion (film)
Fetching desktop pageviews for: The Deer Hunter
Fetching desktop pageviews for: Coming Home (1978 film)
Fetching desktop pageviews for: Midnight Express (film)
Fetching desktop pageviews for: Heaven Can Wait (1978 film)
Fetching desktop pageviews for: Days of Heaven
Fetching desktop pageviews for: Ca

Fetching desktop pageviews for: 2001: A Space Odyssey (film)
Fetching desktop pageviews for: Bullitt
Fetching desktop pageviews for: The Producers (1967 film)
Fetching desktop pageviews for: Rosemary's Baby (film)
Fetching desktop pageviews for: The Subject Was Roses (film)
Fetching desktop pageviews for: The Thomas Crown Affair (1968 film)
Fetching desktop pageviews for: War and Peace (film series)
Fetching desktop pageviews for: Charly
Fetching desktop pageviews for: Journey into Self (1968 film)
Fetching desktop pageviews for: Robert Kennedy Remembered
Fetching desktop pageviews for: Why Man Creates
Fetching desktop pageviews for: Winnie the Pooh and the Blustery Day
Fetching desktop pageviews for: Planet of the Apes (1968 film)
Fetching desktop pageviews for: In the Heat of the Night (film)
Fetching desktop pageviews for: Camelot (film)
Fetching desktop pageviews for: Bonnie and Clyde (film)
Fetching desktop pageviews for: Guess Who's Coming to Dinner
Fetching desktop pageviews for

Fetching desktop pageviews for: The Country Cousin
Fetching desktop pageviews for: The Country Girl (1954 film)
Fetching desktop pageviews for: Cover Girl (film)
Fetching desktop pageviews for: The Cowboy and the Lady (1938 film)
Fetching desktop pageviews for: Crash Dive
Fetching desktop pageviews for: Crashing the Water Barrier
Fetching desktop pageviews for: The Critic (1963 film)
Fetching desktop pageviews for: La Cucaracha (1934 film)
Fetching desktop pageviews for: Cyrano de Bergerac (1950 film)
Fetching desktop pageviews for: A Damsel in Distress (1937 film)
Fetching desktop pageviews for: Dangerous (1935 film)
Fetching desktop pageviews for: The Dark Angel (1935 film)
Fetching desktop pageviews for: The Dawn Patrol (1930 film)
Fetching desktop pageviews for: Day of the Painter
Fetching desktop pageviews for: Daybreak in Udi
Fetching desktop pageviews for: Days of Wine and Roses (film)
Fetching desktop pageviews for: December 7th: The Movie
Fetching desktop pageviews for: Declar

Fetching desktop pageviews for: Krakatoa (film)
Fetching desktop pageviews for: Kukan
Fetching desktop pageviews for: La Strada
Fetching desktop pageviews for: La Dolce Vita
Fetching desktop pageviews for: Lady Be Good (1941 film)
Fetching desktop pageviews for: The Last Command (1928 film)
Fetching desktop pageviews for: Laura (1944 film)
Fetching desktop pageviews for: The Lavender Hill Mob
Fetching desktop pageviews for: Lawrence of Arabia (film)
Fetching desktop pageviews for: Leave Her to Heaven
Fetching desktop pageviews for: Lend a Paw
Fetching desktop pageviews for: Les Girls
Fetching desktop pageviews for: A Letter to Three Wives
Fetching desktop pageviews for: The Life of Emile Zola
Fetching desktop pageviews for: Light in the Window
Fetching desktop pageviews for: Lili
Fetching desktop pageviews for: Lilies of the Field (1963 film)
Fetching desktop pageviews for: The Kidnappers
Fetching desktop pageviews for: The Little Orphan
Fetching desktop pageviews for: Little Women (19

Fetching desktop pageviews for: The Sin of Madelon Claudet
Fetching desktop pageviews for: Since You Went Away
Fetching desktop pageviews for: Skippy (film)
Fetching desktop pageviews for: Sky Above and Mud Beneath
Fetching desktop pageviews for: The Snake Pit
Fetching desktop pageviews for: Snow White and the Seven Dwarfs (1937 film)
Fetching desktop pageviews for: So Much for So Little
Fetching desktop pageviews for: So This Is Harris!
Fetching desktop pageviews for: The Solid Gold Cadillac
Fetching desktop pageviews for: Some Like It Hot
Fetching desktop pageviews for: Somebody Up There Likes Me (1956 film)
Fetching desktop pageviews for: The Song of Bernadette (film)
Fetching desktop pageviews for: Song of the South
Fetching desktop pageviews for: Song Without End
Fetching desktop pageviews for: Sons and Lovers (film)
Fetching desktop pageviews for: Sons of Liberty (film)
Fetching desktop pageviews for: South Pacific (1958 film)
Fetching desktop pageviews for: Spartacus (film)
Fetc

## Cumulative

The following functions grab the Monthly cumulative data (which is the sum of all mobile, and all desktop traffic per article). It is also structured very similarly to the two sections before.

In [None]:
def fetch_and_save_cumulative_pageviews(movie_titles):
    
    """
       Fetch cumulative pageviews for each movie title across different access types and save them to a JSON file.
    """
    
    # Dictionary to hold the cumulative pageviews for all movie titles
    all_pageviews = {}
    
    
    # Iterating through each movie title to fetch the pageviews
    for title in movie_titles:
        print(f"Fetching cumulative pageviews for: {title}")
        
        # Fetching Desktop Views
        ARTICLE_PAGEVIEWS_PARAMS_TEMPLATE['access'] = 'desktop'
        desktop_views = request_pageviews_per_article(title)

        # Fetching Mobile Views (combining mobile-web and mobile-app)
        ARTICLE_PAGEVIEWS_PARAMS_TEMPLATE['access'] = 'mobile-web'
        mobile_web_views = request_pageviews_per_article(title)
        ARTICLE_PAGEVIEWS_PARAMS_TEMPLATE['access'] = 'mobile-app'
        mobile_app_views = request_pageviews_per_article(title)

        # Ensuring data is available for all access types
        if 'items' not in desktop_views or 'items' not in mobile_web_views or 'items' not in mobile_app_views:
            print(f"Data missing for one or more access types for: {title}")
            continue

        # Combining Desktop and Mobile Views
        combined_views = []
        for d_view, mweb_view, mapp_view in zip(desktop_views['items'], mobile_web_views['items'], mobile_app_views['items']):
            # Assuming the views are ordered chronologically and correctly, so we just add their counts
            combined_views.append({
                'timestamp': d_view['timestamp'],
                'views': d_view['views'] + mweb_view['views'] + mapp_view['views']
            })
        # Storing the combined data in the all_pageviews dictionary
        all_pageviews[title] = combined_views

    # Saving the consolidated data to a JSON file
    start_date = "20150701"
    end_date = "20230930"
    with open(f'academy_monthly_cumulative_{start_date}-{end_date}.json', 'w') as outfile:
        json.dump(all_pageviews, outfile, indent=4)

    print(f"Data saved to academy_monthly_cumulative_{start_date}-{end_date}.json")

In [24]:
# Assuming you have the movie_titles list
fetch_and_save_cumulative_pageviews(movie_titles)

Fetching cumulative pageviews for: Everything Everywhere All at Once
Fetching cumulative pageviews for: All Quiet on the Western Front (2022 film)
Fetching cumulative pageviews for: The Whale (2022 film)
Fetching cumulative pageviews for: Top Gun: Maverick
Fetching cumulative pageviews for: Black Panther: Wakanda Forever
Fetching cumulative pageviews for: Avatar: The Way of Water
Fetching cumulative pageviews for: Women Talking (film)
Fetching cumulative pageviews for: Guillermo del Toro's Pinocchio
Fetching cumulative pageviews for: Navalny (film)
Fetching cumulative pageviews for: The Elephant Whisperers
Fetching cumulative pageviews for: An Irish Goodbye
Fetching cumulative pageviews for: The Boy, the Mole, the Fox and the Horse (film)
Fetching cumulative pageviews for: RRR (film)
Fetching cumulative pageviews for: CODA (2021 film)
Fetching cumulative pageviews for: Dune (2021 film)
Fetching cumulative pageviews for: The Eyes of Tammy Faye (2021 film)
Fetching cumulative pageviews f

Fetching cumulative pageviews for: The Great Beauty
Fetching cumulative pageviews for: 20 Feet from Stardom
Fetching cumulative pageviews for: Argo (2012 film)
Fetching cumulative pageviews for: Life of Pi (film)
Fetching cumulative pageviews for: Les Misérables (2012 film)
Fetching cumulative pageviews for: Lincoln (film)
Fetching cumulative pageviews for: Django Unchained
Fetching cumulative pageviews for: Skyfall
Fetching cumulative pageviews for: Silver Linings Playbook
Fetching cumulative pageviews for: Zero Dark Thirty
Fetching cumulative pageviews for: Amour (2012 film)
Fetching cumulative pageviews for: Anna Karenina (2012 film)
Fetching cumulative pageviews for: Paperman
Fetching cumulative pageviews for: Brave (2012 film)
Fetching cumulative pageviews for: Searching for Sugar Man
Fetching cumulative pageviews for: Inocente
Fetching cumulative pageviews for: Curfew (2012 film)
Fetching cumulative pageviews for: The Artist (film)
Fetching cumulative pageviews for: Hugo (film)
F

Fetching cumulative pageviews for: Harvie Krumpet
Fetching cumulative pageviews for: Chernobyl Heart
Fetching cumulative pageviews for: The Fog of War
Fetching cumulative pageviews for: Chicago (2002 film)
Fetching cumulative pageviews for: The Pianist (2002 film)
Fetching cumulative pageviews for: The Lord of the Rings: The Two Towers
Fetching cumulative pageviews for: Frida (film)
Fetching cumulative pageviews for: The Hours (film)
Fetching cumulative pageviews for: Road to Perdition
Fetching cumulative pageviews for: Adaptation (film)
Fetching cumulative pageviews for: Talk to Her
Fetching cumulative pageviews for: This Charming Man (film)
Fetching cumulative pageviews for: Spirited Away
Fetching cumulative pageviews for: Nowhere in Africa
Fetching cumulative pageviews for: The ChubbChubbs!
Fetching cumulative pageviews for: Twin Towers (film)
Fetching cumulative pageviews for: Bowling for Columbine
Fetching cumulative pageviews for: 8 Mile (film)
Fetching cumulative pageviews for: 

Fetching cumulative pageviews for: Bram Stoker's Dracula (1992 film)
Fetching cumulative pageviews for: Aladdin (1992 Disney film)
Fetching cumulative pageviews for: The Crying Game
Fetching cumulative pageviews for: Scent of a Woman (1992 film)
Fetching cumulative pageviews for: A River Runs Through It (film)
Fetching cumulative pageviews for: Indochine (film)
Fetching cumulative pageviews for: My Cousin Vinny
Fetching cumulative pageviews for: The Panama Deception
Fetching cumulative pageviews for: Educating Peter
Fetching cumulative pageviews for: The Last of the Mohicans (1992 film)
Fetching cumulative pageviews for: Death Becomes Her
Fetching cumulative pageviews for: Omnibus (film)
Fetching cumulative pageviews for: Mona Lisa Descending a Staircase
Fetching cumulative pageviews for: The Silence of the Lambs (film)
Fetching cumulative pageviews for: Terminator 2: Judgment Day
Fetching cumulative pageviews for: Bugsy
Fetching cumulative pageviews for: JFK (film)
Fetching cumulative

Fetching cumulative pageviews for: Victor/Victoria
Data missing for one or more access types for: Victor/Victoria
Fetching cumulative pageviews for: Sophie's Choice (film)
Fetching cumulative pageviews for: Missing (1982 film)
Fetching cumulative pageviews for: If You Love This Planet
Fetching cumulative pageviews for: Just Another Missing Kid
Fetching cumulative pageviews for: A Shocking Accident
Fetching cumulative pageviews for: Tango (1981 film)
Fetching cumulative pageviews for: Begin the Beguine (film)
Fetching cumulative pageviews for: Quest for Fire (film)
Fetching cumulative pageviews for: Chariots of Fire
Fetching cumulative pageviews for: Raiders of the Lost Ark
Fetching cumulative pageviews for: Reds (film)
Fetching cumulative pageviews for: On Golden Pond (1981 film)
Fetching cumulative pageviews for: Arthur (1981 film)
Fetching cumulative pageviews for: An American Werewolf in London
Fetching cumulative pageviews for: Close Harmony (1981 film)
Fetching cumulative pageview

Fetching cumulative pageviews for: Summer of '42
Fetching cumulative pageviews for: The Garden of the Finzi-Continis (film)
Fetching cumulative pageviews for: The Hospital
Fetching cumulative pageviews for: Klute
Fetching cumulative pageviews for: Shaft (1971 film)
Fetching cumulative pageviews for: The Crunch Bird
Fetching cumulative pageviews for: The Hellstrom Chronicle
Fetching cumulative pageviews for: Patton (film)
Fetching cumulative pageviews for: Ryan's Daughter
Fetching cumulative pageviews for: Airport (1970 film)
Fetching cumulative pageviews for: Love Story (1970 film)
Fetching cumulative pageviews for: MASH (film)
Fetching cumulative pageviews for: Tora! Tora! Tora!
Fetching cumulative pageviews for: Women in Love (film)
Fetching cumulative pageviews for: Lovers and Other Strangers
Fetching cumulative pageviews for: Woodstock (film)
Fetching cumulative pageviews for: Cromwell (film)
Fetching cumulative pageviews for: Investigation of a Citizen Above Suspicion
Fetching cum

Fetching cumulative pageviews for: Black Narcissus
Fetching cumulative pageviews for: Black Orpheus
Fetching cumulative pageviews for: The Black Swan (film)
Fetching cumulative pageviews for: Blithe Spirit (1945 film)
Fetching cumulative pageviews for: Blood and Sand (1941 film)
Fetching cumulative pageviews for: Blood on the Sun
Fetching cumulative pageviews for: Blossoms in the Dust
Fetching cumulative pageviews for: Body and Soul (1947 film)
Fetching cumulative pageviews for: Bored of Education
Fetching cumulative pageviews for: Born Yesterday (1950 film)
Fetching cumulative pageviews for: A Boy and His Dog (1946 film)
Fetching cumulative pageviews for: Boys Town (film)
Fetching cumulative pageviews for: The Brave One (1956 film)
Fetching cumulative pageviews for: Breakfast at Tiffany's (film)
Fetching cumulative pageviews for: The Sound Barrier
Fetching cumulative pageviews for: The Bridge of San Luis Rey (1929 film)
Fetching cumulative pageviews for: The Bridge on the River Kwai
F

Fetching cumulative pageviews for: Harvey (1950 film)
Fetching cumulative pageviews for: The Harvey Girls
Fetching cumulative pageviews for: Heavenly Music
Fetching cumulative pageviews for: The Heiress
Fetching cumulative pageviews for: Helen Keller in Her Story
Fetching cumulative pageviews for: Hello, Frisco, Hello
Fetching cumulative pageviews for: Henry V (1944 film)
Fetching cumulative pageviews for: Here Comes Mr. Jordan
Fetching cumulative pageviews for: Here Comes the Groom
Fetching cumulative pageviews for: The High and the Mighty (film)
Fetching cumulative pageviews for: High Noon
Fetching cumulative pageviews for: Hitler Lives
Fetching cumulative pageviews for: A Hole in the Head
Fetching cumulative pageviews for: The Hole (1962 film)
Fetching cumulative pageviews for: Holiday Inn (film)
Fetching cumulative pageviews for: The Horse with the Flying Tail
Fetching cumulative pageviews for: The House I Live In (1945 film)
Fetching cumulative pageviews for: The House on 92nd Str

Fetching cumulative pageviews for: The Paleface (1948 film)
Fetching cumulative pageviews for: Panic in the Streets (film)
Fetching cumulative pageviews for: Papa's Delicate Condition
Fetching cumulative pageviews for: The Patriot (1928 film)
Fetching cumulative pageviews for: Penny Wisdom
Fetching cumulative pageviews for: Phantom of the Opera (1943 film)
Fetching cumulative pageviews for: The Philadelphia Story (film)
Fetching cumulative pageviews for: Picnic (1955 film)
Fetching cumulative pageviews for: The Picture of Dorian Gray (1945 film)
Fetching cumulative pageviews for: Pillow Talk (film)
Fetching cumulative pageviews for: The Pink Phink
Fetching cumulative pageviews for: Pinocchio (1940 film)
Fetching cumulative pageviews for: A Place in the Sun (1951 film)
Fetching cumulative pageviews for: Plymouth Adventure
Fetching cumulative pageviews for: Pollyanna (1960 film)
Fetching cumulative pageviews for: Porgy and Bess (film)
Fetching cumulative pageviews for: Portrait of Jennie

Fetching cumulative pageviews for: Torture Money
Fetching cumulative pageviews for: Toward Independence
Fetching cumulative pageviews for: Transatlantic (1931 film)
Fetching cumulative pageviews for: The Treasure of the Sierra Madre (film)
Fetching cumulative pageviews for: A Tree Grows in Brooklyn (1945 film)
Fetching cumulative pageviews for: The True Glory
Fetching cumulative pageviews for: The True Story of the Civil War
Fetching cumulative pageviews for: Tweetie Pie
Fetching cumulative pageviews for: Twelve O'Clock High
Fetching cumulative pageviews for: Two Arabian Knights
Fetching cumulative pageviews for: The Two Mouseketeers
Fetching cumulative pageviews for: Two Women
Fetching cumulative pageviews for: The Ugly Duckling (1939 film)
Fetching cumulative pageviews for: Underworld (1927 film)
Fetching cumulative pageviews for: The V.I.P.s (film)
Fetching cumulative pageviews for: Perfect Strangers (1945 film)
Fetching cumulative pageviews for: Van Gogh (1948 film)
Fetching cumula

### Quick Sanity Check

In [28]:
import json

def print_file_length(filename):
    with open(filename, 'r') as f:
        data = json.load(f)
        print(f"Length of {filename}: {len(data)}")

# Define your file names
file1 = 'academy_monthly_mobile_20150701-20230930.json'
file2 = 'academy_monthly_desktop_202001-202303.json'
file3 = 'academy_monthly_cumulative_20150701-20230930.json'

# Print out the lengths
print_file_length(file1)
print_file_length(file2)
print_file_length(file3)


Length of academy_monthly_mobile_20150701-20230930.json: 1359
Length of academy_monthly_desktop_202001-202303.json: 1358
Length of academy_monthly_cumulative_20150701-20230930.json: 1358
