# Analog Horror Statistics Report

This will gather viewing data relative size for selected horror series

1 - All of these can be found on YouTube
2 - This report is specific to October/November 2024 timeframe - this may change with the implmentation of a webscraper

## Series' used in this study

1. The Mandela Cataloge
2. Monument Mythos
3. Greylock
4. Midwest Angellica
5. The Walton Files
6. Local 58
7. Marble Hornets
8. The Oldest View

#### These were chosen as they all meet the following criteria.
1. A sizeable following and overall popularity. These and are most likely to be recognized by fans of the genre.
2. All are created by a specific group or individual, as opposed to a community led project.
3. These are original, and did not stem a preexisting IP or idea.
4. All consist of 5 or more episodes or installments

#### Regarding Exclusions
It is important to note that certian series' have been left off this list. As this report is complied by a single individual, it was decided to focus on only 8 projects in order to prevent scope creep. This list should not be seen as a judgement of quality, although the authors' personal interests did play a role in determining the list. 

## Imports

This segment was created due to issues seen with the "googleapiclient" import

In [5]:
%pip install google-api-python-client google-auth-httplib2 google-auth-oauthlib


Collecting google-api-python-client
  Using cached google_api_python_client-2.151.0-py2.py3-none-any.whl.metadata (6.7 kB)
Collecting google-auth-httplib2
  Using cached google_auth_httplib2-0.2.0-py2.py3-none-any.whl.metadata (2.2 kB)
Collecting google-auth-oauthlib
  Using cached google_auth_oauthlib-1.2.1-py2.py3-none-any.whl.metadata (2.7 kB)
Collecting httplib2<1.dev0,>=0.19.0 (from google-api-python-client)
  Using cached httplib2-0.22.0-py3-none-any.whl.metadata (2.6 kB)
Collecting google-auth!=2.24.0,!=2.25.0,<3.0.0.dev0,>=1.32.0 (from google-api-python-client)
  Using cached google_auth-2.35.0-py2.py3-none-any.whl.metadata (4.7 kB)
Collecting google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0.dev0,>=1.31.5 (from google-api-python-client)
  Using cached google_api_core-2.22.0-py3-none-any.whl.metadata (2.9 kB)
Collecting uritemplate<5,>=3.0.1 (from google-api-python-client)
  Using cached uritemplate-4.1.1-py2.py3-none-any.whl.metadata (2.9 kB)
Collecting requests-oauthlib>=


[notice] A new release of pip is available: 24.2 -> 24.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [1]:

import varibles
import pandas as pd
import json
import googleapiclient.discovery
import sqlite3
import matplotlib as plt
%load_ext sql


This next portion of code will gathers the YouTube stats for each video listed in the file "YT_URLS.txt", and writes it to a single json file named "raw_YT_data.json"

In [None]:
# Replace with your actual API key
API_KEY = varibles.key
youtube = googleapiclient.discovery.build('youtube', 'v3', developerKey=API_KEY)

with open('c:/Users/carso/Desktop/MainRepo/Projects/YT_URLS.txt','r') as INPUT_LIST:
    for line in INPUT_LIST:
      yt_id = line.split('=')[1][:11]
      request = youtube.videos().list(part='snippet,contentDetails,statistics',id=yt_id)
      response = request.execute()
      with open('./raw_YT_data.json', 'a') as fd:
        json.dump(response, fd)
        fd.write("\n")

The below cell will now extract the data from the json file and translate it into a dataframe

In [None]:
vid_data = []
chan_data = []
YT_data = []
yt_list = []
with open("c:/Users/carso/Desktop/MainRepo/Projects/raw_YT_data.json") as f:
    for object in f:
        videodata = json.loads(object)
        yt_list.append(videodata)

for line in yt_list:
    for item in line["items"]:
        title = item["snippet"]["title"]
        date = item["snippet"]["publishedAt"]
        vidId = item["id"]
        chanTitle = item["snippet"]["channelTitle"]
        chanId = item["snippet"]["channelId"]
        views = item["statistics"]["viewCount"]
        comments = item["statistics"]["commentCount"]
        vid_str = (title,date,vidId,views,comments)
        chan_str = (chanTitle,chanId)
        yt_net_data = (title,date,vidId,views,comments,chanTitle,chanId)
        vid_data.append(vid_str)
        chan_data.append(chan_str)
        YT_data.append(yt_net_data)

YT_NetData = pd.DataFrame(YT_data, columns=["Title","Publication Date","Video ID","Views","Comments","Channel Name","Channel ID"])
Chan_Stats = pd.DataFrame(chan_data, columns=["Channel Name","Channel ID"])
Vid_Stats = pd.DataFrame(vid_data, columns=["Title","Publication Date","Video ID","Views","Comments"])

YT_NetData.head

<bound method NDFrame.head of                                                Title      Publication Date  \
0  TAPE 002 - to the mountain             GREYLOC...  2023-03-20T01:54:13Z   
1  TAPE 003 - orientation protocols             G...  2023-03-31T03:19:00Z   
2  TAPE 004 - unexpected visitors             GRE...  2023-04-14T18:00:15Z   

      Video ID   Views Comments Channel Name                Channel ID  
0  CJXtTWN4NQE  384690      554     GREYLOCK  UCYK5vX7-rpRCPOWJrWuMerg  
1  VwCe45AH-_8  338361      508     GREYLOCK  UCYK5vX7-rpRCPOWJrWuMerg  
2  hSTTxni3f90  277160      499     GREYLOCK  UCYK5vX7-rpRCPOWJrWuMerg  >

## Move the dataframe into a database

In [None]:
conn = sqlite3.connect('c:/Users/carso/Desktop/MainRepo/Projects/AnalogeH.db')

Vid_Stats.to_sql('Video_Data', conn)
Chan_Stats.to_sql('Channel_Data', conn)

3