# Exploring Trends from Varaious APIs  
**Filename:** exploring_trends.ipynb  
**Path:** TAMIDS/Code/Scholars@TAMU Data/exploring_trends.ipynb  
**Created Date:** 04 April 2022, 15:17 

Playing around with multiple APIs to see what kind of data I can get.

In [19]:
from IPython.display import Markdown, display, HTML
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import json
import requests
from requests.exceptions import HTTPError
from tqdm import tqdm
from bs4 import BeautifulSoup
import re

pd.options.display.float_format = '{:,.3f}'.format
plt.style.use('seaborn-darkgrid')

# General Markdown Formatting Functions

def printmd(string, level=1):
    header_level = '#'*level + ' '
    display(Markdown(header_level + string))

In [22]:
def get_api_dict(url: str, kind=None) -> dict:
    try:
        if kind == 'Wikipedia':
            headers = {
                'User-Agent': 'My User Agent',
                'From': 'abibstopher@tamu.edu'
            }
            response = requests.get(url, headers=headers)
        else:
            response = requests.get(url)
        response.raise_for_status()
        jsonResponse = response.json()
    except HTTPError as http_err:
        print(f'HTTP error occurred: {http_err}')
    except Exception as err:
        print(f'Other error occurred: {err}')
    else:
        return jsonResponse
    return {}

## Twitter

People use this social media platform to give their thoughts on topics they have an interest in.

## Google

People use this search engine to find information and resources for things they are interested in.  


The `PyTrends` python package will handle these calls.

## Wikipedia

People go to this open source encyclopedia to learn more about topics they are interested in.

In [23]:
def get_wikipedia_article(title: str) -> str:
    try:
        response = requests.get(url)
        response.raise_for_status()
        soup = BeautifulSoup(response.content, 'html.parser')
    except HTTPError as http_err:
        print(f'HTTP error occurred: {http_err}')
    except Exception as err:
        print(f'Other error occurred: {err}')
    else:
        text = ''
        for paragraph in soup.find_all('p'):
            text += paragraph.text
            
        text = re.sub(r'\[.*?\]+', '', text)
        text = text.replace('\n', '')
        return text

def get_top_wiki_views(access='all-access', date='2022/03/all-days'):
    base_url = f'https://wikimedia.org/api/rest_v1/metrics/pageviews/top/en.wikipedia/'
    url = base_url + '/' + access + '/' + date
    return get_api_dict(url=url, kind='Wikipedia')

In [4]:
access = ['desktop', 'mobile-app', 'mobile-web', 'all-access']
date = '2022/03/all-days'
url = f'https://wikimedia.org/api/rest_v1/metrics/pageviews/top/en.wikipedia/{access[3]}/{date}'
get_api_dict(url=url, kind='Wikipedia')

{'items': [{'project': 'en.wikipedia',
   'access': 'all-access',
   'year': '2022',
   'month': '03',
   'day': 'all-days',
   'articles': [{'article': 'Main_Page', 'views': 165792541, 'rank': 1},
    {'article': 'Special:Search', 'views': 44970965, 'rank': 2},
    {'article': '2022_Russian_invasion_of_Ukraine',
     'views': 14370145,
     'rank': 3},
    {'article': 'Vladimir_Putin', 'views': 10602177, 'rank': 4},
    {'article': 'The_Batman_(film)', 'views': 7021301, 'rank': 5},
    {'article': 'Ukraine', 'views': 5636716, 'rank': 6},
    {'article': 'Volodymyr_Zelenskyy', 'views': 4841723, 'rank': 7},
    {'article': 'The_Kashmir_Files', 'views': 4808795, 'rank': 8},
    {'article': 'Russo-Ukrainian_War', 'views': 4458573, 'rank': 9},
    {'article': 'Anna_Sorokin', 'views': 4180088, 'rank': 10},
    {'article': 'Deaths_in_2022', 'views': 3897339, 'rank': 11},
    {'article': 'RRR_(film)', 'views': 3809473, 'rank': 12},
    {'article': 'Taylor_Hawkins', 'views': 3700577, 'rank': 1

In [20]:
wikipedia_base_url = 'https://en.wikipedia.org/wiki/'
article_title = 'NCAA_Division_I'

get_wikipedia_article()

'NCAA Division I (D-I) is the highest level of intercollegiate athletics sanctioned by the National Collegiate Athletic Association (NCAA) in the United States, which accepts players globally. D-I schools include the major collegiate athletic powers, with larger budgets, more elaborate facilities and more athletic scholarships than Divisions II and III as well as many smaller schools committed to the highest level of intercollegiate competition.This level was once called the University Division of the NCAA, in contrast to the lower-level College Division; these terms were replaced with numeric divisions in 1973. The University Division was renamed Division I, while the College Division was split in two; the College Division members that offered scholarships or wanted to compete against those who did became Division II, while those who did not want to offer scholarships became Division III.For college football only, D-I schools are further divided into the Football Bowl Subdivision (FBS