<a id="toc"></a>

# <u>Table of Contents</u>

1.) [Setup](#setup)  
&nbsp;&nbsp;&nbsp;&nbsp; 1.1.) [Imports](#imports)   
&nbsp;&nbsp;&nbsp;&nbsp; 1.2.) [Helpers](#helpers)   
&nbsp;&nbsp;&nbsp;&nbsp; 1.3.) [Load data](#load)   
2.) [Datetime](#datetime)  
3.) [Speakers](#speakers)  
4.) [Transcript](#transcript)  
5.) [Save to CSV](#save)  

---
<a id="setup"></a>

# [^](#toc) <u>Setup</u>

<a id="imports"></a>

### [^](#toc) Standard imports

In [5]:
### Standard imports
import pandas as pd
import numpy as np
pd.options.display.max_columns = 50

### Regex and datetime
import re
import datetime

# Helps convert String representation of list into a list
import ast

### Removes warnings that occassionally show in imports
import warnings
warnings.filterwarnings('ignore')

### Visualization imports

In [6]:
### Standard imports
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
sns.set()

### Altair
import altair as alt
alt.renderers.enable('notebook')

### Plotly
from plotly.offline import init_notebook_mode, iplot
import plotly.graph_objs as go
import plotly.plotly as py
from plotly import tools
init_notebook_mode(connected=True)

# WordCloud
from wordcloud import WordCloud

# Folium
import folium

<a id="helpers"></a>

### [^](#toc) Helpers

In [7]:
def string_literal(x):
    try:
        return ast.literal_eval(x)
    except:
        return x
    
# A short hand way to plot most bar graphs
def pretty_bar(data, ax, xlabel=None, ylabel=None, title=None, int_text=False, x=None, y=None):
    
    if x is None:
        x = data.values
    if y is None:
        y = data.index
    
    # Plots the data
    fig = sns.barplot(x, y, ax=ax)
    
    # Places text for each value in data
    for i, v in enumerate(x):
        
        # Decides whether the text should be rounded or left as floats
        if int_text:
            ax.text(0, i, int(v), color='k', fontsize=14)
        else:
            ax.text(0, i, round(v, 3), color='k', fontsize=14)
     
    ### Labels plot
    ylabel != None and fig.set(ylabel=ylabel)
    xlabel != None and fig.set(xlabel=xlabel)
    title != None and fig.set(title=title)
    
### Used to style Python print statements
class color:
    BOLD = '\033[1m'
    UNDERLINE = '\033[4m'
    END = '\033[0m'

<a id="load"></a>

### [^](#toc) Load data

In [4]:
dateparse = lambda x: pd.datetime.strptime(x, "%Y-%m-%d %H:%M:%S+%f")

traffic = pd.read_csv("data/web_traffic.csv", parse_dates=['Date'], date_parser=dateparse)

df = pd.read_csv("data/PBS_full_unedited.csv")
for col in ["Transcript", "Story", "Speakers"]:
    df[col] = df[col].map(string_literal)
    
print("Shape of df:", df.shape)
df.head()

Shape of df: (17617, 7)


Unnamed: 0,URL,Story,Date,Title,Transcript,Speakers,Number of Comments
0,https://www.pbs.org/newshour/show/news-wrap-tr...,"In our news wrap Monday, President Trump's sea...","Jul 2, 2018 6:50 PM EDT",News Wrap: Trump interviews Supreme Court cand...,"[[Judy Woodruff, [ President Trump’s search fo...","{President Donald Trump, Man (through translat...",0.0
1,https://www.pbs.org/newshour/show/elected-in-a...,Mexican president-elect Andrés Manuel López Ob...,"Jul 2, 2018 6:45 PM EDT","Elected by a landslide, can Mexico’s López Obr...","[[Judy Woodruff, [ After two previous runs for...","{Alfonso Romo, Diana Mercado (through translat...",0.0
2,https://www.pbs.org/newshour/show/will-u-s-mex...,There are enormous expectations facing the new...,"Jul 2, 2018 6:43 PM EDT",Will U.S.-Mexico policy tensions change under ...,"[[Judy Woodruff, [ And now perspective from fo...","{Judy Woodruff, Roberta Jacobson}",0.0
3,https://www.pbs.org/newshour/show/yemens-spira...,One of the poorest countries in the Middle Eas...,"Jul 2, 2018 6:40 PM EDT",Yemen’s spiraling hunger crisis is a man-made ...,"[[Judy Woodruff, [ The “NewsHour” has reported...","{Naimi (through translator), Yahya Al-Habbari,...",0.0
4,https://www.pbs.org/newshour/show/livingwhileb...,A profusion of national incidents in which whi...,"Jul 2, 2018 6:35 PM EDT",#LivingWhileBlack: How does racial bias lead t...,"[[Judy Woodruff, [ A number of recent incident...","{Woman, Derrick Johnson, Judy Woodruff, Howard...",0.0
