# Jeopardy Project 
## Project provided by Codecademy
### Goals: "You will work to write several functions that investigate a dataset of Jeopardy! questions and answers. Filter the dataset for topics that you’re interested in, compute the average difficulty of those questions, and train to become the next Jeopardy champion!"

In [1]:
import pandas as pd
import random

#Loading data
jeopardy = pd.read_csv('jeopardy.csv')

In [2]:
#Inspecting Data
print(jeopardy.head())

#Renaming columns so that they have proper variable names and making them all lowercase
jeopardy.columns = jeopardy.columns.str.lower()
jeopardy.rename(columns={'show number': 'show_number', ' air date': 'air_date'}, inplace = True)

#Removing the leading space in the column names
jeopardy.columns = jeopardy.columns.str.replace(' ', '')

print(jeopardy.head())

   Show Number    Air Date      Round                         Category  Value  \
0         4680  2004-12-31  Jeopardy!                          HISTORY   $200   
1         4680  2004-12-31  Jeopardy!  ESPN's TOP 10 ALL-TIME ATHLETES   $200   
2         4680  2004-12-31  Jeopardy!      EVERYBODY TALKS ABOUT IT...   $200   
3         4680  2004-12-31  Jeopardy!                 THE COMPANY LINE   $200   
4         4680  2004-12-31  Jeopardy!              EPITAPHS & TRIBUTES   $200   

                                            Question      Answer  
0  For the last 8 years of his life, Galileo was ...  Copernicus  
1  No. 2: 1912 Olympian; football star at Carlisl...  Jim Thorpe  
2  The city of Yuma in this state has a record av...     Arizona  
3  In 1963, live on "The Art Linkletter Show", th...  McDonald's  
4  Signer of the Dec. of Indep., framer of the Co...  John Adams  
   show_number    air_date      round                         category value  \
0         4680  2004-12-31  Jeo

In [3]:
#Function that filters the datset for questions that contains all the words in a list
def words_in_question(words):
    for word in words:
        questions = jeopardy[jeopardy['question'].str.contains(word)]
    return questions

# Testing words_in_question fucntion
print(words_in_question(['Puccini']))


        show_number    air_date             round                   category  \
3723           4398  2003-10-22         Jeopardy!                      OPERA   
4970           3003  1997-09-24  Double Jeopardy!            MUSICAL THEATRE   
12286          5332  2007-11-13  Double Jeopardy!                      OPERA   
14315          1429  1990-11-15  Double Jeopardy!                      OPERA   
15951          3660  2000-06-30         Jeopardy!            NEWS FLASH 1896   
28387          4714  2005-02-17  Double Jeopardy!                      OPERA   
42342          1605  1991-07-19  Double Jeopardy!                DOUBLE TALK   
43529          6131  2011-04-18         Jeopardy!       BACKSTAGE AT THE MET   
52465          1871  1992-10-26  Double Jeopardy!                 OLD MOVIES   
53669          5138  2007-01-03  Double Jeopardy!           OPERA CHARACTERS   
55270          5201  2007-04-02  Double Jeopardy!       AN OPERATIC CATEGORY   
55711          2929  1997-05-01  Double 

In [4]:
#Adding a column for question value as a float for computation
jeopardy['value_float'] = jeopardy['value'].str.replace('$', '', regex=True)
jeopardy['value_float'] = jeopardy['value_float'].str.replace(',', '', regex=True)
jeopardy['value_float'] = pd.to_numeric(jeopardy['value_float'], errors='coerce')

#Function to see the difficulty on different topics

def difficulty(words):
    topic = words_in_question(words)
    average = topic.value_float.mean()
    return average

#Testing difficulty function
difficulty('King')

764.9599636561875

In [5]:
# Function to get the count of all unique answers for questions containing certain words
def get_answer_counts(words):
    topic = words_in_question(words)
    return topic['answer'].value_counts()

# Testing get_answer_counts function
print(get_answer_counts(['King']))

Sweden                  19
Norway                  11
Scotland                11
Denmark                 10
Morocco                 10
                        ..
Ferris Bueller           1
The Three Musketeers     1
Indus                    1
Cobras                   1
"The Dark Tower"         1
Name: answer, Length: 1115, dtype: int64


## Quiz yourself with the dataset!

In [23]:
#Function to quiz yourself
def quiz_me(data):
    question_index = random.randint(0, len(data))
    guess = input(data.question[question_index] + ' ')
    if data.answer[question_index] in guess or guess in data.answer[question_index]:
        print(guess + " is correct!")
    else:
        print("Incorrect. " + data.answer[question_index] + " is the correct response.")
    return 

#Running the function
quiz_me(jeopardy)

She won an Oscar for her portrayal of waitress Alice Hyatt Burstyn
Burstyn is correct!
