# Apprentice Challenge

This challenge is a diagnostic of your current python pandas, matplotlib/seaborn, and numpy skills. These diagnostics will help inform your selection into the Machine Learning Guild's Apprentice program. 

## Challenge Background: A Magic Eight Ball & Randomness

![Shaking Magic Eight Ball](https://media.giphy.com/media/efahzan109oWdMRKnH/source.gif)

Do you remember these days? Holding a question in your mind and shaking the magic eight ball for an answer? 

From Matel via Amazon: "The original Magic 8 Ball is the novelty toy that lets anyone seek advice about their future! All you have to do is simply 'ask the ball' any yes or no question, then wait for your answer to be revealed. Turn the toy upside-down and look inside the window on the bottom - this is where your secret message appears!"

Answers can be positive (i.e. 'It is certain'), negative (i.e. 'Donâ€™t count on it') or neutral (i.e. 'Ask again later').

In this data analysis programming challenge, you will be programmatically analyzing a Magic Eight Ball's fortune telling. This is the type of exploratory analysis typically performed before building machine learning models.

## Instructions

You need to know your way around `pandas` DataFrames and basic Python programming. You have **1 hour** to complete the challenge. We strongly discourage searching the internet for challenge answers.

General Notes:
* Read the first paragraph above to familiarize yourself with the topic.
* Feel free to poke around with the iPython notebook.
* To run a cell, you press `CRTL+ENTER`
* Complete each of the tasks listed below in the notebook.
* You need to provide your code for challenge in the cells which say "-- YOUR CODE FOR TAKS NUMBER --"
* Make sure to run the very last read-only code cell. 

**Please reach out to [Guild Mailbox](mailto:guildprogram@deloitte.com) with any questions.**

# Task 1: Generate Your Fortune!

**Instructions**

Ask our Python-based magic-eight ball 20 questions. You can ask it anything. Save the questions and the fortunes in respective lists which we will use down stream. Use the eightball method `get_fortune` to generate the responses. The sample code below will help you get started. 

```
import eightball

fortune = eightball.get_fortune("What is the answer to life, the universe, and everything?")
print("The Eight Balls Says: {}".format(fortune))

```
Once you reach the box/cell containing the Python code, click on it press Ctrl + Enter and notice what happens!

**Sample Questions**
* Is there ice cream on the moon?
* Will I make a lot of money and become a bizzilionaire?
* Am I going to get a pony?

**Expected Output**
* `questions` which is a `list` of 20 strings
* `fortunes` which is a `list` of 20 strings

In [1]:
# -- YOUR CODE FOR TASK 1 --

import random
import sys
ans = True
from datetime import datetime

# Start Time
start_time = datetime.now()
print(start_time)

# Import the eightball module

# Store your questions list as a variable "questions"
question=[]
print("Am I going to be a successful person?")

# Generate respones to your questions and store them as a variable "fortunes"
fortunes = random.randint(1,3)

if question == "":
    sys.exit()
    
elif fortunes == 1:
    print ("Yes")
    
elif fortunes == 2:
    print ("No")
    
elif fortunes == 3:
    print ("Stop relying on Luck!")

elif fortunes == 4:
    print ("May be")
    
elif fortunes == 5:
    print ("God knows")
    
elif fortunes == 6:
    print ("Work hard, you will be successful")

elif fortunes == 7:
    print ("I dont know")
    
elif fortunes == 8:
    print ("Hell yea!!")

2019-03-21 10:09:39.428479
Am I going to be a successful person?
Stop relying on Luck!


In [2]:
print(type(questions))
print(type(fortunes))


NameError: name 'questions' is not defined

# Task 2: Create a DataFrame!

**Instructions**

Let's analyze your newly minted fortunes. Perhaps we can uncover the magic of the eightball. To start our analysis, put `questions` and `fortunes` in a pandas `DataFrame` called `questions_fortunes`. Your DataFrame should have two columns, one for each of the respective lists. What is the shape of your DataFrame?

**Output**

* `questions_fortunes` which is a `pandas.DataFrame` with two columns called `question` & `fortune`
* Shape of questions_fortunes stored in a variable `shape`

In [18]:
# -- YOUR CODE FOR TASK 2 --

import pandas as pd

#Combine two lists into a dataframe with specified column names
questions = [['Will I be successful?'], 
['Am I any good?']]
Fortunes=[['Yes'], 
['No']]
#questions_fortunes=pd.DataFrame.from_records(questions_fortunes)
#Fortunes=pd.DataFrame.from_records(Fortunes)
# Define the shape of the dataaframe
#shape = questions_fortunes.shape

print(shape)
questions_fortunes = pd.DataFrame(columns = ['question', 'fortunes']) 
questions_fortunes['question']=questions
questions_fortunes['fortunes']=Fortunes
#data.drop(['0'],axis=1)
questions_fortunes

(2, 1)


Unnamed: 0,question,fortunes
0,[Will I be successful?],[Yes]
1,[Am I any good?],[No]


In [19]:
print(list(questions_fortunes))

['question', 'fortunes']


# Task 3: Getting Fortunes

**Instructions**

In the data sub-folder of the challenge folder, there is a dataset ("questions-fortunes.txt") with additional questions and magic eightball fortunes. Read that dataset into a pandas DataFrame called `temp` and combine with your `questions_fortunes` DataFrame. Be sure the index doesn't repeat. Call this new DataFrame `questions_fortunes_updt`.

**Output**

* A temp dataframe of additional questions/fortunes from "questions-fortunes.txt" datafile

* An updated pandas DataFrame `questions_fortunes` with additional rows from the "question-fortunes.txt" datafile.


In [None]:
# -- YOUR CODE FOR TASK 3--

# Create temp DataFrame from "questions-fortunes.txt"
temp = pd.re

# Combine with existing `questions_fortunes` DataFrame
questions_fortunes_updt = ...

print("Check your answers)

In [None]:
print(questions_fortunes_updt.shape[0])

# Task 4: Common Fortunes

**Instructions**

With something close to 1,700 questions and fortunes, we can study the patterns of fortune telling. 

***Part 1:*** 
* Create a variable named `Fortune_Counts` which counts the number of times a fortune is determined for each of the available fortune in `questions_fortunes_updt`. The DataFrame should have two columns: `fortune` and `num_fortune`. The DataFrame should be sorted by `num_fortune` in descending order. Print the entire result.
 
* We know that fortunes from the magic ball fall into one of three categories: Positive, Negative, and Neutral. Use a dictionary of lookup values to assign one of these categories to each question/fortune. Add this as a column `category` to the `Fortune_Counts` DataFrame. 

     * You can access the positive, negative, neutral lookup with the `eightball.fortune_category` property. To add a new column in your DataFrame using a dictionary try adapting this technique: 

        ```
        raw = [[0, "Pony"],
               [0, "Saddle"],
               [0, "Lasso"],
               [1, "Saddle"]
               ]

        prices = {'Pony': 9.99,
                  'Saddle': 4.95,
                  'Lasso': '3.25'
                  }

        df = pd.DataFrame(raw, columns=["orderID", "item"])

        df['price'] = df['item'].map(prices)

        ```


***Part 2:*** 

Create a barchart of your dataset `Fortune_Counts` from Part 1 and color the bars by their category. Create a `seaborn.barplot` with a bar for each fortune colored by category (positive / negative / neutral). Please use the seaborn plotting library. You can install seaborn using `pip`. You can read about the API for the barplot [here](https://seaborn.pydata.org/generated/seaborn.barplot.html). Make the x-axis num_fortune, the y-axis fortune. 


**Output**

*Part 1:* A sorted DataFrame detailing the number of questions per fortune outcome, along with the mapped category

*Part 2:* A `seaborn.barplot` with a bar for each fortune colored by category (positive / negative / neutral).

In [None]:
# Task 4 Part 1
# -- YOUR CODE FOR TASK 4.1 --

# Create dataFrame 'Fortune_Counts' with fortune, category, and num_fortune
Fortune_Counts = ...

# Import categories from eightball

# Create column 'category' mapping the category to each question
Fortune_Counts['category'] = ...

print(Fortune_Counts)

print("Check your answers)

In [None]:
# Task 4 Part 2
# -- YOUR CODE FOR TASK 4.2 --

import seaborn as sns

# Create barplot using sns.barplot()
plt = sns.barplot(...)

print("Check your answers)

In [None]:
print("Ignore this cell")
print("Check your answers)

In [None]:
print("Keep up the good work!")
print("Pass this Cell")

# Task 5: Question your Questions

Magic Eightballs have twenty set responses. We can safely assume that our python-based magic eight ball is not drawing its response from the great beyond. Understanding the patterns of your questions along with the fortunes provided may help interperet the algorithms behind the eight ball.

**Instructions**

***Part 1:*** How long are your questions? Do a quick character count on each of your questions. Include the new data in `questions_fortunes_updt` as a new column called `question_length`. For this task, we recommend using a pandas method and a base python function in the same line. 

What is the average question length?
What is the average question length by fortune? 

***Part 2:*** It seems there may be a correlation between the input (question) length and the output (fortune). Use a pivot table to take a look at number of questions associated with the question length (as the index) vs the fortune told (columns). What do you notice? Make sure all rows display. Fill null values with "-" for easier interpretability. 

**Output**

*Part 1:* 
* A new column in `questions_fortunes_updt` called `question_length`

* A variable called `Avg_Length` of the average length of all questions

* A variable called `Avg_Length_Fortune` of the average length of all questions by Fortune

*Part 2:*
* Pivot Table displaying all rows

In [None]:
# Task 5 Part 1
# -- YOUR CODE FOR TASK 5.1 --

# Create new column `question-length`

# Calculate average length
Avg_Length = ...

# Calculate average length by fortune
Avg_Length_Fortune  = ...

print("Check your answers)

In [None]:
# Task 5 Part 2
# -- YOUR CODE FOR TASK 5.2 --

# Display all rows 


# Create pivot table


print("Check your answers)

In [None]:
print("Do you excel at Pivot Tables?")

In [None]:
print("We're trying to find out.")

# Task 6: Telling Analysis
**Instructions**

Remember our assumption that the output is a function of the input. Based on the pivot table above, it appears that each question length is associated with only one fortune, illustrating that the fortune choice is a function of the input question string length. Thus, we should expect the that two different questions of the same length would procedure the same result. The code below tests and proves this hypothesis.

Our job is done then right? We can know what the eightball is going to tell us with any question, so we can tell the future. This is because the algorithm for this magic eight ball uses the length of the question as the seed.

***Part 1:***
Use the list below to test this hypothesis.

```
same_length  = ["Can whales dance?",
                "Can zebras paint?"
                ]

```

***Part 2:***
Using what you now know about this magic eight ball, write your own function `get_new_fortune` that takes an input `question` and outputs a fortune `option`. Include at least two arguments in your function: `options` and `question`. Use two function arguments, one still being the options list from the eightball module as the fortune options. Get creative and add a third parameter to your function which alters the fortune told!

Test your function by applying it to a question 'Will I make a lot of money and become a bizzilionaire?' to get a fortune. 

***Output***
* `hypothesis_test` which boolean variable
*  `get_new_fortune` a function with at least two inputs: `question` and `options`

In [None]:
# Task 6 Part 1
# -- YOUR CODE FOR TASK 6.1 --

# Same Length List
same_length  = ["Can whales dance?",
                "Can zebras paint?"
                ]

# Code to test if fortunes told are identical
hypothesis_test = ...

print(hypothesis_test)

print("Check your answers)

In [None]:
# Task 6 Part 2
# -- YOUR CODE FOR TASK 6.2 --

# Create your function `get_new_fortune`

# Test your function on the question: "Will I make a lot of money and become a bizzilionaire?"

print("Check your answers)

In [None]:
print("Was our hypothesis true?")


In [None]:
print("Can you tell your own fortune now?")


# Wrapping up!
Please save this notebook as "Last Name - First Name.ipynb". Make sure to save this file and submit it via the link you received in your Deloitte email. 

Happy coding!

## References
1. [Javascript Magic 8 Ball with Basic DOM Manipulation](https://medium.com/@kellylougheed/javascript-magic-8-ball-with-basic-dom-manipulation-1636b83c3c26)
2. [Mattel Games Magic 8 Ball](https://www.amazon.com/Mattel-Games-Magic-8-Ball/dp/B00001ZWV7) - Where to buy on Amazon.