# Shark Tank

_Shark Tank_ is a reality TV show. Contestants present their idea for a company to a panel of investors (a.k.a. "sharks"), who then decide whether or not to invest in that company.  The investors give a certain amount of money in exchange for a percentage stake in the company ("equity"). If you are not familiar with the show, you may want to watch part of an episode [here](http://abc.go.com/shows/shark-tank) to get a sense of how it works. You can also search for a clip on YouTube. 

The data that you will examine in this lab contains data about all contestants from the first 6 seasons of the show, including:
- the name and industry of the proposed company
- whether or not it was funded (i.e., the "Deal" column)
- which sharks chose to invest in the venture (N.B. There are 7 regular sharks, not including "Guest". Each shark has a column in the data set, labeled by their last name.)
- if funded, the amount of money the sharks put in and the percentage equity they got in return

To earn full credit on this lab, you should:
- use built-in `pandas` methods (like `.sum()` and `.max()`) instead of writing a for loop over a `DataFrame` or `Series`
- use the split-apply-combine pattern wherever possible

Of course, if you can't think of a vectorized solution, a `for` loop is still better than no solution at all!

In [12]:
import pandas as pd

## Question 0. Getting and Cleaning the Data

The data is stored in the CSV file `sharktank.csv`. Read in the data into a Pandas `DataFrame`.

In [13]:
shark_file = pd.read_csv('sharktank.csv')
print shark_file

     Season  No. in series                          Company Deal  \
0       1.0            1.0                 Ava the Elephant  Yes   
1       1.0            1.0            Mr. Tod's Pie Factory  Yes   
2       1.0            1.0                          Wispots   No   
3       1.0            1.0      College Foxes Packing Boxes   No   
4       1.0            1.0                        Ionic Ear   No   
5       1.0            2.0                   A Perfect Pear  Yes   
6       1.0            2.0                   Classroom Jams  Yes   
7       1.0            2.0                         Lifebelt   No   
8       1.0            2.0                      Crooked Jaw   No   
9       1.0            2.0               Sticky Note Holder   No   
10      1.0            3.0                      Turbobaster  Yes   
11      1.0            3.0                 Stress Free Kids  Yes   
12      1.0            3.0             Kwyzta Chopstick Art   No   
13      1.0            3.0  50 State Capitals in

There is one column for each of the sharks. A 1 indicates that they chose to invest in that company, while a missing value indicates that they did not choose to invest in that company. Notice that these missing values show up as NaNs when we read in the data. Fill in these missing values with zeros. Other columns may also contain NaNs; be careful not to fill those columns with zeros, or you may end up with strange results down the line.

In [23]:
mask = shark_file.Corcoran != 1.0
shark = 'Corcoran'
shark_file.loc[mask, shark] = 0

mask = shark_file.Cuban != 1.0
shark = 'Cuban'
shark_file.loc[mask, shark] = 0

mask = shark_file.Cuban != 1.0
shark = 'Cuban'
shark_file.loc[mask, shark] = 0
#print shark_file['Corcoran']

0      1.0
1      1.0
2      0.0
3      0.0
4      0.0
5      0.0
6      1.0
7      0.0
8      0.0
9      0.0
10     0.0
11     1.0
12     0.0
13     0.0
14     0.0
15     0.0
16     1.0
17     0.0
18     0.0
19     0.0
20     0.0
21     1.0
22     0.0
23     0.0
24     0.0
25     1.0
26     0.0
27     0.0
28     0.0
29     0.0
      ... 
465    0.0
466    0.0
467    0.0
468    0.0
469    0.0
470    0.0
471    0.0
472    0.0
473    0.0
474    0.0
475    1.0
476    0.0
477    0.0
478    0.0
479    0.0
480    0.0
481    0.0
482    0.0
483    0.0
484    0.0
485    0.0
486    0.0
487    1.0
488    0.0
489    0.0
490    0.0
491    0.0
492    0.0
493    0.0
494    0.0
Name: Corcoran, Length: 495, dtype: float64


Notice that Amount and Equity are currently being treated as categorical variables (`dtype: object`). Can you figure out why this is? Clean up these columns and cast them to numeric types (i.e., a `dtype` of `int` or `float`) because we'll need to perform mathematical operations on these columns.

In [None]:
# YOUR CODE HERE

## Question 1. Which Company was Worth the Most?

The valuation of a company is how much it is worth. If someone invests \\$10,000 for a 40\% equity stake in the company, then this means the company must be valued at \$25,000, since 40% of \\$25,000 is \\$10,000.

Calculate the valuation of each company that was funded. Which company was most valuable? Is it the same as the company that received the largest total investment from the sharks?

In [None]:
# YOUR CODE HERE

**YOUR EXPLANATION HERE**

## Question 2. Which Shark Invested the Most?

Calculate the total amount of money that each shark invested over the 6 seasons. Which shark invested the most total money over the 6 seasons?

_Hint:_ If $n$ sharks funded a given venture, then the amount that each shark invested is the total amount divided by $n$.

In [None]:
# ENTER CODE HERE.

**YOUR EXPLANATION HERE**

## Question 3. Do the Sharks Prefer Certain Industries?

Calculate the funding rate (the proportion of companies that were funded) for each industry. Make a visualization showing this information.

In [None]:
# ENTER CODE HERE.

**YOUR EXPLANATION HERE**

## Submission Instructions

Once you are finished, follow these steps:

1. Restart the kernel and re-run this notebook from beginning to end by going to `Kernel > Restart Kernel and Run All Cells`.

2. If this process stops halfway through, that means there was an error. Correct the error and repeat Step 1 until the notebook runs from beginning to end.

3. Double check that there is a number next to each code cell and that these numbers are in order.

Then, submit your lab as follows:

1. Go to `File > Export Notebook As > PDF`.

2. Double check that the entire notebook, from beginning to end, is in this PDF file. (If the notebook is cut off, try first exporting the notebook to HTML and printing to PDF.)

3. Upload the PDF and Notebook (ipynb) to iLearn