<a href="https://colab.research.google.com/github/grace0607/Shark-Tank-Startups/blob/main/Shark_Tank_Startups.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Will I be Disadvantaged as a Female Startup Founder? Here are Some Suggestions from Shark Tank US Data Over the Years


Ever had a brilliant business idea that got you all excited about founding a startup of your own? Thought about pitching your idea on Shark Tank or to potential investors? Particularly for female-identifying folks, did you ever feel that you were at a disadvantage compared to your male counterparts?

For those of you that haven't heard about [Shark Tank](https://abc.com/shows/shark-tank), it's a US TV series that that brings startup founders together to pitch their businesses to a panel of five investors or "sharks", who decide whether to invest in their companies.


In [None]:
from IPython.display import IFrame
# Source: ABC
IFrame('https://www.yourbasin.com/wp-content/uploads/sites/78/2021/04/Shark-tank.jpg?w=876&h=493&crop=1', width=1000, height=500)

##My Questions
In my project, I am curious about whether female contestants (founders) end up settling for less favorable deals (investments) compared to male contestants. Apologies in advance for being gender binary- I am working with the two groups for the purpose of analytical simplicity!

So my questions are:


1.   Is there a difference in the final deal amount, deal equity, and deal valuation that male vs. female contestants end up winning on average?

2.   What about the original ask amount, original offered equity, and requested valuation before the final judging?
  *   If so, is that where the difference in final deal terms stems from?


*Quick blurb on VC investments*

What makes a VC deal favorable from the perspective of an entrepreneur? In general, the goal is to get the most amount of funding for the least percentage of equity. This ensures that your company is getting a high valuation. For example, if the entrepreneur is getting ```$```100,000 for 10```%``` equity, ```$```100,000 is 10% of the company's valuation. So the company is being valued at ```$```1 million (```$```100,000 x 10). The lower the equity and higher the funding amount, the higher the valuation. You can read more about it [here](https://technical.ly/startups/valuation-meaning-shark-tank/).

## My Hypotheses
1. I hypothesize that **there will be a difference in the final deal terms** (deal amount, deal equity, and deal valuation) between male and female contestants.
2. However, I hypothesize that **there will NOT be a difference in the original/requested deal terms** that contestants pitch for initially.
  *   Perhaps female contestants end up settling for less favorable deals because they negotiate less agressively than their male counterparts.




## The Data
I use a [Shark Tank US dataset](https://www.kaggle.com/datasets/thirumani/shark-tank-us-dataset) that has data from season 1 to season 12. It contains information about the startup, industry, entrepreneur's name, gender, city as well as deal statistics such as original and final deal amount ($), deal equity (%), and deal valuation ($). It also contains information about amount invested per "shark" (or judge) among many others. The dataset contains 52 columns and 1006 rows.

I ran into an encoding error while reading in this csv file at first because the file was following the 'ISO-8859-1' encoding rule instead of the 'utf-8' rule, which is the encoding rule that python assumes by default.

In [None]:
import pandas as pd
import plotly.express as px

In [None]:
encoding_rules = ['utf-8', 'ISO-8859-1', 'cp1252']

for encoding in encoding_rules:
  try:
    sharktank = pd.read_csv("/content/Shark Tank US dataset.csv", encoding=encoding)
    print(f"File encoded under {encoding}")
    break
  except:
    print(f"Error with {encoding} encoding_rule")

Error with utf-8 encoding_rule
File encoded under ISO-8859-1


In [None]:
sharktank=pd.read_csv("/content/Shark Tank US dataset.csv", encoding='ISO-8859-1')
#sharktank.info()

In [None]:
sharktank.head()

Unnamed: 0,Season Number,Season Start,Season End,Episode Number,Pitch Number,Original Air Date,Startup Name,Industry,Business Description,Pitchers Gender,...,Guest Investment Equity,Barbara Corcoran Present,Mark Cuban Present,Lori Greiner Present,Robert Herjavec Present,Daymond John Present,Kevin O Leary Present,Kevin Harrington Present,Guest Name,Notes
0,1,09-Aug-09,05-Feb-10,1,1,09-Aug-09,AvaTheElephant,Health/Wellness,Ava The Elephant - Baby and Child Care,Female,...,,1.0,0.0,0.0,1.0,1.0,1.0,1.0,,
1,1,09-Aug-09,05-Feb-10,1,2,09-Aug-09,Mr.Tod'sPieFactory,Food and Beverage,Mr. Tod's Pie Factory - Specialty Food,Male,...,,1.0,0.0,0.0,1.0,1.0,1.0,1.0,,
2,1,09-Aug-09,05-Feb-10,1,3,09-Aug-09,Wispots,Business Services,Wispots - Consumer Services,Male,...,,1.0,0.0,0.0,1.0,1.0,1.0,1.0,,
3,1,09-Aug-09,05-Feb-10,1,4,09-Aug-09,CollegeFoxesPackingBoxes,Lifestyle/Home,College Foxes Packing Boxes - Consumer Services,Male,...,,1.0,0.0,0.0,1.0,1.0,1.0,1.0,,
4,1,09-Aug-09,05-Feb-10,1,5,09-Aug-09,IonicEar,Software/Tech,Ionic Ear - Novelties,Male,...,,1.0,0.0,0.0,1.0,1.0,1.0,1.0,,


## Data Cleanup
For the purpose of our analysis, I'm going to create a simplified dataframe that only includes the columns we care about.

In [None]:
columns= ["Startup Name", "Pitchers Gender", "Industry",
          "Original Ask Amount", "Original Offered Equity", "Valuation Requested", "Got Deal",
          "Total Deal Amount", "Total Deal Equity", "Deal Valuation"]
shark_tank=sharktank[columns]
shark_tank.head()

Unnamed: 0,Startup Name,Pitchers Gender,Industry,Original Ask Amount,Original Offered Equity,Valuation Requested,Got Deal,Total Deal Amount,Total Deal Equity,Deal Valuation
0,AvaTheElephant,Female,Health/Wellness,50000.0,15.0,333333.0,1,50000.0,55.0,90909.0
1,Mr.Tod'sPieFactory,Male,Food and Beverage,460000.0,10.0,4600000.0,1,460000.0,50.0,920000.0
2,Wispots,Male,Business Services,1200000.0,10.0,12000000.0,0,,,
3,CollegeFoxesPackingBoxes,Male,Lifestyle/Home,250000.0,25.0,1000000.0,0,,,
4,IonicEar,Male,Software/Tech,1000000.0,15.0,6666667.0,0,,,


Upon inspecting the gender breakdown of contestants in the first place, I am a bit surprised to see that there is a overwhelming difference in the number of male and female entrepreneurs that pitch. Nearly 60% of contestants are male while only 25% are female.

In [None]:
gender = shark_tank.groupby('Pitchers Gender').size().to_frame(name='Count')
gender

Unnamed: 0_level_0,Count
Pitchers Gender,Unnamed: 1_level_1
Female,247
Male,590
Mixed Team,168


In [None]:

fig = px.pie(shark_tank, values=shark_tank['Pitchers Gender'].value_counts().tolist(),
             names=shark_tank['Pitchers Gender'].value_counts().index.tolist(),
             title='Breakdown of Contestant Gender',
             color_discrete_sequence=px.colors.qualitative.Pastel)

fig.show()


Some contestants end up not sealing a final deal if the judge's valuations or terms don't match their expectations. Let's see what the breakdown of male vs. female looks like there.

In [None]:
left_deal=shark_tank[shark_tank['Got Deal']==0]
left_deal
gender_left_deal=left_deal.groupby('Pitchers Gender').size().to_frame(name='Count')
gender_left_deal

Unnamed: 0_level_0,Count
Pitchers Gender,Unnamed: 1_level_1
Female,95
Male,275
Mixed Team,64


In [None]:
male_count = (left_deal['Pitchers Gender'] == 'Male').sum()
female_count = (left_deal['Pitchers Gender'] == 'Female').sum()
mixed_count = (left_deal['Pitchers Gender'] == 'Mixed Team').sum()
colors = ['green', 'pink', 'lightgray']
fig = px.pie(values=[male_count, female_count, mixed_count],
             names=['Male', 'Female', 'Mixed Team'],
             title='Contestants that Left the Deal',
             color_discrete_sequence=px.colors.qualitative.Pastel)

fig.show()

So it looks like male contestants tend to leave the deal when they aren't happy with the terms *slightly more* than females. However, the dropout breakdown doesn't look too different from the total gender breakdown. Therefore, I assume that dropping the values of people that left the deal (they don't have a final deal amount anymore because they left) won't skew the dataset disproportionately.

For analysis purposes, I'm going to drop all the rows where contestants end up leaving the deal. I'm also going to drop rows where the gender is "mixed" from now on.

In [None]:
shark_tank= shark_tank[shark_tank['Got Deal'] != 0]
shark_tank= shark_tank[shark_tank['Pitchers Gender'] != "Mixed Team"]
shark_tank

Unnamed: 0,Startup Name,Pitchers Gender,Industry,Original Ask Amount,Original Offered Equity,Valuation Requested,Got Deal,Total Deal Amount,Total Deal Equity,Deal Valuation
0,AvaTheElephant,Female,Health/Wellness,50000.0,15.0,333333.0,1,50000.0,55.0,90909.0
1,Mr.Tod'sPieFactory,Male,Food and Beverage,460000.0,10.0,4600000.0,1,460000.0,50.0,920000.0
5,APerfectPear,Female,Food and Beverage,500000.0,15.0,3333333.0,1,500000.0,50.0,1000000.0
6,ClassroomJams,Male,Children/Education,250000.0,10.0,2500000.0,1,250000.0,100.0,250000.0
10,TurboBaster,Female,Food and Beverage,35000.0,35.0,100000.0,1,35000.0,100.0,35000.0
...,...,...,...,...,...,...,...,...,...,...
991,TouchUpCup,Male,Business Services,150000.0,17.5,857143.0,1,200000.0,25.0,800000.0
995,KinApparel,Female,Fashion/Beauty,200000.0,10.0,2000000.0,1,200000.0,30.0,666667.0
1001,Pizza Pack,Male,Food and Beverage,100000.0,10.0,1000000.0,1,100000.0,13.0,769231.0
1003,Stealth Bros,Male,Lifestyle/Home,200000.0,15.0,1333333.0,1,200000.0,20.0,1000000.0


Now we can finally get to the fun part! Let's test our hypotheses.

##Hypothesis #1: There will be a difference in the final deal terms (deal amount, deal equity, and deal valuation) between male and female contestants.
I compare the deal amount, deal equity, and deal valuation that male and female contestants end up winning.
Let's look at Total (Final) Deal Amount first.

In [None]:
fig = px.histogram(shark_tank, x="Total Deal Amount", color="Pitchers Gender", nbins=40,
                   barmode="group", color_discrete_sequence=["pink", "blue"], range_x=[0, 6000000],
                   histnorm='percent')

fig.update_xaxes(range=[0, 2000000], tickmode="linear", tick0=0, dtick=200000,
                 ticktext=["0", "200K", "400K", "600K", "800K", "1M", "1.2M", "1.4M", "1.6M", "1.8M", "2M", "2M+"],
                 tickvals=[0, 200000, 400000, 600000, 800000, 1000000, 1200000, 1400000, 1600000, 1800000, 2000000, 20000000])

fig.update_layout(title="Total Deal Amount for Male and Female Founders",
                  xaxis_title="Total Deal Amount ($)",
                  yaxis_title="Percentage of Pitches Made")

fig.show()

As can be seen in the graph above, nearly 60% of female founders end up settling for an investment of below 190K. For male founders that percentage is 46%. Male founders tend get higher investments more frequently than female founders. The deal amount alone doesn't tell us everything so we'll look at deal equity, which might be more important because it concerns ownership.

In [None]:
fig = px.histogram(shark_tank, x="Total Deal Equity", color="Pitchers Gender", nbins=10,
                   barmode="group", color_discrete_sequence=["pink", "blue"], range_x=[0, 6000000],
                   histnorm='percent')

fig.update_xaxes(range=[0, 101], tickmode="linear", tick0=0, dtick=10,
                 ticktext=["0", "20", "40", "60", "80", "100", "120"],
                 tickvals=[0, 20, 40, 60, 80, 100, 120])

fig.update_layout(title="Total Deal Equity for Male and Female Founders",
                  xaxis_title="Total Deal Equity (%)",
                  yaxis_title="Percentage of Pitches Made")

fig.show()

Once again, male founders seem to do better when it comes to winning better equity deals. 19% of female founders give up somewhere between 40-60% of their startup's equity, whereas 11% of male founders give up the same range. 34% of male founders get equity deals for less than 20% of their startup's equity (ownership). Only 24% of female founders get deals for the same range of equity.

Let's look at the final valuations for male and female founders next.


In [None]:

fig = px.histogram(shark_tank, x="Deal Valuation", color="Pitchers Gender", nbins=40,
                   barmode="group", color_discrete_sequence=["pink", "blue"],
                   histnorm='percent')
fig.update_layout(title="Deal Valuation for Male and Female Founders",
                  xaxis_title="Deal Valuation ($)",
                  yaxis_title="Percentage of Pitches")
fig.update_xaxes(range=[0, 20000000], tickmode="linear", tick0=0, dtick=1000000,
                 ticktext=["0", "1M", "2M", "3M", "4M", "5M", "6M", "7M", "8M", "9M", "10M", "15M, 20M"],
                 tickvals=[0, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000,15000000, 20000000])

fig.show()


Not surprised at the final valuation results! Female founders tend to get lower valuations for their startups compared to male founders, which is not surprising if you're getting lower deal amount and a less favorable equity deal.

##Hypothesis #2: There will NOT be a difference in the original/requested deal terms that contestants pitch for initially.
Is this because female founders ask for less money in their initial pitch? I don't think so. Maybe there are other factors in between the pitch and the final bid like negotiation. Perhaps female founders are bullied around by the sharks. Let's check if my hypothesis could be true.
I'll begin by taking a look at the original ask amount.

In [None]:
fig = px.histogram(shark_tank, x="Original Ask Amount", color="Pitchers Gender", nbins=40,
                   barmode="group", color_discrete_sequence=px.colors.qualitative.Pastel, range_x=[0, 6000000],
                   histnorm='percent')

fig.update_xaxes(range=[0, 2000000], tickmode="linear", tick0=0, dtick=200000,
                 ticktext=["0", "200K", "400K", "600K", "800K", "1M", "1.2M", "1.4M", "1.6M", "1.8M", "2M", "2M+"],
                 tickvals=[0, 200000, 400000, 600000, 800000, 1000000, 1200000, 1400000, 1600000, 1800000, 2000000, 20000000])

fig.update_layout(title="Original Ask Amount from Male and Female Founders",
                  xaxis_title="Original Ask Amount ($)",
                  yaxis_title="Percentage of Pitches Made")

fig.show()

Hmm it looks like female founders ask for less investment in the first place! This might go against my hypothesis. Let's look at original offered equity next, which could be an even more important metric because it's concerned with ownership of the startup.

In [None]:
fig = px.histogram(shark_tank, x="Original Offered Equity", color="Pitchers Gender", nbins=10,
                   barmode="group", color_discrete_sequence=px.colors.qualitative.Pastel, range_x=[0, 6000000],
                   histnorm='percent')

fig.update_xaxes(range=[0, 101], tickmode="linear", tick0=0, dtick=10,
                 ticktext=["0", "20", "40", "60", "80", "100", "120"],
                 tickvals=[0, 20, 40, 60, 80, 100, 120])

fig.update_layout(title="Original Offered Equity from Male and Female Founders",
                  xaxis_title="Original Offered Equity (%)",
                  yaxis_title="Percentage of Pitches Made")

fig.show()

The results are interesting- both male and female founders are very reluctant to give up equity in the first place! Look at them offering 2.5-7.5```%``` of equity. Most founders don't want to go beyond 22.5```%``` at all. Which is understandable- these companies are your blood, sweat, and tears!!
Condsidering that we know what the final deal equity distribution looks like, this is kind of sad to see. Those judges are real sharks!!!

Anyways, the important thing is that we notice how once again, female founders are open to giving up more equity than male founders, even during the initial pitch.

In [None]:
fig = px.histogram(shark_tank, x="Valuation Requested", color="Pitchers Gender", nbins=40,
                   barmode="group", color_discrete_sequence=px.colors.qualitative.Pastel,
                   histnorm='percent')


fig.update_layout(title="Valuation Requested from Male and Female Founders",
                  xaxis_title="Valuation Requested ($)",
                  yaxis_title="Percentage of Pitches")
fig.update_xaxes(range=[0, 20000000], tickmode="linear", tick0=0, dtick=1000000,
                 ticktext=["0", "1M", "2M", "3M", "4M", "5M", "6M", "7M", "8M", "9M", "10M", "15M, 20M"],
                 tickvals=[0, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000,15000000, 20000000])

fig.show()


So now it doesn't come as a surprise that the requested valuations from female founders tend to be is lower than those from male founders. 48% of female contestants think that their startups should be valued for less than $1M, whereas only 31% of male contestants think the same.

Guess I was wrong about the 2nd hypothesis.

##Conclusion: Female founders ask for less and end up with less
To test my idea of female startup founders ending up with less favorable VC deals, I compared the deal amount, deal equity, and deal valuation that male and female contestants end up winning during the final bid. Here, I actually came to the conclusion that female founders ended up with less favorable deals compared to male founders.

My hypothesis was that this might be due to factors in between the initial pitch and the final bid such as negotiation. To test this hypothesis, I examined the difference between the original ask amount, original offered equity, and requested valuation among the two genders. If there was no difference in the original pitch, my hypothesis could have held true. Contrary to my hypothesis, we found out that the difference in final deal terms stem from the original pitch. Women founders tend to ask for less money, are willing to give up more equity, and think their startups are worth less than male founders.

Is this is the reason why female-founded startups end up getting less favorable investments? Well, we need more conditions to be fulfilled in order to establish a causal relationship. However, the fact that female founders underpitch (or male founders overpitch) their startups can provide some food for thought.

##Potential Explanations

1. Perhaps this acts as a signal of confidence towards your product that the judges are very quick to pick up on. They are sharks after all and will take advantage of you if you give them the chance.

2. Women startup founders tend to be more conservative with their projections, [research](https://www.bcg.com/publications/2018/why-women-owned-startups-are-better-bet) finds.

3. Only 2 out of 6 sharks are women. The male judges might be suffering from "[affinity bias](https://www.weforum.org/agenda/2018/06/female-founded-startups-generate-more-revenue-and-do-it-with-less-funding)" — that they back the people and products they know.

###My conclusion: Ladies, it never hurts to ask! Sharks, you are missing out.
Some basic [research](https://www.forbes.com/sites/forbesbusinesscouncil/2022/07/14/why-vcs-should-invest-in-female-founded-companies-and-upgrade-the-venture-ecosystem/?sh=7f084ac15606) tells me that female founders have proven to be more capital efficient and performance driven than male founders. In fact, businesses founded by women have been found to earn more than twice as much per dollar invested than those founded by men. If this is true, investors are missing out if they don't invest in women. It also means that female founders can be more confident in their abilities and should ask for more than just what they need, but "[what is needed to flourish.](https://www.forbes.com/sites/forbesbusinesscouncil/2022/11/30/why-female-founders-still-arent-getting-the-big-number-investments-and-why-they-should/?sh=248e50bf2761)"


In [None]:
from IPython.display import IFrame
# Source: 1517 Fund
IFrame('https://miro.medium.com/v2/resize:fit:1400/format:webp/1*OQB0U-FJYyhQ5kx9B-sx-A.jpeg', width=800, height=500)