![image](https://github.com/mynameisfho/My-Data-Analyst-Portofolio/blob/main/Identifying%20Unicorn%20Companies/unicorn_companies.jpg)

## Unicorn Companies

A unicorn company refers to a privately held business with a valuation exceeding $1 billion USD. In this dataset, we’ll examine unicorn companies and startups worldwide as of November 2021. The data includes information on each company's country of origin, sector, key investors, and valuation. Notably, companies that have exited through IPOs or acquisitions are excluded from this list. [Source](https://www.cbinsights.com/research-unicorn-companies).


In [15]:
import pandas as pd 
df = pd.read_csv('unicorn_companies.csv')
df

Unnamed: 0,Company,Valuation ($B),Date Added,Country,Category,Select Investors
0,Bytedance,$140.00,4/7/17,China,Artificial intelligence,"Sequoia Capital China, SIG Asia Investments, S..."
1,SpaceX,$100.30,12/1/12,United States,Other,"Founders Fund, Draper Fisher Jurvetson, Rothen..."
2,Stripe,$95.00,1/23/14,United States,Fintech,"Khosla Ventures, LowercaseCapital, capitalG"
3,Klarna,$45.60,12/12/11,Sweden,Fintech,"Institutional Venture Partners, Sequoia Capita..."
4,Canva,$40.00,1/8/18,Australia,Internet software & services,"Sequoia Capital China, Blackbird Ventures, Mat..."
...,...,...,...,...,...,...
912,Heyday,$1.00,11/16/21,United States,E-commerce & direct-to-consumer,"Khosla Ventures,General Catalyst, Victory Park..."
913,PLACE,$1.00,11/17/21,United States,Internet software & services,"Goldman Sachs Asset Management, 3L"
914,Stytch,$1.00,11/18/21,United States,Cybersecurity,"Index Ventures, Benchmark, Thrive Capital"
915,Owkin,$1.00,11/18/21,United States,Artificial Intelligence,"Google Ventures, Cathay Innovation, NJF Capital"


Look at all categories:

In [16]:
# Print out all categories
categories = df['Category'].unique()
categories

array(['Artificial intelligence', 'Other', 'Fintech',
       'Internet software & services',
       'Supply chain, logistics, & delivery',
       'Data management & analytics', 'Edtech',
       'E-commerce & direct-to-consumer', 'Hardware',
       'Auto & transportation', 'Health', 'Consumer & retail', 'Finttech',
       'Travel', 'Cybersecurity', 'Mobile & telecommunications',
       'Artificial Intelligence'], dtype=object)

There are some duplicates because of typos and different capitalization. Let's clean those up.

In [17]:
# Replace wrong text cells
df_clean = df.replace({'Category': {
    'Artificial Intelligence': 'Artificial intelligence',
    'Finttech': 'Fintech'
}})

# Add another printout here
df_clean

Unnamed: 0,Company,Valuation ($B),Date Added,Country,Category,Select Investors
0,Bytedance,$140.00,4/7/17,China,Artificial intelligence,"Sequoia Capital China, SIG Asia Investments, S..."
1,SpaceX,$100.30,12/1/12,United States,Other,"Founders Fund, Draper Fisher Jurvetson, Rothen..."
2,Stripe,$95.00,1/23/14,United States,Fintech,"Khosla Ventures, LowercaseCapital, capitalG"
3,Klarna,$45.60,12/12/11,Sweden,Fintech,"Institutional Venture Partners, Sequoia Capita..."
4,Canva,$40.00,1/8/18,Australia,Internet software & services,"Sequoia Capital China, Blackbird Ventures, Mat..."
...,...,...,...,...,...,...
912,Heyday,$1.00,11/16/21,United States,E-commerce & direct-to-consumer,"Khosla Ventures,General Catalyst, Victory Park..."
913,PLACE,$1.00,11/17/21,United States,Internet software & services,"Goldman Sachs Asset Management, 3L"
914,Stytch,$1.00,11/18/21,United States,Cybersecurity,"Index Ventures, Benchmark, Thrive Capital"
915,Owkin,$1.00,11/18/21,United States,Artificial intelligence,"Google Ventures, Cathay Innovation, NJF Capital"


With the categories cleaned up, let's see how many unicorns there are in each category.

In [18]:
category_counts = df_clean.groupby('Category', as_index=False).size().sort_values(by='size', ascending=False)
category_counts

Unnamed: 0,Category,size
7,Fintech,185
10,Internet software & services,164
5,E-commerce & direct-to-consumer,97
0,Artificial intelligence,72
9,Health,62
12,Other,51
13,"Supply chain, logistics, & delivery",51
3,Cybersecurity,38
11,Mobile & telecommunications,36
4,Data management & analytics,35


In [19]:
import plotly.express as px

# Create a bar chart of the category counts
fig = px.bar(category_counts, x='Category', y='size', title='Number of Unicorn Companies by Category',
             labels={'size': 'Number of Companies'}, text='size')

# Show the bar chart
fig.show()