# Quiz 3 - Venture Capital Exit Analysis

## Load libraries and data

Import packages.

In [1]:
import pandas as pd
import numpy as np

### Dataset

There are two tables in our dataset.

- `df_companies` contains the information of companies that successfully performed a VC exit in the past year.
- `df_deals` contains the deal details of VC exits.

In [2]:
df_companies = pd.read_csv('https://raw.githubusercontent.com/bdi475/datasets/737da234227d5ecec1369df1a1827cd40872c0eb/vc-exits/companies.csv')

df_companies.head()

Unnamed: 0,company_name,deal_id,industry,city,country,is_profitable,patents
0,300,300145_202112,Entertainment Software,New York,United States,False,0
1,ACROBiosystems,acr168_202110,Discovery Tools (Healthcare),Beijing,China,True,0
2,ASR Microelectronics,asr032_202201,Application Specific Semiconductors,Shanghai,China,False,6
3,Admix,adm196_202206,Business/Productivity Software,London,United Kingdom,False,0
4,Affera,aff071_202208,Therapeutic Devices,Newton,United States,False,6


In [3]:
df_deals = pd.read_csv('https://github.com/bdi475/datasets/raw/737da234227d5ecec1369df1a1827cd40872c0eb/vc-exits/deals.csv')

df_deals.head()

Unnamed: 0,deal_date,deal_id,deal_type,deal_amount
0,2021-09-29,vel174_202109,Reverse Merger,345
1,2021-09-29,war005_202109,IPO,3109
2,2021-09-30,ben137_202109,Reverse Merger,403
3,2021-09-30,bea219_202109,M&A,275
4,2021-09-30,avi218_202109,M&A,275


## Quiz Questions

### How many companies?

In [4]:
df_companies.shape[0]

250

### Number of profitable companies?

In [5]:
df_companies[df_companies['is_profitable']].shape[0]

38

### Does `df_companies` contain any row with missing `country` value?

In [6]:
df_companies[df_companies['country'].isna()]

Unnamed: 0,company_name,deal_id,industry,city,country,is_profitable,patents
191,Sixense Enterprises,six232_202110,Other Hardware,,,False,0


**Answer**: True

### Merge DataFrames (ungraded)

In [7]:
df = pd.merge(
    left=df_companies,
    right=df_deals,
    on='deal_id',
    how='inner'
)
print(df.shape) # should print (250, 11)

(250, 10)


In [8]:
df.head(3)

Unnamed: 0,company_name,deal_id,industry,city,country,is_profitable,patents,deal_date,deal_type,deal_amount
0,300,300145_202112,Entertainment Software,New York,United States,False,0,2021-12-16,M&A,397
1,ACROBiosystems,acr168_202110,Discovery Tools (Healthcare),Beijing,China,True,0,2021-10-18,IPO,348
2,ASR Microelectronics,asr032_202201,Application Specific Semiconductors,Shanghai,China,False,6,2022-01-14,IPO,1079


### Which company recorded the largest deal amount in India?

In [9]:
df[df['country'] == 'India'].sort_values('deal_amount', ascending=False).head(1)

Unnamed: 0,company_name,deal_id,industry,city,country,is_profitable,patents,deal_date,deal_type,deal_amount
162,Paytm,pay011_202111,Financial Software,Noida,India,True,0,2021-11-18,IPO,2452


**Answer**: Paytm

### Which company recorded the largest deal amount in 2021?

In [10]:
df['deal_date'] = pd.to_datetime(df['deal_date'])
df['year'] = df['deal_date'].dt.year

df[df['year'] == 2021].sort_values('deal_amount', ascending=False).head(1)

Unnamed: 0,company_name,deal_id,industry,city,country,is_profitable,patents,deal_date,deal_type,deal_amount,year
175,Rivian,riv001_202111,Automotive,Irvine,United States,False,114,2021-11-10,IPO,11934,2021


**Answer**: Rivian

### Which country has the third largest total deal amount?

In [11]:
df_by_country = df.groupby('country', as_index=False).agg({
      'deal_amount': 'sum'
})

df_by_country.sort_values('deal_amount', ascending=False).head(3)

Unnamed: 0,country,deal_amount
27,United States,109712
6,China,26421
26,United Kingdom,14268


**Answer**: United Kingdom

### Which deal type has the lowest average deal amount?

In [12]:
df_by_deal_type = df.groupby('deal_type', as_index=False).agg({
      'deal_amount': 'mean'
})

df_by_deal_type.sort_values('deal_amount').head(1)

Unnamed: 0,deal_type,deal_amount
3,Reverse Merger,498.644444


**Answer**: Reverse Merger

### Among the companies that are not profitable, which company has the largest number of patents?

In [13]:
df_no_profit = df[~df['is_profitable']]
df_no_profit.sort_values('patents', ascending=False).head(1)

Unnamed: 0,company_name,deal_id,industry,city,country,is_profitable,patents,deal_date,deal_type,deal_amount,year
150,Nuance Communications,nua000_202203,Business/Productivity Software,Burlington,United States,False,1113,2022-03-04,M&A,18800,2022


**Answer**: Nuance Communications