# 📍 **Introduction**
In the past couple of months, there have been several high-profile layoffs that occurred in tech companies in a wide range of industries, from media and entertainment to cybersecurity and fintechs, which raises questions about the stability of the industry. It's a very delicate situation and I feel sorry for those who were impacted, but I believe that it's important to analyze the data about these facts to be able to prepare myself and others for the future.

### 🎯 **Purpose and Objective**
The purpose of this project is to analyze a dataset regarding the layoff announcements from tech companies that happened in recent months. The main research objective is to:
- Understand the trends
- Try to find any patterns
- Analyze which industries were more affected
- Analyze the percentage of labor affected. 

This notebook will contain some analysis to comprehend the data, but the "full analysis" will be available in Power BI.

### 📑 **Methodology**
For this project, the following tools will be used:
- Python
- Pandas
- Power BI

### 📚 **Used Dataset**
The dataset that will be used in this project was obtained from Kaggle. I appreciate [Widya Salim](https://www.kaggle.com/salimwid) for making this data available to use. You can access it through this link: [Technology Company Layoffs (2022-2023)](https://www.kaggle.com/datasets/salimwid/technology-company-layoffs-20222023-data)

Text copied from the Kaggle website:
```
From Amazon, Microsoft, Google to Wayfair, the technology industry is currently shaken by massive layoffs since mid-2022.

This tabular dataset includes information on 450+ technology companies, including:
- Office locations affected by layoffs
- Current IPO status
- Reported layoff date
- % of the workforce impacted within each company, etc.
- Use this data to gain insights on technology industry trends and make informed decisions for your career or business.
- Be the first to know and hear about layoffs to make your next move. Don't miss out on this essential resource for staying informed in the fast-paced world of technology

Interesting Task Ideas:
- Visualizing current layoffs trends based on months
- Identifying which locations are most impacted
- Whether IPO status affects severity of layoffs
```

----

## ▶️ **Starting Code**
In this section, the objective is to:
- Import required libraries
- Define default variables
- Test if the dataset is working

In [71]:
# Importing libraries
import pandas as pd
import numpy as np

In [72]:
# Default variables
tech_layoffs = pd.read_csv('../data/tech_layoffs.csv', sep=',')

In [73]:
# Showing random rows to pre-analyze the data
tech_layoffs.sample(5)

Unnamed: 0,company,total_layoffs,impacted_workforce_percentage,reported_date,industry,headquarter_location,sources,status,additional_notes
112,Lattice,Unclear,15,1/12/2023,"Enterprise applications, HR",San Francisco,Lattice CEO Jack Altman,Private,
85,GoFundMe,94,12,10/26/2022,Crowdfunding,"Redwood City, CA",GoFundMe,Private,
356,Teleport,Unclear,Unclear,7/1/2022,Cloud Infrastructure,Oakland,Teleport,Private,
75,Salsify,90,11,11/16/2022,"Cloud computing, e-commerce",Boston,Boston Business Journal,Private,
462,Volta Charging,Unclear,54,10/21/2022,"Automotive, electric vehicles",San Francisco,Volta,Public,


----

## 💡 **Quick Analysis**
### **Objectives**
- Understand column types
- Prepare the dataset
- Discover how many unique companies are listed on this dataset ✅

In [74]:
tech_layoffs.dtypes

company                          object
total_layoffs                    object
impacted_workforce_percentage    object
reported_date                    object
industry                         object
headquarter_location             object
sources                          object
status                           object
additional_notes                 object
dtype: object

In [75]:
# Converting columns to their correct type
tech_layoffs['total_layoffs'] = pd.to_numeric(tech_layoffs['total_layoffs'], errors='coerce')
tech_layoffs['impacted_workforce_percentage'] = pd.to_numeric(tech_layoffs['impacted_workforce_percentage'], errors='coerce')
tech_layoffs['reported_date'] = pd.to_datetime(tech_layoffs['reported_date'], format='%d/%M/%Y')

In [76]:
# Total rows in the dataset
len(tech_layoffs)

489

In [77]:
# Count unique companies in the dataset
tech_layoffs[['company']].nunique()

company    477
dtype: int64

In [78]:
# 489 - 477 = 12 repeated rows = 2 or more layoffs from the same company registered
# Checking which companies repeat in the dataset
tech_layoffs[tech_layoffs['company'].duplicated() == True][['company', 'industry']]

Unnamed: 0,company,industry
69,Gemini,"Fintech, Crypto"
92,Thirty Madison,"health care, wellness"
185,Homeward,Proptech
188,DataRobot,"AI, enterprise software"
201,Socure,Identity verification
211,TruePill,"pharmaceutical, health care"
249,Argo AI,Transportation
300,On Deck,"Networking, business development"
342,Sundae,PropTech
451,Blend,"Fintech, proptech"


### 💡 **Analysis 1) Check which companies had more and fewer layoff**

In [79]:
tech_layoffs.sample(1)

Unnamed: 0,company,total_layoffs,impacted_workforce_percentage,reported_date,industry,headquarter_location,sources,status,additional_notes
30,Faire,84.0,7.0,2022-01-11 00:03:00,"E-commerce, retail",San Francisco,The Information,Private,


In [80]:
# Step 1: calculating the total employees for each company (that has "total_layoff" and "impacted_workforce_percentage" filled)
tech_layoffs['total_employees'] = ((tech_layoffs['total_layoffs'] / 100) / (tech_layoffs['impacted_workforce_percentage'] / 100)) * 100

# Step 2: rounding the "total_employees" value
tech_layoffs['total_employees'] = tech_layoffs['total_employees'].round(0)

tech_layoffs.sample(2)

Unnamed: 0,company,total_layoffs,impacted_workforce_percentage,reported_date,industry,headquarter_location,sources,status,additional_notes,total_employees
459,Sema4,750.0,39.0,2022-01-11 00:14:00,"AI, health care","Stamford, CT",MarketWatch,Public,,1923.0
454,Robinhood,1013.0,33.0,2022-01-08 00:02:00,Fintech,Menlo Park,Robinhood,Public,,3070.0


In [81]:
# Step 3: grouping by company name and removing rows with empty "total_layoffs" or "total_employees"
top_companies = tech_layoffs.groupby('company', as_index=False)[['total_layoffs', 'total_employees']].sum()

top_companies = top_companies[top_companies['total_layoffs'] > 0]
top_companies = top_companies[top_companies['total_employees'] > 0]

# Step 4: creating total layoff percentage (in case a company had layoff twice or more)
top_companies['total_layoff_percentage'] = ((top_companies['total_layoffs'] / top_companies['total_employees']) * 100).round(1)

top_companies.sample(3)

Unnamed: 0,company,total_layoffs,total_employees,total_layoff_percentage
167,Flyhomes,150.0,750.0,20.0
160,Fast,400.0,400.0,100.0
7,Addepar,20.0,667.0,3.0


In [82]:
# Step 5.1: categorizing the company size
company_size_conditions = [
    top_companies['total_employees'] < 100,      # <100 Employees => Small Company
    top_companies['total_employees'] < 500,      # >=100 & <500 Employees => Mid Company
    top_companies['total_employees'] < 2000,     # >=500 & <2000 Employees => Large Company
    top_companies['total_employees'] >= 2000     # >=2000 Employees => Big Tech
]

company_size_options = [
    'Small Company',
    'Mid Company',
    'Large Company',
    'Big Tech'
]

top_companies['company_size'] = np.select(company_size_conditions, company_size_options, 'Unidentified')

top_companies.sample(3)

Unnamed: 0,company,total_layoffs,total_employees,total_layoff_percentage,company_size
19,Amazon,18000.0,360000.0,5.0,Big Tech
154,Ethos Life,40.0,333.0,12.0,Mid Company
289,Ocavu,20.0,42.0,47.6,Small Company


In [84]:
# Step 5.2: merging the "company_size" column to the main dataframe
tech_layoffs = pd.merge(top_companies[['company', 'company_size']], tech_layoffs, how='left', on='company')

tech_layoffs.sample(3)

Unnamed: 0,company,company_size,total_layoffs,impacted_workforce_percentage,reported_date,industry,headquarter_location,sources,status,additional_notes,total_employees
85,Gemini,Large Company,68.0,7.0,2022-01-07 00:18:00,Crypto,New York,TechCrunch,Private,,971.0
30,Bolt,Large Company,250.0,33.0,2022-01-05 00:25:00,FinTech,San Francisco,Fortune,Private,,758.0
24,Bird,Large Company,138.0,23.0,2022-01-06 00:07:00,Transporation,Santa Monica,TechCrunch,Public,,600.0


In [36]:
# Step 6.1: sorting by top 10 most "total_layoffs"
top_companies.sort_values('total_layoffs', ascending=False).head(5)

Unnamed: 0,company,total_layoffs,total_employees,total_layoff_percentage
19,Amazon,18000.0,360000.0,5.0
262,Meta,11000.0,84615.0,13.0
48,Better.com,5000.0,10000.0,50.0
97,Cisco,4100.0,82000.0,5.0
430,Twitter,3740.0,5343.0,70.0


In [37]:
# Step 6.2: sorting by top 10 fewer "total_layoffs"
top_companies.sort_values('total_layoffs', ascending=True).head(5)

Unnamed: 0,company,total_layoffs,total_employees,total_layoff_percentage
461,Woven,5.0,33.0,15.2
383,Sourcepoint,7.0,54.0,13.0
133,Digital Currency Group,10.0,77.0,13.0
337,RealSelf,11.0,220.0,5.0
6,Abra,12.0,240.0,5.0


In [35]:
# Step 6.3: sorting by top 10 most "total_layoff_percentage" (important!)
top_companies.sort_values('total_layoff_percentage', ascending=False).head(10)

Unnamed: 0,company,total_layoffs,total_employees,total_layoff_percentage
329,Protocol Media,60.0,60.0,100.0
399,SummerBio,101.0,101.0,100.0
160,Fast,400.0,400.0,100.0
338,Reali,140.0,140.0,100.0
70,Butler Hospitality,1000.0,1000.0,100.0
451,WanderJaunt,85.0,85.0,100.0
433,Uniphore,76.0,100.0,76.0
311,Parler,60.0,80.0,75.0
430,Twitter,3740.0,5343.0,70.0
54,Bizzabo,220.0,367.0,59.9


In [38]:
# Step 6.4: sorting by top 10 fewer "total_layoff_percentage" (important!)
top_companies.sort_values('total_layoff_percentage', ascending=True).head(10)

Unnamed: 0,company,total_layoffs,total_employees,total_layoff_percentage
157,F5,100.0,10000.0,1.0
173,Freshworks,90.0,4500.0,2.0
73,C2FO,20.0,1000.0,2.0
23,Amperity,13.0,433.0,3.0
450,WalkMe,43.0,1433.0,3.0
7,Addepar,20.0,667.0,3.0
302,Outbrain,38.0,1267.0,3.0
119,Coursera,32.0,1067.0,3.0
20,Amdocs,700.0,23333.0,3.0
416,Thirty Madison,24.0,800.0,3.0


### 🧮 Result of Analysis 1

contar quantas vezes a empresa fez layoff
isso é negativo pra ela se fez mais de 1 vez

## Analysis 2) Rank which industry had most layoff 

## Analysis 3)  