#### **Name**: Prasad GVS
#### **SDS Profile Link**: https://community.superdatascience.com/u/d95a79a9

### **Project Brief:**
The technology sector is one of the most dynamic and rapidly evolving industries. This project aims to analyze the leading technology companies globally by leveraging a dataset containing their rankings, market capitalization, stock tickers, country of operation, and the sector and industry they belong to. The goal is to uncover insights into the market dynamics of top tech companies, identify trends in market capitalization across regions, and understand the competitive landscape within various sectors and industries. This analysis will provide a comprehensive overview of the global tech industry, highlighting major players and trends that could inform business decisions, investment strategies, and market positioning.

#### Project Objectives
**Analyze Market Capitalization Trends:** Understand the distribution of market cap among the top tech companies, identifying companies with the largest and smallest market caps, and examining trends in the dataset.

**Regional Insights:** Investigate how tech companies' market caps and rankings vary across different countries and regions, and identify which regions dominate the global tech landscape.

**Sector and Industry Breakdown:** Examine the various sectors and industries within the tech field, identifying key players and emerging competitors in each industry (software, hardware, telecommunications, etc.).

**Competitive Positioning:** Assess how companies rank against their peers within the same sector and industry, providing insights into competitive strengths and weaknesses.

**Provide Strategic Recommendations (optional):** Based on the analysis, offer data-driven recommendations on potential opportunities for growth, investment, or expansion within the tech sector.

#### Business Questions to Answer:
1. Which companies hold the largest market share in the global tech industry?
2. How does market capitalization vary across countries or regions?
3. What are the leading sectors and industries within the tech space?
4. Which companies have shown consistent growth in market capitalization over time?
5. What emerging tech companies are rising in rankings?

#### Project Deliverables
1. **Data Exploration Report:** Initial findings on the distribution of rankings, market caps, and company performance across countries, sectors, and industries.
2. **Regional Market Analysis:** Detailed analysis of how tech companies are distributed geographically, highlighting regions with the largest market caps and the most competitive landscapes.
3. **Sector and Industry Insights:** Breakdown of leading companies by sector and industry, identifying key players, trends, and emerging competitors.
4. **Competitive Analysis Report:** Comparative report showing company rankings within specific industries, analyzing how different companies perform against their peers.
5. **Strategic Recommendations:** Actionable insights and recommendations for businesses on areas of growth, investment, or strategic expansion.
6. **Dashboard (Optional):** Interactive visualization that allows users to explore market cap, ranking, and other key metrics across companies, regions, and sectors.

### **Import the libraries**

In [4]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

### **Import the dataset**

In [5]:
df = pd.read_csv('../../data/tech-companies.csv')

#### **Initial analysis of dataset**

In [6]:
df.head()

Unnamed: 0,Ranking,Company,Market Cap,Stock,Country,Sector,Industry
0,1,Apple Inc.,$2.866 T,AAPL,United States,Technology,Consumer Electronics
1,2,Microsoft Corporation,$2.755 T,MSFT,United States,Technology,Software—Infrastructure
2,3,Nvidia Corporation,$1.186 T,NVDA,United States,Technology,Semiconductors
3,4,Broadcom Inc.,$495.95 B,AVGO,United States,Technology,Semiconductors
4,5,Taiwan Semiconductor Manufacturing Company Lim...,$487.64 B,2330,Taiwan,Technology,Semiconductors


In [8]:
df.shape

(1000, 7)

In [9]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 7 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   Ranking     1000 non-null   int64 
 1   Company     1000 non-null   object
 2   Market Cap  1000 non-null   object
 3   Stock       1000 non-null   object
 4   Country     1000 non-null   object
 5   Sector      1000 non-null   object
 6   Industry    1000 non-null   object
dtypes: int64(1), object(6)
memory usage: 54.8+ KB


In [11]:
df.describe()

Unnamed: 0,Ranking
count,1000.0
mean,500.5
std,288.819436
min,1.0
25%,250.75
50%,500.5
75%,750.25
max,1000.0


In [12]:
df.describe(include='object')

Unnamed: 0,Company,Market Cap,Stock,Country,Sector,Industry
count,1000,1000,1000,1000,1000,1000
unique,1000,684,996,38,1,12
top,Apple Inc.,$1.16 B,2382,United States,Technology,Software—Application
freq,1,9,2,317,1000,198


### **EDA**

#### **Cleaning Missing data**

In [10]:
df.isna().sum()

Ranking       0
Company       0
Market Cap    0
Stock         0
Country       0
Sector        0
Industry      0
dtype: int64

##### <font color='blue'>**There is no missing data in the dataset**</font>

#### **Cleaning duplicate data**

In [14]:
df.duplicated().value_counts()

False    1000
Name: count, dtype: int64

##### <font color='blue'>**No duplicate rows in the dataset**</font>

#### **Convert curreny data**

#### **Encoding categorical data**

#### **Outliers**