# Market Demand Analysis

1. Data processing

In [1]:
import pandas as pd
import plotly.express as px

In [2]:
data = pd.read_csv('/content/Workforce_Skills_Gap_Analysis.csv')

**Skill Demand Forecast:** Analyze the 'Skills Most in Demand' data to predict future workforce needs and guide the creation of future training initiatives.

In [5]:
# First, we need to make a copy of the data
df = data.copy()

# Let's filter for most recent years (2022 and 2023) to predict future trends
df = df[df['Year'].isin([2022, 2023])]

# Next, we calculate the total number of employees needing reskilling by skill and sort descending
df = df.groupby('Skills Most in Demand')['Employees Needing Reskilling (Thousands)'].sum().sort_values(ascending=False)

# Then we reset the index to return a DataFrame
df = df.reset_index()

df

Unnamed: 0,Skills Most in Demand,Employees Needing Reskilling (Thousands)
0,Sales and Customer Management,1484
1,Healthcare Expertise,1375
2,Data Analysis,907
3,Cloud Computing,800
4,Project Management,652
5,AI and Machine Learning,444
6,Digital Marketing,428
7,Renewable Energy Technology,339
8,Cybersecurity,232


In [6]:
fig = px.bar(df,
             x='Employees Needing Reskilling (Thousands)',
             y='Skills Most in Demand',
             color='Skills Most in Demand',
             title="Skill Demand Forecast",
             labels={'Employees Needing Reskilling (Thousands)': 'Employees (Thousands)',
                     'Skills Most in Demand': 'Skills'},
             template='plotly_dark')

fig.show()

The data indicates a strong future demand in the following areas:


Sales and Customer Management tops the list with 1484 thousand employees needing reskilling, suggesting a critical need for sales skills and customer handling capabilities.

Healthcare Expertise is the second most demanded skill with 1375 thousand employees requiring reskilling, indicating a growing need in the healthcare industry.

Data Analysis and Cloud Computing are also in high demand, with 907 and 800 thousand employees needing training, respectively, pointing towards the ongoing digital transformation in businesses.

Lower on the list but still significant, AI and Machine Learning, and Cybersecurity indicate emerging tech fields where future workforce needs are likely to increase, with 444 and 232 thousand employees needing reskilling respectively.

In [7]:
df = data.copy()

# Convert 'Year' to datetime format
df['Year'] = pd.to_datetime(df['Year'], format='%Y')

# Filter data up to the current year
df = df[df['Year'] <= '2024']

# Calculate the total cost of training for each industry and country
df['Total Training Cost (USD)'] = df['Employees Needing Reskilling (Thousands)'] * df['Average Training Cost Per Employee (USD)'] * 1000

# Group by 'Country' and 'Industry' and calculate the sum of 'Employees Needing Reskilling (Thousands)'
# and 'Total Training Cost (USD)' and the average of 'Business Impact of Skill Shortage (USD Million)'
df = df.groupby(['Country', 'Industry']).agg({'Employees Needing Reskilling (Thousands)': 'sum',
                                              'Total Training Cost (USD)': 'sum',
                                              'Business Impact of Skill Shortage (USD Million)': 'mean'}).reset_index()

# Calculate the cost per employee needing reskilling
df['Cost Per Employee Needing Reskilling (USD)'] = df['Total Training Cost (USD)'] / (df['Employees Needing Reskilling (Thousands)'] * 1000)

# Sort by 'Total Training Cost (USD)' in descending order
df = df.sort_values('Total Training Cost (USD)', ascending=False)

df

Unnamed: 0,Country,Industry,Employees Needing Reskilling (Thousands),Total Training Cost (USD),Business Impact of Skill Shortage (USD Million),Cost Per Employee Needing Reskilling (USD)
24,India,Construction,578,2384929000,76.5,4126.17474
35,UK,Retail,483,2285556000,150.0,4732.0
5,Brazil,Retail,554,2237168000,102.0,4038.209386
6,Brazil,Transportation,444,2082360000,147.0,4690.0
23,Germany,Healthcare,847,2015510000,130.5,2379.586777
30,Japan,Retail,573,1949886000,116.25,3402.942408
13,Canada,Technology,643,1868472000,40.5,2905.866252
16,China,Education,475,1830175000,43.5,3853.0
33,UK,Finance,787,1742534000,8.25,2214.147395
34,UK,Manufacturing,466,1709288000,51.0,3668.0


2. Data visualization

In [8]:
fig = px.bar(df, x='Total Training Cost (USD)', y='Country', color='Industry',
             title='Total Training Cost by Industry and Country',
             labels={'Total Training Cost (USD)': 'Total Training Cost (USD)',
                     'Country': 'Country'},
             hover_data=['Employees Needing Reskilling (Thousands)',
                         'Business Impact of Skill Shortage (USD Million)',
                         'Cost Per Employee Needing Reskilling (USD)'],
             color_discrete_sequence=px.colors.qualitative.Bold,
             template='plotly_dark')

fig.show()

3. Insights generation

The analytics task provides insights into the reskilling needs and associated costs across various industries and countries.

* **Retail industry in the UK and Brazil** are facing significant reskilling needs with 483 thousand and 554 thousand employees respectively, costing over $2.2 billion each.

* **Germany's Healthcare sector needs** to reskill 847 thousand employees, with a total training cost exceeding $2 billion.

  Among countries, **India's Construction sector** has the highest demand for reskilling, with 578 thousand employees at a cost of about $2.4 billion.

* **The Technology sector in the USA** needs to reskill 176 thousand employees which would cost about $556 million, indicating a high cost per employee for reskilling in this industry.

These insights can guide resource allocation in training and development efforts.