**Cost-Benefit Analysis:** <br>
Determine the financial impact of skill shortage and compare it with the average training cost per employee for a cost-effective reskilling strategy in USA.

In [1]:
import pandas as pd
import plotly.express as px

In [2]:
data = pd.read_csv('/content/Workforce_Skills_Gap_Analysis.csv')

In [4]:
df = data.copy()
df_usa = df[df['Country'] == 'USA'].groupby('Year').agg(
    Financial_Impact_Skill_Shortage_USD_Million=('Business Impact of Skill Shortage (USD Million)', 'sum'),
    Average_Training_Cost_Per_Employee_USD=('Average Training Cost Per Employee (USD)', 'mean')
).reset_index()
df_usa

Unnamed: 0,Year,Financial_Impact_Skill_Shortage_USD_Million,Average_Training_Cost_Per_Employee_USD
0,2020,135.0,3862.0
1,2022,63.0,3692.5
2,2023,142.5,3160.0


**Missing data for USA, year 2021**

In [7]:
df = data.copy()
missing_2021 = df[(df['Year'] == 2021) & (df['Country']=='USA')]
missing_2021

Unnamed: 0,Country,Industry,Year,Critical Skill Shortage (%),Employees Needing Reskilling (Thousands),Skills Most in Demand,Average Training Cost Per Employee (USD),Business Impact of Skill Shortage (USD Million)


In [6]:
fig = px.bar(df_usa, x='Year', y=['Financial_Impact_Skill_Shortage_USD_Million', 'Average_Training_Cost_Per_Employee_USD'],
             barmode='group', template='plotly_dark',
             labels={'value':'USD (in Millions for Impact, in Dollars for Training Cost)',
                     'variable':'Metrics'})
fig.show()

The visualization comparing the financial impact of skill shortages to the average training cost per employee in the USA from 2020 to 2023 reveals significant insights.

In 2020, the financial impact of skill shortages stood at 135 million dolars, showcasing the substantial economic burden, albeit with an average training cost per employee of 3,862. By 2022, the financial impact reduced to 63 million, signaling potential mitigation from training investments, though still significant at 3,692.5 per employee. However, in 2023, the impact surged to 142.5 million, despite a decline in average training costs to 3,160.

These fluctuations suggest a dynamic skill gap challenge, underscoring the need for continuous training investment. While costs are high, they remain lower than the financial losses from skill shortages, highlighting the cost-effectiveness of reskilling efforts.

Furthermore, the declining training costs alongside fluctuating impacts emphasize the necessity for adaptable and proactive training strategies to address evolving skill demands effectively. In summary, investing in training is crucial for bridging skill gaps and mitigating economic impacts, necessitating ongoing adaptation to align with changing skills landscapes.

**Workforce Development/Reskilling Strategy:**<br>Identify the industries and countries for USA with the highest critical skill shortage percentage to develop targeted training programs.

In [8]:
df_usa = df[df['Country'] == 'USA'].groupby(['Industry']).agg(
    Critical_Skill_Shortage_Percentage=('Critical Skill Shortage (%)', 'mean')
).reset_index().sort_values(by='Critical_Skill_Shortage_Percentage', ascending=False)
df_usa

Unnamed: 0,Industry,Critical_Skill_Shortage_Percentage
1,Manufacturing,23.0
2,Retail,23.0
3,Technology,22.0
0,Construction,18.0


In [9]:
fig = px.bar(df_usa, x='Industry', y='Critical_Skill_Shortage_Percentage',
             template='plotly_dark', title='Critical Skill Shortage Percentage by Industry in the USA')

fig.show()


The visualization of the critical skill shortage percentage by industry in the USA highlights Manufacturing and Retail as the sectors facing the most significant skill gaps, both marked at a 23% shortage. This is closely followed by the Technology industry, which experiences a 22% shortage, indicating a substantial need for upskilling and reskilling in these fields. Construction shows a slightly lower but still notable shortage at 18%, underscoring the broad spectrum of industries affected by skill gaps. This data underscores the urgent need for targeted training programs in these key sectors to mitigate the impact of skill shortages on the economy.

**Skill Demand Forecast:** <br>Analyze the 'Skills Most in Demand' data for USA to predict future workforce needs and guide the creation of future training initiatives.

In [10]:
df_usa_skills = df[df['Country'] == 'USA'].groupby(['Skills Most in Demand']).size().reset_index(name='Count')
df_usa_skills_sorted = df_usa_skills.sort_values(by='Count', ascending=False)
df_usa_skills_sorted

Unnamed: 0,Skills Most in Demand,Count
0,Data Analysis,3
1,Healthcare Expertise,1


In [11]:
fig = px.bar(df_usa_skills_sorted, x='Skills Most in Demand', y='Count',
             template='plotly_dark', title='Skills Most in Demand in the USA')

fig.show()


The visualization of skills most in demand in the USA highlights Data Analysis as the leading skill requirement, with the highest count of demand across industries. This is followed by Healthcare Expertise, indicating a specialized need in the healthcare sector. These insights underscore the critical need for training programs focused on data analytics and healthcare to address the current skills gap effectively.

**Industry Analysis:** <br>Identify the industry with the most employees needing reskilling to prioritize the sectors that require immediate intervention.

In [17]:
df_usa_reskilling = df[df['Country'] == 'USA'].groupby(['Industry']).agg(
    Total_Employees_Needing_Reskilling=('Employees Needing Reskilling (Thousands)', 'sum')
).sort_values(by='Total_Employees_Needing_Reskilling', ascending=False).reset_index()
df_usa_reskilling

Unnamed: 0,Industry,Total_Employees_Needing_Reskilling
0,Construction,486
1,Manufacturing,335
2,Technology,176
3,Retail,142


In [18]:
fig = px.bar(df_usa_reskilling, x='Industry', y='Total_Employees_Needing_Reskilling',
             template='plotly_dark', title='Total Employees Needing Reskilling by Industry in the USA')
fig.show()