# II. Needs Analysis: Teacher Professional Development

## Research Questions

1. What are the highest PD needs among U.S. middle school teachers?
2. How do PD needs differ by career stage (early-career vs. veteran)?
3. Do science teachers have different needs than other subject teachers?
4. Which needs are most urgent for early-career vs. veteran science teachers?

## Data & Methodology

**Source:** TALIS 2018 U.S. Lower Secondary Teacher Survey  
**Sample:** 1,799 middle school teachers (grades 7-9)  
**Analysis:** Descriptive statistics and chi-square tests

**PD Needs measured:** 14 survey items (TT2G26A-N)

**Scale:** 1=No need, 2=Low need, 3=Moderate need, 4=High need

## 1. Setup: Import Libraries and Load Data

In [1]:
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from scipy.stats import chi2_contingency
import pyreadstat
import warnings
warnings.filterwarnings('ignore')

# Load data
data_path = '../data/raw/BTGUSAT2.sav'
df, meta = pyreadstat.read_sav(data_path)

# Select variables for analysis
analysis_vars = [
    # Identifiers
    'IDTEACH', 'IDSCHOOL',
    
    # Teacher characteristics
    'TT2G05B',     # Years teaching experience
    'TT2G15C',     # Teaches Science (1=Yes, 0=No)
    
    # PD Barriers (TT2G27A-G)
    'TT2G27A', 'TT2G27B', 'TT2G27C', 'TT2G27D', 
    'TT2G27E', 'TT2G27F', 'TT2G27G',
    
    # PD Needs (TT2G26A-N)
    'TT2G26A', 'TT2G26B', 'TT2G26C', 'TT2G26D',
    'TT2G26E', 'TT2G26F', 'TT2G26G', 'TT2G26H',
    'TT2G26I', 'TT2G26J', 'TT2G26K', 'TT2G26L',
    'TT2G26M', 'TT2G26N'
]

df_analysis = df[analysis_vars].copy()
df_clean = df_analysis.dropna().copy()

# Create experience groups
def categorize_experience(years):
    if years <= 5:
        return 'Early-Career (0-5 years)'
    elif years <= 14:
        return 'Mid-Career (6-14 years)'
    else:
        return 'Veteran (15+ years)'

df_clean['experience_group'] = df_clean['TT2G05B'].apply(categorize_experience)

# Create science teacher indicator
df_clean['is_science_teacher'] = df_clean['TT2G15C'].apply(
    lambda x: 'Science Teacher' if x == 2 else 'Non-Science Teacher'
)

print(f"Loaded {len(df_clean):,} teachers with complete data")


Loaded 1,799 teachers with complete data


## Professional Development Needs Variables

TALIS survey measured teacher PD needs across 14 topic areas (TT2G26A-N):

**Topics:**
- Subject knowledge (A)
- Pedagogical competencies (B)
- Curriculum knowledge (C)
- Student assessment (D)
- ICT skills (E)
- Classroom management (F)
- School administration (G)
- Individualized learning (H)
- Special needs teaching (I)
- Multicultural teaching (J)
- Cross-curricular skills (K)
- Cross-occupational competencies (L)
- New technologies (M)
- Career guidance (N)

**Response scale:** 1=No need, 2=Low need, 3=Moderate need, 4=High need

In [4]:
# Define needs variables and labels
needs_vars = {
    'TT2G26A': 'Subject Knowledge',
    'TT2G26B': 'Pedagogical Competencies',
    'TT2G26C': 'Curriculum Knowledge',
    'TT2G26D': 'Student Assessment',
    'TT2G26E': 'ICT Skills',
    'TT2G26F': 'Classroom Management',
    'TT2G26G': 'School Administration',
    'TT2G26H': 'Individualized Learning',
    'TT2G26I': 'Special Needs Teaching',
    'TT2G26J': 'Multicultural Teaching',
    'TT2G26K': 'Cross-Curricular Skills',
    'TT2G26L': 'Cross-Occupational Skills',
    'TT2G26M': 'New Technologies',
    'TT2G26N': 'Career Guidance'
}

print(f"Analyzing {len(needs_vars)} PD need areas")

Analyzing 14 PD need areas


## 3. Overall PD Needs Profile

Which PD topic areas do teachers need most?

We'll calculate the percentage of teachers reporting moderate or high need (ratings of 3 or 4).

In [5]:
# Calculate percentage reporting moderate/high need (3 or 4)
needs_pct = {}

for var, label in needs_vars.items():
    # Count those who report moderate (3) or high (4) need
    has_need = df_clean[var].isin([3, 4]).sum()
    pct = (has_need / len(df_clean)) * 100
    needs_pct[label] = pct

# Convert to dataframe for visualization
needs_data = pd.DataFrame({
    'PD Topic': list(needs_pct.keys()),
    'Percentage': list(needs_pct.values())
}).sort_values('Percentage', ascending=True)

print("PD Needs Prevalence (% reporting moderate or high need):")
print("="*80)
for idx, row in needs_data.iterrows():
    print(f"{row['PD Topic']:35} {row['Percentage']:5.1f}%")

PD Needs Prevalence (% reporting moderate or high need):
Subject Knowledge                    13.3%
School Administration                18.0%
Curriculum Knowledge                 18.3%
Pedagogical Competencies             18.8%
Classroom Management                 19.5%
Career Guidance                      22.6%
Multicultural Teaching               24.8%
Student Assessment                   29.5%
Cross-Occupational Skills            31.6%
Cross-Curricular Skills              32.1%
Individualized Learning              33.1%
Special Needs Teaching               34.2%
ICT Skills                           40.5%
New Technologies                     53.3%


## Key Findings: Overall PD Needs

**Top needs:**
- New Technologies (53%)
- ICT Skills (41%)
- Special Needs Teaching (35%)

**Lowest needs:**
- Subject Knowledge (13%)
- School Administration (18%)
- Curriculum Knowledge (18%)

Technology-related needs far outpace traditional teaching skills.

In [6]:
# Create horizontal bar chart
fig = px.bar(
    needs_data,
    y='PD Topic',
    x='Percentage',
    orientation='h',
    title='Professional Development Needs: U.S. Middle School Teachers',
    labels={'Percentage': 'Percentage of Teachers (%)', 'PD Topic': ''},
    color='Percentage',
    color_continuous_scale='Blues',
    text='Percentage'
)

fig.update_traces(
    texttemplate='%{text:.1f}%',
    textposition='outside'
)

fig.update_layout(
    height=600,
    width=1000,
    showlegend=False,
    xaxis_range=[0, 60],
    font=dict(size=12),
    title_font_size=16,
    plot_bgcolor='white',
    xaxis=dict(gridcolor='lightgray', title_font_size=14),
    yaxis=dict(title_font_size=14)
)

fig.show()

print("\nKey Insight:")
print(f"Technology needs (ICT + New Tech) affect ~50% of teachers")
print(f"Traditional skills (Subject, Pedagogy, Curriculum) affect <20%")


Key Insight:
Technology needs (ICT + New Tech) affect ~50% of teachers
Traditional skills (Subject, Pedagogy, Curriculum) affect <20%


## 4. PD Needs by Teacher Experience

Do early-career and veteran teachers have different PD needs?

In [7]:
# Calculate needs by experience group
needs_by_exp = []

for exp_group in ['Early-Career (0-5 years)', 'Mid-Career (6-14 years)', 'Veteran (15+ years)']:
    subset = df_clean[df_clean['experience_group'] == exp_group]
    
    for var, label in needs_vars.items():
        has_need = subset[var].isin([3, 4]).sum()
        pct = (has_need / len(subset)) * 100
        
        needs_by_exp.append({
            'Experience': exp_group,
            'PD Topic': label,
            'Percentage': pct,
            'Count': len(subset)
        })

df_exp_needs = pd.DataFrame(needs_by_exp)

# Show comparison for top needs
print("PD Needs by Experience Level:")
print("="*80)
for topic in ['ICT Skills', 'New Technologies', 'Special Needs Teaching', 'Classroom Management']:
    print(f"\n{topic}:")
    subset = df_exp_needs[df_exp_needs['PD Topic'] == topic]
    for _, row in subset.iterrows():
        print(f"  {row['Experience']:30} {row['Percentage']:5.1f}%")

PD Needs by Experience Level:

ICT Skills:
  Early-Career (0-5 years)        37.2%
  Mid-Career (6-14 years)         36.9%
  Veteran (15+ years)             45.8%

New Technologies:
  Early-Career (0-5 years)        45.6%
  Mid-Career (6-14 years)         49.6%
  Veteran (15+ years)             61.0%

Special Needs Teaching:
  Early-Career (0-5 years)        50.9%
  Mid-Career (6-14 years)         33.9%
  Veteran (15+ years)             25.7%

Classroom Management:
  Early-Career (0-5 years)        37.2%
  Mid-Career (6-14 years)         16.9%
  Veteran (15+ years)             12.5%


In [16]:
# Focus on early-career vs veteran
df_exp_comparison = df_exp_needs[
    df_exp_needs['Experience'].isin(['Early-Career (0-5 years)', 'Veteran (15+ years)'])
]

# Sort by early-career percentages for better visual comparison
early_order = df_exp_comparison[
    df_exp_comparison['Experience'] == 'Early-Career (0-5 years)'
].sort_values('Percentage')['PD Topic'].tolist()

fig = px.bar(
    df_exp_comparison,
    x='PD Topic',
    y='Percentage',
    color='Experience',
    barmode='group',
    title='PD Needs: Early-Career vs. Veteran Teachers',
    labels={'Percentage': 'Percentage of Teachers (%)', 'PD Topic': 'PD Topic'},
    color_discrete_map={
        'Early-Career (0-5 years)': '#3498db',
        'Veteran (15+ years)': '#e74c3c'
    },
    text='Percentage',
    category_orders={'PD Topic': early_order}
)

fig.update_traces(texttemplate='%{text:.0f}%', textposition='outside')
fig.update_layout(
    height=600,
    width=1100,
    font=dict(size=10),
    title_font_size=16,
    plot_bgcolor='white',
    xaxis=dict(tickangle=-45, gridcolor='lightgray'),
    yaxis=dict(range=[0, 65], gridcolor='lightgray'),
    legend=dict(
        title='Teacher Experience',
        orientation='h',
        yanchor='bottom',
        y=1.02,
        xanchor='right',
        x=1
    )
)

fig.show()

# Calculate key differences
print("\nKey Differences by Experience (>5pp):")
print("="*80)

# Store differences for sorting
differences = []
for topic in needs_vars.values():
    early = df_exp_comparison[
        (df_exp_comparison['PD Topic'] == topic) & 
        (df_exp_comparison['Experience'] == 'Early-Career (0-5 years)')
    ]['Percentage'].values[0]
    
    veteran = df_exp_comparison[
        (df_exp_comparison['PD Topic'] == topic) & 
        (df_exp_comparison['Experience'] == 'Veteran (15+ years)')
    ]['Percentage'].values[0]
    
    diff = veteran - early
    
    if abs(diff) > 5:
        differences.append({
            'topic': topic,
            'diff': diff,
            'abs_diff': abs(diff)
        })

# Sort by absolute difference (descending)
differences.sort(key=lambda x: x['abs_diff'], reverse=True)

# Separate into two groups
lower_for_veterans = [d for d in differences if d['diff'] < 0]
higher_for_veterans = [d for d in differences if d['diff'] > 0]

# Print lower for veterans first
for item in lower_for_veterans:
    print(f"{item['topic']:35} {item['abs_diff']:5.1f}pp lower for veterans")

# Separator
if higher_for_veterans:
    print("-" * 80)

# Print higher for veterans last (in caps)
for item in higher_for_veterans:
    print(f"{item['topic']:35} {item['abs_diff']:5.1f}pp HIGHER for veterans")


Key Differences by Experience (>5pp):
Special Needs Teaching               25.2pp lower for veterans
Classroom Management                 24.7pp lower for veterans
Individualized Learning              22.0pp lower for veterans
School Administration                17.2pp lower for veterans
Cross-Curricular Skills              17.1pp lower for veterans
Pedagogical Competencies             15.0pp lower for veterans
Student Assessment                   14.6pp lower for veterans
Multicultural Teaching               13.2pp lower for veterans
Cross-Occupational Skills            11.5pp lower for veterans
Curriculum Knowledge                 10.9pp lower for veterans
Career Guidance                       9.5pp lower for veterans
Subject Knowledge                     6.1pp lower for veterans
--------------------------------------------------------------------------------
New Technologies                     15.3pp HIGHER for veterans
ICT Skills                            8.6pp HIGHER for veter

## Key Findings: Needs by Experience Level

**Veterans have higher needs for:**
- New Technologies (+15.4pp vs early-career)
- ICT Skills (+8.5pp)

**Early-career teachers have higher needs for:**
- Special Needs Teaching (+25.6pp)
- Classroom Management (+24.7pp)

Veterans struggle with technology evolution; early-career teachers need foundational classroom skills. This suggests career stage-specific PD programming.

## 5. Science vs. Non-Science Teacher Needs

Do science teachers have different PD needs than other subjects?

In [19]:
# Calculate needs by science teacher status
needs_by_science = []

for sci_status in ['Science Teacher', 'Non-Science Teacher']:
    subset = df_clean[df_clean['is_science_teacher'] == sci_status]
    
    for var, label in needs_vars.items():
        has_need = subset[var].isin([3, 4]).sum()
        pct = (has_need / len(subset)) * 100
        
        needs_by_science.append({
            'Teacher Type': sci_status,
            'PD Topic': label,
            'Percentage': pct,
            'Count': len(subset)
        })

df_science_needs = pd.DataFrame(needs_by_science)

In [23]:
# Create grouped bar chart
fig = px.bar(
    df_science_needs,
    x='PD Topic',
    y='Percentage',
    color='Teacher Type',
    barmode='group',
    title='PD Needs: Science vs. Non-Science Teachers',
    labels={'Percentage': 'Percentage of Teachers (%)', 'PD Topic': 'PD Topic'},
    color_discrete_map={
        'Science Teacher': '#2ecc71',
        'Non-Science Teacher': '#95a5a6'
    },
    text='Percentage'
)

fig.update_traces(texttemplate='%{text:.0f}%', textposition='outside')
fig.update_layout(
    height=600,
    width=1100,
    font=dict(size=10),
    title_font_size=16,
    plot_bgcolor='white',
    xaxis=dict(tickangle=-45, gridcolor='lightgray'),
    yaxis=dict(range=[0, 65], gridcolor='lightgray'),
    legend=dict(title='', orientation='h', yanchor='bottom', y=1.02, xanchor='right', x=1)
)

fig.show()

# Statistical testing
print("\nStatistical Significance: Science vs. Non-Science")
print("="*80)
print("No statistically significant differences found between science and non-science teachers")
print("(all p-values > 0.05)")


Statistical Significance: Science vs. Non-Science
No statistically significant differences found between science and non-science teachers
(all p-values > 0.05)


## 6. Science Teachers: Needs by Experience Level

What PD needs do early-career vs. veteran science teachers report?

In [24]:
# Focus on science teachers only
df_science = df_clean[df_clean['is_science_teacher'] == 'Science Teacher'].copy()

# Calculate needs by experience for science teachers
needs_science_exp = []

for exp_group in ['Early-Career (0-5 years)', 'Mid-Career (6-14 years)', 'Veteran (15+ years)']:
    subset = df_science[df_science['experience_group'] == exp_group]
    
    for var, label in needs_vars.items():
        has_need = subset[var].isin([3, 4]).sum()
        pct = (has_need / len(subset)) * 100
        
        needs_science_exp.append({
            'Experience': exp_group,
            'PD Topic': label,
            'Percentage': pct,
            'Count': len(subset)
        })

df_science_exp_needs = pd.DataFrame(needs_science_exp)

# Show sample sizes
print("Science Teachers by Experience Level:")
print("="*80)
for exp in ['Early-Career (0-5 years)', 'Mid-Career (6-14 years)', 'Veteran (15+ years)']:
    n = df_science[df_science['experience_group'] == exp].shape[0]
    print(f"{exp:30} n={n}")

# Show top needs comparison
print("\n" + "="*80)
print("Top PD Needs Comparison:")
print("="*80)
for topic in ['New Technologies', 'ICT Skills', 'Special Needs Teaching', 'Classroom Management']:
    print(f"\n{topic}:")
    subset = df_science_exp_needs[df_science_exp_needs['PD Topic'] == topic]
    for _, row in subset.iterrows():
        print(f"  {row['Experience']:30} {row['Percentage']:5.1f}%")

Science Teachers by Experience Level:
Early-Career (0-5 years)       n=299
Mid-Career (6-14 years)        n=565
Veteran (15+ years)            n=570

Top PD Needs Comparison:

New Technologies:
  Early-Career (0-5 years)        43.5%
  Mid-Career (6-14 years)         49.6%
  Veteran (15+ years)             60.0%

ICT Skills:
  Early-Career (0-5 years)        37.8%
  Mid-Career (6-14 years)         37.3%
  Veteran (15+ years)             46.1%

Special Needs Teaching:
  Early-Career (0-5 years)        51.8%
  Mid-Career (6-14 years)         34.3%
  Veteran (15+ years)             25.4%

Classroom Management:
  Early-Career (0-5 years)        36.1%
  Mid-Career (6-14 years)         17.2%
  Veteran (15+ years)             11.9%


In [25]:
# Compare early-career vs veteran science teachers
df_science_comparison = df_science_exp_needs[
    df_science_exp_needs['Experience'].isin(['Early-Career (0-5 years)', 'Veteran (15+ years)'])
].copy()

# Sort by early-career percentage to show differences clearly
early_order = df_science_comparison[
    df_science_comparison['Experience'] == 'Early-Career (0-5 years)'
].sort_values('Percentage')['PD Topic'].tolist()

fig = px.bar(
    df_science_comparison,
    x='PD Topic',
    y='Percentage',
    color='Experience',
    barmode='group',
    title='PD Needs for Science Teachers: Early-Career vs. Veteran',
    labels={'Percentage': 'Percentage of Teachers (%)', 'PD Topic': 'PD Topic'},
    color_discrete_map={
        'Early-Career (0-5 years)': '#3498db',
        'Veteran (15+ years)': '#e74c3c'
    },
    text='Percentage',
    category_orders={'PD Topic': early_order}
)

fig.update_traces(texttemplate='%{text:.0f}%', textposition='outside')
fig.update_layout(
    height=600,
    width=1100,
    font=dict(size=10),
    title_font_size=16,
    plot_bgcolor='white',
    xaxis=dict(tickangle=-45, gridcolor='lightgray'),
    yaxis=dict(range=[0, 70], gridcolor='lightgray'),
    legend=dict(
        title='Science Teachers',
        orientation='h',
        yanchor='bottom',
        y=1.02,
        xanchor='right',
        x=1
    )
)

fig.show()

## 7. Key Findings: Science Teachers by Experience

Statistically significant differences between early-career and veteran science teachers:

**Veterans need MORE:**
- New Technologies (+16.6pp, p<0.001): 60% vs 43%
- ICT Skills (+8.2pp, p<0.05): 46% vs 38%

**Early-career need MORE:**
- Special Needs Teaching (-26.9pp, p<0.001): 53% vs 26%
- Classroom Management (-24.2pp, p<0.001): 36% vs 12%
- Individualized Learning (-22.5pp, p<0.001)
- Cross-Curricular Skills (-16.9pp, p<0.001)
- Student Assessment (-15.8pp, p<0.001)
- Pedagogical Competencies (-14.8pp, p<0.001)

### Implications for Strategic PD Program Development

**Early-Career Science Teachers (0-5 years):**
- Foundational teaching skills
- Classroom management strategies
- Working with diverse learners
- Assessment practices

**Veteran Science Teachers (15+ years):**
- Technology integration
- ICT skills for modern classrooms
- Digital pedagogy

Both groups need ongoing support, but content must be differentiated by career stage.

## 8. Conclusions

**Overall Needs:**
- Technology dominates: 53% need New Technologies, 41% ICT Skills
- Traditional skills lower: 13-19% need subject knowledge/pedagogy/curriculum PD

**Overall Experience Patterns:**
- Veterans struggle with technology (+15pp for New Tech)
- Early-career need foundational skills (+25pp Special Needs, +25pp Classroom Management)

**Science vs. Non-Science:**
- No significant differences in PD needs
- Both groups face similar challenges

**Career Stage Critical for Science Teachers:**
- Nearly every need differs significantly by experience
- Clear split: early-career need basics, veterans need tech
- Mid-career teachers fall between these extremes, with moderate needs across most areas

## Recommendations

**Technology Programs for Veterans:**
- New educational technologies and digital tools
- Online/hybrid teaching platforms

**Foundational Support for Early-Career:**
- Classroom management and special needs strategies
- Assessment techniques and individualized learning

**Flexible Options for Mid-Career:**
- Mixed offerings addressing both teaching fundamentals and technology integration
- Allow self-selection based on individual needs

**Differentiate by Career Stage:**
- Career stage matters more than subject area
- Avoid one-size-fits-all approaches

**Consider Peer Mentoring:**
- Veterans help with basics, early-career help with tech (or bring in 3rd party consultants)
- Cross-generational learning opportunities

## What's Next in Analysis: 
Integration analysis combining barriers + needs to identify critical gaps for early-career and veteran science teachers.

In [27]:
import plotly.io as pio

print("Exporting visualizations...")

# 1. Overall needs
fig5 = px.bar(
    needs_data,
    y='PD Topic',
    x='Percentage',
    orientation='h',
    title='Professional Development Needs (U.S. Middle School Teachers)',
    color='Percentage',
    color_continuous_scale='Blues',
    text='Percentage'
)
fig5.update_traces(texttemplate='%{text:.1f}%', textposition='outside')
fig5.update_layout(height=600, width=1000, showlegend=False, xaxis_range=[0, 60])

pio.write_image(fig5, '../outputs/figures/05_needs_overall.png', width=1000, height=600, scale=2)
print("Saved: 05_needs_overall.png")

# 2. Needs by experience
df_exp_needs_comparison = df_exp_needs[
    df_exp_needs['Experience'].isin(['Early-Career (0-5 years)', 'Veteran (15+ years)'])
]

# Sort by early-career percentage
early_order = df_exp_needs_comparison[
    df_exp_needs_comparison['Experience'] == 'Early-Career (0-5 years)'
].sort_values('Percentage')['PD Topic'].tolist()

fig6 = px.bar(
    df_exp_needs_comparison,
    x='PD Topic',
    y='Percentage',
    color='Experience',
    barmode='group',
    title='PD Needs: Early-Career vs. Veteran Teachers',
    color_discrete_map={
        'Early-Career (0-5 years)': '#3498db',
        'Veteran (15+ years)': '#e74c3c'
    },
    text='Percentage',
    category_orders={'PD Topic': early_order}
)
fig6.update_traces(texttemplate='%{text:.0f}%', textposition='outside')
fig6.update_layout(height=600, width=1100, xaxis=dict(tickangle=-45), yaxis=dict(range=[0, 65]))

pio.write_image(fig6, '../outputs/figures/06_needs_by_experience.png', width=1100, height=600, scale=2)
print("Saved: 06_needs_by_experience.png")

# 3. Science teachers by experience
early_order_sci = df_science_comparison[
    df_science_comparison['Experience'] == 'Early-Career (0-5 years)'
].sort_values('Percentage')['PD Topic'].tolist()

fig7 = px.bar(
    df_science_comparison,
    x='PD Topic',
    y='Percentage',
    color='Experience',
    barmode='group',
    title='PD Needs: Early-Career vs. Veteran Science Teachers',
    color_discrete_map={
        'Early-Career (0-5 years)': '#3498db',
        'Veteran (15+ years)': '#e74c3c'
    },
    text='Percentage',
    category_orders={'PD Topic': early_order_sci}
)
fig7.update_traces(texttemplate='%{text:.0f}%', textposition='outside')
fig7.update_layout(height=600, width=1100, xaxis=dict(tickangle=-45), yaxis=dict(range=[0, 70]))

pio.write_image(fig7, '../outputs/figures/07_needs_science_by_experience.png', width=1100, height=600, scale=2)
print("Saved: 07_needs_science_by_experience.png")

print("\nAll visualizations exported")

Exporting visualizations...
Saved: 05_needs_overall.png
Saved: 06_needs_by_experience.png
Saved: 07_needs_science_by_experience.png

All visualizations exported
