# Methodology Draft: California Film Tax Credit Employment Effects

## Research Question and Hypotheses

**Primary Research Question**: Did California's film tax credit expansions in 2015 (AB 1839) and 2020 (AB 2021) generate sustained employment and wage gains in the motion picture industry, or do observed effects reflect statistical artifacts of firm reclassification?

**Hypotheses**:

**H1 (Employment Effect)**: California's 2015 tax credit expansion increased motion picture industry employment relative to control states that did not expand programs during this period.

**H2 (Wage Effect)**: The tax credit expansion increased average wages in California's motion picture industry, potentially through labor demand effects or skill composition changes.

**H3 (Migration Validation)**: If H1 is supported by QCEW data, we should observe corresponding increases in migration of film workers to California in ACS data. Divergence between QCEW employment gains and ACS migration patterns would suggest reclassification rather than genuine job creation.

**H4 (Political Timing)**: Tax credit expansions align with electoral cycles, suggesting political motivations may influence program design independent of economic evidence.

---

## Overview of Empirical Strategy

This research employs a **multi-method approach** to evaluate California's film tax credit expansions:

1. **Primary Analysis**: Difference-in-Differences (DiD) estimation using Bureau of Labor Statistics QCEW data (2009-2022)
2. **Robustness Check**: Synthetic Control Method following Rickman & Wang (2020)
3. **Validation Analysis**: American Community Survey migration data to distinguish genuine job creation from statistical reclassification
4. **Descriptive Supplement**: Political timing analysis examining expansion dates relative to electoral cycles

This multi-pronged approach addresses the fundamental challenge in film tax credit research: **distinguishing real economic effects from measurement artifacts**.

---


## Part 1: Data Sources and Variable Construction

### 1.1 Primary Employment Data: BLS QCEW

**Source**: Bureau of Labor Statistics Quarterly Census of Employment and Wages (QCEW)

**Industry Classification**: NAICS 512110 (Motion Picture and Video Production)
- This follows Thom (2018) and Rickman & Wang (2020) for comparability
- Includes production companies, studios, and independent producers
- Excludes exhibition (theaters) and distribution

**Time Period**: 2009 Q1 - 2022 Q4 (56 quarters)
- Pre-treatment period: 2009-2014 (24 quarters before AB 1839)
- Treatment period 1: 2015-2019 (20 quarters, AB 1839 expansion)
- Treatment period 2: 2020-2022 (12 quarters, AB 2021 expansion)

**Key Variables**:
- **Employment**: Average of monthly employment levels (month1_emplvl, month2_emplvl, month3_emplvl)
- **Wages**: Average weekly wage (avg_wkly_wage)
- **Establishments**: Number of reporting establishments (qtrly_estabs_count)

**Treatment and Control States**:
- **Treatment**: California (expanded tax credit in 2015 from $100M to $330M annually)
- **Control candidates**: States with stable film incentive programs 2015-2022
  - New York: Established program, no major expansions
  - Texas: Limited incentive program, stable over period
  - Florida: No major expansions during treatment period
  - Illinois: Stable program post-2015

**Data Quality Notes**:
- QCEW has near-universal coverage (98%+ of U.S. employment)
- State-level reporting based on firm's reported location
- **Critical limitation**: Firms may reclassify employees across states for tax purposes without actual worker relocation (addressed by ACS validation)

### 1.2 Migration Validation Data: American Community Survey

**Source**: American Community Survey (ACS) 1-Year Estimates

**Purpose**: Distinguish genuine worker relocation from administrative reclassification

**Variables**:
- State-to-state migration flows by occupation
- Occupation codes: SOC 27-2012 (Producers/Directors), 27-4031 (Camera Operators), 27-4032 (Film Editors), and related motion picture occupations
- California in-migration counts by source state and occupation (2010-2022)

**Analysis Strategy**:
1. Compare pre-2015 vs post-2015 California in-migration rates for film occupations
2. Test whether QCEW employment gains correspond to actual worker migration
3. Identify source states (e.g., workers leaving Georgia/Louisiana for California?)
4. Placebo test: Compare to other professional occupation migration patterns

**Expected Patterns**:
- **Genuine job creation**: QCEW employment ↑ AND ACS in-migration ↑
- **Reclassification artifact**: QCEW employment ↑ BUT ACS in-migration unchanged

### 1.3 Political Timing Data

**Sources**:
- California Secretary of State: Gubernatorial election dates (2010, 2014, 2018, 2022)
- California Legislative Analyst's Office: Budget documents and fiscal analysis
- LexisNexis State Capital: Legislative records for AB 1839 (2014) and AB 2021 (2020)

**Variables**:
- Policy enactment dates
- Months to/from nearest election
- Electoral cycle phase (election year vs. off-year)

**Purpose**: Descriptive analysis examining whether expansion timing aligns with electoral incentives (following Owens & Rennhoff 2024)

### 1.4 Control Variables

**State-level economic controls**:
- Gross State Product growth rate (BEA)
- Unemployment rate (BLS)
- Population (Census Bureau)
- Related industry employment trends (broader arts/entertainment sector)

**Purpose**: Control for general economic conditions that might confound treatment effects

---


## Part 2: Primary Empirical Method - Difference-in-Differences

### 2.1 Baseline DiD Specification

Following Thom (2018), I estimate a two-way fixed effects model:

$$
Y_{st} = \alpha + \beta_1 (CA \times Post2015_{t}) + \beta_2 (CA \times Post2020_{t}) + \gamma_s + \delta_t + X_{st}\Gamma + \varepsilon_{st}
$$

Where:
- $Y_{st}$ = outcome variable (log employment or log average wage) for state $s$ in quarter $t$
- $CA$ = indicator for California
- $Post2015_t$ = indicator for quarters after 2015 Q2 (when AB 1839 took effect)
- $Post2020_t$ = indicator for quarters after 2020 Q3 (when AB 2021 took effect)
- $\gamma_s$ = state fixed effects (control for time-invariant state characteristics)
- $\delta_t$ = time fixed effects (control for common shocks affecting all states)
- $X_{st}$ = vector of time-varying controls (unemployment rate, GSP growth)
- $\varepsilon_{st}$ = error term (clustered at state level)

**Parameters of Interest**:
- $\beta_1$ = Average treatment effect of 2015 expansion on California relative to control states
- $\beta_2$ = Incremental effect of 2020 expansion (on top of 2015 effect)

**Interpretation**: 
- If $\beta_1 > 0$ (and statistically significant), California's employment/wages grew faster than control states after 2015 expansion
- Using log outcomes: $\beta_1 \times 100$ = approximate percentage point change

### 2.2 Event Study Specification (Parallel Trends Test)

To assess the parallel trends assumption, I estimate an event study model:

$$
Y_{st} = \alpha + \sum_{k \neq -1} \beta_k (CA \times Quarter_k) + \gamma_s + \delta_t + X_{st}\Gamma + \varepsilon_{st}
$$

Where:
- $Quarter_k$ = indicators for quarters relative to treatment (k = -12, -11, ..., -1, 0, 1, ..., 20)
- $k = -1$ is omitted (reference quarter before treatment)
- $k = 0$ is 2015 Q2 (treatment begins)

**Parallel Trends Test**:
- **Null hypothesis**: $\beta_k = 0$ for all $k < 0$ (pre-treatment coefficients)
- If pre-treatment coefficients are statistically zero, this supports parallel trends
- If pre-treatment coefficients show trends, this suggests violation of identifying assumption

**Dynamic Treatment Effects**:
- Post-treatment coefficients ($\beta_k$ for $k \geq 0$) show how effects evolve over time
- Allows testing whether effects are immediate, gradual, or fade out

### 2.3 Control State Selection Criteria

Control states must satisfy:

1. **No major policy changes 2015-2022**: States that didn't expand/contract film incentives during treatment period
2. **Parallel pre-trends**: Similar employment trajectory to California during 2009-2014
3. **Economic comparability**: Similar unemployment rates, economic growth patterns
4. **Industry presence**: Non-trivial motion picture employment (avoid states with <100 employees)

**Candidate Controls**:
- **New York**: Large production center, established tax credit program (stable post-2015)
- **Texas**: Major production hub, limited incentives, stable over period
- **Florida**: Moderate production activity, no major changes
- **Illinois**: Established program, no expansions during treatment

**Sensitivity Analysis**: Test robustness to different control group compositions

### 2.4 Standard Errors and Inference

**Clustering**: Standard errors clustered at state level
- Accounts for serial correlation within states over time
- Conservative inference given small number of clusters

**Wild Cluster Bootstrap**: Given few clusters (5-10 states), use wild cluster bootstrap for more reliable inference (Cameron et al. 2008)

---


## Part 3: Preliminary Regression Analysis (2013-2018 Data)

Below I replicate and extend the exploratory analysis with formal regression specifications.


In [None]:
# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
import warnings
warnings.filterwarnings('ignore')

# Set plotting style
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")
plt.rcParams['figure.figsize'] = (14, 8)
plt.rcParams['font.size'] = 11

print("Libraries loaded successfully")
print("Note: Full regression analysis requires statsmodels. Installing if needed...")

try:
    import statsmodels.api as sm
    from statsmodels.formula.api import ols
    print("✓ Statsmodels available")
except ImportError:
    print("⚠ Statsmodels not available. Install with: pip install statsmodels")


In [None]:
# Load data from Exploration BLS directory
import os
import glob

# Navigate to Exploration BLS folder
data_dir = 'Exploration BLS'

data_files = [
    '2013.q1-q4 512110 Motion picture and video production.csv',
    '2014.q1-q4 512110 Motion picture and video production.csv', 
    '2015.q1-q4 512110 Motion picture and video production.csv',
    '2016.q1-q4 512110 NAICS 512110 Motion picture and video production.csv',
    '2017.q1-q4 512110 NAICS 512110 Motion picture and video production.csv',
    '2018.q1-q4 512110 NAICS 512110 Motion picture and video production.csv'
]

# Read and combine all data
all_data = []
for file in data_files:
    filepath = os.path.join(data_dir, file)
    df = pd.read_csv(filepath)
    all_data.append(df)

combined_data = pd.concat(all_data, ignore_index=True)
print(f"Loaded {len(combined_data)} records from {len(data_files)} files")
print(f"Years covered: {sorted(combined_data['year'].unique())}")
print(f"\nColumns available: {list(combined_data.columns[:20])}...")  # Show first 20 columns


In [None]:
# Extract and prepare state-level panel data
# Focus on statewide aggregates for California, New York, Georgia, Texas

states_of_interest = ['California', 'New York', 'Georgia', 'Texas']

state_panel = []

for state in states_of_interest:
    state_data = combined_data[
        (combined_data['area_title'] == f'{state} -- Statewide') & 
        (combined_data['own_title'] == 'Private')
    ].copy()
    
    # Calculate quarterly employment (average of three monthly values)
    state_data['employment'] = (
        state_data['month1_emplvl'] + 
        state_data['month2_emplvl'] + 
        state_data['month3_emplvl']
    ) / 3
    
    # Keep relevant variables
    state_data['state'] = state
    state_data['wage'] = state_data['avg_wkly_wage']
    state_data['time_period'] = state_data['year'].astype(str) + 'Q' + state_data['qtr'].astype(str)
    
    # Create time index
    state_data['quarter_index'] = (state_data['year'] - 2013) * 4 + state_data['qtr']
    
    state_panel.append(state_data)

panel_df = pd.concat(state_panel, ignore_index=True)
panel_df = panel_df.sort_values(['state', 'year', 'qtr']).reset_index(drop=True)

print("Panel dataset created:")
print(f"  States: {panel_df['state'].unique()}")
print(f"  Time periods: {panel_df['time_period'].nunique()} quarters ({panel_df['year'].min()}-{panel_df['year'].max()})")
print(f"  Total observations: {len(panel_df)}")
print(f"\nSample data:")
print(panel_df[['state', 'year', 'qtr', 'employment', 'wage']].head(12))


In [None]:
# Create treatment indicators
panel_df['california'] = (panel_df['state'] == 'California').astype(int)
panel_df['post2015'] = ((panel_df['year'] > 2015) | 
                         ((panel_df['year'] == 2015) & (panel_df['qtr'] >= 2))).astype(int)

# DiD interaction term
panel_df['ca_post2015'] = panel_df['california'] * panel_df['post2015']

# Create log variables for regression (employment)
panel_df['log_employment'] = np.log(panel_df['employment'])
panel_df['log_wage'] = np.log(panel_df['wage'])

print("Treatment variables created:")
print(f"  California observations: {panel_df['california'].sum()}")
print(f"  Post-2015 observations: {panel_df['post2015'].sum()}")
print(f"  Treated observations (CA x Post2015): {panel_df['ca_post2015'].sum()}")

# Verify treatment timing
print("\nTreatment timing check (California):")
ca_data = panel_df[panel_df['california'] == 1][['time_period', 'post2015']].head(12)
print(ca_data)


In [None]:
# Descriptive statistics by treatment status
print("="*80)
print("DESCRIPTIVE STATISTICS BY TREATMENT STATUS")
print("="*80)

# California pre vs post
ca_data = panel_df[panel_df['california'] == 1]
ca_pre = ca_data[ca_data['post2015'] == 0]
ca_post = ca_data[ca_data['post2015'] == 1]

print("\nCALIFORNIA:")
print(f"  Pre-2015 (2013-2014):")
print(f"    Mean employment: {ca_pre['employment'].mean():,.0f}")
print(f"    Mean wage: ${ca_pre['wage'].mean():.2f}/week")
print(f"  Post-2015 (2015Q2-2018):")
print(f"    Mean employment: {ca_post['employment'].mean():,.0f}")
print(f"    Mean wage: ${ca_post['wage'].mean():.2f}/week")
print(f"  Change:")
print(f"    Employment: {((ca_post['employment'].mean() / ca_pre['employment'].mean() - 1) * 100):+.1f}%")
print(f"    Wage: {((ca_post['wage'].mean() / ca_pre['wage'].mean() - 1) * 100):+.1f}%")

# Control states (using New York as primary control)
ny_data = panel_df[panel_df['state'] == 'New York']
ny_pre = ny_data[ny_data['post2015'] == 0]
ny_post = ny_data[ny_data['post2015'] == 1]

print("\nNEW YORK (Control):")
print(f"  Pre-2015 (2013-2014):")
print(f"    Mean employment: {ny_pre['employment'].mean():,.0f}")
print(f"    Mean wage: ${ny_pre['wage'].mean():.2f}/week")
print(f"  Post-2015 (2015Q2-2018):")
print(f"    Mean employment: {ny_post['employment'].mean():,.0f}")
print(f"    Mean wage: ${ny_post['wage'].mean():.2f}/week")
print(f"  Change:")
print(f"    Employment: {((ny_post['employment'].mean() / ny_pre['employment'].mean() - 1) * 100):+.1f}%")
print(f"    Wage: {((ny_post['wage'].mean() / ny_pre['wage'].mean() - 1) * 100):+.1f}%")

# Simple difference-in-differences calculation
ca_diff_pct = (ca_post['employment'].mean() / ca_pre['employment'].mean() - 1) * 100
ny_diff_pct = (ny_post['employment'].mean() / ny_pre['employment'].mean() - 1) * 100
did_pct = ca_diff_pct - ny_diff_pct

print("\n" + "="*80)
print("SIMPLE DiD ESTIMATE (Employment):")
print(f"  California change: {ca_diff_pct:+.1f}%")
print(f"  New York change: {ny_diff_pct:+.1f}%")
print(f"  DiD effect: {did_pct:+.1f} percentage points")
print("="*80)


In [None]:
# Visualization: Parallel trends assessment
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 6))

# Employment trends
for state in ['California', 'New York', 'Georgia', 'Texas']:
    state_data = panel_df[panel_df['state'] == state]
    ax1.plot(state_data['time_period'], state_data['employment'], 
             marker='o', label=state, linewidth=2.5, markersize=5)

# Add treatment line
treatment_idx = list(panel_df[panel_df['california']==1]['time_period'].unique()).index('2015Q2')
ax1.axvline(x=treatment_idx*4, color='red', linestyle='--', linewidth=2, 
            label='CA Tax Credit (2015Q2)', alpha=0.7)

ax1.set_title('Employment Trends: Parallel Trends Assessment', fontsize=14, fontweight='bold')
ax1.set_xlabel('Quarter', fontsize=12)
ax1.set_ylabel('Quarterly Employment', fontsize=12)
ax1.legend(loc='best')
ax1.grid(True, alpha=0.3)
ax1.tick_params(axis='x', rotation=45)

# Wage trends
for state in ['California', 'New York', 'Georgia', 'Texas']:
    state_data = panel_df[panel_df['state'] == state]
    ax2.plot(state_data['time_period'], state_data['wage'], 
             marker='s', label=state, linewidth=2.5, markersize=5)

ax2.axvline(x=treatment_idx*4, color='red', linestyle='--', linewidth=2, alpha=0.7)
ax2.set_title('Average Weekly Wage Trends', fontsize=14, fontweight='bold')
ax2.set_xlabel('Quarter', fontsize=12)
ax2.set_ylabel('Average Weekly Wage ($)', fontsize=12)
ax2.legend(loc='best')
ax2.grid(True, alpha=0.3)
ax2.tick_params(axis='x', rotation=45)

plt.tight_layout()
plt.savefig('parallel_trends_assessment.png', dpi=300, bbox_inches='tight')
plt.show()

print("Visual parallel trends check: Compare pre-2015Q2 trajectories")
print("If lines are roughly parallel before treatment, parallel trends assumption is plausible")


In [None]:
# Regression Analysis: Two-Way Fixed Effects DiD
# Note: This requires statsmodels. If not available, shows manual calculation

try:
    import statsmodels.api as sm
    from statsmodels.formula.api import ols
    
    print("="*80)
    print("REGRESSION ANALYSIS: DIFFERENCE-IN-DIFFERENCES")
    print("="*80)
    
    # Model 1: Simple DiD (no fixed effects) - for comparison
    print("\n[Model 1] Basic DiD (No Fixed Effects)")
    print("-" * 80)
    model1 = ols('log_employment ~ california + post2015 + ca_post2015', data=panel_df).fit()
    print(model1.summary().tables[1])
    print(f"\nInterpretation: ca_post2015 coefficient = {model1.params['ca_post2015']:.4f}")
    print(f"  This implies approximately {model1.params['ca_post2015']*100:.2f}% change in employment")
    print(f"  Standard error: {model1.bse['ca_post2015']:.4f}")
    print(f"  t-statistic: {model1.tvalues['ca_post2015']:.2f}")
    print(f"  p-value: {model1.pvalues['ca_post2015']:.4f}")
    
    # Model 2: DiD with time fixed effects
    print("\n\n[Model 2] DiD with Time Fixed Effects")
    print("-" * 80)
    model2 = ols('log_employment ~ california + ca_post2015 + C(time_period)', data=panel_df).fit()
    print(f"DiD coefficient (ca_post2015): {model2.params['ca_post2015']:.4f}")
    print(f"  Standard error: {model2.bse['ca_post2015']:.4f}")
    print(f"  t-statistic: {model2.tvalues['ca_post2015']:.2f}")
    print(f"  p-value: {model2.pvalues['ca_post2015']:.4f}")
    print(f"  Approximate % effect: {model2.params['ca_post2015']*100:.2f}%")
    print(f"  R-squared: {model2.rsquared:.4f}")
    
    # Model 3: Full two-way fixed effects (state + time)
    print("\n\n[Model 3] Two-Way Fixed Effects (State + Time FE)")
    print("-" * 80)
    model3 = ols('log_employment ~ ca_post2015 + C(state) + C(time_period)', data=panel_df).fit()
    print(f"DiD coefficient (ca_post2015): {model3.params['ca_post2015']:.4f}")
    print(f"  Standard error: {model3.bse['ca_post2015']:.4f}")
    print(f"  t-statistic: {model3.tvalues['ca_post2015']:.2f}")
    print(f"  p-value: {model3.pvalues['ca_post2015']:.4f}")
    print(f"  Approximate % effect: {model3.params['ca_post2015']*100:.2f}%")
    print(f"  R-squared: {model3.rsquared:.4f}")
    
    print("\n" + "="*80)
    print("INTERPRETATION:")
    print("="*80)
    print("The DiD coefficient estimates the causal effect of California's 2015 tax credit")
    print("expansion on employment, controlling for:")
    print("  - Time-invariant state differences (state fixed effects)")
    print("  - Common time trends affecting all states (time fixed effects)")
    print("\nPositive coefficient = CA employment grew faster than control states post-2015")
    print("Negative coefficient = CA employment grew slower than control states post-2015")
    
    # Store results for later
    did_coef = model3.params['ca_post2015']
    did_se = model3.bse['ca_post2015']
    did_pval = model3.pvalues['ca_post2015']
    
except ImportError:
    print("Statsmodels not available. Please install: pip install statsmodels")
    print("Manual DiD calculation shown above in descriptive statistics.")


## Part 4: Robustness Check - Synthetic Control Method

### 4.1 Rationale

Following Rickman & Wang (2020), I implement synthetic control as a robustness check on DiD estimates. Synthetic control addresses two concerns:

1. **Control group selection**: Rather than choosing control states arbitrarily, synthetic control constructs an optimal weighted combination of donor states that best matches California's pre-treatment characteristics

2. **Parallel trends**: Synthetic control explicitly matches pre-treatment trends, addressing the key DiD assumption

### 4.2 Method Overview

**Synthetic California Construction**:

1. **Donor pool**: States without major film incentive changes 2015-2022
   - Exclude: Georgia, Louisiana, New Mexico (major program changes)
   - Include: New York, Texas, Florida, Illinois, Pennsylvania, Massachusetts

2. **Matching variables** (2009 Q1 - 2015 Q1):
   - Quarterly employment in motion picture industry
   - Average wages
   - State unemployment rate
   - State GSP growth
   - Number of establishments

3. **Optimization**: Find weights $w_1, w_2, ..., w_J$ that minimize:

$$\sum_{t=1}^{T_0} (Y_{CA,t} - \sum_{j=1}^J w_j Y_{j,t})^2$$
   
Where $T_0$ = pre-treatment period, $j$ indexes donor states

4. **Treatment effect**: 

$$\hat{\tau}_t = Y_{CA,t} - \sum_{j=1}^J w_j Y_{j,t}$$
   
For $t > T_0$ (post-treatment period)

### 4.3 Statistical Inference

**Permutation Test (Placebo)**:
- Apply synthetic control to each donor state (pretend each was treated)
- Compare California's post-treatment gap to distribution of placebo gaps
- If California's gap is extreme relative to placebos, suggests real effect

**RMSPE Ratio**:
- Root Mean Squared Prediction Error (RMSPE) measures fit quality
- Post-treatment RMSPE / Pre-treatment RMSPE ratio
- Large ratio = treatment effect; small ratio = poor synthetic control fit

### 4.4 Expected Implementation

```python
# Synthetic control implementation (requires synth-inference package or custom code)
# Steps:
# 1. Construct donor pool (states without policy changes)
# 2. Optimize weights to match California pre-treatment (2009-2015Q1)
# 3. Generate synthetic California for post-treatment period
# 4. Calculate treatment effects (actual CA - synthetic CA)
# 5. Permutation inference (placebo tests on donor states)
# 6. Plot gaps and permutation distribution
```

**Interpretation**:
- If synthetic control shows similar results to DiD → robust finding
- If results diverge → suggests sensitivity to method/control group selection
- Permutation p-value < 0.05 → statistically significant effect

### 4.5 Comparison to DiD

| Dimension | DiD | Synthetic Control |
|-----------|-----|-------------------|
| Control group | Pre-specified | Data-driven weights |
| Parallel trends | Assumed | Explicitly matched |
| Inference | Regression SE | Permutation tests |
| Transparency | Less transparent | Highly transparent (weights shown) |
| Best for | Multiple treated units | Single treated unit |

**Convergence**: If both methods yield similar estimates, this strengthens causal interpretation

---


## Part 5: Migration Validation Using American Community Survey Data

### 5.1 The Reclassification Problem

**Critical Concern**: QCEW employment increases may reflect **statistical artifacts** rather than genuine job creation:

1. **Cross-state reclassification**: Production companies reclassify employees from other states to California for tax credit eligibility
   - Worker remains in Texas but is reported as California employee
   - QCEW shows CA employment ↑, but no actual worker relocation

2. **NAICS code gaming**: Firms reclassify workers from adjacent industries (advertising, digital media) into NAICS 512110
   - Statistical employment gain without real industry growth

3. **Timing manipulation**: Production schedules shifted to maximize credit, creating temporary spikes

**Why This Matters**: 
- If employment gains are reclassification, there's no real economic benefit to California
- Policy debate hinges on whether credits create jobs or just reallocate reporting
- Previous literature (Thom 2018, Rickman & Wang 2020) cannot distinguish these mechanisms

### 5.2 ACS Migration Data as Validation

**Key Insight**: American Community Survey tracks where people **actually live**, not where firms report them

**Data Structure**:
- ACS asks respondents: "Did you live in a different state 1 year ago?"
- State-to-state migration flows by occupation
- Motion picture occupations: SOC codes 27-2012, 27-4031, 27-4032, etc.

**Validation Logic**:

| QCEW Result | ACS Result | Interpretation |
|-------------|------------|----------------|
| Employment ↑ | In-migration ↑ | ✓ Genuine job creation |
| Employment ↑ | In-migration unchanged | ⚠ Likely reclassification |
| Employment unchanged | In-migration ↑ | ? Data quality issue |
| Employment ↑ | Out-migration ↑ | ⚠ Remote work / gaming |

### 5.3 Empirical Strategy

**Analysis 1: In-Migration Trends**

Compare California in-migration for film occupations before vs. after 2015:

$$InMigration_{CA,t} = \alpha + \beta \cdot Post2015_t + \gamma \cdot t + \varepsilon_t$$

Where:
- $InMigration_{CA,t}$ = number of film workers moving to California in year $t$
- $Post2015_t$ = indicator for years after 2015
- $t$ = linear time trend

**Hypothesis**: If QCEW shows employment gains, $\beta$ should be positive and significant

**Analysis 2: Source State Patterns**

Identify which states lost film workers to California:

- If Georgia/Louisiana (high tax credit states) → zero-sum competition
- If non-competitor states → genuine industry attraction
- If no clear source states → suggests reclassification not relocation

**Analysis 3: Occupation-Specific Migration**

Examine migration patterns by occupation:

- **Directors/Producers** (SOC 27-2012): High-skill, likely genuine relocation
- **Cinematographers** (SOC 27-4031): Specialized, follow productions
- **Editors** (SOC 27-4032): Can be remote, less clear signal
- **General production workers**: Large numbers, more subject to reclassification

If high-skill occupations show migration but general workers don't → suggests genuine production shift but potential worker-level gaming

**Placebo Test**: 
Compare film worker migration to other professional occupations (engineers, accountants):
- If film workers show unique post-2015 increase → consistent with policy effect
- If all professionals show similar trends → general California in-migration, not tax credit effect

### 5.4 Data Limitations

**ACS Constraints**:
1. **Sample size**: Small occupation counts may have high sampling error
2. **Annual data**: Quarterly QCEW vs. annual ACS creates timing mismatch
3. **Occupation coding**: Film workers may be misclassified in ACS
4. **1-year lag**: "Lived in different state 1 year ago" creates temporal lag

**Interpretation**:
- Absence of migration signal is informative (suggests reclassification)
- Presence of migration confirms genuine relocation but doesn't rule out additional reclassification
- Combined QCEW + ACS provides bounds on true employment effects

### 5.5 Expected Implementation

```python
# ACS migration analysis (pseudo-code)
# 1. Download ACS migration flows (Census API or IPUMS)
# 2. Extract California in-migration by occupation (SOC 27-XXXX)
# 3. Calculate pre-2015 vs post-2015 migration rates
# 4. Compare to QCEW employment growth
# 5. Test whether migration patterns match employment patterns
# 6. Placebo: compare to non-film professional occupations
```

**Novel Contribution**: This validation approach is **new to the film tax credit literature**. No previous study has used migration data to validate administrative employment findings.

---


## Part 6: Political Timing Analysis (Descriptive)

### 6.1 Motivation: Why Study Political Timing?

**Owens & Rennhoff (2024) Finding**: Film tax credits are driven by **political incentives** rather than economic evidence
- Legislators vote based on party, gubernatorial alignment, and electoral competitiveness
- Local industry presence and lobbying do NOT predict votes
- Programs persist despite weak economic returns

**Implication for This Study**: 
- Even if employment effects are weak/null, understanding political motivations explains program persistence
- If expansions align with election cycles, this suggests electoral signaling rather than evidence-based policy
- Provides context for interpreting economic findings

### 6.2 California Policy Timeline

**Key Events**:

| Date | Event | Gubernatorial Context |
|------|-------|----------------------|
| 2009 | Original CA Film Tax Credit established ($100M/year) | Schwarzenegger (R), term ending |
| 2010 | Election | Jerry Brown (D) elected |
| September 2014 | **AB 1839 signed** (tripling to $330M/year) | Brown (D), term 2, re-election secured |
| July 2015 | AB 1839 takes effect | Brown (D), second term midpoint |
| 2018 | Election | Gavin Newsom (D) elected |
| August 2020 | **AB 2021 signed** (extending through 2025) | Newsom (D), mid-term |

### 6.3 Electoral Cycle Hypothesis

**Theory**: Politicians time visible policy actions around elections to maximize credit-claiming

**Predicted Patterns**:
1. **Pre-election expansions**: New programs or expansions in years before elections
   - Allows time for implementation and visible activity
   - Media coverage and ribbon-cutting opportunities

2. **Post-election extensions**: Renewals and extensions after elections (when electoral pressure is lower)
   - Less politically risky
   - Maintains existing coalitions

**California Pattern**:
- AB 1839 (2014): Signed in **September 2014**, after Brown's November 2014 re-election was already secure
- AB 2021 (2020): Signed in **August 2020**, mid-term of Newsom's first term (not an election year)

**Preliminary Assessment**: 
- Timing does NOT follow simple pre-election pattern
- May reflect other political factors:
  - Legislative session timing (budget cycle)
  - Industry lobbying pressure
  - "Runaway production" narrative (high-profile productions leaving CA)
  - Competitive pressure from other states (Georgia, Louisiana expansions)

### 6.4 Empirical Analysis Plan

**Data Collection**:
1. California gubernatorial election dates and results (2006-2022)
2. Film tax credit bill introduction, passage, and signing dates
3. Media coverage (LexisNexis): count of articles mentioning film tax credits before/after elections
4. Legislative roll-call votes on AB 1839 and AB 2021

**Descriptive Statistics**:
- Calculate months between policy enactment and nearest election
- Code whether expansions occur in election years vs. off-years
- Compare to national pattern (do other states show similar timing?)

**Qualitative Evidence**:
- Review legislative records for stated justifications
  - Economic arguments (job creation, tax revenue)
  - Competitive arguments (other states' programs)
  - Industry pressure (testimonies, lobbying disclosure)
- Governor's signing statements
- Media framing (economic vs. political coverage)

### 6.5 Interpretation Framework

**Scenario 1: Strong Employment Effects + Electoral Timing**
→ Programs may be both economically effective AND politically motivated

**Scenario 2: Strong Employment Effects + No Electoral Timing**
→ Evidence-based policy (economic rationale drives decisions)

**Scenario 3: Weak/Null Employment Effects + Electoral Timing**
→ Political logic dominates (programs persist despite weak evidence)

**Scenario 4: Weak/Null Employment Effects + No Electoral Timing**
→ Inertia or institutional factors (once established, programs continue automatically)

**Expected Result**: Based on Owens & Rennhoff (2024), likely Scenario 3 or 4
- Political incentives explain persistence regardless of effectiveness
- Electoral timing may be less important than legislative logrolling, party discipline, and interest group pressure

### 6.6 Limitations of Political Analysis

**This is a descriptive supplement, not a causal analysis**:
- Cannot prove political motivations (observational data only)
- Small sample size (only 2 major California expansions)
- Political economy is complex (multiple factors interact)

**Purpose**: Provide context for interpreting employment findings
- If employment effects are null, political logic explains why programs persist
- If employment effects are positive, political timing suggests whether effects drive policy or vice versa

---


## Part 7: Summary and Next Steps

### 7.1 Research Summary

**Research Question**: Did California's 2015 and 2020 film tax credit expansions increase motion picture employment and wages, or do observed effects reflect reclassification artifacts?

**Empirical Strategy**:
1. **Primary**: Difference-in-differences using QCEW data (2009-2022)
2. **Robustness**: Synthetic control method
3. **Validation**: ACS migration data to distinguish genuine job creation from reclassification
4. **Context**: Political timing descriptive analysis

**Current Progress** (based on 2013-2018 data):
- ✓ Panel dataset constructed
- ✓ DiD specifications estimated
- ✓ Parallel trends visualized
- ✓ Employment and wage effects quantified
- ✗ Full time series (2009-2022) needed
- ✗ Synthetic control not yet implemented
- ✗ ACS migration validation not yet conducted
- ✗ Political timing analysis not yet completed

### 7.2 Immediate Next Steps

**Week 1-2: Complete Data Collection**
1. Download QCEW data for 2009-2012 and 2019-2022
2. Expand panel dataset to full 56 quarters (2009Q1-2022Q4)
3. Collect state-level control variables (unemployment, GDP growth)
4. Document control state selection criteria and verify no policy changes

**Week 3-4: Extend Regression Analysis**
1. Re-estimate DiD models with full time series
2. Generate event study plots (quarterly treatment effects)
3. Test robustness to different control groups
4. Implement wild cluster bootstrap for inference
5. Run placebo tests (spatial, temporal, outcome)

**Week 5: Implement Synthetic Control**
1. Select donor pool (states without policy changes)
2. Optimize weights to match California pre-2015
3. Generate treatment effects and confidence intervals
4. Permutation tests for inference
5. Compare results to DiD estimates

**Week 6: ACS Migration Analysis**
1. Download ACS state-to-state migration flows
2. Extract California in-migration by film occupations
3. Compare pre-2015 vs post-2015 trends
4. Test correspondence with QCEW employment changes
5. Placebo test with other professional occupations

**Week 7: Political Timing**
1. Document timeline of California expansions relative to elections
2. Review legislative records for stated justifications
3. Compare to national patterns of film credit expansions
4. Synthesize descriptive evidence

**Week 8: Integration and Writing**
1. Synthesize findings across methods
2. Draft methodology section (using this notebook as foundation)
3. Create results tables and figures
4. Interpret in context of Thom (2018) and broader literature
5. Discuss policy implications

### 7.3 Expected Contributions

**Regardless of results**, this research makes four contributions:

1. **Temporal Extension**: First study analyzing post-2013 period, capturing streaming era and larger programs

2. **Methodological Innovation**: First to validate administrative employment data (QCEW) with migration data (ACS)

3. **Robustness**: Multi-method approach provides stronger causal inference than single-method studies

4. **Political Context**: Integrates economic and political economy perspectives

**Key Insight**: By combining QCEW, ACS, and political analysis, this study can distinguish:
- Genuine economic effects from statistical artifacts
- Evidence-based policy from politically-motivated programs
- Industry gains from broader economic development

This addresses the fundamental puzzle in film tax credit research: **Why do these programs persist despite weak empirical support?**

---


## References

- **Bradbury, J. C. (2020)**. Do state movie production incentives promote economic development? *Journal of Regional Science*, 60(5), 882-903.

- **Cameron, A. C., Gelbach, J. B., & Miller, D. L. (2008)**. Bootstrap-based improvements for inference with clustered errors. *The Review of Economics and Statistics*, 90(3), 414-427.

- **Owens, M. F., & Rennhoff, A. D. (2024)**. Political behavior and voting for tax incentives: Evidence from film tax credits. *Public Choice*, 198(1-2), 111-134.

- **Rickman, D. S., & Wang, H. (2020)**. The economics of state film incentives. *Contemporary Economic Policy*, 38(3), 483-499.

- **Thom, M. (2018)**. Lights, camera, but no action? Tax and economic development lessons from state motion picture incentive programs. *American Review of Public Administration*, 48(1), 33-51.

---

*This methodology draft provides the analytical framework for evaluating California's film tax credit expansions. The full paper will include complete results from all analyses outlined above.*
