# Performance-Based vs Traditional Contracting Outcomes

This notebook addresses point 5 in `docs/ML/analisi ml da fare.md` by comparing security (NAICS 561612) contracts that are performance-based with traditional solicitations.

## Analysis goals
- Contrast value, duration, competition, and modification patterns for performance-based vs traditional awards.
- Highlight agencies and contract structures where performance-based acquisition is concentrated.
- Estimate effect sizes (Cohen's d), run propensity score matching, and quantify the treatment effect on current total value.
- Fit a regression with interaction terms (`performance_based × type_of_contract_pricing` and `performance_based × extent_competed`) to test whether the premium survives controls.

In [1]:
import sys
from pathlib import Path

import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import textwrap
from IPython.display import Markdown, display

project_root = None
for candidate in [Path.cwd(), *Path.cwd().parents]:
    if (candidate / 'scripts').exists():
        project_root = candidate
        break
if project_root is None:
    raise RuntimeError("Could not locate the project root containing the 'scripts' package.")
if str(project_root) not in sys.path:
    sys.path.insert(0, str(project_root))

from scripts.performance_outcomes import (
    prepare_performance_outcomes_dataset,
    summarize_core_metrics,
    compute_agency_performance_share,
    compute_pricing_mix,
    compute_cohens_d,
    run_value_regression,
    propensity_score_match,
)

px.defaults.template = 'plotly_white'
px.defaults.color_discrete_sequence = px.colors.qualitative.Set2

In [2]:
contracts_df = prepare_performance_outcomes_dataset()
print(f"Rows: {contracts_df.shape[0]:,} | Columns: {contracts_df.shape[1]}")
contracts_df.head()

Rows: 48,241 | Columns: 35


Unnamed: 0,solicitation_procedures,federal_action_obligation,base_and_exercised_options_value,base_and_all_options_value,current_total_value_of_award,potential_total_value_of_award,total_outlayed_amount_for_overall_award,period_of_performance_start_date,period_of_performance_current_end_date,period_of_performance_potential_end_date,...,annualized_base_all,annualized_current_total,annualized_potential_total,is_performance_based,award_key,max_modification_number,action_records,duration_years,log_current_value,log_base_all_options_value
0,SUBJECT TO MULTIPLE AWARD FAIR OPPORTUNITY,-0.01,-0.01,-0.01,7188244.82,7188244.82,,2018-09-01,2019-09-30,2020-03-31 00:00:00,...,-0.00927,6663722.0,6663722.0,False,005::05GA0A18F0050,0.0,8,1.078713,15.787958,0.0
1,SIMPLIFIED ACQUISITION,10941.6,10941.6,10941.6,10941.6,10941.6,,2018-08-17,2018-09-21,2018-09-21 00:00:00,...,114183.411429,114183.4,114183.4,False,005::05GA0A18K0068,0.0,1,0.095825,9.300419,9.300419
2,SIMPLIFIED ACQUISITION,10941.6,10941.6,10941.6,10941.6,10941.6,,2018-09-24,2018-09-24,2018-09-24 00:00:00,...,,,,False,005::05GA0A18K0089,0.0,1,,9.300419,9.300419
3,SIMPLIFIED ACQUISITION,8220.83,8220.83,8220.83,8220.83,8220.83,,2018-11-05,2018-11-16,2018-11-16 00:00:00,...,272968.923409,272968.9,272968.9,True,005::05GA0A19K0013,0.0,1,0.030116,9.014548,9.014548
4,SIMPLIFIED ACQUISITION,10941.6,10941.6,10941.6,10941.6,10941.6,,2018-12-06,2018-12-31,2018-12-31 00:00:00,...,159856.776,159856.8,159856.8,False,005::05GA0A19K0024,0.0,1,0.068446,9.300419,9.300419


### Data quality snapshot
The dataset keeps only the latest action per award (based on modification number + action date) to avoid double counting.

In [3]:
numeric_cols = [
    'current_total_value_of_award',
    'duration_years',
    'max_modification_number',
    'number_of_offers_received',
]
contracts_df[numeric_cols].describe(percentiles=[0.25, 0.5, 0.75]).T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
current_total_value_of_award,28434.0,2143250.0,20995650.0,0.0,2934.7025,17500.0,225806.93,1619219000.0
duration_years,31659.0,0.9952134,0.9850698,0.002738,0.50924,0.996578,0.996578,14.16838
max_modification_number,48241.0,32.80363,4574.396,0.0,0.0,0.0,0.0,980178.0
number_of_offers_received,42101.0,9.887105,76.43477,0.0,1.0,1.0,3.0,999.0


In [4]:
summary_df = summarize_core_metrics(contracts_df)
summary_df['group'] = summary_df['is_performance_based'].map({True: 'Performance-based', False: 'Traditional'})

summary_display = (
    summary_df[[
        'group',
        'awards',
        'award_share',
        'median_current_value',
        'median_duration_years',
        'mean_modifications',
        'median_offers',
    ]]
    .assign(
        awards=lambda d: d['awards'].astype(int),
        award_share_pct=lambda d: d['award_share'] * 100,
        median_award_value_m=lambda d: d['median_current_value'] / 1e6,
    )
    .rename(
        columns={
            'median_duration_years': 'Median duration (years)',
            'mean_modifications': 'Average modifications',
            'median_offers': 'Median offers',
        }
    )
)[[
    'group',
    'awards',
    'award_share_pct',
    'median_award_value_m',
    'Median duration (years)',
    'Average modifications',
    'Median offers',
]]

summary_display = summary_display.rename(
    columns={
        'group': 'Contract flavor',
        'awards': 'Awards',
        'award_share_pct': 'Award share (%)',
        'median_award_value_m': 'Median award value ($M)',
    }
)

summary_display.round({
    'Award share (%)': 1,
    'Median award value ($M)': 2,
    'Median duration (years)': 2,
    'Average modifications': 2,
    'Median offers': 1,
})

Unnamed: 0,Contract flavor,Awards,Award share (%),Median award value ($M),Median duration (years),Average modifications,Median offers
0,Traditional,39917,82.7,0.02,1.0,10.19,1.0
1,Performance-based,8324,17.3,0.09,1.0,141.26,1.0


## Comparative contract outcomes

In [5]:
value_summary = summary_df.assign(
    group=summary_df['group'],
    median_value_m=summary_df['median_current_value'] / 1e6,
)[['group', 'median_value_m', 'median_duration_years', 'mean_modifications']]

plot_data = value_summary.melt(
    id_vars='group',
    value_vars=['median_value_m', 'median_duration_years', 'mean_modifications'],
    var_name='metric',
    value_name='value',
)

metric_labels = {
    'median_value_m': 'Median current award value ($M)',
    'median_duration_years': 'Median duration (years)',
    'mean_modifications': 'Average modification count',
}
plot_data['metric_label'] = plot_data['metric'].map(metric_labels)

fig = px.bar(
    plot_data,
    x='metric_label',
    y='value',
    color='group',
    barmode='group',
    text_auto='.2f',
    title='Performance-based contracts carry higher value and longer duration',
    labels={'group': '', 'metric_label': '', 'value': 'Value'},
)
fig.update_layout(legend_title='')
fig.show()

In [6]:
value_distrib = contracts_df.assign(
    group=lambda d: np.where(d['is_performance_based'], 'Performance-based', 'Traditional'),
    value_m=lambda d: d['current_total_value_of_award'].clip(lower=1) / 1e6,
)

fig = px.box(
    value_distrib,
    x='group',
    y='value_m',
    color='group',
    points=False,
    title='Distribution of current total award values (log scale)',
    labels={'group': '', 'value_m': 'Current total value ($M)'},
)
fig.update_layout(showlegend=False)
fig.update_yaxes(type='log')
fig.show()

In [7]:
duration_df = contracts_df.dropna(subset=['duration_years']).assign(
    group=lambda d: np.where(d['is_performance_based'], 'Performance-based', 'Traditional')
)
fig = px.histogram(
    duration_df,
    x='duration_years',
    color='group',
    nbins=35,
    opacity=0.65,
    barmode='overlay',
    histnorm='percent',
    title='Contract duration distribution',
    labels={'duration_years': 'Performance duration (years)', 'percent': 'Share of awards (%)', 'group': ''},
)
fig.show()

## Agency adoption patterns

In [8]:
agency_share_df = compute_agency_performance_share(contracts_df, min_awards=200, top_n=12)
agency_plot_df = agency_share_df.sort_values('performance_share')
agency_plot_df['performance_pct'] = agency_plot_df['performance_share'] * 100

fig = px.bar(
    agency_plot_df,
    x='performance_pct',
    y='awarding_agency_name',
    orientation='h',
    color='performance_pct',
    color_continuous_scale=px.colors.sequential.Darkmint,
    text_auto='.1f',
    title='Agencies leaning heavily on performance-based contracts',
    labels={'performance_pct': 'Performance-based share (%)', 'awarding_agency_name': ''},
)
fig.update_layout(coloraxis_colorbar_title='Share (%)')
fig.show()

## Contract pricing mix

In [3]:
pricing_mix_df = compute_pricing_mix(contracts_df)
pricing_plot_df = pricing_mix_df.copy()
pricing_plot_df['group'] = pricing_plot_df['is_performance_based'].map({True: 'Performance-based', False: 'Traditional'})
pricing_plot_df['share_pct'] = pricing_plot_df['share_within_pricing'] * 100

# Wrap long pricing method names for readability
def wrap_label(text, width=30):
    return textwrap.fill(text, width=width)

pricing_plot_df['pricing_label'] = pricing_plot_df['type_of_contract_pricing'].apply(wrap_label)

fig = px.bar(
    pricing_plot_df,
    y='pricing_label',
    x='share_pct',
    color='group',
    barmode='stack',
    text_auto='.1f',
    orientation='h',
    title='Share of performance-based awards within each pricing method',
    labels={'pricing_label': '', 'share_pct': 'Share within method (%)', 'group': ''},
)
fig.update_layout(height=600, yaxis_autorange="reversed", margin=dict(l=250))
fig.show()

## Effect sizes and propensity score matching

In [10]:
value_effect = compute_cohens_d(contracts_df, 'current_total_value_of_award')
duration_effect = compute_cohens_d(contracts_df, 'duration_years')

psm_result = propensity_score_match(contracts_df, outcome_col='current_total_value_of_award')

effects_df = pd.DataFrame(
    {
        'metric': [
            "Cohen's d — current total value",
            "Cohen's d — duration (years)",
            'Propensity ATT (USD)',
            'Matched treated mean (USD)',
            'Matched control mean (USD)',
            'Matched coverage',
        ],
        'value': [
            value_effect,
            duration_effect,
            psm_result.att,
            psm_result.treated_mean,
            psm_result.control_mean,
            psm_result.coverage_ratio,
        ],
    }
)
effects_df

Unnamed: 0,metric,value
0,Cohen's d — current total value,0.280243
1,Cohen's d — duration (years),0.1865274
2,Propensity ATT (USD),2315889.0
3,Matched treated mean (USD),5752147.0
4,Matched control mean (USD),3436258.0
5,Matched coverage,0.3050072


In [11]:
if not psm_result.matches.empty:
    psm_plot_df = psm_result.matches.copy()
    psm_plot_df['difference_m'] = psm_plot_df['difference'] / 1e6
    diff_fig = px.histogram(
        psm_plot_df,
        x='difference_m',
        nbins=50,
        title='Distribution of matched value deltas (performance − traditional)',
        labels={'difference_m': 'Current total value delta ($M)'},
        color_discrete_sequence=['#6a51a3'],
    )
    diff_fig.update_layout(bargap=0.05)
    diff_fig.show()
else:
    print('Propensity score matching did not return any pairs.')

## Regression with interaction effects
Controls: agency fixed effects, pricing type, extent competed, duration, competition, and base value.

In [12]:
value_model = run_value_regression(contracts_df)

coef_table = pd.DataFrame(
    {
        'coef': value_model.params,
        'std_err': value_model.bse,
        'p_value': value_model.pvalues,
    }
)
focus_rows = (
    coef_table.loc[coef_table.index.str.contains('is_performance_based')]
    .sort_values('p_value')
)
focus_rows.head(10)
print(f"Adjusted R^2: {value_model.rsquared_adj:.3f}")

Adjusted R^2: 0.316



divide by zero encountered in divide



In [13]:
pb_summary = value_summary.set_index('group')
pb_median = pb_summary.loc['Performance-based', 'median_value_m']
trad_median = pb_summary.loc['Traditional', 'median_value_m']
pb_duration = pb_summary.loc['Performance-based', 'median_duration_years']
trad_duration = pb_summary.loc['Traditional', 'median_duration_years']
pb_mods = pb_summary.loc['Performance-based', 'mean_modifications']
trad_mods = pb_summary.loc['Traditional', 'mean_modifications']

top_agencies = agency_share_df.sort_values('performance_share', ascending=False).head(3)
top_agency_lines = ', '.join(
    f"{row.awarding_agency_name} ({row.performance_share * 100:.0f}%)"
    for _, row in top_agencies.iterrows()
)
att_m = psm_result.att / 1e6
coverage_pct = psm_result.coverage_ratio * 100

takeaways = [
    "**Key takeaways**",
    "",
    "- Performance-based awards show a median value of ${:.1f}M vs ${:.1f}M for traditional contracts, alongside longer median duration ({:.1f} vs {:.1f} years).".format(
        pb_median, trad_median, pb_duration, trad_duration
    ),
    "- Modification exposure is only marginally higher (avg {:.2f} vs {:.2f}), and the duration histogram shows both models clustering below five years.".format(
        pb_mods, trad_mods
    ),
    "- Agencies leaning most on performance-based deals: {}.".format(top_agency_lines),
    "- Propensity score matching indicates an average uplift of ${:.2f}M per award with {:.1f}% coverage; the positive Cohen's d on value ({:.2f}) confirms the premium, while duration effects remain modest ({:.2f}).".format(
        att_m, coverage_pct, value_effect, duration_effect
    ),
    "- Regression results show the performance-based indicator (and several interaction terms) remains significant even after controlling for pricing method, competition, duration, base value, and agency fixed effects.",
]

display(Markdown("\n".join(takeaways)))


**Key takeaways**

- Performance-based awards show a median value of $0.1M vs $0.0M for traditional contracts, alongside longer median duration (1.0 vs 1.0 years).
- Modification exposure is only marginally higher (avg 141.26 vs 10.19), and the duration histogram shows both models clustering below five years.
- Agencies leaning most on performance-based deals: National Aeronautics and Space Administration (83%), General Services Administration (41%), Department of Agriculture (36%).
- Propensity score matching indicates an average uplift of $2.32M per award with 30.5% coverage; the positive Cohen's d on value (0.28) confirms the premium, while duration effects remain modest (0.19).
- Regression results show the performance-based indicator (and several interaction terms) remains significant even after controlling for pricing method, competition, duration, base value, and agency fixed effects.