## Rossi Dataset on Recidivism 

Rossi, P.H., R.A. Berk, and K.J. Lenihan (1980). Money, Work, and Crime: Some Experimental Results. New York: Academic Press. John Fox, Marilia Sa Carvalho (2012).

This data set is originally from Rossi et al. (1980), and is used as an example in Allison (1995). The data pertain to 432 convicts who were released from Maryland state prisons in the 1970s and who were followed up for one year after release. Half the released convicts were assigned at random to an experimental treatment in which they were given financial aid; half did not receive aid.:

https://vincentarelbundock.github.io/Rdatasets/doc/carData/Rossi.html

A data frame with 432 observations on the following 62 variables.

 - `week` week of first arrest after release or censoring; all censored observations are censored at 52 weeks.

 - `arrest` 1 if arrested, 0 if not arrested.

 - `fin` financial aid: no yes.

 - `age` in years at time of release.

 - `wexp` full-time work experience before incarceration: no or yes.


In [None]:
import lifelines.datasets as datasets
import pandas as pd
from lifelines import KaplanMeierFitter
import matplotlib.pyplot as plt

# Load the Rossi dataset
rossi = datasets.load_rossi()

# Display the first few rows
print(rossi.head())

In [None]:
T = rossi["week"]     # Time to event (weeks)
E = rossi["arrest"]   # Event indicator (1 if arrested, 0 if not)

In [None]:
# Initialize Kaplan-Meier fitter
kmf = KaplanMeierFitter()

# Fit the model
kmf.fit(T, event_observed=E)


In [None]:
# Plot the Kaplan-Meier survival curve
plt.figure(figsize=(8,6))
kmf.plot_survival_function()
plt.title("Kaplan-Meier Survival Curve for Recidivism (Rossi Dataset)")
plt.xlabel("Weeks After Release")
plt.ylabel("Survival Probability")
plt.grid(True)
plt.show()

In [None]:
# Define groups by work experience (wexp)
wexp_0 = rossi[rossi["wexp"] == 0]  # No work experience
wexp_1 = rossi[rossi["wexp"] == 1]  # Has work experience

# Kaplan-Meier Fitters for each group
kmf_0 = KaplanMeierFitter()
kmf_1 = KaplanMeierFitter()

# Fit survival models
kmf_0.fit(durations=wexp_0["week"], event_observed=wexp_0["arrest"], label="No Work Experience")
kmf_1.fit(durations=wexp_1["week"], event_observed=wexp_1["arrest"], label="Has Work Experience")

# Plot survival curves
plt.figure(figsize=(8,6))
kmf_0.plot_survival_function()
kmf_1.plot_survival_function()

# Labels and title
plt.title("Kaplan-Meier Survival Curve by Work Experience (wexp)")
plt.xlabel("Weeks After Release")
plt.ylabel("Survival Probability")
plt.legend()
plt.grid(True)
plt.show()

In [None]:

# Define groups by financial aid (fin)
fin_0 = rossi[rossi["fin"] == 0]  # No financial aid
fin_1 = rossi[rossi["fin"] == 1]  # Received financial aid

# Kaplan-Meier Fitters for each group
kmf_0 = KaplanMeierFitter()
kmf_1 = KaplanMeierFitter()

# Fit survival models
kmf_0.fit(durations=fin_0["week"], event_observed=fin_0["arrest"], label="No Financial Aid")
kmf_1.fit(durations=fin_1["week"], event_observed=fin_1["arrest"], label="Received Financial Aid")

# Plot survival curves
plt.figure(figsize=(8,6))
kmf_0.plot_survival_function()
kmf_1.plot_survival_function()

# Labels and title
plt.title("Kaplan-Meier Survival Curve by Financial Aid (fin)")
plt.xlabel("Weeks After Release")
plt.ylabel("Survival Probability")
plt.legend()
plt.grid(True)
plt.show()