## Project 1 - Predicting Life Insurance Policy Lapse

Project Idea: Build a model to predict whether a life insurance policy will lapse within a given period based on customer demographics and policy characteristics.

Steps:

    Collect data on life insurance policies, including customer demographics, policy details, and lapse status.
    Preprocess the data, handle missing values, and create features.
    Train a classification model to predict policy lapse.

In [2]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report

# Sample data
data = {
    'age': [25, 35, 45, 55, 65],
    'income': [30000, 50000, 70000, 90000, 110000],
    'policy_term': [10, 15, 20, 25, 30],
    'premium': [200, 300, 400, 500, 600],
    'policy_lapse': [0, 1, 0, 1, 0]
}
df = pd.DataFrame(data)

# Preprocess data
X = df[['age', 'income', 'policy_term', 'premium']]
y = df['policy_lapse']

# Standardize features
scaler = StandardScaler()
X = scaler.fit_transform(X)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model
model = RandomForestClassifier()
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
print(f"Accuracy: {accuracy_score(y_test, y_pred)}")
print(classification_report(y_test, y_pred))


ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

## Project 2 - Estimating Life Insurance Premiums

Project Idea: Develop a regression model to estimate the premium of a life insurance policy based on customer demographics and policy details.

Steps:

    Collect data on life insurance policies, including customer demographics and premium amounts.
    Preprocess the data, handle missing values, and create features.
    Train a regression model to predict premiums.

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

# Sample data
data = {
    'age': [25, 35, 45, 55, 65],
    'income': [30000, 50000, 70000, 90000, 110000],
    'policy_term': [10, 15, 20, 25, 30],
    'premium': [200, 300, 400, 500, 600]
}
df = pd.DataFrame(data)

# Preprocess data
X = df[['age', 'income', 'policy_term']]
y = df['premium']

# Standardize features
scaler = StandardScaler()
X = scaler.fit_transform(X)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model
model = LinearRegression()
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
print(f"Mean Squared Error: {mean_squared_error(y_test, y_pred)}")
print(f"R^2 Score: {r2_score(y_test, y_pred)}")


## Project 3 - Predicting Life Expectancy for Underwriting

Project Idea: Create a model to predict life expectancy of applicants for underwriting purposes using demographic and health data.

Steps:

    Collect data on applicants, including demographic and health information.
    Preprocess the data, handle missing values, and create features.
    Train a regression model to predict life expectancy.

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.metrics import mean_squared_error, r2_score

# Sample data
data = {
    'age': [30, 40, 50, 60, 70],
    'bmi': [22, 28, 26, 30, 24],
    'smoker': [0, 1, 1, 0, 0],
    'exercise_frequency': [3, 1, 2, 4, 3],
    'life_expectancy': [80, 75, 70, 85, 82]
}
df = pd.DataFrame(data)

# Preprocess data
X = df[['age', 'bmi', 'smoker', 'exercise_frequency']]
y = df['life_expectancy']

# Standardize features
scaler = StandardScaler()
X = scaler.fit_transform(X)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model
model = GradientBoostingRegressor()
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
print(f"Mean Squared Error: {mean_squared_error(y_test, y_pred)}")
print(f"R^2 Score: {r2_score(y_test, y_pred)}")


## Project 4 -  Customer Retention Prediction

Project Idea: Build a model to predict the likelihood of customers renewing their life insurance policies.

Steps:

    Collect data on customers and their policy renewal history.
    Preprocess the data, handle missing values, and create features.
    Train a classification model to predict customer retention.

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import accuracy_score, classification_report

# Sample data
data = {
    'age': [25, 35, 45, 55, 65],
    'income': [30000, 50000, 70000, 90000, 110000],
    'policy_term': [10, 15, 20, 25, 30],
    'premium': [200, 300, 400, 500, 600],
    'renewed': [1, 0, 1, 1, 0]
}
df = pd.DataFrame(data)

# Preprocess data
X = df[['age', 'income', 'policy_term', 'premium']]
y = df['renewed']

# Standardize features
scaler = StandardScaler()
X = scaler.fit_transform(X)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model
model = GradientBoostingClassifier()
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
print(f"Accuracy: {accuracy_score(y_test, y_pred)}")
print(classification_report(y_test, y_pred))
