# Cost of Living Index Prediction

This notebook demonstrates the process of predicting the Cost of Living Index using other indices from the dataset.

## Load Dataset
The dataset contains various indices related to the cost of living by country. I started by loading the dataset to examine its structure.

In [None]:
import pandas as pd

# Load the dataset
file_path = '/mnt/data/Cost_of_Living_Index_by_Country_2024.csv'
cost_of_living_data = pd.read_csv(file_path)

# Display the first few rows of the dataset
cost_of_living_data.head()

## Data Preprocessing

Select features and target variable, and split the data into training and testing sets.

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

# Select features and target
X = cost_of_living_data[['Rent Index', 'Cost of Living Plus Rent Index', 'Groceries Index', 'Restaurant Price Index', 'Local Purchasing Power Index']]
y = cost_of_living_data['Cost of Living Index']

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

## Model Training

Create a linear regression model and train it using the training set.

In [None]:
# Create a linear regression model
model = LinearRegression()

# Train the model
model.fit(X_train, y_train)

## Predictions and Evaluation

Make predictions on the test set and evaluate the model performance.

In [None]:
# Make predictions
y_pred = model.predict(X_test)

# Calculate performance metrics
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

mse, r2

## Results

The Mean Squared Error (MSE) and R-squared (R²) score are displayed below.

In [None]:
mse, r2