# EazyML Counterfactual Template

## Define Imports

In [None]:
!pip install --upgrade eazyml-counterfactual
!pip install gdown python-dotenv

In [None]:
import os
import pandas as pd
import eazyml as ez
from eazyml_counterfactual import (
        ez_cf_inference,
        ez_init        
)
import gdown

from dotenv import load_dotenv
load_dotenv()

## 1. Initialize EazyML

The `ez_init` function uses the `EAZYML_ACCESS_KEY` environment variable for authentication. If the variable is not set, it defaults to a trial license.

In [None]:
ez_init(os.getenv('EAZYML_ACCESS_KEY'))

## 2. Define Dataset Files and Outcome Variable

In [None]:
gdown.download_folder(id='1gWvCFW2cHqthUsPUQ0feOG4P41rpQwJC')

In [None]:
# Defining file paths for training and test datasets and specifying the outcome variable
train_file = os.path.join('data', "Mobile Price Ternary - Train Data.xlsx")
test_file = os.path.join('data', "Mobile Price Ternary - Test Data.xlsx")
outcome = "price_range"

# Loading the training dataset and the test dataset
train_df = pd.read_excel(train_file)
test_df = pd.read_excel(test_file)

## 3. Dataset Information

The dataset used in this notebook is the **Mobile Price Classification Dataset**, which contains data on mobile phones and their characteristics. It includes various features such as the mobile’s battery life, brand, camera quality, and other technical specifications that can help classify mobile phones into different price ranges.

You can find more details and download the dataset from Kaggle using the following link:

[Kaggle Mobile Price Classification Dataset](https://www.kaggle.com/datasets/iabhishekofficial/mobile-price-classification)

### Columns in the Dataset:
- **battery_power**: The battery power of the mobile phone (in mAh).
- **blue**: Whether the mobile has Bluetooth connectivity (1 = Yes, 0 = No).
- **clock_speed**: The clock speed of the mobile’s processor (in GHz).
- **dual_sim**: Whether the mobile supports dual SIM (1 = Yes, 0 = No).
- **fc**: Front camera quality (in megapixels).
- **four_g**: Whether the mobile supports 4G connectivity (1 = Yes, 0 = No).
- **int_memory**: Internal memory of the mobile (in GB).
- **m_dep**: Mobile depth (in cm).
- **mobile_wt**: Weight of the mobile (in grams).
- **n_cores**: Number of processor cores in the mobile.
- **pc**: Primary camera quality (in megapixels).
- **px_height**: Pixel Resolution Height.
- **px_width**: Pixel Resolution Width.
- **ram**: Random access memory of the mobile (in MB).
- **sc_h**: Screen height of the mobile (in cm).
- **sc_w**: Screen width of the mobile (in cm).
- **talk_time**: Maximum talk time (in hours).
- **three_g**: Whether the mobile supports 3G connectivity (1 = Yes, 0 = No).
- **touch_screen**: Whether the mobile has a touch screen (1 = Yes, 0 = No).
- **wifi**: Whether the mobile supports Wi-Fi connectivity (1 = Yes, 0 = No).
- **price_range**: The price range of the mobile (target variable, with 4 possible classes: 0, 1, 2, 3).

### 3.1 Display the Dataset

Below is a preview of the dataset:

In [None]:
# Display the first few rows of the training DataFrame for inspection
ez.ez_display_df(train_df.head())

## 4. EazyML Modeling

### 4.1 Building model using the EazyML Modeling API

In [None]:
# Define model parameters
model_options = {
    "model_type": "predictive",
}

# Build predictive model using EazyML API
build_model_response = ez.ez_build_model(train_df, outcome=outcome, options=model_options)

### 4.2 Feature Importance

In [None]:
ez.ez_display_df(build_model_response['global_importance'])

### 4.3 Model Importance

In [None]:
ez.ez_display_df(build_model_response['model_performance'])

### 4.4 Predict Using the Trained EazyML Model

In [None]:
# Extract model information from the response dictionary
model_info = build_model_response["model_info"]

# Read test data from a CSV file into a pandas DataFrame
test_data = pd.read_excel(test_file)

# Make predictions using the model, requesting confidence scores and class probabilities
predicted_resp = ez.ez_predict(test_data, model_info, options={"confidence_score": True, "class_probability": True})

# Check if the prediction was successful
if predicted_resp['success']:
    print("Prediction successful")  
    predicted_df = predicted_resp['pred_df']  # Extract the predicted DataFrame
    ez.ez_display_df(predicted_df.head())  # Display the first few rows of the predicted DataFrame
else:
    print("Prediction failed")  
    print(predicted_resp['message'])  

## 5. EazyML Counterfactual Inference

### 5.1 Define Counterfactual Inference Configuration

In [None]:
# Define the selected features for prediction
selected_features = ['sc_w', 'n_cores', 'mobile_wt', 'talk_time', 'ram', 'px_width', 'px_height', 
                     'battery_power', 'pc', 'fc', 'm_dep', 'int_memory', 'sc_h']

# Define variant (modifiable) features
invariants = []
variants = [feature for feature in selected_features if feature not in invariants]

# Define configurable parameters for counterfactual inference
cf_options = {   
    "variants": variants,  
    "outcome_ordinality": "1",  # Desired outcome 
    "train_data": train_file  
}

### 5.2 Perform Counterfactual Inference

In [None]:
# Specify the index of the test record for counterfactual inference
test_index_no = 0  
test_data = predicted_df.loc[[test_index_no]]  

# Perform Inference 
result, optimal_transition_df = ez_cf_inference(
    test_data=test_data,  
    outcome=outcome,  
    selected_features=selected_features,  
    model_info=model_info,
    options=cf_options  
)

### 5.3 Display Results

In [None]:
# Summarizes whether an optimal transition was found and the improvement in outcome probability.
ez.ez_display_json(result)

In [None]:
# Details the feature changes needed to achieve the optimal outcome.
ez.ez_display_df(optimal_transition_df)