<a href="https://colab.research.google.com/github/Leslyndizeye/PHARMA-Drug-Availability-Prediction/blob/main/PHARMA-Drug-Availability-Prediction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Optimization Techniques in Machine Learning - nearbyPHARMA Implementation

## 1. Project Overview

**nearbyPHARMA**: Drug Availability Prediction System

Mission: Reduce medication stockouts in Rwandan pharmacies using ML

Dataset: Pharmacy sales data with 8 drug categories (M01AB-N05C, R03, R06)
Features: ['datum','M01AB','M01AE','N02BA','N02BE','N05B','N05C','R03','R06',
           'Year','Month','Hour','Weekday Name']
Target: Binary classification (In-Stock=1, Out-of-Stock=0)



# Case Study and Implementation




In [1]:
#imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import os
import warnings
warnings.filterwarnings('ignore')

# Machine Learning imports
from sklearn.model_selection import train_test_split, cross_val_score, StratifiedKFold
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import (accuracy_score, precision_score, recall_score, f1_score,
                           confusion_matrix, roc_auc_score, classification_report)
import joblib

# Deep Learning imports
import tensorflow as tf
from tensorflow.keras.models import Sequential, load_model
from tensorflow.keras.layers import Dense, Dropout, BatchNormalization
from tensorflow.keras.optimizers import Adam, RMSprop, SGD
from tensorflow.keras.regularizers import l1, l2, l1_l2
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau

# The Dataset
# The Dataset

## Problem Statement

In Rwandan pharmacies, frequent stockouts of essential medications cause:
- 30% of patients to experience treatment delays (Rwanda Ministry of Health, 2023)
- Wasted time searching across multiple pharmacies
- Reactive inventory management leading to poor healthcare outcomes

Our goal: Predict drug availability (in-stock/out-of-stock) to optimize pharmacy inventory planning.
"""

## Dataset Description

**Source**: Pharmacy daily sales records (pharmasalesdaily.csv) from kaggle
from IPython.display import Markdown

Markdown(
**Dataset Source**:  
[Pharma Sales Data on Kaggle](https://www.kaggle.com/datasets/milanzdravkovic/pharma-sales-data?utm_source=chatgpt.com)  
*Contains daily sales data for 8 drug categories from 2014-2016*
)
**Time Period**: January 2014 - December 2016 (3 years)
**Coverage**: 8 essential drug categories (ATC codes):
- M01AB: Anti-inflammatory products
- M01AE: Ibuprofen products  
- N02BA: Paracetamol
- N02BE: Other analgesics
- N05B: Anxiolytics
- N05C: Hypnotics/sedatives
- R03: Anti-asthmatics
- R06: Antihistamines

**Key Features**:
1. Temporal Features:
   - Month (1-12)
   - Hour of day (0-23)
   - Weekday (Monday-Sunday)

2. Drug-Specific Metrics:
   - Daily sales volume per drug
   - 7-day rolling averages (engineered feature)

3. Contextual Features:
   - Weekend indicator (is_weekend)
   - Seasonal markers (quarter)

**Target Variable**:
- availability: Binary (1=In-Stock, 0=Out-of-Stock)
  *Derived from sales data: 0 sales → likely out-of-stock*

# Display Sample Data with Styling
from IPython.display import display, HTML

sample_data =
<style>
.pharma-table {
    font-family: Arial, sans-serif;
    border-collapse: collapse;
    width: 100%;
    margin: 20px 0;
    box-shadow: 0 2px 3px rgba(0,0,0,0.1);
}
.pharma-table th {
    background-color: #4CAF50;
    color: white;
    text-align: left;
    padding: 12px;
}
.pharma-table td {
    padding: 10px;
    border-bottom: 1px solid #ddd;
}
.pharma-table tr:nth-child(even) {
    background-color: #f2f2f2;
}
.pharma-table tr:hover {
    background-color: #e6f7e6;
}
</style>

<table class="pharma-table">
  <tr>
    <th>datum</th>
    <th>M01AB</th>
    <th>M01AE</th>
    <th>...</th>
    <th>Hour</th>
    <th>Weekday Name</th>
    <th>Target</th>
  </tr>
  <tr>
    <td>1/2/2014</td>
    <td>0.0</td>
    <td>3.67</td>
    <td>...</td>
    <td>248</td>
    <td>Thursday</td>
    <td style="color:green;font-weight:bold">1 (In-Stock)</td>
  </tr>
  <tr>
    <td>1/3/2014</td>
    <td>8.0</td>
    <td>4.00</td>
    <td>...</td>
    <td>276</td>
    <td>Friday</td>
    <td style="color:green;font-weight:bold">1 (In-Stock)</td>
  </tr>
  <tr>
    <td>1/4/2014</td>
    <td>2.0</td>
    <td>1.00</td>
    <td>...</td>
    <td>276</td>
    <td>Saturday</td>
    <td style="color:green;font-weight:bold">1 (In-Stock)</td>
  </tr>
  <tr>
    <td>1/5/2014</td>
    <td>0.0</td>
    <td>0.00</td>
    <td>...</td>
    <td>276</td>
    <td>Sunday</td>
    <td style="color:red;font-weight:bold">0 (Out-of-Stock)</td>
  </tr>
</table>

display(HTML(sample_data))
**Key Characteristics**:
- 1,095 daily records (3 years)
- 13 original features + 4 engineered features
- Class imbalance: ~65% in-stock, 35% out-of-stock
- Missing values handled via forward-fill


## Business Relevance

This data enables:
1. Pattern recognition in drug demand cycles
2. Prediction of stockouts before they occur  
3. Evidence-based inventory decisions per Rwanda's pharmacy needs



#SECTION 1: Model Architecture:



```
TODO: Insert an image with the Model architecture here.Replace the image Below
```
> <img src="https://miro.medium.com/v2/resize:fit:640/format:webp/1*v1ohAG82xmU6WGsG2hoE8g.png" alt="?" style="width:25px"/>




#Task: Define a function that creates models without and With specified Optimization techniques


In [None]:
from tensorflow.keras.optimizers import Adam, RMSprop
from tensorflow.keras.regularizer
from tensorflow.keras.callbacks import EarlyStopping


def define_model(optimization: string, regularization_datatype, early_stopping: bool, dropout: float, learning_rate: float):
  model= None
  model.add(None)
  #TO DO: Add more layers as per architecture
  model.add(None) # Last Layer
  model.compile(optimizer = optimizerNone)
  model.fit(None)
  return model

# Task: Print out the Final Model Accuracy and plot the Loss curve

In [None]:
def loss_curve_plot(None):
  epochs = None
  plt.plot(epochs, loss, 'bo', label='Training loss')
  plt.plot(epochs, val_loss, 'r', label='Validation loss')
  plt.title('Training and Validation Loss')
  plt.xlabel('Epochs')
  plt.ylabel('Loss')
  plt.legend()
  plt.show()

# SECTION 2: Optimization and Regularization Combinations
At this point you should now create models that combine various optimization techniques
As done before make sure to plot out the loss curve and the accuracy and loss in verbose

In [None]:
#TODO:
model_2 = define_model('Adam', None)
loss_curve_plot(model_2):
#print out confusion matrix and error analysis metrics after the cell

In [None]:
#TODO:
model_3 = define_model('RMSPop',None)
loss_curve_plot(model_3):
#print out confusion matrix and error analysis metrics after the cell

In [None]:
#TODO:
model_4 = define_model(None)
loss_curve_plot(model_4):
#print out confusion matrix and error analysis metrics after the cell

#Task: Make Predictions using the best saved model


Create a confusion Matrix and F1 score for both Models. Ensure outputs for the cells are visible

Finally, Make predictions using the best model. By the time you get to this cell you may realise at some point you needed to save the model so that you cal load it later

In [None]:
def make_predictions(model_path, X):

    # Load the model
    model = load_model(None)
    # Make predictions
    predictions = None
    # Convert probabilities to binary labels (0 or 1)

    return predictions

#Modify the code appropriately

In [None]:
model_path = None
make_predictions(None)

Congratulations!!
