📊 Operational Intelligence Dashboard

A manufacturing analytics dashboard for chemical process optimization. Built with Streamlit to help you understand what's driving yield, catch outliers, and make smarter operational decisions.

Author: Hridesh Singh Chauhan
Purpose: Portfolio project showcasing data science and operational analytics for manufacturing processes.

🚀 What It Does

This dashboard helps you analyze manufacturing process data and answer questions like:

Which runs are outliers and why?
What process parameters drive yield?
Where should we focus improvement efforts?
What happens if we change temperature by 5°C?

Key Features

Outlier Detection

Finds runs with low yield using Isolation Forest
Shows you what makes outliers different
Configurable threshold (default: any run below 70% yield)

Predictive Modeling

Predicts yield based on process parameters
Uses Random Forest with cross-validation
Shows which parameters matter most

Strategic Insights

Cost-yield tradeoff analysis (is higher yield worth the energy cost?)
Improvement prioritization (where to focus first)
What-if scenarios (test changes before implementing)

Time Series Analysis

See how parameters change over time
Spot trends and patterns
Identify when outliers occurred

Realistic Energy Modeling

Energy prices vary by time of day and season
Ambient temperature affects cooling needs
More realistic cost calculations

📋 Setup

You'll need Python 3.8+. Install the dependencies:

cd "Operational Intelligence Dashboard"
pip install -r requirements.txt

Then run it:

streamlit run dashboard.py

🎯 Getting Started

Once the app loads, you'll see:

KPIs at the top - Average yield, downtime, energy costs, and outlier rate
Sidebar controls - Adjust simulation settings and outlier thresholds
Analysis tabs - Dive into different views of your data

First Steps

Check out the KPIs to get a quick sense of overall performance
Scroll down to see time series of your process parameters
Click on "Strategic Insights" to see improvement opportunities
Try the "What-If Simulator" to test different scenarios

📊 What You'll Find

Key Performance Indicators

Four main metrics at the top:

Average Yield - How it compares to the top 10% of runs
Downtime Percentage - Whether it's getting better or worse over time
Total Energy Cost - Average cost per run
Outliers Detected - Percentage of runs flagged as problematic

Time Series Analysis

Interactive charts showing how each parameter changes over time. You can:

Switch which parameter is on the X-axis
See outlier markers on yield plots
View all your process parameters

Process Relationships

Scatter plots to explore how parameters relate to each other. Pick any two parameters and see:

How they correlate
Where outliers cluster
Relationships that might not be obvious

Strategic Insights

This is where the magic happens. Three tabs:

Cost-Yield Tradeoffs See the relationship between yield and energy costs. Are you getting good bang for your buck? Which runs are the most efficient?

Improvement Prioritization Answers: "Where should I focus first?" Shows you:

Which parameters will give the biggest yield boost
How hard it would be to improve each one
Top 3 priorities with specific recommendations

What-If Simulator Test changes before making them. Adjust sliders for process parameters and see:

Predicted yield change
Estimated energy cost impact
Whether it's worth it

Outlier Detection Table

A detailed table of all outlier runs with:

All their parameter values
What makes them outliers
Sortable and filterable

🔧 How It Works

The project is organized into a few main files:

Operational Intelligence Dashboard/
├── dashboard.py           # The main Streamlit app
├── data_preprocessing.py  # Generates synthetic data and calculates metrics
├── model.py              # ML models for prediction and outlier detection
├── external_data.py      # Simulates energy prices and ambient temperature
└── requirements.txt      # What you need to install

dashboard.py - The main interface. Handles all the visualizations and user interactions.

data_preprocessing.py - Creates realistic synthetic data with proper relationships between parameters and yield. Also calculates business metrics.

model.py - Contains the machine learning models:

Random Forest for yield prediction
Isolation Forest for outlier detection
Feature importance calculations

external_data.py - Simulates external factors like energy prices (higher during peak hours, seasonal variation) and ambient temperature.

💡 Usage Tips

Running a Basic Analysis

Launch the app
Play with the sidebar settings (try changing the number of samples)
Adjust the outlier threshold if needed
Scroll through the different sections

Finding Improvement Opportunities

Go to "Strategic Insights"
Check "Improvement Prioritization" - this shows you exactly where to focus
Look at the top 3 priorities and their recommendations
Use the "What-If Simulator" to test those changes

Testing Scenarios

Open "What-If Simulator"
The sliders start at average values (this is your baseline)
Move a slider (e.g., increase temperature)
Click "Run Scenario Analysis"
See predicted yield, energy cost, and whether it's a good idea

The simulator will tell you if your scenario matches the baseline (useful for verifying it's working correctly).

📊 About the Data

The app generates synthetic data that mimics real manufacturing processes. It includes:

Process Parameters:

Temperature, Pressure, Catalyst Concentration, Flow Rate

External Factors:

Energy prices (varies by time of day and season)
Ambient temperature (affects cooling needs)

Calculated:

Energy consumption (based on process and ambient conditions)
Energy cost (consumption × time-varying price)
Yield (realistic relationships with process parameters)

The data generation models things like:

Optimal temperature ranges
Diminishing returns at high parameter values
Interactions between parameters
Realistic yield distributions

🎓 Under the Hood

Machine Learning:

Isolation Forest finds outliers without needing labeled data
Random Forest predicts yield and shows feature importance
Cross-validation ensures the metrics are reliable

Energy Modeling:

Energy prices are higher during peak hours (9 AM - 5 PM)
Summer months see price increases
Ambient temperature affects how much cooling you need
More realistic than assuming fixed prices

Strategic Analysis:

Looks at actual data to find improvement opportunities
Considers both impact (how much yield gain) and feasibility (how big is the gap)
Prioritizes based on what's achievable, not just what's important

⚙️ Configuration Options

Simulation Settings (in the sidebar):

Number of samples - How many runs to generate (default: 1000)
Process parameter ranges - Adjust temperature, pressure, etc.
Energy unit price - Base price for energy calculations

Outlier Detection (in the sidebar):

Outlier Yield % - Below this threshold, runs are considered outliers (default: 70%)
For synthetic data, this automatically calculates the contamination rate

🐛 Troubleshooting

"No suitable feature columns found"

Make sure the data generation completed. Check the sidebar settings.

"Yield column not found"

The preprocessing should create this automatically. Try refreshing or check the logs.

Model predictions seem off

Check that you have enough samples (try increasing in sidebar)
The model is trained on the synthetic data, so predictions should match those patterns

Energy costs don't make sense

Energy costs depend on consumption and time-varying prices
Check that external data integration completed (you should see energy_price and ambient_temperature columns)
The costs vary by time of day and season, so they won't all be the same

What-if simulator shows different values than baseline

The sliders start at average values, so if you haven't moved them, it should show baseline
Make sure you click "Run Scenario Analysis" to see results

📝 Quick Notes

The app uses synthetic data generation - no CSV upload needed
Outlier detection is based on yield threshold (you can change it)
Energy costs use realistic time-varying prices
Strategic insights are calculated from your actual data
The what-if simulator starts at average values so you can compare

Version: 1.0

Last Updated: Nov 11th, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📊 Operational Intelligence Dashboard

🚀 What It Does

Key Features

📋 Setup

🎯 Getting Started

First Steps

📊 What You'll Find

Key Performance Indicators

Time Series Analysis

Process Relationships

Strategic Insights

Outlier Detection Table

🔧 How It Works

💡 Usage Tips

Running a Basic Analysis

Finding Improvement Opportunities

Testing Scenarios

📊 About the Data

🎓 Under the Hood

⚙️ Configuration Options

🐛 Troubleshooting

📝 Quick Notes

Version: 1.0

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
dashboard.py		dashboard.py
data_preprocessing.py		data_preprocessing.py
external_data.py		external_data.py
model.py		model.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

📊 Operational Intelligence Dashboard

🚀 What It Does

Key Features

📋 Setup

🎯 Getting Started

First Steps

📊 What You'll Find

Key Performance Indicators

Time Series Analysis

Process Relationships

Strategic Insights

Outlier Detection Table

🔧 How It Works

💡 Usage Tips

Running a Basic Analysis

Finding Improvement Opportunities

Testing Scenarios

📊 About the Data

🎓 Under the Hood

⚙️ Configuration Options

🐛 Troubleshooting

📝 Quick Notes

Version: 1.0

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages