Skip to content

GhostRuins/Process-Optimization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📊 Operational Intelligence Dashboard

A manufacturing analytics dashboard for chemical process optimization. Built with Streamlit to help you understand what's driving yield, catch outliers, and make smarter operational decisions.

Author: Hridesh Singh Chauhan
Purpose: Portfolio project showcasing data science and operational analytics for manufacturing processes.

🚀 What It Does

This dashboard helps you analyze manufacturing process data and answer questions like:

  • Which runs are outliers and why?
  • What process parameters drive yield?
  • Where should we focus improvement efforts?
  • What happens if we change temperature by 5°C?

Key Features

Outlier Detection

  • Finds runs with low yield using Isolation Forest
  • Shows you what makes outliers different
  • Configurable threshold (default: any run below 70% yield)

Predictive Modeling

  • Predicts yield based on process parameters
  • Uses Random Forest with cross-validation
  • Shows which parameters matter most

Strategic Insights

  • Cost-yield tradeoff analysis (is higher yield worth the energy cost?)
  • Improvement prioritization (where to focus first)
  • What-if scenarios (test changes before implementing)

Time Series Analysis

  • See how parameters change over time
  • Spot trends and patterns
  • Identify when outliers occurred

Realistic Energy Modeling

  • Energy prices vary by time of day and season
  • Ambient temperature affects cooling needs
  • More realistic cost calculations

📋 Setup

You'll need Python 3.8+. Install the dependencies:

cd "Operational Intelligence Dashboard"
pip install -r requirements.txt

Then run it:

streamlit run dashboard.py

🎯 Getting Started

Once the app loads, you'll see:

  1. KPIs at the top - Average yield, downtime, energy costs, and outlier rate
  2. Sidebar controls - Adjust simulation settings and outlier thresholds
  3. Analysis tabs - Dive into different views of your data

First Steps

  1. Check out the KPIs to get a quick sense of overall performance
  2. Scroll down to see time series of your process parameters
  3. Click on "Strategic Insights" to see improvement opportunities
  4. Try the "What-If Simulator" to test different scenarios

📊 What You'll Find

Key Performance Indicators

Four main metrics at the top:

  • Average Yield - How it compares to the top 10% of runs
  • Downtime Percentage - Whether it's getting better or worse over time
  • Total Energy Cost - Average cost per run
  • Outliers Detected - Percentage of runs flagged as problematic

Time Series Analysis

Interactive charts showing how each parameter changes over time. You can:

  • Switch which parameter is on the X-axis
  • See outlier markers on yield plots
  • View all your process parameters

Process Relationships

Scatter plots to explore how parameters relate to each other. Pick any two parameters and see:

  • How they correlate
  • Where outliers cluster
  • Relationships that might not be obvious

Strategic Insights

This is where the magic happens. Three tabs:

Cost-Yield Tradeoffs See the relationship between yield and energy costs. Are you getting good bang for your buck? Which runs are the most efficient?

Improvement Prioritization Answers: "Where should I focus first?" Shows you:

  • Which parameters will give the biggest yield boost
  • How hard it would be to improve each one
  • Top 3 priorities with specific recommendations

What-If Simulator Test changes before making them. Adjust sliders for process parameters and see:

  • Predicted yield change
  • Estimated energy cost impact
  • Whether it's worth it

Outlier Detection Table

A detailed table of all outlier runs with:

  • All their parameter values
  • What makes them outliers
  • Sortable and filterable

🔧 How It Works

The project is organized into a few main files:

Operational Intelligence Dashboard/
├── dashboard.py           # The main Streamlit app
├── data_preprocessing.py  # Generates synthetic data and calculates metrics
├── model.py              # ML models for prediction and outlier detection
├── external_data.py      # Simulates energy prices and ambient temperature
└── requirements.txt      # What you need to install

dashboard.py - The main interface. Handles all the visualizations and user interactions.

data_preprocessing.py - Creates realistic synthetic data with proper relationships between parameters and yield. Also calculates business metrics.

model.py - Contains the machine learning models:

  • Random Forest for yield prediction
  • Isolation Forest for outlier detection
  • Feature importance calculations

external_data.py - Simulates external factors like energy prices (higher during peak hours, seasonal variation) and ambient temperature.

💡 Usage Tips

Running a Basic Analysis

  1. Launch the app
  2. Play with the sidebar settings (try changing the number of samples)
  3. Adjust the outlier threshold if needed
  4. Scroll through the different sections

Finding Improvement Opportunities

  1. Go to "Strategic Insights"
  2. Check "Improvement Prioritization" - this shows you exactly where to focus
  3. Look at the top 3 priorities and their recommendations
  4. Use the "What-If Simulator" to test those changes

Testing Scenarios

  1. Open "What-If Simulator"
  2. The sliders start at average values (this is your baseline)
  3. Move a slider (e.g., increase temperature)
  4. Click "Run Scenario Analysis"
  5. See predicted yield, energy cost, and whether it's a good idea

The simulator will tell you if your scenario matches the baseline (useful for verifying it's working correctly).

📊 About the Data

The app generates synthetic data that mimics real manufacturing processes. It includes:

Process Parameters:

  • Temperature, Pressure, Catalyst Concentration, Flow Rate

External Factors:

  • Energy prices (varies by time of day and season)
  • Ambient temperature (affects cooling needs)

Calculated:

  • Energy consumption (based on process and ambient conditions)
  • Energy cost (consumption × time-varying price)
  • Yield (realistic relationships with process parameters)

The data generation models things like:

  • Optimal temperature ranges
  • Diminishing returns at high parameter values
  • Interactions between parameters
  • Realistic yield distributions

🎓 Under the Hood

Machine Learning:

  • Isolation Forest finds outliers without needing labeled data
  • Random Forest predicts yield and shows feature importance
  • Cross-validation ensures the metrics are reliable

Energy Modeling:

  • Energy prices are higher during peak hours (9 AM - 5 PM)
  • Summer months see price increases
  • Ambient temperature affects how much cooling you need
  • More realistic than assuming fixed prices

Strategic Analysis:

  • Looks at actual data to find improvement opportunities
  • Considers both impact (how much yield gain) and feasibility (how big is the gap)
  • Prioritizes based on what's achievable, not just what's important

⚙️ Configuration Options

Simulation Settings (in the sidebar):

  • Number of samples - How many runs to generate (default: 1000)
  • Process parameter ranges - Adjust temperature, pressure, etc.
  • Energy unit price - Base price for energy calculations

Outlier Detection (in the sidebar):

  • Outlier Yield % - Below this threshold, runs are considered outliers (default: 70%)
  • For synthetic data, this automatically calculates the contamination rate

🐛 Troubleshooting

"No suitable feature columns found"

  • Make sure the data generation completed. Check the sidebar settings.

"Yield column not found"

  • The preprocessing should create this automatically. Try refreshing or check the logs.

Model predictions seem off

  • Check that you have enough samples (try increasing in sidebar)
  • The model is trained on the synthetic data, so predictions should match those patterns

Energy costs don't make sense

  • Energy costs depend on consumption and time-varying prices
  • Check that external data integration completed (you should see energy_price and ambient_temperature columns)
  • The costs vary by time of day and season, so they won't all be the same

What-if simulator shows different values than baseline

  • The sliders start at average values, so if you haven't moved them, it should show baseline
  • Make sure you click "Run Scenario Analysis" to see results

📝 Quick Notes

  • The app uses synthetic data generation - no CSV upload needed
  • Outlier detection is based on yield threshold (you can change it)
  • Energy costs use realistic time-varying prices
  • Strategic insights are calculated from your actual data
  • The what-if simulator starts at average values so you can compare

Version: 1.0

Last Updated: Nov 11th, 2025

About

End-to-end data analytics platform to identify performance inefficiencies and optimize yield across simulated manufacturing operations.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages