Introduction

Classification model of the energy consumption with the data collected from a smart small-scale steel industry in South Korea.
I took the data from UCI Machine Learning Repository.

Methodology

In this work, we used the dataset from the following research paper
@sathishkumar2023steel

The dataset is provided in the current repository. Here is the link Steel_industry_data.csv

wget 'https://raw.githubusercontent.com/Hokfu/Energy-Consumption-Model/main/Steel_industry_data.csv'

Problem Description

A steel company has a few challenges apart from market competition like Increased energy Costs, downtime, inefficient resource allocation, maintenance, and regulatory compliance

Problem: If the company does not know which conditions lead to high energy consumption and which ones lead to low and medium energy loads, those challenges will become serious problems.

Opportunity: Vice versa, if the company can predict the energy consumption of a process in advance, it can improve in the challenges above, and can gain market advantage.

EDA

Firstly, I tried to find the relation between numerical features and the target we want to know which is energy load type.
We can see clearly that NSM impacts the most to the load type by checking relations in above violin plot. Violin plot or box plot can be used to find out the distribution of numerical features. In this case, I checked the distribution of each numerical features relating to each load type.

It is more obvious when we check feature importance while training the random forest model.

Model Training

I trained with two models - logistic regression and random forest. Overall, random forest model seems to work better so I chose it as the final model.

Parameter Tuning

Maximum depth and minimum sample leaves are tuned in a loop to find the best values.

Dependency and Environment Management

For notebook and model training(train.py)
Use conda or any environment. For conda environment,

conda create -n 'environment-name' python=3.9.18

Activate conda environment

conda activate 'environment-name'

pip install -r requirements.txt

to install requirements.

For model prediction
Use pipenv

pipenv install numpy scikit-learn==1.3.0 gunicorn flask

Containerization

For container building

docker build -t <container_name> .

For container running

docker run -it --rm -p 9696:9696 <container_name>

Then, use another terminal and run predict_test.py to check the model.

Deployment

In Render, create account, create a new web service, and deploy the container.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.gitattributes		.gitattributes
Dockerfile		Dockerfile
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
Steel_industry_data.csv		Steel_industry_data.csv
finding best maximum depth.png		finding best maximum depth.png
finding best min sample leaves.png		finding best min sample leaves.png
finding feature importance.png		finding feature importance.png
model_1.bin		model_1.bin
notebook.ipynb		notebook.ipynb
predict.py		predict.py
predict_test.py		predict_test.py
relationship between numerical features and target.png		relationship between numerical features and target.png
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of Contents

Introduction

Methodology

Problem Description

EDA

Model Training

Parameter Tuning

Dependency and Environment Management

Containerization

Deployment

About

Releases

Packages

Languages

Hokfu/Energy-Consumption-Model

Folders and files

Latest commit

History

Repository files navigation

Table of Contents

Introduction

Methodology

Problem Description

EDA

Model Training

Parameter Tuning

Dependency and Environment Management

Containerization

Deployment

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages