Skip to content

Designed and implemented a robust big data pipeline to store real-time streaming sensor readings in MongoDB using Confluent Kafka. Leveraged this data to create an end-to-end AI-powered application that can help reduce the unnecessary costs incurred while repairing heavy-duty trucks.

Notifications You must be signed in to change notification settings

seemanshu-shukla/sensor-fault-detection

Repository files navigation

Sensor-Fault-Detection

Problem Statement:

The Air Pressure System (APS) is a critical component of a heavy-duty vehicle that uses compressed air to force a piston to provide pressure to the brake pads, slowing the vehicle down. The benefits of using an APS instead of a hydraulic system are the easy availability and long-term sustainability of natural air.

This is a Binary Classification problem, in which the affirmative class indicates that the failure was caused by a certain component of the APS, while the negative class indicates that the failure was caused by something else.

Solution Proposed:

In this project, the system in focus is the Air Pressure system (APS) which generates pressurized air that are utilized in various functions in a truck, such as braking and gear changes. The datasets positive class corresponds to component failures for a specific component of the APS system. The negative class corresponds to trucks with failures for components not related to the APS system.

The problem is to reduce the cost due to unnecessary repairs. So it is required to minimize the false predictions.

Tech Stack Used:

  1. Python
  2. FastAPI
  3. Machine learning algorithms
  4. Docker
  5. MongoDB

Infrastructure Required:

  1. AWS S3
  2. AWS EC2
  3. AWS ECR
  4. Git Actions
  5. Terraform

How to run?

Before we run the project, make sure that you are having MongoDB in your local system, with Compass since we are using MongoDB for data storage. You also need AWS account to access the service like S3, ECR and EC2 instances.

Data Collections:

Source Code-

https://github.com/seemanshu-shukla/sensor-fault-detection-big-data-pipeline

image

Project Archietecture:

image

Deployment Archietecture:

image

Setup Project Locally:

Step 1: Clone the repository

git clone https://github.com/seemanshu-shukla/sensor-fault-detection.git

Step 2- Create and activate a conda environment after opening the repository

conda create -n sensor python=3.7.6 -y
conda activate sensor

Step 3 - Install the requirements

pip install -r requirements.txt

Step 4 - Create .env file and set up following secrets

AWS_ACCESS_KEY_ID=<AWS_ACCESS_KEY_ID>

AWS_SECRET_ACCESS_KEY=<AWS_SECRET_ACCESS_KEY>

AWS_DEFAULT_REGION=<AWS_DEFAULT_REGION>

MONGODB_URL="<Your MongoDB connection key>"

Step 5 - Run the application server

python main.py

Step 6. Train application

http://localhost:8080/train

Step 7. Prediction application

http://localhost:8080/predict

Running project locally using Docker:

  1. Check if the Dockerfile is available in the project directory

  2. Build the Docker image

docker build --build-arg AWS_ACCESS_KEY_ID=<AWS_ACCESS_KEY_ID> --build-arg AWS_SECRET_ACCESS_KEY=<AWS_SECRET_ACCESS_KEY> --build-arg AWS_DEFAULT_REGION=<AWS_DEFAULT_REGION> --build-arg MONGODB_URL=<MONGODB_URL> . 

  1. Run the Docker image
docker run -d -p 8080:8080 <IMAGE_NAME>

About

Designed and implemented a robust big data pipeline to store real-time streaming sensor readings in MongoDB using Confluent Kafka. Leveraged this data to create an end-to-end AI-powered application that can help reduce the unnecessary costs incurred while repairing heavy-duty trucks.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published