Sensor Data Classification

1. Understanding the Dataset

The dataset consists of time-series sensor data collected over time, where each row represents a sensor reading at a given timestamp. The goal is to classify whether the signal "Passes" or "Fails" based on sensor values.

Dataset Structure

Time: Timestamp of the recorded signal.
Feature Columns (0-589): Sensor readings capturing various measurements.
Pass/Fail: Target variable indicating whether the signal was classified as a pass (1) or fail (-1).

2. Data Preprocessing

To prepare the dataset for analysis and model training, the following steps are taken:

A. Handling Missing Values

Check for missing or NaN values in sensor readings.
Strategy: Replace missing values with mean/median or remove rows if they are too sparse.

B. Feature Scaling

Sensor readings have different ranges, so Standardization (StandardScaler) or MinMax Scaling is applied.

C. Train-Test Split

The dataset is split into training (80%) and testing (20%) subsets using train_test_split().

3. Exploratory Data Analysis (EDA)

A. Visualizing Sensor Distributions

Histograms: Show how individual sensor readings are distributed.
Boxplots: Identify outliers in the sensor data.

B. Correlation Analysis

Heatmap (Seaborn): Identify strongly correlated sensor features.
Feature Reduction: Remove redundant sensors that provide duplicate information.

4. Feature Selection & Engineering

Select the most important sensor readings that impact the Pass/Fail classification.
Feature engineering techniques like Principal Component Analysis (PCA) or Feature Importance (Random Forest/Gradient Boosting) may be used.

5. Model Training & Evaluation

Multiple classification models are trained and evaluated:

A. Models Used

Logistic Regression (Baseline)
Decision Tree Classifier
Random Forest Classifier
Gradient Boosting Classifier
Naïve Bayes Classifier
Support Vector Machine (SVM)
Stacking Classifier (Combining multiple models)

B. Performance Metrics

Accuracy Score
Precision, Recall, F1-score
Confusion Matrix for classification performance.

6. Insights & Business Impact

What sensor readings indicate a failure?
- Certain sensor values may be critical in identifying defective signals.
How accurate is the failure prediction?
- If models achieve high accuracy (>85%), they can be deployed in real-world signal monitoring.
How can this help businesses?
- Automated detection of failing signals reduces manual inspections and improves production efficiency.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Sensor_Data_Classification.ipynb		Sensor_Data_Classification.ipynb
signal-data.csv		signal-data.csv
stacking_classifier_model.pkl		stacking_classifier_model.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Sensor Data Classification

1. Understanding the Dataset

Dataset Structure

2. Data Preprocessing

A. Handling Missing Values

B. Feature Scaling

C. Train-Test Split

3. Exploratory Data Analysis (EDA)

A. Visualizing Sensor Distributions

B. Correlation Analysis

4. Feature Selection & Engineering

5. Model Training & Evaluation

A. Models Used

B. Performance Metrics

6. Insights & Business Impact

About

Uh oh!

Releases

Packages

Languages

License

ghost2k0407/Sensor-Data-Classification

Folders and files

Latest commit

History

Repository files navigation

Sensor Data Classification

1. Understanding the Dataset

Dataset Structure

2. Data Preprocessing

A. Handling Missing Values

B. Feature Scaling

C. Train-Test Split

3. Exploratory Data Analysis (EDA)

A. Visualizing Sensor Distributions

B. Correlation Analysis

4. Feature Selection & Engineering

5. Model Training & Evaluation

A. Models Used

B. Performance Metrics

6. Insights & Business Impact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages