# Rule-Based Sensor Data Validation

### 🎯 Objective:
Apply domain-specific rules to detect invalid sensor readings using simple thresholds and physics-informed logic.

---

### 📌 Rules Defined:

| Feature         | Validation Rule                       | Justification                          |
|----------------|----------------------------------------|----------------------------------------|
| Pressure_In     | 2.0 < value < 6.0 bar                  | Based on compressor operating range    |
| Temperature_In  | -20°C < value < 100°C                 | Operating bounds for inlet air         |
| Flow_Rate       | value > 0                             | Flow must be positive                  |
| Efficiency      | 0.75 < value < 0.95                   | Mechanical efficiency expected range   |
| Vibration       | value < 5 mm/s                        | High values indicate mechanical fault  |


### function for applying rules

In [None]:
# Copy of the rules applied to a DataFrame
def apply_validation_rules(df):
    valid = pd.Series(True, index=df.index)

    valid &= df["Pressure_In"].between(2.0, 6.0)
    valid &= df["Temperature_In"].between(-20, 100)
    valid &= df["Flow_Rate"] > 0
    valid &= df["Efficiency"].between(0.75, 0.95)
    valid &= df["Vibration"] < 5

    return valid

In [None]:
# Apply rules to real data
valid_mask = apply_validation_rules(df)
invalid_rows = df[~valid_mask]

print(f"Number of invalid rows detected: {len(invalid_rows)}")

### Visualize

In [None]:
# Visualize invalid vs. valid
import seaborn as sns

df_copy = df.copy()
df_copy["Validity"] = valid_mask.replace({True: "Valid", False: "Invalid"})

sns.pairplot(df_copy, hue="Validity", diag_kind="kde", corner=True)
plt.suptitle("Validation Result: Feature Distribution")
plt.show()


### further changes...