#  Chisa Workflow Demo: Legacy Sensor Modernization with Chisa
In this notebook, we simulate a real-world Data Engineering workflow. 

**The Scenario:** We have received a dataset from an old chemical plant. The sensors report in legacy units (Fahrenheit, PSI, and Gallons per Minute). Our Machine Learning models and physical simulations strictly require standard SI units (Celsius, Pascals, and Cubic Meters per Second).

Instead of relying on magic numbers or messy conversions, we will use **Chisa** to:
1. Define a custom physical dimension dynamically.
2. Normalize the Pandas DataFrame safely.
3. Validate physics using dimensional algebra.
4. Visualize the clean data.

In [None]:
# Setup the environment
try:
    import chisa
    print(f"Chisa version {chisa.__version__} is ready to go!")
except ImportError:
    print("Installing Chisa...")
    %pip install chisa
    import chisa
    print("Chisa installed successfully!")

In [None]:
# Step 1: Import the ecosystem
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Import Chisa components
from chisa import convert, axiom, BaseUnit
from chisa.units.time import Second, Hour
from chisa.units.volume import CubicMeter, USGallon

## Step 2: Extending the Domain (Custom Dimensions)
Chisa doesn't have "Volumetric Flow Rate" built-in out of the box. But because Chisa is a framework, we can build it instantly.

Flow Rate is simply **Volume divided by Time** ($m^3 / s$). Let's define the physics using `@axiom.derive` so Chisa understands how to calculate it automatically.

In [None]:
@axiom.bound(min_val=0, msg="Flow rate cannot be negative in this pipeline!")
class FlowRateUnit(BaseUnit):
    dimension = "flow_rate"

# The SI Base Unit: Cubic Meters per Second
@axiom.derive(mul=[CubicMeter], div=[Second])
class CubicMeterPerSecond(FlowRateUnit):
    symbol = "m³/s"
    aliases = ["m3/s", "cms"]

# The Legacy Unit: US Gallons per Minute
@axiom.derive(mul=[USGallon], div=[60.0, Second])
class GallonPerMinute(FlowRateUnit):
    symbol = "GPM"
    aliases = ["gpm", "gallons per minute"]

print("Custom Domain 'Flow Rate' successfully injected into the Chisa Engine!")

## Step 3: Data Ingestion (Pandas)
Let's load the dirty, legacy data from our sensors. In a real scenario, this would be `pd.read_csv('sensors.csv')`.

In [None]:
# Simulating 5 hours of sensor readings
timestamps = np.arange(0, 300, 60) # Every 60 minutes

dirty_df = pd.DataFrame({
    'time_minutes': timestamps,
    'reactor_temp_f': [212.0, 250.5, 300.2, 315.0, 310.0],
    'valve_pressure_psi': [14.7, 50.0, 120.5, 125.0, 118.0],
    'pump_flow_gpm': [0.0, 500.0, 1500.0, 1450.0, 1480.0]
})

print("--- Legacy Sensor Data ---")
print(dirty_df)

## Step 4: High-Performance Normalization
We will use Chisa's **Fluent API** to clean the data. By extracting `.use(format='raw')`, Chisa processes the arrays using NumPy's C-engine and returns standard `float64` arrays, preventing any bottlenecks in Pandas.

In [None]:
clean_df = pd.DataFrame({'time_minutes': dirty_df['time_minutes']})

# Normalizing Temperature
clean_df['temp_c'] = convert(dirty_df['reactor_temp_f'].values, 'F').to('C').use(format='raw').resolve()

# Normalizing Pressure
clean_df['pressure_pa'] = convert(dirty_df['valve_pressure_psi'].values, 'psi').to('Pa').use(format='raw').resolve()

# Normalizing our Custom Flow Rate!
clean_df['flow_m3s'] = convert(dirty_df['pump_flow_gpm'].values, 'GPM').to('m³/s').use(format='raw').resolve()

print("--- Normalized SI Data (Ready for ML) ---")
print(clean_df.round(2))

## Step 5: Validating Physics (Explicit OOP)
Machine Learning models don't know physics, but Chisa does. Let's say we want to calculate the total volume of chemicals pumped at the peak rate (Minute 120) over a 2-hour duration. 

If we multiply Flow Rate by Time, we should get Volume.

In [None]:
# Get the peak flow rate as a physical object
peak_flow_rate = CubicMeterPerSecond(clean_df.loc[2, 'flow_m3s'])
duration = Hour(2.0)

# Chisa safely multiplies them, checking dimensional boundaries
# Flow Rate (m^3/s) * Time (s) = Volume (m^3)
total_volume = peak_flow_rate * duration.to(Second)

print(f"Peak Flow Rate : {peak_flow_rate.format(prec=4, tag=True)}")
print(f"Duration       : {duration.format(tag=True)}")
print("-" * 40)
print(f"Total Chemical Volume Pumped: {total_volume.to(CubicMeter).format(delim=True, prec=2, tag=True)}")

## Step 6: Visualization
Finally, we plot our normalized, mathematically verified data using Matplotlib.

In [None]:
fig, ax1 = plt.subplots(figsize=(10, 5))

# Plot Temperature
ax1.plot(clean_df['time_minutes'], clean_df['temp_c'], 'tab:red', marker='o', label='Temp (°C)')
ax1.set_xlabel('Time (Minutes)')
ax1.set_ylabel('Temperature (°C)', color='tab:red')
ax1.tick_params(axis='y', labelcolor='tab:red')
ax1.grid(True, linestyle='--', alpha=0.5)

# Plot Flow Rate on a secondary axis
ax2 = ax1.twinx()
ax2.plot(clean_df['time_minutes'], clean_df['flow_m3s'], 'tab:blue', marker='s', label='Flow Rate (m³/s)')
ax2.set_ylabel('Flow Rate (m³/s)', color='tab:blue')
ax2.tick_params(axis='y', labelcolor='tab:blue')

plt.title('Reactor Startup Sequence (Normalized SI Units)')
fig.tight_layout()
plt.show()