# Feature engineering

In [12]:
import pandas as pd
import numpy as np

In [13]:
# Load your data
df = pd.read_csv("synthetic_compost_data.csv")
df['time_stamp'] = pd.to_datetime(df['time_stamp'])
df.set_index('time_stamp', inplace=True)

### Cooling Duration Estimation Logic

In this project, we estimate the cooling duration required to bring compost temperatures back to a safe range using air pumps. The system is divided into two compartments, each with its own temperature sensors and air pump.

#### Why Estimate Cooling Time?

We don't directly measure how long it takes to cool each chamber — instead, we **simulate a realistic cooling time** based on:
- The **current maximum temperature** in the compartment
- The assumption that **higher temperatures require longer cooling**
- Known composting behavior from literature and lab-scale trials

#### Cooling Time Strategy

We use the function `estimate_cooling_duration(temp)` to return a simulated cooling duration, in minutes. This function is **piecewise**, based on the temperature range.

The base values (5, 10, 15, 20 minutes) are chosen based on:

- **65–68°C → 5 minutes**  
  - Slightly above threshold (65°C)  
  - One or two quick air pulses likely sufficient  
  - Minimal microbial risk if corrected promptly

- **68–72°C → 10 minutes**  
  - Moderately high temperature  
  - Likely requires sustained airflow to bring temperature below safe levels  
  - Microbial activity may still be viable, but risk increases with time

- **72–75°C → 15 minutes**  
  - Borderline hazardous  
  - Thermophilic microbes start to die  
  - Requires aggressive cooling for multiple minutes

- **> 75°C → 20 minutes**  
  - Dangerously high  
  - Rapid thermal damage to microbial population  
  - Extended air pump activation assumed necessary

These values are based on experimental studies and compost engineering practices (e.g., Haug 1993, Cornell Waste Management Institute), where active aeration systems have been observed to reduce compost temperatures by **1–2°C per minute** under optimal airflow.

#### Adding Random Variation

To make the model more realistic, we introduce **random noise** using `np.random.uniform()`:
- This simulates unpredictability in heat transfer (e.g., due to moisture, clumping, compost density, airflow path)
- Each temperature band has a different range of variability (±1–3 minutes)

#### Final Output

The function returns a **floating-point number** representing the simulated cooling duration (in minutes), or `NaN` if no cooling is needed.


In [14]:
# Define logic-based cooling duration estimation
def estimate_cooling_duration(temp):
    if temp <= 65:
        return np.nan
    elif temp <= 68:
        return 5 + np.random.uniform(-1, 1)
    elif temp <= 72:
        return 10 + np.random.uniform(-2, 2)
    elif temp <= 75:
        return 15 + np.random.uniform(-3, 2)
    else:
        return 20 + np.random.uniform(-3, 3)

In [15]:
# Compute max temp per compartment
df['max_temp_compartment_A'] = df[['temperature_active1', 'temperature_active2']].max(axis=1)
df['max_temp_compartment_B'] = df[['temperature_active3', 'temperature_active4']].max(axis=1)

In [16]:
# Estimate cooling durations
df['cooling_duration_A'] = df['max_temp_compartment_A'].apply(estimate_cooling_duration)
df['cooling_duration_B'] = df['max_temp_compartment_B'].apply(estimate_cooling_duration)

In [22]:
# Save full DataFrame including all original columns and computed values
df.reset_index().to_csv("cooling_durations_full_output.csv", index=False)

In [23]:
df.head(10)

Unnamed: 0_level_0,moisture_active1,moisture_active2,oxygen,lid,co2,device_id,temperature_active1,temperature_active2,temperature_active3,temperature_active4,temperature_curing1,temperature_curing2,moisture_curing1,moisture_curing2,max_temp_compartment_A,max_temp_compartment_B,cooling_duration_A,cooling_duration_B
time_stamp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1
2024-01-01,53.414959,54.445332,14.585502,1,5.155804,dev1,30.745071,29.792604,30.971533,32.284545,,,,,30.745071,32.284545,,
2024-01-02,54.431544,55.309214,14.804655,1,5.425923,dev1,33.870992,33.871017,36.591041,35.373374,,,,,33.871017,36.591041,,
2024-01-03,55.332008,55.207526,15.543138,1,4.899575,dev1,37.740233,39.258285,37.749318,37.74585,,,,,39.258285,37.749318,,
2024-01-04,55.714561,56.200347,15.559422,1,5.018819,dev1,43.02961,39.796746,40.07929,41.823235,,,,,43.02961,41.823235,,
2024-01-05,51.243475,51.566163,15.328532,1,5.07145,dev1,45.369642,47.36026,45.526853,44.770433,,,,,47.36026,45.526853,,
2024-01-06,52.636082,53.320398,15.482393,1,4.710142,dev1,53.309584,50.772447,51.212403,48.973989,,,,,53.309584,51.212403,,
2024-01-07,52.016211,51.533749,16.147307,1,4.863838,dev1,54.516759,55.499717,53.606843,55.89688,,,,,55.499717,55.89688,,
2024-01-08,51.32046,51.663486,15.297435,1,4.76649,dev1,58.654598,59.118015,58.652996,62.333973,,,,,59.118015,62.333973,,
2024-01-09,55.819044,56.348256,15.951515,1,4.758116,dev1,63.757532,62.191211,65.011595,61.946512,,,,,63.757532,65.011595,,5.981294
2024-01-10,52.759116,51.879746,15.661616,1,4.463012,dev1,68.313295,65.060495,66.007721,68.295292,,,,,68.313295,68.295292,10.247386,9.744911
