CREATE A SYTHETIC DATASET

1. Install the dependancies if they are not present

In [20]:
%pip install --upgrade pip
%pip install pandas numpy

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


2. Import all the required libraries.

In [21]:
import pandas as pd
import numpy as np
print('pandas', pd.__version__)
print('numpy', np.__version__)

pandas 2.3.3
numpy 2.4.0


3. Initialize a random seed generator

In [22]:
np.random.seed(42)

4. Initialize the parameters for generating the synthetic data

In [23]:
n_samples = 500
ambient_temp = np.random.uniform(15, 45, n_samples)  # °C
power_load = np.random.uniform(5, 50, n_samples)     # W
cooling_type = np.random.choice([0,1,2], n_samples)  # 0=Passive,1=Fan,2=High-conduction

5. Simplified linear thermal resistance model to generate data

$$
\begin{aligned}
T_{\text{max}} &= T_{\text{ambient}} + 0.5 \cdot P_{\text{load}} - 5 \cdot C_{\text{eff}} + \epsilon \\
\text{where:} \quad

&T_{\text{max}} = \text{Maximum temperature of the control unit (°C)}, \\
&T_{\text{ambient}} = \text{Ambient temperature (°C)}, \\
&P_{\text{load}} = \text{Power load (W) applied to the control unit}, \\
&C_{\text{eff}} = \text{Cooling efficiency index (0 = Passive, 1 = Fan, 2 = High-Conductivity)}, \\
&\epsilon = \text{Random noise sampled from a normal distribution}
\end{aligned}
$$



In [24]:
max_temp = ambient_temp + power_load * 0.5 - cooling_type * 5 + np.random.normal(0, 1.5, n_samples)

6. Build DataFrame

In [25]:
df = pd.DataFrame({
    'AmbientTemp': ambient_temp,
    'PowerLoad': power_load,
    'CoolingType': cooling_type,
    'MaxTemp': max_temp
})

7. Convert the data frame to .csv file

In [26]:
df.to_csv("thermal_dataset.csv", index=False)
print("Synthetic dataset saved as thermal_dataset.csv")

Synthetic dataset saved as thermal_dataset.csv
