# PART 1 : Vehicle Data Generation and Real life Test Scenario Simulation

# Use Case Scenario : 

A simulation of vehicle scenario specific tests (including; Cold Start, High speed, etc) on Electric and hybrid vehciles. The goal of the test is to generate large volumes of performance data. This project addresses the manual logs used in excel to identify performance issues and safety concerns which can take upto hours. The project aims to improve insight quality and be more efficient in its data automation. 

Since, it was difficult to find readily available datasets for vehicle tests, the project generates a realistic data that can be close to its real life counter part. We ask the question of -
"How can we analyse a vehicle test data to efficiently detect performance issues like overheating and torque dropouts during different driving scenarios?"

# Objective : 

The automobile industry have evolved to manufacturing EVs and Hybrid vehicles. For this test, we will not be considering Combustion based vehicles. We will be testing 5 vehciles under four different test scenarios : 

1. Cold Start
2. Hill Climb
3. Endurance Run
4. High Speed
    
The Data Structures the dataset will be dealing with : 

1. timestamp :          Calculated in seconds
2. vehicle_id :         Vehicle ID defined as V001, V002, V003, V004, V005
3. propulsion_type :    Engine type as EV or HYBRID 
4. test_scenario :      Endurance Run, High Speed, Hill Climb, Cold Start
5. vehicle_speed_mph:   Vehcile Speed every second in miles per hour
6. acceleration_mphps:  Deviation acceleration b/w +ve or -ve
7. engine_rpm :         Revolution of the vehicle per minute 
8. torque :             Rotational motion of the force and calculated in Nw
9. coolant_temp:        Temperature regulation to prevent overheating and freezing 
10. ambient_temp:       Temperature outside the vehicle 
11. power_kw:            
12. phase:              Vehcile's status between idle, accelerate, break or cruise 
13. soc:                Represents State of Charge 

Now, we can continue in understanding the test scenario in detail below 

In [1]:
# Firstly, we will import numpy and pandas for regular Data Manipulation
# Random is introduced to generate vehicle test dataset to add realism 
# Most importantly, Datetime for keeping in track of when the test timeline

import numpy as np
import pandas as pd
import random
from datetime import datetime

The code below simulates Vehicle Dataset and can be customised based on request. 

In [2]:
#Simulating test runs for different scenario with 1 second as it's frequency 
def simulate_test_run(vehicle_id, propulsion_type, test_scenario, duration_minutes):
    n_rows = duration_minutes * 60
    timestamps = pd.date_range(start=datetime.now(), periods=n_rows, freq='1S')

    def simulate_speed():
        if test_scenario == 'Cold Start':
            idle_duration = min(random.randint(240, 480), int(n_rows * 0.7))
            drive_duration = n_rows - idle_duration
            idle_block = np.zeros(idle_duration)

            if drive_duration <= 0:
                return np.concatenate([idle_block, np.zeros(n_rows - idle_duration)])

            accel = np.linspace(0, 20, int(drive_duration * 0.25))
            cruise = np.random.normal(25, 2, drive_duration - len(accel))
            cruise = np.clip(cruise, 15, 35)
            drive_block = np.concatenate([accel, cruise])
            return np.concatenate([idle_block, drive_block])

        elif test_scenario == 'Hill Climb':
            base_speed = 30
            ripple = np.sin(np.linspace(0, 4 * np.pi, n_rows)) * 0.8
            return np.clip(base_speed + ripple + np.random.normal(0, 1, n_rows), 12, 50)

        else:
            road_segments = {
                'residential': (20, 30),
                'main_road': (30, 40),
                'carriageway': (50, 60),
                'dual_carriageway': (60, 70),
                'motorway': (70, 80),
                'endurance': (75, 100),
                'high_speed': (80, 110),
            }

            if test_scenario == 'Endurance Run':
                segments = ['main_road', 'motorway', 'endurance', 'motorway']
            elif test_scenario == 'High Speed':
                segments = ['dual_carriageway', 'motorway', 'high_speed', 'motorway']
            else:
                segments = ['residential', 'main_road', 'carriageway', 'motorway']

            segment_lengths = np.random.multinomial(n_rows, [1/len(segments)] * len(segments))
            speed_series = []

            for i, (seg, seg_len) in enumerate(zip(segments, segment_lengths)):
                min_mph, max_mph = road_segments[seg]
                base = np.linspace(min_mph, max_mph, seg_len)
                noise = np.random.normal(0, 0.8, seg_len)
                ripple = np.sin(np.linspace(0, 2 * np.pi, seg_len)) * 1.2
                pattern = base + noise + ripple

                # Smooth transitions between segments (4–10s)
                if i > 0 and len(speed_series) > 0:
                    prev_end = speed_series[-1]
                    ramp_len = min(10, seg_len // 4)
                    ramp = np.linspace(prev_end, pattern[0], ramp_len)
                    pattern = np.concatenate([ramp, pattern[ramp_len:]])

                speed_series.extend(np.clip(pattern, 0, 120))

            return np.array(speed_series[:n_rows])

    def simulate_rpm(speed):
        ratio = {'Cold Start': 30, 'Endurance Run': 35, 'Hill Climb': 40, 'High Speed': 45}.get(test_scenario, 32)
        return np.clip(speed * ratio + np.random.normal(0, 80, len(speed)), 500, 7000)

    def simulate_torque():
        peak = {'Cold Start': 90, 'Endurance Run': 200, 'Hill Climb': 250, 'High Speed': 230}.get(test_scenario, 180)
        return np.abs(np.random.normal(peak, 30, n_rows))

    def simulate_coolant_temp():
        # Base thermal profile
        start = np.random.uniform(20, 40)
        max_t = np.random.uniform(85, 95)
        rise = np.linspace(start, max_t, n_rows)
        ripple = np.sin(np.linspace(0, 8 * np.pi, n_rows)) * 1.5
        temp_raw = rise + ripple

        ptype = (propulsion_type or "").strip().upper()

        # Soft targets & hard caps --> for anamoly detection
        if ptype == 'EV':
            soft_target, hard_cap, rare_prob = 57.5, 58.0, 0.001  # ~0.1% can touch 58
        elif ptype in {'HEV', 'PHEV', 'HYBRID'}:
            soft_target, hard_cap, rare_prob = 69.5, 70.0, 0.001  # ~0.1% can touch 70
        else:
            return np.minimum(temp_raw, 95.0)  # ICE/other unchanged

        temp = temp_raw.copy()
        exceed = temp_raw > soft_target
        if np.any(exceed):
            rnd = np.random.rand(n_rows)
            rare_mask = exceed & (rnd < rare_prob)
            normal_mask = exceed & (~rare_mask)

            # Allowing up to the hard cap
            temp[rare_mask] = np.minimum(temp_raw[rare_mask], hard_cap)

            # Most points: compress to just below soft target
            overshoot = temp_raw[normal_mask] - soft_target
            jitter = np.random.uniform(0.05, 0.35, size=overshoot.shape)  # guarantees < soft target
            drop = np.clip(0.1 + 0.2 * overshoot, 0.05, 0.35)
            temp[normal_mask] = soft_target - np.maximum(jitter, drop)

        # Final safety cap
        return np.minimum(temp, hard_cap)

    def simulate_phase(speed):
        phase = []
        for i in range(len(speed)):
            if speed[i] < 3:
                phase.append('Idle')
            elif i > 0 and speed[i] > speed[i - 1] + 0.5:
                phase.append('Accelerate')
            elif i > 0 and speed[i] < speed[i - 1] - 0.5:
                phase.append('Brake')
            else:
                phase.append('Cruise')
        return phase

    def simulate_soc():
        soc = np.linspace(100, 80, n_rows) + np.random.normal(0, 0.3, n_rows)
        return np.clip(soc, 0, 100)

    def simulate_ambient_temp():
        ambient = np.random.uniform(10, 35)
        drift = np.cumsum(np.random.normal(0, 0.01, n_rows))
        return np.clip(ambient + drift, -10, 50)

    # Signals
    vehicle_speed = simulate_speed()
    engine_rpm = simulate_rpm(vehicle_speed)
    torque = simulate_torque()
    coolant_temp = simulate_coolant_temp()
    ambient_temp = simulate_ambient_temp()
    acceleration = np.diff(vehicle_speed, prepend=vehicle_speed[0]).round(2)
    power_kw = ((torque * engine_rpm) / 9550).round(2)
    phase = simulate_phase(vehicle_speed)

    # DataFrame 
    df = pd.DataFrame({
        'timestamp': timestamps,
        'vehicle_id': vehicle_id,
        'propulsion_type': propulsion_type,
        'test_scenario': test_scenario,
        'vehicle_speed_mph': vehicle_speed.round(2),
        'acceleration_mphps': acceleration,
        'engine_rpm': engine_rpm.round(2),
        'torque': torque.round(2),
        'coolant_temp': coolant_temp.round(2),
        'ambient_temp': ambient_temp,
        'power_kw': power_kw,
        'phase': phase
    })

    if propulsion_type and (propulsion_type.strip().upper() == 'EV'):
        df['soc'] = simulate_soc().round(2)

    return df


Below, we can take an example of a test run to check the results of the simulated test result...

In [3]:
df = simulate_test_run(
    vehicle_id='V001',                # Change id b/w V001 to V005 
    propulsion_type='EV',             # or interchangable with 'HYBRID'
    test_scenario='Cold Start',       # or 'Endurance Run', 'Hill Climb', High Speed etc.
    duration_minutes=15               # Can select anywhere between 1 to 90 minutes 
)

In [4]:
df

Unnamed: 0,timestamp,vehicle_id,propulsion_type,test_scenario,vehicle_speed_mph,acceleration_mphps,engine_rpm,torque,coolant_temp,ambient_temp,power_kw,phase,soc
0,2025-08-11 21:02:06.618927,V001,EV,Cold Start,0.00,0.00,500.00,119.44,33.07,16.940485,6.25,Idle,100.00
1,2025-08-11 21:02:07.618927,V001,EV,Cold Start,0.00,0.00,500.00,118.11,33.18,16.935149,6.18,Idle,100.00
2,2025-08-11 21:02:08.618927,V001,EV,Cold Start,0.00,0.00,500.00,116.98,33.28,16.944962,6.12,Idle,99.94
3,2025-08-11 21:02:09.618927,V001,EV,Cold Start,0.00,0.00,500.00,74.66,33.38,16.936187,3.91,Idle,99.82
4,2025-08-11 21:02:10.618927,V001,EV,Cold Start,0.00,0.00,500.00,39.86,33.48,16.933412,2.09,Idle,99.83
...,...,...,...,...,...,...,...,...,...,...,...,...,...
895,2025-08-11 21:17:01.618927,V001,EV,Cold Start,26.20,3.84,778.30,129.78,57.15,16.712864,10.58,Accelerate,79.94
896,2025-08-11 21:17:02.618927,V001,EV,Cold Start,26.83,0.63,867.62,45.94,57.15,16.718828,4.17,Accelerate,80.38
897,2025-08-11 21:17:03.618927,V001,EV,Cold Start,26.97,0.14,704.43,126.02,57.15,16.729154,9.30,Cruise,79.98
898,2025-08-11 21:17:04.618927,V001,EV,Cold Start,24.94,-2.02,633.38,115.75,57.15,16.739296,7.68,Brake,80.51


In [5]:
import os

# Make sure the output folder exists
os.makedirs("vehicle_logs", exist_ok=True)

vehicle_ids = [f"V00{i+1}" for i in range(5)]
test_scenarios = ['Cold Start', 'Endurance Run', 'Hill Climb', 'High Speed']

# Assigning Various Vehicle propulsions 
vehicle_propulsions = {
    'V001': 'EV',
    'V002': 'Hybrid',
    'V003': 'EV',
    'V004': 'Hybrid',
    'V005': 'EV',
}

for vehicle_id in vehicle_ids:
    propulsion = vehicle_propulsions[vehicle_id]
    for scenario in test_scenarios:
        duration = random.randint(30, 90)  #Calculated in minutes

        df = simulate_test_run(vehicle_id, propulsion, scenario, duration)

        scenario_clean = scenario.replace(' ', '_')
        filename = f"vehicle_logs/{vehicle_id}_{scenario_clean}.csv"
        df.to_csv(filename, index=False)

        print(f"✅ Saved: {filename} | Duration: {duration} min | Type: {propulsion}")



✅ Saved: vehicle_logs/V001_Cold_Start.csv | Duration: 77 min | Type: EV
✅ Saved: vehicle_logs/V001_Endurance_Run.csv | Duration: 32 min | Type: EV
✅ Saved: vehicle_logs/V001_Hill_Climb.csv | Duration: 68 min | Type: EV
✅ Saved: vehicle_logs/V001_High_Speed.csv | Duration: 31 min | Type: EV
✅ Saved: vehicle_logs/V002_Cold_Start.csv | Duration: 85 min | Type: Hybrid
✅ Saved: vehicle_logs/V002_Endurance_Run.csv | Duration: 34 min | Type: Hybrid
✅ Saved: vehicle_logs/V002_Hill_Climb.csv | Duration: 64 min | Type: Hybrid
✅ Saved: vehicle_logs/V002_High_Speed.csv | Duration: 69 min | Type: Hybrid
✅ Saved: vehicle_logs/V003_Cold_Start.csv | Duration: 70 min | Type: EV
✅ Saved: vehicle_logs/V003_Endurance_Run.csv | Duration: 40 min | Type: EV
✅ Saved: vehicle_logs/V003_Hill_Climb.csv | Duration: 77 min | Type: EV
✅ Saved: vehicle_logs/V003_High_Speed.csv | Duration: 57 min | Type: EV
✅ Saved: vehicle_logs/V004_Cold_Start.csv | Duration: 88 min | Type: Hybrid
✅ Saved: vehicle_logs/V004_Enduranc

The Data is generated and is stored as a log files, next part will be rather short but important