# ParkingEnv 交互式演示（中文说明）

这个笔记本帮助你在 Jupyter 中配置并启动 `main.py` 或助力调参器，快速体验停车环境。

- 先设置下一个代码单元中的运行参数（轮次数、最大步数、模式、动画速度）。
- 修改参数后重新运行相关单元即可让 CLI 使用新的值。
- `manual` 模式下，Qt 窗口使用方向键：`↑/↓` 控制油门，`←/→` 控制方向，`Esc` 退出。
- 每个代码单元都会说明是否写入 JSON 配置，按需调整文件路径后执行即可。


# ParkingEnv Interactive Demo (English Notes)

This notebook shows how to configure and launch `main.py` or the assist-model tuner straight from Jupyter.

- Set the run parameters in the next code cell (episode count, max steps, mode, animation speed).
- Rerun the relevant cells after changing a value so the CLI picks up the new settings.
- In `manual` mode the Qt window listens to the arrow keys: `↑/↓` throttle, `←/→` steering, `Esc` to exit.
- Each code cell tells you whether it writes a JSON config; adjust the file paths first if needed, then execute it.


## 运行参数设置（中文说明）

下方代码块定义 CLI 演示所需的基础变量。

- `episodes` 控制重复运行次数，通常 1 次即可快速检查界面。
- `max_steps` 设定每轮的最大仿真步数，避免策略长时间卡住。
- `sleep_scale` 调整动画节奏，值越大播放越慢，便于观察。
- `mode` 选择 `manual` 或 `random`，也可以在下方保留备用模式以便切换。
- 如需更多自定义项，可在运行后直接修改变量并重新执行。


## Run Parameter Setup (English Notes)

The next code cell declares the baseline variables used by the CLI demo.

- `episodes` controls how many times the scenario repeats; a single run is usually enough to verify the UI.
- `max_steps` caps the number of simulation steps per episode so random policies cannot run forever.
- `sleep_scale` slows down the animation when set above `1.0`, which helps when you want to inspect behaviour frame by frame.
- `mode` toggles between `manual` and `random`; keep an alternate assignment commented out for quick switching.
- Feel free to add more parameters or tweak these values, then rerun the cell to update the downstream commands.


In [663]:
from pathlib import Path

episodes = 1
max_steps = 4000
sleep_scale = 0.5    # Larger values slow the animation

mode = "manual"  # "manual" or "random"
# mode = "random"

## 随机生成训练配置（中文说明）

下方命令会调用 `generate_training_config.py`，只随机化车位和障碍布局，并将结果写入 `generated_configs/train_001.json`。

- 执行前请先在外部终端激活 `parking-rl` Conda 环境，并确保依赖已经安装。
- 如果不想覆盖现有文件，可修改 `--out` 路径或暂时注释这一行。
- 想直接复用旧配置时可以跳过此单元格。


## Random Training Config Generator (English Notes)

The next command runs `generate_training_config.py`, which randomizes only the parking-slot map parameters and writes them to `generated_configs/train_001.json`.

- Activate the `parking-rl` Conda environment in an external terminal beforehand and ensure the dependencies are installed.
- Edit or comment out the `--out` option if you prefer not to overwrite an existing file.
- Skip this cell whenever you want to reuse a configuration that is already on disk.


In [664]:
# !pip install -r requirements.txt
# !conda activate parking-rl
!python generate_training_config.py --out generated_configs/train_001.json

Wrote config to generated_configs/train_001.json


## 指定配置文件路径（中文说明）

这个单元负责控制是否写入 Notebook 生成的覆盖文件，以及 CLI 将加载哪份 JSON。

- 将 `write_notebook_config` 设为 `True` 时，会把当前 `custom_config` 写回磁盘；默认 `False` 以保护现有文件。
- 推荐默认指向 `generated_configs/notebook_override.json`，也可以改为训练配置或其他路径。
- 调整 `config_path` 后请重新运行依赖该变量的后续单元。


## Choose the Config File Path (English Notes)

This code controls whether the notebook writes the override JSON and which file the CLI loads.

- Set `write_notebook_config` to `True` when you want to persist `custom_config`; leave it `False` to avoid overwriting existing files.
- The recommended default is `generated_configs/notebook_override.json`, but you can switch to the training config or any custom path.
- After changing `config_path`, rerun the downstream cells that depend on it.


In [665]:
write_notebook_config = False  # Set to False to keep existing file contents
# config_path = Path("generated_configs/notebook_override.json") # use default config for notebook
config_path = Path("generated_configs/train_001.json") # use training config for notebook

## 载入默认配置副本（中文说明）

以下代码会深拷贝 `build_config()` 返回的默认配置，并在目标 JSON 存在时读取它。

- 如果 `config_path` 指向的文件存在，会优先加载其中的参数（保留 `__notes` 等附加信息）。
- 若文件不存在，就使用默认配置并存入 `custom_config`，方便在 Notebook 中即时修改。
- 可以利用末尾的示例查看或编辑任意字段，调整后配合写回逻辑即可保存。


## Load a Working Copy of the Config (English Notes)

The next cell deep-copies the defaults returned by `build_config()` and reads the target JSON when it is available.

- When `config_path` already exists, its contents are loaded first, including any extra fields such as `__notes`.
- If the file is missing, the code falls back to the default dictionary and keeps it in `custom_config` for quick edits.
- Use the examples at the end of the cell to inspect or update individual fields, then combine with the write-back logic to persist your changes.


In [666]:
from copy import deepcopy
import json
from main import build_config

# Load existing notebook override (keeps __notes) when available; fall back to defaults otherwise.
if config_path.exists():
    with config_path.open("r", encoding="utf-8") as fh:
        custom_config = json.load(fh)
else:
    custom_config = deepcopy(build_config())

# You can modify custom_config here if desired, e.g.:
custom_config # shows all config parameters, including defaults
# custom_config["vehicle"]["length"] # show specific parameter
# custom_config["vehicle"]["length"] = 4.5 # modify specific parameter
# custom_config["vehicle"]["length"] # verify change

{'dt': 0.1,
 'max_steps': 4000,
 'field_size': 60.0,
 'ray_max_range': 12.0,
 'observation_noise': {'enabled': True, 'std': 0.005},
 'ray_angles': [-135.0, -90.0, -60.0, -30.0, 30.0, 60.0, 90.0, 135.0, 0.0],
 'vehicle': {'length': 4.0,
  'width': 2.0,
  'wheel_base': 2.5,
  'max_speed': 3.0,
  'max_reverse_speed': -2.0,
  'max_steering_angle': 60.0,
  'max_steering_rate': 30.0,
  'steering_damping': 52.5,
  'steering_rate_damping': 8.75,
  'steering_assist_deadband': 0.03,
  'velocity_damping': 2.45,
  'velocity_deadband': 0.03,
  'enable_steering_assist': True,
  'manual_forward_accel': 1.5,
  'manual_reverse_accel': 2.0,
  'manual_steering_accel': 10.0},
 'spawn_region': [-0.91, 15.03, -0.76, 15.02],
 'parking_slot': {'length': 5.9,
  'width': 2.35,
  'offset_x_range': [-21.4, -14.51],
  'offset_y_range': [-18.46, -4.78],
  'orientation_range': [-10.85, 10.85]},
 'static_obstacles': {'count': 4,
  'size_range': [0.95, 2.42],
  'min_distance': 1.67,
  'seed': 2824},
 'dynamic_obstacle

## 状态 / 动作 / 奖励 总览（中文说明）

**状态空间**
- 全局位置 `(x, y)`：车辆中心在世界坐标系中的位置，内部已按 `±field_size/2` 归一化；实际训练时可再做特征缩放。
- `cos(yaw)` 与 `sin(yaw)`：车辆朝向的三角编码，避免角度取模问题，数值范围 [-1, 1]。
- 纵向速度 `velocity`：受 `max_speed` 与 `max_reverse_speed` 限制，典型区间约 [-2.0, 3.0] m/s。
- 转向角 `steering_angle`：裁剪在 `±max_steering_angle` 内，默认约 ±0.79 rad (±45°)。
- 转向角速度 `steering_rate`：裁剪在 `±max_steering_rate` 内，默认约 ±1.05 rad/s (±60°/s)。
- 车位坐标误差 `(slot_dx, slot_dy)`：车辆在车位坐标系下的平移偏差，采用场地大小归一化。
- `cos(Δyaw)` 与 `sin(Δyaw)`：车身与车位朝向差的三角编码，范围 [-1, 1]。
- 激光测距 `ray_1 … ray_N`：共 `N = len(ray_angles)` 束射线，取值 [0, 1]，0 表示立即碰撞，1 表示达到 `ray_max_range`。
- 除非使用 `raw=True`，观测默认叠加标准差为 0.005 的高斯噪声，可通过 `ParkingEnv(config={"observation_noise": {"enabled": False}})` 或 `env.unwrapped.set_observation_noise(...)` 关闭/调整。

**动作空间（Box(2))**
- `a_longitudinal`：纵向加速度指令，范围 [-3.0, 2.0] m/s²，正值加速、负值制动或倒车。
- `a_steering`：转向角加速度指令，范围 [-1.5, 1.5] rad/s²；在 `step()` 中会裁剪到上限以保持稳定。

**奖励构成**
- `distance`：`-distance_scale * ||slot_xy||`，鼓励尽快贴近车位中心。
- `heading`：`-heading_scale * |Δyaw|`，惩罚朝向误差。
- `velocity`：`-velocity_penalty * |v|`，在靠近车位时鼓励降速。
- `smoothness`：`-smoothness * steering_rate^2`，约束方向盘抖动，与辅助模型阻尼相关。
- `step`：每步扣除 `step_cost`，促使策略提高效率。
- `collision`：碰撞时一次扣除 `collision`（默认 -120）。
- `success`：满足位置、朝向与速度阈值时给予 `success` 奖励（默认 +140）。
- 以上权重均来源于 `config['reward']`，可在 JSON 中调节以改变训练侧重点。


## State / Action / Reward Overview (English Notes)

**State space**
- Global position `(x, y)`: vehicle centre in world coordinates, internally normalized by `±field_size/2`; you may apply additional feature scaling for training.
- `cos(yaw)` and `sin(yaw)`: sine/cosine encoding of the vehicle heading to avoid wrap-around issues, each within [-1, 1].
- Longitudinal velocity `velocity`: bounded by `max_speed` and `max_reverse_speed`, typically around [-2.0, 3.0] m/s.
- Steering angle `steering_angle`: clipped to `±max_steering_angle`, roughly ±0.79 rad (±45°) by default.
- Steering rate `steering_rate`: clipped to `±max_steering_rate`, roughly ±1.05 rad/s (±60°/s) by default.
- Slot-frame error `(slot_dx, slot_dy)`: translation of the vehicle relative to the slot frame, normalized by the field size.
- `cos(Δyaw)` and `sin(Δyaw)`: heading difference between the vehicle and the slot, encoded in [-1, 1].
- Range readings `ray_1 … ray_N`: `N = len(ray_angles)` lidar beams scaled to [0, 1]; 0 means immediate collision, 1 reaches `ray_max_range`.
- Unless you request `raw=True`, a Gaussian noise with std 0.005 is added; disable or tune it via `ParkingEnv(config={"observation_noise": {"enabled": False}})` or `env.unwrapped.set_observation_noise(...)`.

**Action space (Box(2))**
- `a_longitudinal`: longitudinal acceleration command in [-3.0, 2.0] m/s²; positive accelerates forward, negative brakes or reverses.
- `a_steering`: steering angular acceleration command in [-1.5, 1.5] rad/s²; the environment clips it during `step()` for stability.

**Reward components**
- `distance`: `-distance_scale * ||slot_xy||`, encouraging the agent to reach the slot centre quickly.
- `heading`: `-heading_scale * |Δyaw|`, penalising heading misalignment.
- `velocity`: `-velocity_penalty * |v|`, encouraging the car to slow down near the slot.
- `smoothness`: `-smoothness * steering_rate^2`, discouraging steering oscillations and linking to the assist-model damping.
- `step`: subtracts `step_cost` every timestep to promote efficiency.
- `collision`: applies `collision` once on impact (default -120).
- `success`: grants `success` when position, heading, and velocity thresholds are met (default +140).
- All weights originate from `config['reward']`, so adjust the JSON to shift training priorities.


## 助力模型说明（中文说明）

- 当方向盘角加速度指令满足 `|α_cmd| ≤ deadband` 时（即松开方向盘），系统触发回正模型。
- 回正加速度 `α_assist = K_p · φ + K_d · φ̇`，并在积分阶段与驾驶员指令叠加更新方向盘角速度与角度。
- 油门松开时，若纵向加速度满足 `|a_cmd| ≤ velocity_deadband`，则阻尼 `a_assist = K_v · v` 让车辆速度指数衰减。
- 下方代码提供调参器的初始角度、角速度、仿真步数以及是否实时写回 JSON 的选项，便于快速尝试不同阻尼参数。


## Assist-Model Notes (English Explanation)

- When the steering angular acceleration command satisfies `|α_cmd| ≤ deadband` (driver releases the wheel), the assist model recenters it.
- The assist acceleration follows `α_assist = K_p · φ + K_d · φ̇` and combines with the command during integration to update steering rate and angle.
- When the throttle command magnitude stays within `velocity_deadband`, the damping term `a_assist = K_v · v` makes the velocity decay exponentially.
- The next code cell sets the tuner defaults: initial steering angle/rate, simulation steps, and whether live edits sync back into the JSON file.


In [667]:
# Assist-model tuner launch options
tuner_angle0 = 20.0   # 初始方向盘角度（度）
tuner_rate0 = 0.0    # 初始方向盘角速度（度/秒）
tuner_steps = 100     # 仿真步数
tuner_sync_updates = True  # True 表示实时写回 JSON 配置

## 启动助力调参器（中文说明）

下面的命令会调用 `assist_model_tuner.py`，在 Qt 窗口中展示方向盘与油门阻尼的仿真结果。你可以滑动滑块来调节来调整阻尼参数，关闭窗口后自动保存。

- 使用 `conda run -n parking-rl` 以 notebook 外部环境启动脚本，确保 Matplotlib 后端配置为 `QtAgg`。
- 当 `tuner_sync_updates = True` 时，会把当前 `custom_config` 写入 JSON 并作为 `--config` 传入，让调参器实时读取。
- 需要额外参数（例如时间步长）时，可在列表中继续追加命令行选项。


## Launch the Assist Tuner (English Notes)

The upcoming command runs `assist_model_tuner.py` and opens a Qt window that visualises the steering/throttle damping response. You can adjust the damping parameter by sliding the slider and save it automatically after closing the window.

- It relies on `conda run -n parking-rl` so that the external environment provides the correct dependencies and the Matplotlib backend is set to `QtAgg`.
- When `tuner_sync_updates = True`, the notebook writes `custom_config` to JSON and passes it via `--config`, letting the tuner pick up live edits.
- Add further command-line options to `tuner_cmd` if you need extra knobs such as timestep or logging verbosity.


In [668]:
from pathlib import Path
import json
import os
import subprocess

project_root = Path(".").resolve()
tuner_cmd = [
    "conda", "run", "-n", "parking-rl", "python", "assist_model_tuner.py",
    "--angle0", str(tuner_angle0),
    "--rate0", str(tuner_rate0),
    "--steps", str(tuner_steps),
]

if tuner_sync_updates:
    tuner_cmd.append("--sync")

if config_path is not None:
    if write_notebook_config:
        config_path.parent.mkdir(parents=True, exist_ok=True)
        with config_path.open("w", encoding="utf-8") as fh:
            json.dump(custom_config, fh, indent=2, ensure_ascii=False)
            fh.write("\n")
    tuner_cmd.extend(["--config", str(config_path.resolve())])

print("Launching assist model tuner:", " ".join(tuner_cmd))
env_vars = os.environ.copy()
env_vars["MPLBACKEND"] = "QtAgg"
subprocess.run(tuner_cmd, cwd=project_root, env=env_vars, check=False)

Launching assist model tuner: conda run -n parking-rl python assist_model_tuner.py --angle0 20.0 --rate0 0.0 --steps 100 --sync --config /home/ansatz/ME5418/parking_project/generated_configs/train_001.json


CompletedProcess(args=['conda', 'run', '-n', 'parking-rl', 'python', 'assist_model_tuner.py', '--angle0', '20.0', '--rate0', '0.0', '--steps', '100', '--sync', '--config', '/home/ansatz/ME5418/parking_project/generated_configs/train_001.json'], returncode=0)

## 查看调参结果（中文说明）

调参器运行后，可以执行下方代码读取当前配置文件，并打印关键阻尼参数以供核对。

- 若 `write_notebook_config` 为 True，则会从刚刚写入的 JSON 读取数据；否则直接读取现有文件。
- 输出的字段包括方向盘比例/阻尼、死区以及油门阻尼，便于确认是否符合预期。
- 想检查更多字段时，可继续访问 `custom_config` 中的其他键值。


## Inspect Tuner Outputs (English Notes)

After the tuner window closes, the next cell reloads the config file and prints the key damping parameters for verification.

- When `write_notebook_config` is True, it reads the JSON that was just written; otherwise it loads the existing file on disk.
- The printed values cover the steering proportional/derivative gains, deadband, and throttle damping so you can confirm the latest edits.
- Feel free to probe additional entries inside `custom_config` if you need more diagnostics.


In [669]:
# 调参可视化会在外部 Qt 窗口中渲染，运行上一个单元格即可启动。
custom_config = json.load(config_path.open("r", encoding="utf-8"))
vehicle_section = custom_config.get("vehicle", {})
print("steering_damping Kp:", vehicle_section.get("steering_damping"))  # 输出当前转向比例系数
print("steering_rate_damping Kd:", vehicle_section.get("steering_rate_damping"))  # 输出当前转向速度阻尼
print("steering_deadband:", vehicle_section.get("steering_assist_deadband"))  # 方向盘回正死区
print("velocity_damping Kv:", vehicle_section.get("velocity_damping"))  # 油门释放阻尼
print("velocity_deadband:", vehicle_section.get("velocity_deadband"))  # 油门释放死区

steering_damping Kp: 52.5
steering_rate_damping Kd: 8.75
steering_deadband: 0.03
velocity_damping Kv: 2.45
velocity_deadband: 0.03


## 启动 CLI 演示（中文说明）

接下来的代码将生成命令并运行 `main.py`，在 Qt 窗口中展示手动或随机策略。

- 运行前请确认已经在外部终端激活 `parking-rl` 环境，确保依赖和 GUI 后端可用。
- 如果勾选写入选项，会把当前 `custom_config` 保存成 JSON 并随命令加载。
- 想修改睡眠倍速、轮次数等参数时，返回上方变量单元重新运行即可生效。


## Launch the CLI Demo (English Notes)

The next cell assembles the command for `main.py` and opens a Qt window to run either the manual or random policy.

- Make sure the `parking-rl` environment is activated in an external terminal so that dependencies and the GUI backend are available.
- When the write flag is enabled, the current `custom_config` is saved to JSON and passed to the CLI via `--config`.
- To adjust sleep scaling, episode count, or any other parameter, go back to the earlier variable cell, tweak the values, and rerun it before launching.


In [670]:
from pathlib import Path
import json
import os
import subprocess

project_root = Path(".").resolve()
cmd = [
    "conda", "run", "-n", "parking-rl", "python", "main.py",
    "--mode", mode,
    "--episodes", str(episodes),
    "--max-steps", str(max_steps),
]

if sleep_scale != 0.5:
    cmd.extend(["--sleep-scale", str(sleep_scale)])

if config_path is not None:
    if write_notebook_config:
        config_path.parent.mkdir(parents=True, exist_ok=True)
        with config_path.open("w", encoding="utf-8") as fh:
            json.dump(custom_config, fh, indent=2, ensure_ascii=False)
            fh.write("\n")
    cmd.extend(["--config", str(config_path.resolve())])

print("Launching CLI demo:", " ".join(cmd))
env_vars = os.environ.copy()
env_vars["MPLBACKEND"] = "QtAgg"
subprocess.run(cmd, cwd=project_root, env=env_vars, check=False)

Launching CLI demo: conda run -n parking-rl python main.py --mode manual --episodes 1 --max-steps 4000 --config /home/ansatz/ME5418/parking_project/generated_configs/train_001.json
Episode 1 Step 1 Reward -52.958 Termination running Distance 35.08 Heading 15.2 deg
Episode 1 Step 2 Reward -52.957 Termination running Distance 35.08 Heading 15.2 deg
Episode 1 Step 3 Reward -52.959 Termination running Distance 35.08 Heading 15.2 deg
Episode 1 Step 4 Reward -52.972 Termination running Distance 35.08 Heading 15.2 deg
Episode 1 Step 5 Reward -52.975 Termination running Distance 35.09 Heading 15.2 deg
Episode 1 Step 6 Reward -52.975 Termination running Distance 35.09 Heading 15.2 deg
Episode 1 Step 7 Reward -52.974 Termination running Distance 35.09 Heading 15.2 deg
Episode 1 Step 8 Reward -52.977 Termination running Distance 35.09 Heading 15.2 deg
Episode 1 Step 9 Reward -52.977 Termination running Distance 35.09 Heading 15.2 deg
Episode 1 Step 10 Reward -52.967 Termination running Distance 3

CompletedProcess(args=['conda', 'run', '-n', 'parking-rl', 'python', 'main.py', '--mode', 'manual', '--episodes', '1', '--max-steps', '4000', '--config', '/home/ansatz/ME5418/parking_project/generated_configs/train_001.json'], returncode=0)

## 训练接入说明（中文说明）

- 在训练脚本中可以 `env = gym.make("ParkingEnv-v0", config=custom_cfg)`，或从 `parking_gym import ParkingEnv` 自行实例化；省略 `config` 时会采用默认参数（包含 9 束雷达）。
- 观测向量长度为 `11 + len(ray_angles)`，可通过 `env.observation_space.shape[0]` 或一次 `obs, _ = env.reset()` 验证；顺序与上文状态列表一致，末尾为 0~1 归一化的激光距离。
- `env.step(action)` 需要形如 `np.array([lon_accel, steer_accel], dtype=np.float32)` 的二维动作，可参考下方示例。
- 若需关闭观测噪声以便评估，可调用 `env.unwrapped.set_observation_noise(enabled=False)`；重新开启或调整标准差时通过 `std` 参数指定。

```python
import gymnasium as gym
import numpy as np
from parking_gym import ParkingEnv

env = ParkingEnv()  # 或 gym.make("ParkingEnv-v0")
obs, info = env.reset()
done = False
while not done:
    action = np.array([0.0, 0.0], dtype=np.float32)  # 替换为策略输出
    obs, reward, terminated, truncated, info = env.step(action)
    done = terminated or truncated
```


## Training Integration Notes (English Explanation)

- In your training script call `env = gym.make("ParkingEnv-v0", config=custom_cfg)` or import `ParkingEnv` directly; omitting `config` uses the synced defaults (with nine lidar beams).
- The observation vector length is `11 + len(ray_angles)`; confirm with `env.observation_space.shape[0]` or by running `obs, _ = env.reset()`. The order matches the earlier state list, with the lidar distances occupying the tail.
- `env.step(action)` expects a two-dimensional action such as `np.array([lon_accel, steer_accel], dtype=np.float32)`; see the snippet below.
- Disable observation noise during evaluation via `env.unwrapped.set_observation_noise(enabled=False)`, or adjust the standard deviation by passing a `std` value.

```python
import gymnasium as gym
import numpy as np
from parking_gym import ParkingEnv

env = ParkingEnv()  # or gym.make("ParkingEnv-v0")
obs, info = env.reset()
done = False
while not done:
    action = np.array([0.0, 0.0], dtype=np.float32)  # replace with policy output
    obs, reward, terminated, truncated, info = env.step(action)
    done = terminated or truncated
```
