While using EPDMS for closed-loop evaluation, we encountered some behavior related to the human_penalty_filter mechanism that we are unsure about. We would like to clarify whether this is the intended design or a potential bug.
Specifically, when a scenario triggers the human_penalty_filter (e.g., the human driver also fails the lane_keeping metric, receiving a score of 0.0), our AI agent's reported score for lane_keeping is exempted and overwritten from 0.0 to 1.0.
However, we've observed that the final total score appears to still be calculated based on the original, pre-exemption value of 0.0. This leads to a seeming contradiction in the final output report.
The relevant code is as follows:
for column in human_pdm_result.columns:
if column in [
"multiplicative_metrics_prod",
"weighted_metrics",
"weighted_metrics_array",
]:
continue
if human_pdm_result[column].iloc[0] == 0:
pdm_result.at[0, column] = 1
The final PDM score is calculated from the weighted_metrics array. We get:
token,valid,no_at_fault_collisions,drivable_area_compliance,driving_direction_compliance,traffic_light_compliance,ego_progress,time_to_collision_within_bound,lane_keeping,history_comfort,two_frame_extended_comfort,score
eeae24d38eb15e0a,True,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.875