Skip to content

Heedou/PASFramework

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PAS_Framework

Contributors

Name Affiliation Email
Sangyub Lee Intelligence and Informatics Lab,
Korea University, Seoul, South Korea
& Korean National Police Agency
yubii2@korea.ac.kr
Heedou Kim Data Mining and Information Systems Lab,
Korea University, Seoul, South Korea
& Korean National Police Agency
heedou123@korea.ac.kr
Hyuncheol Kim* Intelligence and Informatics Lab,
Korea University, Seoul, South Korea
harrykim@korea.ac.kr
  • *: Corresponding Author

PAS: Police Action Scenario Framework

FIg1-PAS Construction with Example

PAS (Police Action Scenario) is a dedicated framework for evaluating Large Language Models (LLMs) in real-world policing contexts.

Modern policing requires nuanced judgment and situational awareness—standard benchmarks alone are not sufficient. PAS introduces a scenario-based, multi-stage evaluation method designed specifically for policing tasks.

🔍 Evaluation Pipeline

PAS defines LLM evaluation as a five-stage process:

  • S: Police Action Scenarios
    Situation-driven tasks reflecting real-world policing needs.

  • R: Reference Responses
    Expert-crafted gold answers created with input from law enforcement professionals.

  • G: Response Generation
    LLM-generated outputs based on the given scenarios.

  • M: Core Evaluation Metrics
    Task-relevant metrics and evaluation methodologies tailored for public safety applications.

  • P: Policing LLM Performance Evaluation
    Final assessment of the LLM’s effectiveness, accuracy, and fitness for deployment in policing.

Formally expressed as:
E_police = f(S, R, G, M, P)


PAS fills the gap in evaluating LLMs for law enforcement by combining structured scenarios, expert benchmarks, and targeted metrics.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages