# COGS 188 - Project Proposal

# Project Description

You have the choice of doing either (1) an AI solve a problem style project or (2) run a Special Topics class on a topic of your choice.  If you want to do (2) you should fill out the _other_ proposal for that. This is the proposal description for (1).

You will design and execute a machine learning project. There are a few constraints on the nature of the allowed project. 
- The problem addressed will not be a "toy problem" or "common training students problem" like 8-Queens or a small Traveling Salesman Problem or similar
- If its the kind of problem (e.g., RL) that interacts with a simulator or live task, then the problem will have a reasonably complex action space. For instance, a wupus world kind of thing with a 9x9 grid is definitely too small.  A simulated mountain car with a less complex 2-d road and simplified dynamics seems like a fairly low achievement level.  A more complex 3-d mountain car simulation with large extent and realistic dynamics, sure sounds great!
- If its the kind of problem that uses a dataset, then the dataset will have >1k observations and >5 variables. I'd prefer more like >10k observations and >10 variables. A general rule is that if you have >100x more observations than variables, your solution will likely generalize a lot better. The goal of training an unsupervised machine learning model is to learn the underlying pattern in a dataset in order to generalize well to unseen data, so choosing a large dataset is very important.
- The project must include some elements we talked about in the course
- The project will include a model selection and/or feature selection component where you will be looking for the best setup to maximize the performance of your AI system. Generally RL tasks may require a huge amount of training, so extensive grid search is unlikely to be possible. However expoloring a few reasonable hyper-parameters may still be possible. 
- You will evaluate the performance of your AI system using more than one appropriate metric
- You will be writing a report describing and discussing these accomplishments


Feel free to delete this description section when you hand in your proposal.

# Names

- Katie Chung
- Jiawei Gao
- Grace Ortiz
- Hsiang-An Pao

# Abstract - Jiawei
This section should be short and clearly stated. It should be a single paragraph <200 words.  It should summarize: 
- what your goal/problem is
- what the data used represents and how they are measured
- what you will be doing with the data
- how performance/success will be measured

# Background - Jiawei

Fill in the background and discuss the kind of prior work that has gone on in this research area here. **Use inline citation** to specify which references support which statements.  You can do that through HTML footnotes (demonstrated here). I used to reccommend Markdown footnotes (google is your friend) because they are simpler but recently I have had some problems with them working for me whereas HTML ones always work so far. So use the method that works for you, but do use inline citations.

Here is an example of inline citation. After government genocide in the 20th century, real birds were replaced with surveillance drones designed to look just like birds<a name="lorenz"></a>[<sup>[1]</sup>](#lorenznote). Use a minimum of 3 to 5 citations, but we prefer more <a name="admonish"></a>[<sup>[2]</sup>](#admonishnote). You need enough citations to fully explain and back up important facts. 

Remeber you are trying to explain why someone would want to answer your question or why your hypothesis is in the form that you've stated. 

# Problem Statement - Katie

*Clearly describe the problem that you are solving. Avoid ambiguous words. The problem described should be well defined and should have at least one ML-relevant potential solution. Additionally, describe the problem thoroughly such that it is clear that the problem is quantifiable (the problem can be expressed in mathematical or logical terms), measurable (the problem can be measured by some metric and clearly observed), and replicable (the problem can be reproduced and occurs more than once).*

Autonomous driving systems must make real-time decisions in dynamic environments while balancing safety, efficiency, and adherence to traffic rules. Traditional rule-based approaches struggle to generalize across diverse driving conditions, thus reinforcement learning (RL) is a promising alternative for self-driving applications. However, training RL models to drive safely and effectively remains a significant challenge due to the need for reliable evaluation metrics and the complexity of real-world driving scenarios.

In this project, we aim to develop and compare reinforcement learning models for self-driving car simulations using CARLA. Our goal is to identify the model that achieves the highest overall performance across multiple key metrics, including:

- Safety: Minimizing collisions with obstacles, pedestrians, and other vehicles
- Lane Adherence: Ensuring the vehicle stays within lane boundaries and follows lane discipline
- Traffic Rule Compliance: Obeying traffic lights, stop signs, and yielding rules
- Efficiency: Reaching the intended destination within a reasonable time frame while maintaining safe driving behavior

The vehicle's actions can be evaluated using numerical metrics such as collision count, lane deviation, and time to destination. Additionally, each episode in the CARLA simulation can be analyzed for performance using well-defined criteria. This is also replicable, as the experiment can be conducted multiple times with different RL models and configurations to assess their effectiveness under various driving conditions.

Through this project, we seek to determine which RL algorithm and model architecture yield the best trade-off between safety, rule adherence, and efficiency, contributing to the broader field of autonomous vehicle research.

# Data - Grace

*You should have a strong idea of what dataset(s) will be used to accomplish this project.* 

*If you know what (some) of the data you will use, please give the following information for each dataset:*
*- link/reference to obtain it*
*- description of the size of the dataset (# of variables, # of observations)*
*- what an observation consists of*
*- what some critical variables are, how they are represented*
*- any special handling, transformations, cleaning, etc will be needed*

*If you don't yet know what your dataset(s) will be, you should describe what you desire in terms of the above bullets.*

As this is a reinforcement learning project, the agent will generate its own data through its interaction with the Webots environment and will not use a pre-existing static dataset. Each observation from the agent will consist of various sensor readings <a name="webots_sensors"></a>[<sup>[2]</sup>](#webots_sensors_note):
- **RGB image frames**: 1D byte array 
- **Point cloud distance data (LiDAR)**: 1D float array
- **Proximity sensor data**: Float
- **GPS coordinates**: 3D float array
- **Acceleration**: 3D float array
- **Angular velocity**: 3D float array
- **Cardinal direction**: 3D float array
- **Wheel rotation**: Float (in radians) 
- Control commands <a name="webots_carlib"></a>[<sup>[3]</sup>](#webots_carlib_note)
    - **Steering**: Float (in radians)
    - **Throttle**: Float ([0, max_speed])
    - **Braking**: Float ([0, 1])
- **Time step**: Float

To ensure the vehicle's safety, lane adherance, and traffic rule compliance the most critical variables are:
- RGB image frames for detecting pedestrains, other vehicles, traffic signs/signals, and lanes
- Point cloud distance data for determining following distance and preventing collisions
- Proximity sensor data for accurate close range obstacle and collision detection

To ensure the vehicle's efficiency the most critical variables are time step and GPS coordinates to minimize drive time verify the correct final destination. 

Webots by default runs at 32ms per time step, meaning approximately 31 observations will be recorded per 1 second of simulation time. Observations will be stored as NumPy arrays for optimal reinforcement learning training. To ensure consistency and prevent feature bias, the sensor data will be preprocessed. All sensor readings will be normalized to a common scale to improve stability and convergence speed. In addition, RGB images that are returned from as 1D arrays will be reshaped into 3D arrays (height x width x channels) and pixel values will be normalized. GPS coordinates will be converted from abosolute to relative positioning to simplify state representation. Lastly, null values returned by LiDAR sensors will be replaced with the maximum range of the sensor. 

# Proposed Solution - Hsiang-An

The proposed solution combines **Reinforcement Learning (RL)** and **Convolutional Neural Networks (CNNs)** to develop a self-driving car system in a simulated environment created using **Webots**. The system learns to navigate autonomously by processing visual inputs from a front-facing camera and optimizing driving policies through trial and error.

# **Model**
- **Webots** provides a realistic simulation environment with customizable tracks and driving scenarios. The car is equipped with a front-facing camera to capture visual input, simulating real-world driving conditions.
- A **CNN** processes raw image data to extract meaningful features (e.g., lane markings, obstacles, traffic signs).
- Implemented using **PyTorch**, the CNN serves as the perception module, transforming visual inputs into a state space for the RL agent.
- The RL agent may use **Proximal Policy Optimization (PPO)** to learn an optimal driving policy.
- The state space consists of CNN-extracted features, and the action space includes controls like steering, throttle, and brake.
- Includes reward fuctions that gives positive rewards for staying within lanes and maintaining safe speeds, negative rewards for collisions, going off-road, or violating traffic rules.

# **Training Pipeline**
1. The car collects image data from the Webots environment.
2. The CNN processes the images and extracts features.
3. The RL agent selects actions based on the features and receives rewards.
4. The agent updates its policy iteratively using collected experiences.

# **Testing and Evaluation**
- The trained model is tested on unseen tracks or scenarios to evaluate generalization.
- Evalutaion metrics are listed below 
- A **rule-based controller** serves as the **benchmark**. It follows predefined rules (e.g., stay in the center of the lane, stop at obstacles) without learning capabilities.

---

## Why This Solution Works

- CNNs excel at processing image data and have been successfully used in autonomous driving tasks like lane detection and object recognition. By extracting meaningful features from raw images, the CNN enables the RL agent to interpret complex visual inputs effectively.
- RL allows the agent to learn optimal policies through trial and error, making it well-suited for dynamic and unpredictable driving scenarios. The reward-based learning process ensures the agent improves over time by maximizing safe and efficient driving behaviors.
- Webots provides a realistic and customizable environment for training and testing, enabling the simulation of diverse driving scenarios. The simulator's integration with PyTorch ensures a seamless and reproducible implementation.

---


# Evaluation Metrics - Katie

*Propose at least one evaluation metric that can be used to quantify the performance of both the benchmark model and the solution model. The evaluation metric(s) you propose should be appropriate given the context of the data, the problem statement, and the intended solution. Describe how the evaluation metric(s) are derived and provide an example of their mathematical representations (if applicable). Complex evaluation metrics should be clearly defined and quantifiable (can be expressed in mathematical or logical terms).*

In this project, we evaluate the performance of both the **benchmark model** and the **solution model** using key safety and efficiency metrics. These metrics ensure that the self-driving agent follows safe driving behavior while effectively navigating to its destination.

#### **1. Collision Rate (Safety)**
**Definition:** Measures the frequency of collisions per episode.  
- Lower values indicate better performance in avoiding obstacles and other vehicles.  
- Derived from the number of collisions detected during the simulation.  

**Mathematical Representation:**  
$$
\text{Collision Rate} = \frac{\text{Total Collisions}}{\text{Total Episodes}}
$$
Where:  
- **Total Collisions** is the number of times the agent collides with an object
- **Total Episodes** is the number of completed simulation runs

**Example Interpretation:**  
- If the agent crashes 10 times in 50 episodes, the collision rate is 0.2 (or 20%)  
- A safer model should minimize this rate

#### **2. Lane Adherence (Safety & Rule Compliance)**
**Definition:** Measures how well the vehicle stays within lane boundaries  
- Calculated as the deviation from the center of the assigned lane over time

**Mathematical Representation:**  
$$
\text{Lane Deviation} = \frac{1}{T} \sum_{t=1}^{T} |d_t|
$$
Where:  
- \( d_t \) is the lateral distance from the lane center at time \( t \)
- \( T \) is the total number of time steps in an episode

**Example Interpretation:**  
- A **higher deviation** means the vehicle frequently strays out of its lane
- The goal is to **minimize lane deviation** for better lane-keeping performance

#### **3. Traffic Rule Compliance (Safety & Legal Adherence)**
**Definition:** Tracks the number of violations related to red lights, stop signs, and illegal lane changes
- Lower values indicate better adherence to traffic laws

**Mathematical Representation:**  
$$
\text{Violation Rate} = \frac{\text{Total Violations}}{\text{Total Episodes}}
$$

**Example Interpretation:**  
- If a model runs 100 episodes and violates traffic rules 15 times, the violation rate is 0.15 (or 15%)
- A safer model will have a near-zero violation rate

#### **4. Time to Destination (Efficiency)**
**Definition:** Measures the time taken to successfully reach the goal
- A balance is needed: the car should not drive recklessly fast but also should not drive too slowly

**Mathematical Representation:**  
$$
\text{Time Efficiency} = \frac{\text{Total Distance Traveled}}{\text{Total Time Taken}}
$$

**Example Interpretation:**  
- If an agent takes 100 seconds to reach a 500m destination, its speed efficiency score is 5 m/s
- A good model should balance speed while following safety rules


# Ethics & Privacy - Hsiang-An

Ethical concerns in this project are complex, with safety being a primary consideration. Self-driving vehicles rely on ML models to interpret their surroundings and make real-time decisions, but these models may encounter unpredictable scenarios that lead to accidents. Unlike human drivers, AI systems lack personal accountability, making it difficult to determine who is responsible when failures occur. Questions of liability (manufacturer, software engineers, or the vehicle owner) will become even more complicated if our model contributes to such issues. Another ethical dilemma arises in unavoidable accident scenarios, where the system may need to choose between different harmful outcomes. Should the vehicle prioritize the safety of its passengers over pedestrians or other drivers? This challenge highlights the difficulty engineers face in programming and defining ethical guidelines.

Bias in machine learning models is another important consideration, as training data may not always represent the full diversity of real-world driving conditions. If the dataset lacks a wide range of pedestrian appearances or road environments, the AI may struggle to make fair and accurate decisions. Without comprehensive testing across diverse environments, the system could unintentionally discriminate against certain groups, leading to unsafe or unfair outcomes.

Privacy is also a key concern, as self-driving vehicles collect vast amounts of data, including location, passenger behavior, to even personal/veicle information. We would make sure to include data that has been anonymized so that the data is essential for improving AI performance, but is not at risk of privacy leaks.


# Team Expectations 


* Meet once a week via Zoom, more as needed closer to the end of the quarter
* Respond in group chat within 12 hours
* Conflict resolved by majority, any conflicts should be brought up within group before seeking TA assistance 
* Work should be divided evenly to the best of the group's ability
* Be awafre of deadlines, each member's portion of work should be completed at least a couple hours prior to deadline to allow for revision 


# Project Timeline Proposal - Hsiang-An


| Meeting Date  | Meeting Time| Completed Before Meeting  | Discuss at Meeting |
|---|---|---|---|
| 2/12  |  5 PM |  Brainstorm topics/questions and split parts for research (all)  | Determine best form of communication; Discuss and decide on final project topic; discuss hypothesis; begin background research | 
| 2/14  |  12 PM |  Finalize Project Proposal (all), Search for datasets (Jiawei) | Complete background, Discuss datasets and metrics, finalize questions, Turn in proposal | 
| 2/21  | 5 PM  | Import and Wrangle Data (Hsiang-An), EDA (Grace)  | Discuss and finalize datasets and metrics for EDA, Review if work is divided in meaningful manner   |
| 2/28  |  5 PM  | Finalize data, Continue on EDA (Grace), Programming start for RL (Katie) | Review if EDA and data wrangling is completed, Discuss possible algorithm, validations, model selection |
| 3/5  | 6 PM  | Finalize EDA, continue programming (Katie), Start Analysis (Jiawei, Hsiang-An) | Review project code, Analyze algorithms and model performance, Split work based on what’s lacking |
| 3/12  | 12 PM  | Complete Analysis ; Start results/conclusion/discussion (Grace, Katie)| Discuss and complete project, Plan for extra meeting if needed |
| 3/16  | 5 PM  | Complete and Edit project (all)| Discuss and review report |
| 3/19  | Before 11:59 PM  | Finalize project (all) | Turn in Final Project  |

# Footnotes
<a name="lorenznote"></a>1.[^](#lorenz): Lorenz, T. (9 Dec 2021) Birds Aren’t Real, or Are They? Inside a Gen Z Conspiracy Theory. *The New York Times*. https://www.nytimes.com/2021/12/09/technology/birds-arent-real-gen-z-misinformation.html<br> 
<a name="webots_sensors_note"></a>2.[^](#webots_sensors): Cyberbotics API Reference: doc. https://cyberbotics.com/doc/reference/nodes-and-api-functions?tab-language=python  
<a name="webots_carlib_note"></a>3.[^](#webots_carlib): Cyberbotics Car & Driver Library Reference: doc. https://cyberbotics.com/doc/automobile/car-and-driver-libraries 

