# **Proyecto Integrador - Avance 5. Modelo Final**
## **Tecnologico de Monterrey**
------------------------------------------------------------------
### Profa. Dra. Grettel Barceló Alonso

### Prof. Dr. Luis Eduardo Falcón Morales

### Profa. Verónica Sandra Guzmán de Valle
------------------------------------------------------------------
### Marcela Alejandra Rosales Jiménez - A01032022
### José Antonio Mendoza Castro - A01794067

## **1. Preparación de ambiente**

### Instalacion de librerias

In [1]:
!pip install -q transformers

### Librerias

In [2]:
import pandas as pd
import spacy
from transformers import pipeline

In [3]:
pd.set_option('display.max_colwidth', None)
pd.set_option('display.max_rows', None)

## **2. Carga de datos**

In [4]:
dataset_path = f"scripts/wts_json_data"
df = pd.read_csv(f"{dataset_path}/captions_dataset.csv")

## **3. Funciones**

In [5]:
def extract_keywords(text):
  """
  Funcion que extrae los sustantivos, verbos y adjetivos de un texto.
  """
  doc = nlp(text)
  keywords = []
  for token in doc:
    if token.pos_ in ('NOUN', 'VERB', 'ADJ'):
      keywords.append(token.text)
  return keywords

In [21]:
def classify_text(text):
  """
  Funcion que clasifica un texto en base a una lista de etiquetas.
  """
  candidate_labels_tmp = [
      'accident',
      'caution',
      'collision',
      'crash',
      'danger',
      'emergency',
      'hazard',
      'injury',
      'prevention',
      'risk',
      'safety',
      'warning'
  ]
  result = classifier(text, candidate_labels_tmp)
  return result['labels'][0]

## **4. Exploracion de los datos**

Visualizamos algunos de los datos, contamos con 8 columnas. La columna id nos permite identificar los textos, para cada id hay 5 textos correspondientes. La columna video_name es el nombre del video al que le corresponde ese texto. El campo fps representa frames/second. La columna label representa la etapa de cada video, en total por cada video hay 5 etapas: precognition, recognition,judgement, action y avoidance. Las columnas caption_pedestrian y caption_vehicle se refieren a la perspectiva del texto, el primero corresponde al peaton y el segundo al vehiculo. Finalmente las columnas start_time y end_time corresponden al segundo de inicio y fin al que corresponde esa etapa.

In [7]:
df[20:40]

Unnamed: 0,id,video_name,fps,label,caption_pedestrian,caption_vehicle,start_time,end_time
20,9fa4da9c-4b60-48d9-a159-53547b02aedf,video3187.mp4,29.97,prerecognition,"The pedestrian, a middle-aged male in his 50s, stood in an urban setting on a bright weekday morning. He was wearing a blue T-shirt and slacks, his height measuring around 170 cm. With glasses perched on his nose and a black hat atop his head, he appeared to be closely watching his surroundings. Unaware of the vehicle directly in front of him, the pedestrian's body was positioned perpendicular to the vehicle and to the right. His line of sight was fixed straight ahead, aligned with the direction of travel. The pedestrian seemed to be in no rush, moving slowly as he attempted to cross the road. The conditions were optimal, with the road surface dry and the weather clear. Despite the light traffic volume on the main road, there was only one lane available for vehicles traveling in one direction. Sidewalks were present on both sides, adding to the pedestrian's sense of safety and ease as he made his way across.","The vehicle was moving at a constant speed when it suddenly came to a stop. Positioned on the right side of a pedestrian, the vehicle was in close proximity to them. The pedestrian was clearly visible within the vehicle's field of view. The vehicle's speed was at 0km/h, indicating a complete halt. In terms of the environment conditions, the pedestrian was a male in his 50s, standing at a height of 170 cm. He was wearing glasses, a black hat, a blue T-shirt, and blue slacks. The event took place in an urban area on a weekday with clear weather and bright brightness. The road surface was dry and level, consisting of asphalt. The traffic volume was light as the vehicle was traveling on a main road with a single one-way lane and sidewalks on both sides.",20.954,21.021
21,9fa4da9c-4b60-48d9-a159-53547b02aedf,video3187.mp4,29.97,recognition,"The pedestrian is a middle-aged man in his 50s with a height of 170 cm. He is wearing a black hat, a blue T-shirt as an upper body clothing, and blue slacks as lower body clothing. He is also wearing glasses. The event takes place in an urban area on a clear weekday. The weather is bright and the road conditions are dry. The road surface is made of asphalt and there is only one lane on the one-way main road. There is a sidewalk on both sides. The pedestrian is standing directly in front of a vehicle while facing perpendicular to it and to the right. He is closely watching the vehicle and is almost aware of its presence. His line of sight is in front, aligning with the vehicle's direction of travel. He is slowly crossing the road and appears to be cautious. The traffic volume is light and the road is level.","The vehicle is currently in a state of constant speed and is positioned on the right side of a pedestrian. They are situated in close proximity to each other, with the vehicle being able to see the pedestrian within its field of view. At this moment, the vehicle has come to a complete stop, with its speed at 0 km/h. As for the environmental conditions, the pedestrian is a male in his 50s, standing at a height of 170 cm. The pedestrian is wearing glasses and a black hat, along with a blue t-shirt and blue slacks. The event takes place in an urban area on a weekday. The weather is clear, and the brightness is bright. The road conditions are optimal, with the asphalt road surface being dry and level. This particular location is classified as a main road with a single one-way lane and sidewalks on both sides. The traffic volume is light, allowing for a relatively calm atmosphere.",21.054,21.121
22,9fa4da9c-4b60-48d9-a159-53547b02aedf,video3187.mp4,29.97,judgement,"The pedestrian, a man in his 50s wearing a blue T-shirt and slacks, with a black hat and glasses, stands directly in front of the vehicle on the urban main road on a clear and bright weekday. His body is positioned perpendicular to the vehicle and to the right. He is close to the vehicle, and his line of sight is in front, in the direction of travel. The pedestrian closely watches his surroundings, almost noticing the presence of the vehicle. He is aware of the vehicle and slowly crosses the road. The road surface is dry and level, with light traffic volume. There is one-way traffic on the road, with a single lane and sidewalks on both sides. The environment presents a calm and steady atmosphere as the pedestrian cautiously proceeds with his crossing, taking into account the surroundings and his own visual state. Overall, the situation remains under control and harmonious, with no immediate danger or rush.","The vehicle is currently stationary, maintaining a constant speed, with a close relative distance to a pedestrian on its right side. The vehicle's field of view allows it to see the pedestrian clearly. Analyzing the surroundings, it can be observed that the weather is clear, the brightness is bright, and the road surface conditions are dry. The vehicle is situated in an urban area, specifically on a main road with a single lane in the direction it is traveling. There are sidewalks on both sides of the road. The environment conditions indicate that the pedestrian is a male in his 50s, approximately 170 cm tall, wearing glasses and a black hat. His upper body is adorned with a blue T-shirt, and he is wearing blue slacks for his lower body. This event occurs on a weekday with light traffic volume on the road. The vehicle is placed on level asphalt.",21.154,21.288
23,9fa4da9c-4b60-48d9-a159-53547b02aedf,video3187.mp4,29.97,action,"The pedestrian stands perpendicular to the vehicle and to the right. Positioned directly in front of the vehicle, the pedestrian is relatively close to it. Their line of sight is in front, aligned with the direction of travel. Closely watching, the pedestrian's visual status indicates their awareness of the surroundings. Moving slowly, the pedestrian is making their way across. In terms of the environment condition, the pedestrian is a male in his 50s, with a height of 170 cm. He wears glasses and a black hat, and his upper body is adorned with a blue t-shirt. His lower body is clad in blue slacks. Taking place in an urban setting on a clear and bright weekday, the road surface is dry and level, made of asphalt. The traffic volume is light, and it is a one-way, one-lane road with sidewalks on both sides. The event unfolds on a main road.","The vehicle is currently experiencing constant speed as it moves along. Positioned on the right side of the pedestrian, it is close in relative distance. From its field of view, the pedestrian is visible to the vehicle. The vehicle is in a stopped state with a speed of 0km/h. In the surrounding environment, a male in his 50s is present. He has a height of 170 cm and wears glasses. Additionally, he sports a black hat, a blue T-shirt for his upper body, and blue slacks for his lower body. The setting of this event takes place in an urban area, specifically on a weekday. The weather is clear with bright lighting conditions. The road surface is dry and level, made of asphalt. It is a main road with one-way traffic and a single lane. Sidewalks are available on both sides. The traffic volume is light in this scenario.",21.321,23.357
24,9fa4da9c-4b60-48d9-a159-53547b02aedf,video3187.mp4,29.97,avoidance,"The pedestrian, a middle-aged man in his 50s, was standing directly in front of the vehicle on a clear weekday. He was relatively close to the vehicle, with his body perpendicular and to the right. Closely watching, he had a line of sight in front, following the direction of the vehicle's travel. Although he had almost noticed the vehicle's presence, he was slowly crossing the road. The man's appearance included glasses and he was dressed in a blue T-shirt and slacks, with a black hat. The urban environment offered bright lighting, with dry and level asphalt road conditions. The main road had light traffic with only one lane, and sidewalks were present on both sides. These details painted a clear picture of the pedestrian's situation, providing important information on his relative position to the vehicle and overall surroundings.","The vehicle is currently experiencing acceleration and is positioned on the right side of a pedestrian. The vehicle is at a close distance from the pedestrian and has a clear field of view where it can see the pedestrian. The vehicle is about to start moving and is moving at a speed of 5km/h. The environment conditions show that the pedestrian is a male in his 50s, with a height of 170 cm. He is wearing glasses and a black hat, along with a blue T-shirt and blue slacks. The event is taking place in an urban setting, specifically on a weekday. The weather is clear and bright, with dry road surface conditions on a level asphalt road. The traffic volume is light and the road is classified as a main road with one-way traffic and one lane. There are sidewalks present on both sides of the road.",23.39,24.892
25,99934c50-d5d3-4c44-a9e3-9450fe8e1ceb,video1248.mp4,30.0,prerecognition,"The pedestrian, a woman in her 40s, stands still on a clear, bright weekday morning. She is wearing a purplish red T-shirt and a gray skirt, her height measuring approximately 160 cm. Oblivious to the vehicle nearby, she closely watches her intended crossing destination. The pedestrian's body is oriented perpendicularly to the left side of the vehicle while positioning herself diagonally to the right and in front of it. In this urban environment, on a level, dry asphalt road with usual traffic volume, the main road consists of a one-way lane with sidewalks on both sides. The pedestrian seems unaware of the vehicle's presence, as she remains engrossed in her surroundings. Despite the busy traffic and the potential danger it poses, she stands still, perhaps waiting for the opportune moment to move across the road.","A vehicle is decelerating while travelling diagonally to the left in front of a pedestrian. The vehicle is near the pedestrian and has a clear field of view with the pedestrian visible. It is moving straight ahead at a speed of 10km/h. The vehicle is in an urban environment and it is a weekday. The weather is clear and the brightness is bright. The road surface conditions are dry and level, with the road surface being asphalt. The traffic volume is normal and the road is classified as a main road with one way and one lane. There are sidewalks present on both sides of the road. As for the environment condition, the pedestrian is a female in her 40s with a height of 160 cm. She is wearing a purplish red T-shirt and a gray skirt.",19.633,19.967
26,99934c50-d5d3-4c44-a9e3-9450fe8e1ceb,video1248.mp4,30.0,recognition,"A female pedestrian in her 40s with a height of 160 cm was standing still on a main road. She was wearing a purplish red T-shirt and a gray skirt. The pedestrian's body was perpendicular to the vehicle and positioned diagonally to the right, in front of the vehicle. She had a close line of sight to her crossing destination and was closely watching her surroundings. Although she was almost noticed, she was aware of the vehicle. The pedestrian was in an urban area on a weekday, with clear weather and bright lighting. The road surface conditions were dry and level, with asphalt as the road surface type. The traffic volume was usual on this one-way, one-lane road, which had sidewalks on both sides.","The vehicle is currently decelerating and is positioned diagonally to the left in front of the pedestrian. The vehicle is near to the pedestrian and has a clear view of them. It appears that the vehicle is about to stop. Additionally, the vehicle is traveling at a speed of 5km/h. As for the environmental conditions, the pedestrian is a female in her 40s, standing at a height of 160 cm. She is wearing a purplish red T-shirt and a gray skirt. The event takes place in an urban setting on a clear and bright weekday. The road surface is dry and level, with asphalt as the type of road surface. The traffic volume is usual, and the road is classified as a main road with one-way traffic flow and one lane. Sidewalks are available on both sides of the road.",20.0,22.267
27,99934c50-d5d3-4c44-a9e3-9450fe8e1ceb,video1248.mp4,30.0,judgement,"The pedestrian, a female in her 40s, stood still diagonally to the right in front of the vehicle on a clear and bright weekday. She was wearing a purplish red T-shirt and a gray skirt, standing on the dry asphalt road surface. The road was a main road with only one way and one lane, and there were sidewalks on both sides. The pedestrian's body was oriented diagonally to the left, opposite to the direction of the vehicle. She closely watched the road surface, indicating her visual focus. Although she had almost noticed the vehicle, she was unaware of its presence. The relative distance between the pedestrian and the vehicle was close. The overall environment conditions were urban, and the traffic volume was usual. The pedestrian's line of sight was unobstructed, and the road surface was level. These details provide a clear understanding of the situation surrounding the pedestrian and the environment they were in at that particular time.","The vehicle is positioned diagonally to the left in front of the pedestrian. It is close to the pedestrian and the pedestrian is visible within the vehicle's field of view. The vehicle is currently stopped, with a speed of 0 km/h. The environment conditions surrounding the event include a female pedestrian in her 40s, standing at a height of 160 cm. She is wearing a purplish red T-shirt and a gray skirt. The event is taking place in an urban area on a weekday, with clear weather and bright brightness. The road surface conditions are dry and level, with the road classified as a main road consisting of one-way traffic with only one lane. Both sides of the road have sidewalks.",22.3,22.367
28,99934c50-d5d3-4c44-a9e3-9450fe8e1ceb,video1248.mp4,30.0,action,"The pedestrian, a woman in her 40s, stood diagonally to the right and in front of the vehicle. She closely watched the vehicle, almost noticed its presence. Her body was oriented diagonally to the left, opposite to the direction of the vehicle. With her line of sight fixed on her crossing destination, she appeared ready to cross. Moving slowly, she stood in front of the vehicle, seemingly aware of its presence. The woman's clothing consisted of a purplish red T-shirt on her upper body and a gray skirt on her lower body. The urban environment was characterized by a clear and bright weather on a dry asphalt road, with usual traffic volume. The road was a main road with only one lane in the direction the pedestrian was facing. Sidewalks were present on both sides.","The vehicle in question is currently undergoing acceleration. It is positioned diagonally to the left in front of a pedestrian and is relatively close to them. The vehicle has a clear field of view of the pedestrian and is currently at a stop, with a speed of 0km/h. In terms of the environment condition, the pedestrian is a female in her 40s, standing at approximately 160 cm in height. She is wearing a purplish red T-shirt on her upper body and a gray skirt on her lower body. The event takes place in an urban setting on a weekday, with clear and bright weather conditions. The road surface is dry and level, consisting of asphalt. The traffic volume is usual, and the road is classified as a main road with a single one-way lane. There are sidewalks available on both sides of the road.",22.4,24.167
29,99934c50-d5d3-4c44-a9e3-9450fe8e1ceb,video1248.mp4,30.0,avoidance,"The pedestrian stood perpendicular to the vehicle and to the left. They were positioned directly in front of the vehicle, at a close relative distance. Their line of sight was focused on their crossing destination, and they closely watched the surroundings. Moving slowly, they were heading across the road. The pedestrian, an adult female in her 40s, stood at a height of 160 cm. She was wearing a purplish red T-shirt, and a gray skirt. The event took place in an urban area on a clear weekday with bright lighting conditions. The road surface was dry and level, made of asphalt. It was a main road with one-way traffic and one lane, with sidewalks available on both sides. The traffic volume was normal, and the surroundings were typically suburban. This information provides a comprehensive description of the pedestrian's orientation, appearance, location, and the environmental conditions in which the event occurred.","The vehicle is currently in a state of constant speed, indicating that it is maintaining a steady pace without any acceleration or deceleration. It is positioned on the left side of a pedestrian, implying that the vehicle is located to the left of the individual as viewed from the vehicle's perspective. The relative distance between the vehicle and the pedestrian is close, indicating a short distance between them. From the vehicle's field of view, the pedestrian is visible and can be seen by the driver. The vehicle itself is currently in a stopped position, suggesting that it is not in motion. Its speed is recorded as 0 km/h, indicating complete immobility. As for the environment conditions, the pedestrian is a female in her 40s, who stands at a height of 160 cm. She is wearing a purplish red T-shirt on her upper body and a gray skirt on her lower body. The event takes place in an urban area, specifically on a weekday, with a clear and bright weather condition. The road surface is dry and level, composed of asphalt. The traffic volume on the road is usual, and the road itself is classified as a main road with a single one-way lane and sidewalks available on both sides.",24.2,27.967


Elaboramos un conteo de palabras para determinar algunas palabras relevantes a identificar en una clasificacion.

In [8]:
nlp = spacy.load('en_core_web_sm')

In [9]:
#TO DO: Hacer una funcion que itere para cada una de las labels y aplicar la funcion a cada columna
#text_colums = ['caption_pedestrian', 'caption_vehicle']

In [10]:
#labels = ['prerecognition', 'recognition', 'judgement', 'action', 'avoidance']

In [11]:
list_kw_action_pedestrian = df[df['label'] == 'action']['caption_pedestrian'].apply(extract_keywords).tolist()

In [12]:
dict_act = {}

for l in list_kw_action_pedestrian:
  for w in l:
    if w in dict_act:
      dict_act[w] += 1
    else:
      dict_act[w] = 1

In [13]:
df_act = pd.DataFrame.from_dict(dict_act, orient='index', columns=['count'])

In [14]:
df_act = df_act.sort_values(by='count', ascending=False).reset_index().rename(columns={'index': 'word'})

In [15]:
df_act.head()

Unnamed: 0,word,count
0,vehicle,12684
1,road,12169
2,pedestrian,11491
3,front,4745
4,traffic,4489


En esta parte consideramos palabras comunes en ingles relacionadas a la seguridad vial. Como vemos la mayoria aparecen en el las descripciones de los videos.

In [16]:
df_act[df_act['word'].isin(['accident', 'caution', 'collision', 'crash', 'danger', 'emergency', 'hazard', 'injury', 'prevention', 'risk', 'safety', 'warning'])]

Unnamed: 0,word,count
218,caution,120
272,safety,72
398,danger,29
437,risk,24
940,collision,4
1080,accident,3
1255,hazard,2


## **5. Modelado de los datos**

In [24]:
import torch
torch.cuda.is_available()

True

In [28]:
if torch.cuda.is_available():
    classifier = pipeline('zero-shot-classification', model = 'facebook/bart-large-mnli', device="cuda")
else:
    classifier = pipeline('zero-shot-classification', model = 'facebook/bart-large-mnli')

Consideramos las palabras identificadas en la seccion anterior para clasificar los textos.

### Muestra

In [29]:
#Condeideramos solo los textos correspondientes al label 'action'
df_sample = df[df['label'] == 'action'].sample(frac=0.1, random_state=42)

In [30]:
df_sample['class'] = df_sample['caption_pedestrian'].apply(classify_text)

In [31]:
#Tiempo de ejecucion promedio por texto: 4.6 segundos con un GPU A100

## **6. Resultados**

In [32]:
df_sample.groupby('class').count()

Unnamed: 0_level_0,id,video_name,fps,label,caption_pedestrian,caption_vehicle,start_time,end_time
class,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
caution,49,49,49,49,49,49,49,49
hazard,1,1,1,1,1,1,1,1
risk,266,266,266,266,266,266,266,266
safety,24,24,24,24,24,24,24,24


## **7. Evaluar prompts**

Ya que evaluamos la existencia de palabras que indiquen accidente vial o conflicto vial, experimentamos con prompts a utilizar con un modelo LLM como ChatGPT.

```
a continuación te voy a entregar la descripción de un evento ocurrido en la calle, identifica con un "sí" o "no" si describe un accidente vial, también en máximo dos oraciones tu proceso de pensamiento.
```


### 7.1 ChatGPT

https://chatgpt.com/

![image.png](attachment:7fecc8e6-a4e2-4faf-b7a7-05e82e0237e5.png)

![image.png](attachment:6c1b4ca8-71ae-42f6-89ba-46fc8b606946.png)

### 7.2 Replicate con llama 3

https://replicate.com/blog/run-llama-3-with-an-api

![image.png](attachment:77759c06-4928-47e9-8826-0e38f9c2b8c1.png)

![image.png](attachment:292afe39-55db-4b4b-84a1-d9f1a80b362a.png)

## **Conclusiones**

Según el avance que vimos en las semanas anteriores y la dificultad para converger en varios clusters usando técnicas como K-means y DBSCAN, consideramos explorar la posibilidad de usar modelos generativos de Large Language Models (LLMs) para identificar si el texto corresponde a un accidente vial.

Lo primero que hicimos en este avance fue extraer los datos relevantes de los archivos JSON y generar un archivo CSV; este proceso se encuentra en el directorio `scripts/`. Después investigamos los datos en busca de palabras clave que pudieran indicar que el dataset contiene suficiente información para identificar un accidente, por lo que definimos varias palabras clave en inglés, como accident, caution, collision, entre otras. Finalmente, probamos varios prompts y ejemplos de texto de WTS en el servicio de chat de modelos como GPT-4 y Llama 3. Con estas pruebas, observamos que es posible hacer una clasificación para accidente vial.


El siguiente paso para nuestro proyecto es utilizar una API de alguno de estos modelos para obtener estas respuestas de manera programática.  

## **Repositorio**

https://github.com/marcelarosalesj/mna_proyecto_integrador_equipo13