# Proyecto Bimestral: Sistema de Recuperación de Información basado en Reuters-21578

**Prof. Iván Carrera**

**27 de mayo de 2024**

## 1. Introducción

El objetivo de este proyecto es diseñar, construir, programar y desplegar un Sistema de Recuperación de Información (SRI) utilizando el corpus Reuters-21578. El proyecto se dividirá en varias fases, que se describen a continuación.

## 2. Fases del Proyecto

### 2.1. Adquisición de Datos

**Objetivo:** Obtener y preparar el corpus Reuters-21578.

**Tareas:**

- Descargar el corpus Reuters-21578.
- Descomprimir y organizar los archivos.
- Documentar el proceso de adquisición de datos.

### 2.2. Preprocesamiento

**Objetivo:** Limpiar y preparar los datos para su análisis.





In [1]:
import os
import re
import pandas as pd
from nltk.stem import SnowballStemmer

**Tareas:**

- Extraer el contenido relevante de los documentos.


In [2]:
# Directorio donde se encuentran los archivos
directorio = 'reuters/training/'

# Leer el contenido de los archivos en una lista
documentos = []
for archivo in os.listdir(directorio):
    ruta_archivo = os.path.join(directorio, archivo)
    with open(ruta_archivo, 'r', encoding='latin-1') as f:
        documentos.append(f.read())

print(f'Se han cargado {len(documentos)} documentos.')


Se han cargado 7769 documentos.


- Realizar limpieza de datos: eliminación de caracteres no deseados, normalización de texto, etc.


In [3]:
def limpiar_texto(texto):
    # Eliminar caracteres no deseados (mantener solo letras y espacios)
    texto_limpio = re.sub(r'[^a-zA-Z\s]', '', texto)
    # Normalizar a minúsculas
    texto_limpio = texto_limpio.lower()
    # Eliminar espacios en blanco adicionales
    texto_limpio = re.sub(r'\s+', ' ', texto_limpio).strip()
    return texto_limpio


In [4]:
documentos_limpios = [limpiar_texto(doc) for doc in documentos]

- Tokenización: dividir el texto en palabras o tokens.


In [5]:
# Dividir en palabras
def separar(doc):
    palabras = doc.split()
    return palabras

In [6]:
documentos_tokenizados_split = [separar(doc) for doc in documentos_limpios]

In [7]:
documentos_tokenizados_split[0]

['bahia',
 'cocoa',
 'review',
 'showers',
 'continued',
 'throughout',
 'the',
 'week',
 'in',
 'the',
 'bahia',
 'cocoa',
 'zone',
 'alleviating',
 'the',
 'drought',
 'since',
 'early',
 'january',
 'and',
 'improving',
 'prospects',
 'for',
 'the',
 'coming',
 'temporao',
 'although',
 'normal',
 'humidity',
 'levels',
 'have',
 'not',
 'been',
 'restored',
 'comissaria',
 'smith',
 'said',
 'in',
 'its',
 'weekly',
 'review',
 'the',
 'dry',
 'period',
 'means',
 'the',
 'temporao',
 'will',
 'be',
 'late',
 'this',
 'year',
 'arrivals',
 'for',
 'the',
 'week',
 'ended',
 'february',
 'were',
 'bags',
 'of',
 'kilos',
 'making',
 'a',
 'cumulative',
 'total',
 'for',
 'the',
 'season',
 'of',
 'mln',
 'against',
 'at',
 'the',
 'same',
 'stage',
 'last',
 'year',
 'again',
 'it',
 'seems',
 'that',
 'cocoa',
 'delivered',
 'earlier',
 'on',
 'consignment',
 'was',
 'included',
 'in',
 'the',
 'arrivals',
 'figures',
 'comissaria',
 'smith',
 'said',
 'there',
 'is',
 'still',
 's

- Eliminar stop words y aplicar stemming o lematización.
- Documentar cada paso del preprocesamiento.

In [8]:
# Cargar las stop words desde el archivo
ruta_stop_words = 'reuters/stopwords'
with open(ruta_stop_words, 'r', encoding='latin-1') as f:
    stop_words = set(f.read().split())

In [9]:
# Usar el Snowball Stemmer (puedes cambiar a otro si lo prefieres)
stemmer = SnowballStemmer('english')

def procesar_tokens(tokens):
    # Eliminar stop words
    tokens_filtrados = [token for token in tokens if token not in stop_words]
    # Aplicar stemming
    tokens_stemmizados = [stemmer.stem(token) for token in tokens_filtrados]
    return tokens_stemmizados

In [10]:
documentos_procesados = [procesar_tokens(doc) for doc in documentos_tokenizados_split]

In [11]:
documentos_procesados

[['bahia',
  'cocoa',
  'review',
  'shower',
  'continu',
  'week',
  'bahia',
  'cocoa',
  'zone',
  'allevi',
  'drought',
  'earli',
  'januari',
  'improv',
  'prospect',
  'come',
  'temporao',
  'normal',
  'humid',
  'level',
  'restor',
  'comissaria',
  'smith',
  'week',
  'review',
  'dri',
  'period',
  'mean',
  'temporao',
  'late',
  'year',
  'arriv',
  'week',
  'end',
  'februari',
  'bag',
  'kilo',
  'make',
  'cumul',
  'total',
  'season',
  'mln',
  'stage',
  'year',
  'cocoa',
  'deliv',
  'earlier',
  'consign',
  'includ',
  'arriv',
  'figur',
  'comissaria',
  'smith',
  'doubt',
  'crop',
  'cocoa',
  'harvest',
  'practic',
  'end',
  'total',
  'bahia',
  'crop',
  'estim',
  'mln',
  'bag',
  'sale',
  'stand',
  'mln',
  'hundr',
  'thousand',
  'bag',
  'hand',
  'farmer',
  'middlemen',
  'export',
  'processor',
  'doubt',
  'cocoa',
  'fit',
  'export',
  'shipper',
  'experienc',
  'dificulti',
  'obtain',
  'bahia',
  'superior',
  'certif',
  '

### 2.3. Representación de Datos en Espacio Vectorial

**Objetivo:** Convertir los textos en una forma que los algoritmos puedan procesar.

**Tareas:**

- Utilizar técnicas como Bag of Words (BoW) y TF-IDF para vectorizar el texto.




In [12]:
# Unir los tokens procesados nuevamente en un solo string por documento
documentos_procesados_texto = [' '.join(doc) for doc in documentos_procesados]


In [13]:
documentos_procesados_texto

['bahia cocoa review shower continu week bahia cocoa zone allevi drought earli januari improv prospect come temporao normal humid level restor comissaria smith week review dri period mean temporao late year arriv week end februari bag kilo make cumul total season mln stage year cocoa deliv earlier consign includ arriv figur comissaria smith doubt crop cocoa harvest practic end total bahia crop estim mln bag sale stand mln hundr thousand bag hand farmer middlemen export processor doubt cocoa fit export shipper experienc dificulti obtain bahia superior certif view lower qualiti recent week farmer sold good part cocoa held consign comissaria smith spot bean price rose cruzado arroba kilo bean shipper reluct offer nearbi shipment limit sale book march shipment dlrs tonn port name crop sale light open port junejuli dlrs dlrs york juli augsept dlrs tonn fob routin sale butter made marchapril sold dlrs aprilmay butter time york junejuli dlrs augsept dlrs time york sept octdec dlrs time york d

In [14]:
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer

vectorizer_bow = CountVectorizer()
vectorizer_tfidf = TfidfVectorizer()

In [15]:
X_bow = vectorizer_bow.fit_transform(documentos_procesados_texto)

print(f'BoW shape: {X_bow.shape}')


BoW shape: (7769, 21411)


In [16]:
df_bow = pd.DataFrame(X_bow.toarray(), columns=vectorizer_bow.get_feature_names_out())

In [17]:
df_bow

Unnamed: 0,aa,aaa,aachen,aaminus,aancor,aap,aaplus,aar,aarnoud,aaron,...,zorinski,zseven,zuccherifici,zuckerman,zulia,zurich,zurichbas,zuyuan,zverev,zzzz
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
7764,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
7765,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
7766,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
7767,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [18]:
# TF-IDF

X_tfidf = vectorizer_tfidf.fit_transform(documentos_procesados_texto)
print(f'TF-IDF shape: {X_tfidf.shape}')

TF-IDF shape: (7769, 21411)


In [19]:
# Convertir a DataFrame para visualizar
df_tfidf = pd.DataFrame(X_tfidf.toarray(), columns=vectorizer_tfidf.get_feature_names_out())
df_tfidf


Unnamed: 0,aa,aaa,aachen,aaminus,aancor,aap,aaplus,aar,aarnoud,aaron,...,zorinski,zseven,zuccherifici,zuckerman,zulia,zurich,zurichbas,zuyuan,zverev,zzzz
0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
7764,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7765,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7766,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7767,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


- Evaluar las diferentes técnicas de vectorización.
- Documentar los métodos y resultados obtenidos.

In [20]:
%pip install matplotlib


Note: you may need to restart the kernel to use updated packages.


In [21]:
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score
import matplotlib.pyplot as plt

# Número de clusters para K-means (puedes ajustar este valor según tus necesidades)
num_clusters = 5

# Clustering con BoW
kmeans_bow = KMeans(n_clusters=num_clusters, random_state=42)
kmeans_bow.fit(X_bow)
labels_bow = kmeans_bow.labels_
silhouette_avg_bow = silhouette_score(X_bow, labels_bow)
print(f'Silueta promedio para BoW: {silhouette_avg_bow}')

Silueta promedio para BoW: -0.009643857906619473


### 2.4. Indexación

**Objetivo:** Crear un índice que permita búsquedas eficientes.

**Tareas:**

- Construir un índice invertido que mapee términos a documentos.
- Implementar y optimizar estructuras de datos para el índice.
- Documentar el proceso de construcción del índice.



In [47]:
def construir_indice_invertido(documentos):
    indice_invertido = {}  # Usa un diccionario estándar
    for doc_id, doc in enumerate(documentos):
        for palabra in doc:
            if palabra not in indice_invertido:
                indice_invertido[palabra] = set()  # Inicializa un conjunto para nuevas palabras
            indice_invertido[palabra].add(doc_id)  # Agrega el doc_id al conjunto
    return indice_invertido

indice_invertido = construir_indice_invertido(documentos_procesados)

# Mostrar algunos términos y sus listas de documentos


In [41]:
indice_invertido

{'bahia': {0, 941, 1240, 2047},
 'cocoa': {0,
  9,
  82,
  262,
  288,
  300,
  310,
  318,
  319,
  364,
  366,
  389,
  394,
  474,
  489,
  780,
  862,
  941,
  944,
  1166,
  1190,
  1518,
  1528,
  1723,
  1755,
  2016,
  2047,
  2237,
  2573,
  2947,
  3088,
  3389,
  3408,
  3457,
  4013,
  4226,
  4286,
  4643,
  4661,
  4708,
  4805,
  4881,
  4959,
  5132,
  5281,
  5444,
  5448,
  5505,
  5900,
  6036,
  6078,
  6691,
  7027,
  7093,
  7107,
  7432,
  7489,
  7702,
  7736},
 'review': {0,
  25,
  115,
  271,
  332,
  430,
  502,
  505,
  506,
  593,
  627,
  686,
  757,
  770,
  780,
  813,
  819,
  826,
  830,
  854,
  892,
  957,
  1105,
  1218,
  1278,
  1308,
  1362,
  1576,
  1728,
  1778,
  1887,
  1890,
  1999,
  2018,
  2047,
  2141,
  2177,
  2199,
  2299,
  2401,
  2493,
  2527,
  2549,
  2683,
  2704,
  2759,
  2795,
  2847,
  2848,
  2852,
  2862,
  2912,
  2920,
  2935,
  2939,
  3099,
  3108,
  3112,
  3120,
  3214,
  3263,
  3337,
  3368,
  3370,
  3446,
  347

In [42]:
def buscar(consulta, indice_invertido):
    consulta_procesada = procesar_tokens(separar(limpiar_texto(consulta)))
    # Inicializar un conjunto con los IDs de los documentos relevantes
    documentos_relevantes = set()
    # Iterar sobre cada palabra de la consulta
    for palabra in consulta_procesada:
        # Buscar la palabra en el índice invertido
        if palabra in indice_invertido:
            # Agregar los IDs de los documentos que contienen la palabra al conjunto de documentos relevantes
            documentos_relevantes.update(indice_invertido[palabra])
    #realizar la matriz de similitud
    
    
   
    
    return documentos_relevantes

In [37]:
documentos_encontrados

{23,
 37,
 49,
 70,
 71,
 72,
 83,
 107,
 110,
 119,
 131,
 146,
 160,
 167,
 169,
 172,
 179,
 181,
 190,
 196,
 206,
 214,
 219,
 231,
 234,
 237,
 238,
 240,
 264,
 292,
 323,
 325,
 332,
 357,
 363,
 370,
 372,
 411,
 412,
 428,
 438,
 440,
 441,
 449,
 452,
 469,
 475,
 488,
 495,
 497,
 498,
 586,
 602,
 611,
 630,
 647,
 650,
 675,
 679,
 681,
 717,
 750,
 770,
 775,
 779,
 780,
 790,
 791,
 794,
 826,
 849,
 854,
 869,
 888,
 902,
 907,
 937,
 960,
 962,
 968,
 973,
 974,
 984,
 1002,
 1007,
 1070,
 1098,
 1131,
 1140,
 1157,
 1168,
 1184,
 1188,
 1205,
 1221,
 1222,
 1224,
 1300,
 1305,
 1314,
 1336,
 1362,
 1365,
 1374,
 1384,
 1392,
 1405,
 1412,
 1416,
 1444,
 1469,
 1473,
 1495,
 1498,
 1513,
 1536,
 1554,
 1573,
 1578,
 1591,
 1592,
 1632,
 1651,
 1656,
 1658,
 1701,
 1703,
 1709,
 1717,
 1721,
 1725,
 1727,
 1732,
 1737,
 1738,
 1762,
 1764,
 1825,
 1827,
 1842,
 1844,
 1855,
 1872,
 1886,
 1887,
 1909,
 1927,
 1995,
 2002,
 2011,
 2018,
 2023,
 2025,
 2027,
 2028,
 2054

In [43]:
consulta_procesada = procesar_tokens(separar(limpiar_texto('AUSTRALIAN UNIONS LAUNCH NEW SOUTH WALES STRIKES')))
consulta_vector = vectorizer_bow.transform([' '.join(consulta_procesada)])
df_consulta = pd.DataFrame(consulta_vector.toarray(), columns=vectorizer_bow.get_feature_names_out())
df_consulta

Unnamed: 0,aa,aaa,aachen,aaminus,aancor,aap,aaplus,aar,aarnoud,aaron,...,zorinski,zseven,zuccherifici,zuckerman,zulia,zurich,zurichbas,zuyuan,zverev,zzzz
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [44]:
documentos_encontrados=buscar('AUSTRALIAN UNIONS LAUNCH NEW SOUTH WALES STRIKES',indice_invertido)
bow_2 = X_bow[list(documentos_encontrados)]
df_bow_2 = pd.DataFrame(bow_2.toarray(), columns=vectorizer_bow.get_feature_names_out())
df_bow_2

Unnamed: 0,aa,aaa,aachen,aaminus,aancor,aap,aaplus,aar,aarnoud,aaron,...,zorinski,zseven,zuccherifici,zuckerman,zulia,zurich,zurichbas,zuyuan,zverev,zzzz
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
660,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
661,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
662,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
663,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [45]:
documentos_encontrados

{23,
 37,
 49,
 70,
 71,
 72,
 83,
 107,
 110,
 119,
 131,
 146,
 160,
 167,
 169,
 172,
 179,
 181,
 190,
 196,
 206,
 214,
 219,
 231,
 234,
 237,
 238,
 240,
 264,
 292,
 323,
 325,
 332,
 357,
 363,
 370,
 372,
 411,
 412,
 428,
 438,
 440,
 441,
 449,
 452,
 469,
 475,
 488,
 495,
 497,
 498,
 586,
 602,
 611,
 630,
 647,
 650,
 675,
 679,
 681,
 717,
 750,
 770,
 775,
 779,
 780,
 790,
 791,
 794,
 826,
 849,
 854,
 869,
 888,
 902,
 907,
 937,
 960,
 962,
 968,
 973,
 974,
 984,
 1002,
 1007,
 1070,
 1098,
 1131,
 1140,
 1157,
 1168,
 1184,
 1188,
 1205,
 1221,
 1222,
 1224,
 1300,
 1305,
 1314,
 1336,
 1362,
 1365,
 1374,
 1384,
 1392,
 1405,
 1412,
 1416,
 1444,
 1469,
 1473,
 1495,
 1498,
 1513,
 1536,
 1554,
 1573,
 1578,
 1591,
 1592,
 1632,
 1651,
 1656,
 1658,
 1701,
 1703,
 1709,
 1717,
 1721,
 1725,
 1727,
 1732,
 1737,
 1738,
 1762,
 1764,
 1825,
 1827,
 1842,
 1844,
 1855,
 1872,
 1886,
 1887,
 1909,
 1927,
 1995,
 2002,
 2011,
 2018,
 2023,
 2025,
 2027,
 2028,
 2054

In [46]:
from sklearn.metrics.pairwise import cosine_similarity

# Calcular similitud coseno
similitud_coseno = cosine_similarity(consulta_vector, bow_2).flatten()
similitud_coseno_id = [(doc_id, similitud_coseno[id]) for id, doc_id in enumerate (documentos_encontrados)]
similitud_coseno_id.sort(key=lambda x: x[1], reverse=True)
# Mostrar resultados ordenados por similitud
print(f"Documentos ordenados por similitud: {similitud_coseno_id}")

Documentos ordenados por similitud: [(2343, 0.4319342127906801), (4879, 0.31722063428725766), (5820, 0.29704426289300234), (5866, 0.29704426289300234), (4876, 0.2791452631195413), (4319, 0.2672612419124244), (4378, 0.2672612419124244), (4392, 0.26261286571944514), (6835, 0.25964539344474935), (1725, 0.25269934785963455), (3627, 0.25000000000000006), (2963, 0.23717082451262847), (5937, 0.23570226039551587), (5804, 0.2326918908652601), (5049, 0.22965760608731273), (7753, 0.22732056912722476), (7444, 0.2245365597551247), (4390, 0.22140372138502384), (3266, 0.22011272658140596), (2207, 0.21516574145596765), (5798, 0.21408720964441885), (3274, 0.21241399516204162), (3234, 0.21182963643408087), (1591, 0.20768001929732066), (5688, 0.20739033894608508), (4849, 0.20655911179772893), (2157, 0.2062394778460764), (7342, 0.20573779994945587), (4652, 0.20341905108624314), (6749, 0.20272121351984582), (1188, 0.19933664825552866), (370, 0.19867985355975662), (4023, 0.19867985355975662), (2824, 0.19810

### 2.5. Diseño del Motor de Búsqueda

**Objetivo:** Implementar la funcionalidad de búsqueda.

**Tareas:**

- Desarrollar la lógica para procesar consultas de usuarios.
- Implementar algoritmos de similitud como similitud coseno o Jaccard.
- Desarrollar un algoritmo de ranking para ordenar los resultados.
- Documentar la arquitectura y los algoritmos utilizados.

### 2.6. Evaluación del Sistema

**Objetivo:** Medir la efectividad del sistema.

**Tareas:**

- Definir un conjunto de métricas de evaluación (precisión, recall, F1-score).
- Realizar pruebas utilizando el conjunto de prueba del corpus.
- Comparar el rendimiento de diferentes configuraciones del sistema.
- Documentar los resultados y análisis.

### 2.7. Interfaz Web de Usuario

**Objetivo:** Crear una interfaz para interactuar con el sistema.

**Tareas:**

- Diseñar una interfaz web donde los usuarios puedan ingresar consultas.
- Mostrar los resultados de búsqueda de manera clara y ordenada.
- Implementar características adicionales como filtros y opciones de visualización.
- Documentar el diseño y funcionalidades de la interfaz.

## 3. Entrega Final

- **Documentación Completa:** Incluyendo los procesos, decisiones tomadas, y resultados de cada fase.
- **Código Fuente:** Organizado y bien comentado.
- **Informe de Evaluación:** Análisis detallado de la evaluación del sistema.
- **Demostración del Sistema:** Presentación funcional del sistema a través de la interfaz web.

## 4. Requisitos Técnicos

- **Lenguajes de Programación:** Python (preprocesamiento y modelado), JavaScript (para la interfaz web).

## 5. Evaluación del Proyecto

- **Funcionamiento:** (35%) Efectividad y eficiencia en la recuperación de información.
- **Documentación:** (35%) Claridad en la documentación de cada fase.
- **Innovación y Creatividad:** (15%) En la implementación de técnicas y la interfaz de usuario.
- **Presentación Final:** (15%) Calidad y claridad de la demostración del sistema.