# Natural Language Processing for Query Classification

This notebook uses zero-shot classification to determine the category of user queries.

In [4]:
from transformers import pipeline

classifier = pipeline(
   'zero-shot-classification',
    model="facebook/bart-large-mnli",
)

  from .autonotebook import tqdm as notebook_tqdm
Device set to use cpu
Device set to use cpu


In [5]:
labels = ['clima', 'dolar', 'uf', 'noticias', 'saludo' ] 
result = classifier('Que clima hace hoy en Santiago?', candidate_labels=labels) 
print(result['labels'][0]) 
result2 = classifier('Que hay de nuevo en Santiago?', candidate_labels=labels)
print(result2['labels'][0])
result3 = classifier('El dolar subio hoy', candidate_labels=labels)
print(result3['labels'][0])
result4 = classifier('hola que sabes?', candidate_labels=labels)
print(result4)

clima
noticias
noticias
dolar
dolar
{'sequence': 'hola que sabes?', 'labels': ['clima', 'noticias', 'uf', 'saludo', 'dolar'], 'scores': [0.30700528621673584, 0.23928457498550415, 0.22170601785182953, 0.1908983588218689, 0.04110582917928696]}
{'sequence': 'hola que sabes?', 'labels': ['clima', 'noticias', 'uf', 'saludo', 'dolar'], 'scores': [0.30700528621673584, 0.23928457498550415, 0.22170601785182953, 0.1908983588218689, 0.04110582917928696]}


## Query Classifier Function

Creating a reusable function to classify user queries 

In [10]:
def classify_query(query, threshold=0.4):
    """
    Classifies user query into one of several categories using zero-shot classification.
    
    Args:
        query (str): The user's input query
        threshold (float): Confidence threshold to accept classification
        
    Returns:
        str: The category of the query (clima, dolar, uf, noticias, saludo, other)
    """
    categories = ['clima', 'dolar', 'uf', 'noticias', 'saludo']
    result = classifier(query, candidate_labels=categories)
    
    # Get the highest scoring category and its score
    top_category = result['labels'][0]
    top_score = result['scores'][0]
    
    print(f"Classified '{query}' as {top_category} with confidence {top_score:.2f}")
    
    if top_score >= threshold:
        return top_category
    else:
        return "other"  # Default category if confidence is too low

In [11]:
# Test examples
test_queries = [
    "¿Qué temperatura hace en Madrid?",
    "¿Cuál es el valor del dólar hoy?",
    "Dame las últimas noticias",
    "Hola, buenos días",
    "¿Cuánto vale la UF actualmente?",
    "¿Cuál es la capital de Francia?"  # Should be classified as 'other'
]

for query in test_queries:
    category = classify_query(query)
    print(f"Final category: {category}\n")

Classified '¿Qué temperatura hace en Madrid?' as clima with confidence 0.48
Final category: clima

Classified '¿Cuál es el valor del dólar hoy?' as dolar with confidence 0.89
Final category: dolar

Classified '¿Cuál es el valor del dólar hoy?' as dolar with confidence 0.89
Final category: dolar

Classified 'Dame las últimas noticias' as noticias with confidence 0.87
Final category: noticias

Classified 'Dame las últimas noticias' as noticias with confidence 0.87
Final category: noticias

Classified 'Hola, buenos días' as saludo with confidence 0.59
Final category: saludo

Classified 'Hola, buenos días' as saludo with confidence 0.59
Final category: saludo

Classified '¿Cuánto vale la UF actualmente?' as uf with confidence 0.72
Final category: uf

Classified '¿Cuánto vale la UF actualmente?' as uf with confidence 0.72
Final category: uf

Classified '¿Cuál es la capital de Francia?' as uf with confidence 0.40
Final category: other

Classified '¿Cuál es la capital de Francia?' as uf with 

## Query Router Function

Function to route queries to the appropriate agents

In [9]:
def route_query(query):
    """
    Routes a user query to the appropriate agent based on classification.
    
    Args:
        query (str): The user's input query
        
    Returns:
        str: The agent type to handle this query
    """
    # This function would be imported from another notebook
    category = classify_query(query)
    
    # Map categories to agent types
    routing_map = {
        "clima": "weather",
        "dolar": "financial",
        "uf": "financial",
        "noticias": "notice",
        "saludo": "general",
        "other": "general"
    }
    
    return routing_map.get(category, "general")