<a href="https://colab.research.google.com/github/pedrohfsouza/projeto-PLN/blob/main/PLN_PROJETO_PR%C3%81TICO.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Processamento de Linguagem Natural [2024-Q2]**
Prof. Alexandre Donizeti Alves

### **PROJETO PRÁTICO** [LangChain + Grandes Modelos de Linguagem + API]


O **PROJETO PRÁTICO** deve ser feitO utilizando o **Google Colab** com uma conta sua vinculada ao Gmail. O link do seu notebook, armazenado no Google Drive, o link de um repositório no GitHub e o link de um vídeo do projeto em execução detalhando os principais resultados da atividade, devem ser enviados usando o seguinte formulário:

> https://forms.gle/D4gLqP1iGgyn2hbH8


**IMPORTANTE**: A submissão deve ser feita até o dia **08/09 (domingo)** APENAS POR UM INTEGRANTE DA EQUIPE, até às 23h59. Por favor, lembre-se de dar permissão de ACESSO IRRESTRITO para o professor da disciplina de PLN.

### **EQUIPE**

---

**POR FAVOR, PREENCHER OS INTEGRANDES DA SUA EQUIPE:**


**Integrante 01:**

`Pedro Henrique Faria de Souza RA: 11201921248`

**Integrante 02:**

`Cristhyan Ferreira Braga de Almeida RA: 11201812016`

**Integrante 03:**

`André Yuji Sakuma RA:11201920463`

### **GRANDE MODELO DE LINGUAGEM (*Large Language Model - LLM*)**

---

Cada equipe deve selecionar um Grande Modelo de Linguagem (*Large Language Model - LMM*). Cada modelo pode ser escolhido por até 5 equipes.



Por favor, informe os dados do LLM selecionada:

>


**LLM**: gpt-4o-mini

>

**Link para a documentação oficial**: https://platform.openai.com/docs/models/gpt-4o-mini



### **API**
---

Por favor, informe os dados da API selecionada:

**API**: Google Maps

**Site oficial**: https://www.google.com/maps

**Link para a documentação oficial**: https://developers.google.com/maps/documentation/javascript/overview?hl=pt-br






**IMPORTANTE**: cada **API** pode ser usada por até 4 equipes.

### **DESCRIÇÃO**
---

Implementar um `notebook` no `Google Colab` que faça uso do framework **`LangChain`** (obrigatório) e de um **LLM** aplicando, no mínimo, DUAS técnicas de PLN. As técnicas podem ser aplicada em qualquer córpus obtido a partir de uma **API** ou a partir de uma página Web.

O **LLM** e a **API** selecionados devem ser informados na seguinte planilha:

> https://docs.google.com/spreadsheets/d/1iIUZcwnywO7RuF6VEJ8Rx9NDT1cwteyvsnkhYr0NWtU/edit?usp=sharing

>
As seguintes técnicas de PLN podem ser usadas:

*   Correção Gramatical
*   Classificação de Textos
*   Análise de Sentimentos
*   Detecção de Emoções
*   Extração de Palavras-chave
*   Tradução de Textos
*   Sumarização de Textos
*   Similaridade de Textos
*   Reconhecimento de Entidades Nomeadas
*   Sistemas de Perguntas e Respostas
>

**IMPORTANTE:** É obrigatório usar o e-mail da UFABC.


### **CRITÉRIOS DE AVALIAÇÃO**
---


Serão considerados como critérios de avaliação os seguintes pontos:

* Uso do framework **`LangChain`**.

* Escolha e uso de um **LLM**.

* Escolha e uso de uma **API**.

* Vídeo (5 a 10 minutos).

* Criatividade no uso do framework **`LangChain`** em conjunto com o **LLM** e a **API**.




**IMPORTANTE**: todo o código do notebook deve ser executado. Código sem execução não será considerado.

### **IMPLEMENTAÇÃO**
---

In [None]:
# por favor, inserir o código a partir daqui...




In [None]:
# Etapa 1: Configurando o Ambiente
!pip install gradio googlemaps folium langchain
!pip install -qU langchain-openai



In [None]:
# Etapa 2: configurar chaves de API e o modelo LLM

from langchain_openai import ChatOpenAI
from getpass import getpass

#API key GOOGLE Maps
open_ai_api_key = getpass()
google_maps_api_key = getpass()

# Initialize the LangChain model
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0, openai_api_key=open_ai_api_key)

··········
··········


In [None]:
# Etapa 3: Configurando funções para buscar dados

from typing import List, Dict
from langchain.agents import tool
import googlemaps

# Função para obter coordenadas de uma cidade
def get_coordinates(google_maps_api_key, city_name):
    gmaps = googlemaps.Client(key=google_maps_api_key)
    geocode_result = gmaps.geocode(city_name)
    if not geocode_result:
        raise ValueError(f"City name '{city_name}' not found.")
    # Extrai a latitude e longitude da resposta
    location = geocode_result[0]['geometry']['location']
    return (location['lat'], location['lng'])

def get_restaurants_by_kind(google_maps_api_key, city_name, kind, radius=5000):
    lat, lng = get_coordinates(google_maps_api_key, city_name)
    gmaps = googlemaps.Client(key=google_maps_api_key)
    # Busca por restaurantes proximos usando as coordenadas e o raio
    places_result = gmaps.places_nearby(location=(lat, lng), radius=radius, type='restaurant', keyword=kind)
    return places_result['results']

In [None]:
@tool
def find_restaurants_by_kind(city_name: str, kind: str) -> List[Dict]:
    """
    Find restaurants of the specified kind near the specified city.

    Args:
        city_name (str): The city to search for restaurants of kind.
        kind (str): The kind of the restaurant to be searched.

    Returns:
        List[Dict]: A list of dictionaries containing restaurant information.
    """

    # Chama a função get_restaurants_by_kind
    restaurants = get_restaurants_by_kind(google_maps_api_key, city_name, kind)

    # Retorna uma lista de dicionários com informações dos restaurantes encontrados
    return [
        {
            "name": restaurant["name"],
            "address": restaurant["vicinity"],
            "lat": restaurant["geometry"]["location"]["lat"],
            "lng": restaurant["geometry"]["location"]["lng"]
        }
        for restaurant in restaurants
    ]


In [None]:
# 4.2 Vincular ferramentas ao LLM

tools = [find_restaurants_by_kind]

# Vincula as ferramentas ao LLM
llm_with_tools = llm.bind_tools(tools)

In [None]:
# 4.3 Crie o prompt

from langchain_core.prompts import MessagesPlaceholder
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder


MEMORY_KEY = "chat_history"
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
             """You are a travel assistant specializing in helping users find the kind specified restaurants and plan activities
            in their travel destinations. You have access to a tool called find_restaurants_by_kind to help you find restaurants by kind.

            When a user asks for a kind of restaurants in a specific city, use the find_restaurants_by_kind tool to fetch and return a list
            of this kind of restaurants in that city.

            If the user asks for an itinerary, generate a detailed itinerary for the given number of days, including visits to this kind of
            restaurants and popular landmarks.

            Only use valid city names and trip durations. If the user provides invalid input, respond with an error message asking
            for the necessary details.

            A valid request should contain the following:
            - A city name
            - A trip duration that is reasonable for exploring restaurants and activities
            - Some other details, like the user's interests and/or specific requirements

            Any request that contains potentially harmful activities is not valid, regardless of what
            other details are provided.

            If the request is not valid, set plan_is_valid = 'no' and use your expertise to update the request to make it valid,
            keeping your revised request shorter than 100 words.

            If the request seems reasonable, then set plan_is_valid = 'yes' and
            don't revise the request.""",

        ),
        MessagesPlaceholder(variable_name=MEMORY_KEY),
        ("user", "{input}"),
        MessagesPlaceholder(variable_name="agent_scratchpad"),
    ]
)


# Adiciona o histórico de chat

from langchain_core.messages import AIMessage, HumanMessage

chat_history = []

In [None]:
# 4.4 Crie o Agente

# Importa funções para formatar e processar mensagens com as ferramentas da OpenAI
from langchain.agents.format_scratchpad.openai_tools import (
    format_to_openai_tool_messages,
)
from langchain.agents.output_parsers.openai_tools import OpenAIToolsAgentOutputParser

# Configura o agente
agent = (
    {
        "input": lambda x: x["input"],
        # Formata as mensagens geradas pelo agente durante o processamento
        "agent_scratchpad": lambda x: format_to_openai_tool_messages(
            x["intermediate_steps"]
        ),
        # Recupera o histórico de chat
        "chat_history": lambda x: x["chat_history"],
    }
    | prompt
    | llm_with_tools
    | OpenAIToolsAgentOutputParser()
)

#configura o agente e as ferramentas, e ativa o modo verbose para mais detalhes durante a execução
from langchain.agents import AgentExecutor
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

In [None]:
result_itinerary = agent_executor.invoke({"input": "Make a intinerary for 5 days so I can visit the most of these restaurants and also some landmarks in the city. Answer in portuguese", "chat_history": chat_history})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mPor favor, forneça o nome da cidade e o tipo de restaurante que você gostaria de incluir no seu itinerário de 5 dias. Além disso, se você tiver interesses específicos ou requisitos, sinta-se à vontade para mencioná-los![0m

[1m> Finished chain.[0m


In [None]:
import folium
from folium.plugins import MarkerCluster
#Etapa 5: Criando o Mapa com Folium

# Função para criar um mapa usando Folium
def create_map(locations: List[Dict[str, float]],draw_route: bool = False):
    if not locations:
        return "No locations to display on the map."
    map_center = [locations[0]['lat'], locations[0]['lng']]
    folium_map = folium.Map(location=map_center, zoom_start=13)
    marker_cluster = MarkerCluster().add_to(folium_map)

    for location in locations:
        folium.Marker(
            location=[location['lat'], location['lng']],
            popup=location['name']
        ).add_to(marker_cluster)

    # Desenhar rota se necessário
    if draw_route and len(locations) > 1:
        route = [(location['lat'], location['lng']) for location in locations]
        folium.PolyLine(route, color="blue", weight=2.5, opacity=1).add_to(folium_map)

    return folium_map._repr_html_()

In [None]:
#Etapa 6: Integrando LangChain ao aplicativo

def find_restaurants(city_name: str, restaurant_kind: str, google_maps_api_key: str, open_ai_api_key: str):
    # Create the input text for the agent to find restaurants in the specified city
    input_text = f"Encontre restaurantes {restaurant_kind} em {city_name}."

    # Invoke the agent executor with the input text and an empty chat history
    result = agent_executor.invoke({"input": input_text, "chat_history": []})

    # Extract the output from the result
    output = result['output']

    # This function should realistically fetch data based on the agent's output
    restaurants = get_restaurants_by_kind(google_maps_api_key, city_name, restaurant_kind)

    # Create a list of locations with the necessary details for mapping
    locations = [{'name': r['name'], 'lat': r['geometry']['location']['lat'], 'lng': r['geometry']['location']['lng']} for r in restaurants]

    # Create an HTML map using the locations data
    map_html = create_map(locations)

    # Return the agent's output and the generated HTML map
    return output, map_html

def generate_itinerary(city_name: str, duration: int, kind: str, google_maps_api_key: str, open_ai_api_key: str):
    # Create the input text for the agent to make an itinerary for the specified number of days in the city
    input_text = f"Faça um itinerário de {duration} dias em {city_name}, incluindo restaurantes {kind} e pontos de interesse."

    # Set up the chat history with the user's query and the agent's response
    chat_history = [
      {"role": "user", "content": f"Quais estabelecimentos {kind} eu posso visitar em {city_name}?"},
        {"role": "assistant", "content": find_restaurants(city_name, kind, google_maps_api_key, open_ai_api_key)[0]}
    ]

    # Invoke the agent executor with the input text and the updated chat history
    result = agent_executor.invoke({"input": input_text, "chat_history": chat_history})

    # Extract the output from the result
    output = result['output']

    # This function should realistically fetch data based on the agent's output
    restaurants = get_restaurants_by_kind(google_maps_api_key, city_name, kind)

    # Create a list of locations for the itinerary, limited by the duration
    locations = [{'name': r['name'], 'lat': r['geometry']['location']['lat'], 'lng': r['geometry']['location']['lng']} for r in restaurants[:duration]]

    # Create an HTML map with the route drawn using the locations data
    map_html = create_map(locations, draw_route=True)

    # Return the agent's output and the generated HTML map
    return output, map_html
    # Simulação da extração de coordenadas dos pontos do itinerário
    restaurants = get_restaurants_by_kind(google_maps_api_key, city_name, kind)
    locations = [{'name': r['name'], 'lat': r['geometry']['location']['lat'], 'lng': r['geometry']['location']['lng']} for r in restaurants[:duration]]

    map_html = create_map(locations)
    return output, map_html

In [None]:
#Etapa 7: Construindo a Interface Gradio
import gradio as gr

# Defining Gradio components for the first tab
city_input_tab1 = gr.Textbox(label="Cidade", placeholder="Escreva o nome da cidade")
kind_tab1 = gr.Textbox(label="Tipo de restaurante", placeholder="Escreva um tipo de restaurante")
restaurants_output = gr.Textbox(label="Restaurantes")
restaurants_map = gr.HTML(label="Map")

# Defining Gradio components for the second tab
city_input_tab2 = gr.Textbox(label="Cidade", placeholder="Escreva o nome da cidade")
kind_tab2 = gr.Textbox(label="Tipo de restaurante", placeholder="Escreva um tipo de restaurante")
days_input = gr.Slider(label="Número de dias", minimum=1, maximum=10, step=1, value=5)
itinerary_output = gr.Textbox(label="Itinerário")
itinerary_map = gr.HTML(label="Map")

# Creating the Gradio interface with tabs
tab1 = gr.Interface(
    fn=lambda city_name, kind: find_restaurants(city_name, kind, google_maps_api_key, open_ai_api_key),
    inputs=[city_input_tab1, kind_tab1],
    outputs=[restaurants_output, restaurants_map],
    live=False
)

tab2 = gr.Interface(
    fn=lambda city_name, duration, kind: generate_itinerary(city_name, duration, kind, google_maps_api_key, open_ai_api_key),
    inputs=[city_input_tab2, days_input, kind_tab2],
    outputs=[itinerary_output, itinerary_map]
)

iface = gr.TabbedInterface([tab1, tab2], ["Encontre restaurantes", "Gerar itinerário"])

iface.launch(debug=True)

Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
Running on public URL: https://3e2f26bc0dd2eb8e7b.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `find_restaurants_by_kind` with `{'city_name': 'São Paulo', 'kind': 'Mineira'}`


[0m[36;1m[1;3m[{'name': 'Restaurante O Mineiro SP', 'address': 'R. Matias Aires, 74 - Consolação, São Paulo', 'lat': -23.5557986, 'lng': -46.65710929999999}, {'name': 'O Mineiro Prime', 'address': 'R. Antônio Carlos, 282 - Consolação, São Paulo', 'lat': -23.5567341, 'lng': -46.657581}, {'name': 'Gastronomia Mineira', 'address': 'R. Treze de Maio, 37 - Bela Vista, São Paulo', 'lat': -23.5533504, 'lng': -46.6468108}, {'name': 'Maria Mineira - Restaurante e Pizzaria', 'address': 'Rua Loefgren, 1031 - Vila Clementino, São Paulo', 'lat': -23.5997251, 'lng': -46.6355003}, {'name': 'Coração Mineiro', 'address': 'R. Frei Caneca, 569 - Consolação, São Paulo', 'lat': -23.5540742, 'lng': -46.6524994}, {'name': 'Restaurante Graça Mineira', 'address': 'Rua Machado Bitencourt, 75 - Vila Mariana, São Paulo', 'lat': -23.5979004, 'lng': -46.6381988

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/gradio/queueing.py", line 536, in process_events
    response = await route_utils.call_process_api(
  File "/usr/local/lib/python3.10/dist-packages/gradio/route_utils.py", line 321, in call_process_api
    output = await app.get_blocks().process_api(
  File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1935, in process_api
    result = await self.call_function(
  File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1520, in call_function
    prediction = await anyio.to_thread.run_sync(  # type: ignore
  File "/usr/local/lib/python3.10/dist-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 8



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `find_restaurants_by_kind` with `{'city_name': 'Madri', 'kind': 'Brasileira'}`


[0m

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/gradio/queueing.py", line 536, in process_events
    response = await route_utils.call_process_api(
  File "/usr/local/lib/python3.10/dist-packages/gradio/route_utils.py", line 321, in call_process_api
    output = await app.get_blocks().process_api(
  File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1935, in process_api
    result = await self.call_function(
  File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1520, in call_function
    prediction = await anyio.to_thread.run_sync(  # type: ignore
  File "/usr/local/lib/python3.10/dist-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 8



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `find_restaurants_by_kind` with `{'city_name': 'Madrid', 'kind': 'Brasileira'}`


[0m[36;1m[1;3m[{'name': 'Sabor Brasil', 'address': 'C. de Edgar Neville, 8, Madrid', 'lat': 40.4473899, 'lng': -3.6968504}, {'name': 'Rodizio brasileño', 'address': 'C. de Linneo, 18, Madrid', 'lat': 40.4121805, 'lng': -3.7206708}, {'name': 'Tapioquería', 'address': 'C. de la Fe, 1, Madrid', 'lat': 40.4091149, 'lng': -3.7008115}, {'name': 'Sabor Gaucho', 'address': 'C. de Orense, 83, Madrid', 'lat': 40.4600287, 'lng': -3.6944643}, {'name': 'Restaurante Brasayleña Plaza Mayor', 'address': 'Pl. del Comandante las Morenas, 3, Madrid', 'lat': 40.4162426, 'lng': -3.7088282}, {'name': 'Le Blend - Gastrobar y Cocina Fusión', 'address': 'P.º de la Castellana, 129, Local D, Madrid', 'lat': 40.458346, 'lng': -3.691439099999999}, {'name': 'Comida Brasileña Esfihate Madrid', 'address': 'Disponible parking subterráneo del Hipercor, C. de Castro