# Green Greta App

In this notebook, I carry out the steps to create a recycling application that integrates two different models, on the one hand an image classification model (previously created in image_classification_model.ipynb) and a chatbot using a LLM+RAG.

In [2]:
import gradio as gr

import os
import sys
sys.path.append('../..')

import panel as pn  # GUI
pn.extension()

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
HUGGINGFACEHUB_API_TOKEN = os.environ['HUGGINGFACEHUB_API_TOKEN']
import pandas as pd

#langchain
from langchain.text_splitter import RecursiveCharacterTextSplitter, CharacterTextSplitter
from langchain.embeddings import HuggingFaceInferenceAPIEmbeddings
from langchain_community.llms import HuggingFaceHub
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.prompts import PromptTemplate
from langchain.chains import RetrievalQA
from langchain.prompts import ChatPromptTemplate
from langchain.schema import StrOutputParser
from langchain.schema.runnable import Runnable
from langchain.schema.runnable.config import RunnableConfig
from pydantic import BaseModel
from langchain.chains import ConversationalRetrievalChain, LLMChain
from langchain.vectorstores import Chroma
from langchain.memory import ConversationBufferMemory
from langchain.prompts.prompt import PromptTemplate
from langchain.prompts.chat import ChatPromptTemplate, SystemMessagePromptTemplate
from langchain.prompts import SystemMessagePromptTemplate, HumanMessagePromptTemplate, ChatPromptTemplate, MessagesPlaceholder
from langchain_community.document_loaders import WebBaseLoader

from datasets import load_dataset
from pydantic.v1 import BaseModel, Field
import shutil
import numpy as np
from typing import List

from huggingface_hub import from_pretrained_keras

import tensorflow as tf
from tensorflow import keras
import torch
from PIL import Image

In [2]:
# Cell 1: Image Classification Model
model1 = from_pretrained_keras("ALVHB95/finalsupermodelofthedestiny")

Fetching 8 files:   0%|          | 0/8 [00:00<?, ?it/s]

In [3]:
# Define class labels
class_labels = ['cardboard', 'compost', 'glass', 'metal', 'paper', 'plastic', 'trash']

# Function to predict image label and score
def predict_image(input):
    # Resize the image to the size expected by the model and convert to numpy array
    image_array = tf.keras.preprocessing.image.img_to_array(input.resize((244, 224)))  # Cambiar el orden de las dimensiones
    # Normalize the image
    image_array = tf.keras.applications.efficientnet.preprocess_input(image_array)
    # Expand the dimensions to create a batch
    image_array = tf.expand_dims(image_array, 0)
    # Predict using the model
    predictions = model1.predict(image_array)
    category_scores = {}
    for i, class_label in enumerate(class_labels):
        category_scores[class_label] = predictions[0][i].item()
    
    return category_scores

Input Image Path: C:\Users\alvar\OneDrive - CEPS\Documents\aplications\FCE.jpg
Prediction: {'compost': 0.566018}


In [4]:
image_gradio_app = gr.Interface(
    fn=predict_image,
    inputs=gr.Image(label="Image", sources=['upload', 'webcam'], type="pil"),
    outputs=[gr.Label(label="Result")],
    title=custom_title,
    theme=theme
)

  inputs=gr.Image(label="Select hot dog candidate", sources=['upload', 'webcam'], type="pil"),


In [4]:
# Cell 2: Chatbot Model
loader = WebBaseLoader(["https://www.epa.gov/recycle/frequent-questions-recycling", "https://www.whitehorsedc.gov.uk/vale-of-white-horse-district-council/recycling-rubbish-and-waste/lets-get-real-about-recycling/", "https://www.teimas.com/blog/13-preguntas-y-respuestas-sobre-la-ley-de-residuos-07-2022", "https://www.molok.com/es/blog/gestion-de-residuos-solidos-urbanos-rsu-10-dudas-comunes", "https://espanol.epa.gov/espanol/el-reciclaje#valelapena","https://espanol.epa.gov/espanol/preguntas-frecuentes-sobre-reciclado-de-plastico-y-elaboracion-de-abono-vegetal","https://espanol.epa.gov/espanol/consejo-del-dia-como-reciclo-mis","https://espanol.epa.gov/espanol/recursos-para-reciclar-dispositivos-electronicos","https://www.epa.gov/recycle/electronics-donation-and-recycling","https://reducereutilizarecicla.org/que-es-el-reciclaje/","https://reducereutilizarecicla.org/contenedores-de-reciclaje/", "https://reducereutilizarecicla.org/contenedores-de-reciclaje/contenedor-amarillo/","https://reducereutilizarecicla.org/contenedores-de-reciclaje/contenedor-azul/","https://reducereutilizarecicla.org/contenedores-de-reciclaje/contenedor-verde/","https://reducereutilizarecicla.org/contenedores-de-reciclaje/contenedor-marron-organico/","https://reducereutilizarecicla.org/contenedores-de-reciclaje/contenedor-gris-restos/","https://reducereutilizarecicla.org/contenedores-de-reciclaje/punto-limpio/", "https://reducereutilizarecicla.org/donde-tirar-auriculares/","https://reducereutilizarecicla.org/donde-tirar-sartenes/","https://reducereutilizarecicla.org/donde-tirar-aceite-usado/","https://reducereutilizarecicla.org/como-se-reciclan-los-envases-tipo-brik/","https://reducereutilizarecicla.org/los-envases-del-verano/", "https://reducereutilizarecicla.org/donde-tirar-radiografias/","https://reducereutilizarecicla.org/envases-ecologicos/","https://reducereutilizarecicla.org/donde-tirar-los-restos-de-pintura/","https://reducereutilizarecicla.org/valorizacion-de-residuos/","https://reducereutilizarecicla.org/como-reciclar-pilas/","https://reducereutilizarecicla.org/como-reciclar-capsulas-de-cafe/","https://reducereutilizarecicla.org/reciclando-cd/", "https://reducereutilizarecicla.org/donde-tirar-neumaticos/","https://reducereutilizarecicla.org/como-reciclar-una-canasta-de-mimbre/","https://reducereutilizarecicla.org/como-funciona-el-contenedor-amarillo/", "https://reducereutilizarecicla.org/donde-se-tiran-los-vapers/","https://reducereutilizarecicla.org/cuanto-tarda-una-bolsa-biodegradable-en-degradarse/", "https://reducereutilizarecicla.org/donde-se-reciclan-los-juguetes/","https://reducereutilizarecicla.org/objetos-que-se-pueden-reutilizar/","https://reducereutilizarecicla.org/la-parafina-se-puede-reutilizar/","https://reducereutilizarecicla.org/planta-de-reciclaje-de-papel/", "https://reducereutilizarecicla.org/como-saber-si-un-envase-es-reciclable/", "https://reducereutilizarecicla.org/reutilizar-vasos-de-vela/", "https://reducereutilizarecicla.org/bolsas-frio-calor/", "https://reducereutilizarecicla.org/reciclar-y-reutilizar-materiales-de-construccion/", "https://reducereutilizarecicla.org/que-es-exactamente-el-pet/", "https://reducereutilizarecicla.org/tipos-de-reciclaje/", "https://reducereutilizarecicla.org/que-hacer-con-palets-reciclados/", "https://reducereutilizarecicla.org/vertederos-controlados/", "https://reducereutilizarecicla.org/donde-tirar-escombros/","https://reducereutilizarecicla.org/como-reciclar-los-residuos-de-ps-poliestireno/","https://reducereutilizarecicla.org/tirar-la-basura-sin-bolsas/","https://reducereutilizarecicla.org/tirar-el-palo-de-la-fregona/","https://reducereutilizarecicla.org/la-mejor-manera-de-reciclar-una-pala-de-padel/", "https://reducereutilizarecicla.org/sabes-donde-tirar-las-llantas-viejas-de-un-coche/","https://reducereutilizarecicla.org/sabes-donde-tirar-el-arbol-de-navidad/","https://reducereutilizarecicla.org/clavos-tornillos-herramientas-donde-tirar-hierro/","https://reducereutilizarecicla.org/donde-tirar-un-secador-de-pelo-contenedor-o-punto-limpio/","https://reducereutilizarecicla.org/donde-tirar-electrodomesticos/","https://reducereutilizarecicla.org/donde-puedo-tirar-ramas-de-arboles/", "https://reducereutilizarecicla.org/donde-tirar-escombros/","https://reducereutilizarecicla.org/donde-se-tira-el-muerdago-quemado/","https://reducereutilizarecicla.org/sandalias-caucho-reciclado-neumaticos/","https://reducereutilizarecicla.org/ideas-para-reciclar-aspas-de-ventilador-de-techo/","https://reducereutilizarecicla.org/reciclar-sacos-dormir/","https://reducereutilizarecicla.org/reciclar-sillas-playa/","https://reducereutilizarecicla.org/donde-tirar-antipolillas/","https://reducereutilizarecicla.org/que-hacer-con-los-juguetes-viejos/","https://reducereutilizarecicla.org/como-utilizar-las-mascarillas-y-el-gel-hidroalcoholico-en-la-playa/","https://reducereutilizarecicla.org/ideas-para-reciclar-un-ventilador-de-pie/","https://reducereutilizarecicla.org/donde-tirar-gasoil/","https://reducereutilizarecicla.org/donde-puedo-tirar-basura-electronica/","https://reducereutilizarecicla.org/donde-tirar-agujas/", "https://reducereutilizarecicla.org/donde-tirar-residuos-peligrosos/", "https://reducereutilizarecicla.org/donde-tirar-los-cables/", "https://reducereutilizarecicla.org/donde-tirar-bicicletas/", "https://reducereutilizarecicla.org/donde-tirar-maletas/", "https://reducereutilizarecicla.org/como-reciclar-una-pantalla/", "https://reducereutilizarecicla.org/metales-reciclables/","https://reducereutilizarecicla.org/donde-tirar-caja-de-helado/", "https://reducereutilizarecicla.org/como-reciclar-perchas-de-plastico/","https://reducereutilizarecicla.org/donde-tirar-un-jarron-de-ceramica/","https://reducereutilizarecicla.org/donde-tirar-sanitarios/", "https://reducereutilizarecicla.org/reciclar-bombonas-de-camping-gas/", "https://reducereutilizarecicla.org/donde-tirar-aceite-usado-de-motor/", "https://reducereutilizarecicla.org/como-reciclar-rotuladores-subrayadores-y-boligrafos/", "https://reducereutilizarecicla.org/donde-tirar-un-ordenador/", "https://reducereutilizarecicla.org/donde-tirar-un-termometro-de-mercurio/", "https://reducereutilizarecicla.org/tirar-nevera-vieja/","https://reducereutilizarecicla.org/que-cosas-pueden-ser-recicladas/","https://reducereutilizarecicla.org/donde-tirar-los-pintaunas/","https://reducereutilizarecicla.org/donde-tirar-bombona-de-helio/", "https://reducereutilizarecicla.org/donde-tirar-alfombras/", "https://reducereutilizarecicla.org/donde-tirar-impresoras-y-sus-cartuchos-o-toner/", "https://reducereutilizarecicla.org/donde-tirar-aguarras/","https://reducereutilizarecicla.org/donde-tirar-discos-duros/","https://reducereutilizarecicla.org/donde-tirar-azulejos/","https://reducereutilizarecicla.org/donde-tirar-diapositivas/","https://reducereutilizarecicla.org/donde-tirar-jeringuillas-usadas/","https://reducereutilizarecicla.org/donde-tirar-cintas-vhs/","https://reducereutilizarecicla.org/donde-tirar-gomaespuma/", "https://reducereutilizarecicla.org/donde-tirar-los-botes-de-pintura/", "https://reducereutilizarecicla.org/donde-se-recicla-la-madera/", "https://reducereutilizarecicla.org/donde-tirar-discos-de-vinilo/", "https://reducereutilizarecicla.org/donde-tirar-imanes/", "https://reducereutilizarecicla.org/donde-tirar-fluorescentes/", "https://reducereutilizarecicla.org/donde-tirar-un-microondas/", "https://reducereutilizarecicla.org/reciclar-toallas/", "https://reducereutilizarecicla.org/reciclar-vaqueros/","https://reducereutilizarecicla.org/como-se-recicla-la-tela/", "https://reducereutilizarecicla.org/contenedor-rojo-ropa/", "https://reducereutilizarecicla.org/reciclar-chanclas/","https://reducereutilizarecicla.org/reciclar-banadores/","https://reducereutilizarecicla.org/asi-funciona-el-reciclaje-de-cremalleras/","https://reducereutilizarecicla.org/donde-tirar-zapatos/","https://reducereutilizarecicla.org/como-reciclar-una-camisa/","https://reducereutilizarecicla.org/donde-tirar-un-mantel-de-tela-sucio/","https://reducereutilizarecicla.org/contenedores-de-ropa/","https://reducereutilizarecicla.org/que-cosas-pueden-ser-recicladas/","https://reducereutilizarecicla.org/los-textiles-se-vuelven-ecologicos/","https://reducereutilizarecicla.org/donde-tirar-ropa-vieja/","https://espanol.epa.gov/espanol/terminos-0-9","https://espanol.epa.gov/espanol/terminos","https://espanol.epa.gov/espanol/terminos-b","https://espanol.epa.gov/espanol/terminos-c","https://espanol.epa.gov/espanol/terminos-d","https://espanol.epa.gov/espanol/terminos-e","https://espanol.epa.gov/espanol/terminos-f","https://espanol.epa.gov/espanol/terminos-g","https://espanol.epa.gov/espanol/terminos-h","https://espanol.epa.gov/espanol/terminos-i","https://espanol.epa.gov/espanol/terminos-j","https://espanol.epa.gov/espanol/terminos-l","https://espanol.epa.gov/espanol/terminos-m","https://espanol.epa.gov/espanol/terminos-n","https://espanol.epa.gov/espanol/terminos-o","https://espanol.epa.gov/espanol/terminos-p","https://espanol.epa.gov/espanol/terminos-q","https://espanol.epa.gov/espanol/terminos-r","https://espanol.epa.gov/espanol/terminos-s","https://espanol.epa.gov/espanol/terminos-t","https://espanol.epa.gov/espanol/terminos-u","https://espanol.epa.gov/espanol/terminos-v","https://espanol.epa.gov/espanol/terminos-w-x-y-z#W", "https://espanol.epa.gov/espanol/la-importancia-de-la-educacion-ambiental","https://espanol.epa.gov/espanol/consejo-del-dia-que-puede-hacer-para-reciclar-por-estacion","https://espanol.epa.gov/espanol/lo-que-puede-hacer-usted-acerca-de-la-contaminacion-por-basura-0","https://espanol.epa.gov/espanol/diez-maneras-de-eliminar-el-embalaje-y-las-envolturas-de-su-vida","https://blog.cerdanyaecoresort.com/sostenibilidad-ambiental-que-es-y-como-mejorarla-en-casa/","https://www.imh.eus/es/imh/comunicacion/docu-libre/reduccion-residuos","https://archivo-es.greenpeace.org/espana/Global/espana/report/contaminacion/Guia%20Transform-accion%20residuos.pdf","https://elpais.com/escaparate/2023-01-10/11-productos-sostenibles-de-uso-cotidiano-para-reducir-el-consumo-de-plastico-y-generar-menos-residuos.html", "https://elpais.com/escaparate/2023-01-10/11-productos-sostenibles-de-uso-cotidiano-para-reducir-el-consumo-de-plastico-y-generar-menos-residuos.html#", "https://www.iberdrola.com/sostenibilidad/productos-ecologicos", "https://www.retema.es/agenda", "https://www.iberdrola.com/sostenibilidad/reciclaje-para-ninos","https://www.miteco.gob.es/es/ceneam/recursos/pag-web/programas-planes/voluntariado-ong-internacionales.html", "https://www.iberdrola.com/compromiso-social/voluntariado-corporativo", "https://reducereutilizarecicla.org/reciclos-ecoembes/","https://reducereutilizarecicla.org/goma-eva-es-reciclable/", "https://reducereutilizarecicla.org/se-puede-generar-energia-en-los-vertederos/", "https://reducereutilizarecicla.org/tirar-la-basura-sin-bolsas/","https://reducereutilizarecicla.org/por-que-es-importante-saber-como-reutilizar-camisetas/","https://reducereutilizarecicla.org/reutilizar-pantalones/","https://reducereutilizarecicla.org/poliester-reciclado/","https://reducereutilizarecicla.org/ropa-con-materiales-reciclados/", "https://reducereutilizarecicla.org/contenedores-caseros-para-reciclar/","https://www.miteco.gob.es/es/calidad-y-evaluacion-ambiental/temas/prevencion-y-gestion-residuos/flujos/domesticos/gestion/sistema-recogida/puntos-limpios.html","https://punto-limpio.info/","https://reducereutilizarecicla.org/ecopuntos-que-son-y-donde-estan/"])

data=loader.load()
# split documents
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1500, chunk_overlap=200,add_start_index=True)
splits = text_splitter.split_documents(data)

In [5]:
import hashlib
import json
from langchain_core.documents import Document
import uuid

def stable_hash(doc: Document) -> str:
    """
    Stable hash document based on its metadata.
    """
    # Construct a unique string based on document attributes
    unique_string = f"{doc.metadata.get('title', '')}_{doc.metadata.get('author', '')}_{str(uuid.uuid4())}"
    
    return hashlib.sha1(unique_string.encode()).hexdigest()
    
    return hashlib.sha1(json.dumps(modified_metadata, sort_keys=True).encode()).hexdigest()


split_ids = list(map(stable_hash, splits))

In [6]:
# define embedding
embeddings = HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-l6-v2')

In [7]:
# create vector database from data
persist_directory = 'docs/chroma/'

# Remove old database files if any
shutil.rmtree(persist_directory, ignore_errors=True)
vectordb = Chroma.from_documents(
    documents=splits,
    embedding=embeddings,
    ids=split_ids,
    persist_directory=persist_directory
)

In [8]:
# define retriever
retriever = vectordb.as_retriever(search_kwargs={"k": 2}, search_type="mmr")

In [9]:
from langchain.output_parsers import PydanticOutputParser

class FinalAnswer(BaseModel):
    question: str = Field(description="the original question")
    answer: str = Field(description="the extracted answer")

# Assuming you have a parser for the FinalAnswer class
parser = PydanticOutputParser(pydantic_object=FinalAnswer)

In [10]:
template = """
Your name is AngryGreta and you are a recycling chatbot with the objective to anwer questions from user in English or Spanish /
Use the following pieces of context to answer the question /
If the question is English answer in English /
If the question is Spanish answer in Spanish /
Do not mention the word context when you answer a question, use the word database instead /
Answer the question fully and provide as much relevant detail as possible. Do not cut your response short
Context: {context}
User: {question}
{format_instructions}
"""

# Create the chat prompt templates
sys_prompt = SystemMessagePromptTemplate.from_template(template)
qa_prompt = ChatPromptTemplate(
    messages=[
        sys_prompt,
        HumanMessagePromptTemplate.from_template("{question}")],
    partial_variables={"format_instructions": parser.get_format_instructions()}
   
)
llm = HuggingFaceHub(
    repo_id="mistralai/Mixtral-8x7B-Instruct-v0.1",
    task="text-generation",
    model_kwargs={
        "max_new_tokens": 2000,
        "top_k": 50,
        "temperature": 0.1,
        "repetition_penalty": 1.3,
        "early_stopping":"never"
    },
)

In [11]:
qa_chain = ConversationalRetrievalChain.from_llm(
    llm = llm,
    memory = ConversationBufferMemory(llm=llm, memory_key="chat_history", input_key='question', output_key='output'),
    retriever = retriever,
    verbose = True,
    combine_docs_chain_kwargs={'prompt': qa_prompt},
    get_chat_history = lambda h : h,
    rephrase_question = False,
    output_key = 'output',
)

In [12]:
def chat_interface(question):
    result = qa_chain.invoke({'question': question})
    output_string = result['output']

    # Find the index of the last occurrence of "answer": in the string
    answer_index = output_string.rfind('"answer":')

    # Extract the substring starting from the "answer": index
    answer_part = output_string[answer_index + len('"answer":'):].strip()

    # Find the next occurrence of a double quote to get the start of the answer value
    quote_index = answer_part.find('"')

    # Extract the answer value between double quotes
    answer_value = answer_part[quote_index + 1:answer_part.find('"', quote_index + 1)]
    
    return answer_value

In [13]:


def format_data(docs: List[Document]) -> str:
    return "\n\n".join(
        f"Content: {doc.page_content}\nSource: {doc.metadata['source']}" for dat in data
    )

In [14]:
import pandas as pd

response = vectordb.get(include=["metadatas", "documents", "embeddings"])
df = pd.DataFrame(
    {
        "id": response["ids"],
        "source": [metadata.get("source") for metadata in response["metadatas"]],
        "page": [metadata.get("page", -1) for metadata in response["metadatas"]],
        "document": response["documents"],
        "embedding": response["embeddings"],
    }
)
df["contains_answer"] = df["document"].apply(lambda x: "Eichler" in x)
df["contains_answer"].to_numpy().nonzero()

(array([], dtype=int64),)

In [15]:
question='¿Cómo se reciclan las baterías de los aparatos eléctricos?'
answer=chat_interface(question)



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: 
Your name is AngryGreta and you are a recycling chatbot with the objective to anwer questions from user in English or Spanish /
Use the following pieces of context to answer the question /
If the question is English answer in English /
If the question is Spanish answer in Spanish /
Do not mention the word context when you answer a question, use the word database instead /
Answer the question fully and provide as much relevant detail as possible. Do not cut your response short
Context: se trata de un objeto que funciona gracias a pilas, batería o enchufado a corriente eléctrica, será un AEE. También entran en este grupo los aparatos necesarios para generar, transmitir y medir tales corrientes y campos.Si, además, provienen de hogares o comercios, industrias, instituciones que, por su naturaleza y cantidad, son similares a los procedentes de hoga

In [16]:
question_row = pd.DataFrame(
    {
        "id": "question",
        "question": question,
        "embedding": [embeddings.embed_query(question)],
    }
)
answer_row = pd.DataFrame(
    {
        "id": "answer",
        "answer": answer,
        "embedding": [embeddings.embed_query(answer)],
    }
)
df = pd.concat([question_row, answer_row, df])

In [18]:
question_embedding = [embeddings.embed_query(question)]
df["dist"] = df.apply(
    lambda row: np.linalg.norm(
        np.array(row["embedding"]) - question_embedding
    ),
    axis=1,
)

In [20]:
from renumics import spotlight

spotlight.show(df)

VBox(children=(Label(value='Spotlight running on http://127.0.0.1:57973/'), HBox(children=(Button(description=…

In [24]:
chatbot_gradio_app = gr.ChatInterface(
    fn=chat_interface,
    title=custom_title
)

In [None]:
# Combine both interfaces into a single app
gr.TabbedInterface(
    [image_gradio_app, chatbot_gradio_app],
    tab_names=["image","chatbot"]
).launch(share=True, server_port=8887)