<a href="https://colab.research.google.com/github/huckles-learning-lab/rasa-demo/blob/main/rhelb_chatbot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


---



---


# RHELB - Bot auf Rasa basierend mit Haystack-Elasticsearch integration und Leaf


---



##Das Skript erstellt einen Chatbot, der mithilfe von Machine-Learning, Deep-Learning und neuronalen Netzwerken sowie AI betrieben wird. Der Chatbot nutzt Rasa, Haystack, Elasticsearch und Leaf, um Nutzeranfragen zu beantworten.

Folgende Ordnerstrucktur wird angelegt:

    domain.yml
    config.yml
    data/nlu.yml
    data/rules.yml
    actions/actions.py



Installieren Sie Rasa und starten Sie die Haystack REST API und einen Demo DocumentStore über Docker:




---



---



---





---



---


# #Setting up your environment

##1. Python Environment Setup
Check if your Python environment is already configured:

In [None]:
!python3 --version
!pip3 --version

Fetch the relevant packages using apt, and install virtualenv using pip.

In [None]:
!apt update
!apt install python3-dev python3-pip

##2. Virtual Environment Setup
Create a new virtual environment by choosing a Python interpreter and making a ./venv directory to hold it:

In [None]:
!apt install python3.8-venv
!python3 -m venv ./venv

And activate the virtual environment:

In [None]:
!source ./venv/bin/activate



---



---





---

# #Installing Rasa Open Source

First make sure your pip version is up to date:

In [None]:
!pip3 install -U pip

To install Rasa Open Source:

In [None]:
!pip3 install rasa

###Congratulations! You have successfully installed Rasa Open Source!

You can now create a new project with:

In [None]:
!rasa init

It creates the following files:

In [None]:
.
├── actions
│   ├── __init__.py
│   └── actions.py
├── config.yml
├── credentials.yml
├── data
│   ├── nlu.yml
│   └── stories.yml
├── domain.yml
├── endpoints.yml
├── models
│   └── <timestamp>.tar.gz
└── tests
   └── test_stories.yml



---



---





---


# #Farm-Haystack & Elasticsearch



In [None]:
!python3 -m venv ./venv2
!source ./venv2/bin/activate
!pip install --upgrade pip
!pip install elasticsearch
!pip install 'farm-haystack[all]' ## or 'all-gpu' for the GPU-enabled dependencies



---


# #Leaf-Framework

In [None]:
!python3 -m venv ./venv-leaf
!source ./venv-leaf/bin/activate
!pip install leaf



---



---




---


# #Main Block

In [None]:
!cd rasa

import rasa
from rasa.model import Trainer
from rasa.shared.nlu.training_data.loading import load_data
from rasa.nlu.model import Interpreter
from rasa.core.agent import Agent
from rasa.core.training import interactive
from rasa.core import config

# Lade Trainingsdaten im Rasa NLU-Format
training_data = load_data("data/nlu.md")

# Trainiere das NLU-Modell
trainer = Trainer(config.load("config.yml"))
trainer.train(training_data)
model_directory = trainer.persist("models/nlu", fixed_model_name="current")

# Trainiere das Core-Modell
agent = Agent("domain.yml", policies=config.load("config.yml"))
data = agent.load_data("data/stories.md")
agent.train(data)
agent.persist("models/dialogue")

# Initialisiere Interpreter
interpreter = Interpreter.load(model_directory)

In [None]:
from haystack.document_store.elasticsearch import ElasticsearchDocumentStore
from haystack.file_converter.txt import TextConverter
from haystack.retriever.sparse import ElasticsearchRetriever

# Erstelle Elasticsearch-Index
document_store = ElasticsearchDocumentStore(host="localhost", username="", password="", index="document")
document_store.delete_documents()

# Konvertiere Textdaten und lade sie in den Index
converter = TextConverter(remove_numeric_tables=True)
document_store.write_documents(converter.convert(file_path="data/text_data.txt"), index="document")

# Erstelle Haystack-Document-Store- und Retrieval-Instanzen
document_store = ElasticsearchDocumentStore(host="localhost", username="", password="", index="document")
retriever = ElasticsearchRetriever(document_store=document_store)


In [None]:

from leaf.api import Agent, Environment
from leaf.environment.vectorized import Vectorized
from haystack.retriever.sparse import ElasticsearchRetriever
from rasa.core.agent import Agent as RasaAgent

# Erstelle Leaf-Modell für den Chatbot
environment = Environment(Vectorized(environment_size=32))
retriever = ElasticsearchRetriever(document_store=document_store)
rasa_agent = RasaAgent.load("models/dialogue", interpreter=interpreter)
agent = Agent(environment, [retriever, rasa_agent])

# Trainiere das Leaf-Modell
agent.train(epochs=10)

In [None]:
from rasa_sdk import Action, Tracker
from rasa_sdk.executor import CollectingDispatcher
from haystack import Finder
from haystack.reader.farm import FARMReader
from haystack.utils import print_answers
from leaf.query import LeafQuery
from leaf.result import LeafResult
from leaf.model import LeafModel
import json

class SearchAction(Action):
    def name(self) -> Text:
        return "action_search"

    def run(self, dispatcher: CollectingDispatcher,
            tracker: Tracker,
            domain: Dict[Text, Any]) -> List[Dict[Text, Any]]:

        # Extract the user's message from the tracker object
        user_message = tracker.latest_message.get('text')

        # Create a Haystack Finder object to get the answers
        document_store = ElasticsearchDocumentStore(host="localhost", username="", password="", index="faq")
        retriever = ElasticsearchRetriever(document_store=document_store)
        reader = FARMReader(model_name_or_path="deepset/roberta-base-squad2", use_gpu=False)
        finder = Finder(reader, retriever)

        # Use the Haystack Finder object to get the answers
        prediction = finder.get_answers(question=user_message, top_k_retriever=3, top_k_reader=3)

        # Create a Leaf Query object to get the best answer
        leaf_query = LeafQuery(
            question=user_message,
            context={
                "answers": [answer.to_dict() for answer in prediction["answers"]],
                "meta": prediction["meta"],
            },
        )

        # Create a Leaf Model object to make the query
        leaf_model = LeafModel(model_path="models/leaf_model")
        leaf_result = leaf_model.predict(leaf_query)

        # Convert the result to a dictionary
        result_dict = json.loads(leaf_result.to_json())

        # Get the best answer from the result
        answer = result_dict['answers'][0]['answer']

        # Send the answer back to the user
        dispatcher.utter_message(text=answer)

        return []



---



---





---


# #Leaf konfigurieren und Modell trainieren

In [None]:
# Importieren der benötigten Pakete
from leaf.modules import Module
from leaf.data import Dataset
from leaf.optimizers import Adam
from leaf.losses import CrossEntropyLoss
from leaf.models import NeuralNetwork
from leaf.callbacks import Callback, AccuracyCallback
from leaf.metrics import accuracy

# Definieren der Modellparameter
num_classes = 2
input_shape = (28, 28, 1)

# Definieren der Trainingsparameter
batch_size = 32
epochs = 10
learning_rate = 0.001

# Erstellen des Datasets
dataset = Dataset.from_elasticsearch(
    es_host=es_host,
    es_port=es_port,
    es_index=es_index,
    query={"query": {"match_all": {}}},
    fields=["text"],
    target_field="label",
    num_samples=1000,
)

# Erstellen des Moduls
module = Module(
    NeuralNetwork(num_classes=num_classes, input_shape=input_shape),
    loss=CrossEntropyLoss(),
    optimizer=Adam(learning_rate),
)

# Erstellen des Callbacks
callback = AccuracyCallback(metric_fn=accuracy, target=1.0)

# Trainieren des Modells
module.fit(
    dataset=dataset,
    batch_size=batch_size,
    epochs=epochs,
    callbacks=[callback],
)
