# Let's chat with a friend

Demo chat with Leolani. Leolani uses face recognition and gender/age
estimation to estiablish your identity. When you are new, it will add you to her friends.

To use the face functions, you need to install Docker.

In [15]:
#! pip install matplotlib
#! pip install pandas
#! pip install seaborn
#! pip install torchvision

Collecting torchvision
  Downloading torchvision-0.11.1-cp37-cp37m-macosx_10_9_x86_64.whl (1.2 MB)
     |████████████████████████████████| 1.2 MB 3.8 MB/s            
Collecting torch==1.10.0
  Downloading torch-1.10.0-cp37-none-macosx_10_9_x86_64.whl (147.1 MB)
     |████████████████████████████████| 147.1 MB 7.9 MB/s             
Installing collected packages: torch, torchvision
  Attempting uninstall: torch
    Found existing installation: torch 1.9.0
    Uninstalling torch-1.9.0:
      Successfully uninstalled torch-1.9.0
Successfully installed torch-1.10.0 torchvision-0.11.1


In [1]:
import emissor as em
from emissor.persistence import ScenarioStorage
from emissor.representation.annotation import AnnotationType, Token, NER
from emissor.representation.container import Index
from emissor.representation.scenario import Modality, ImageSignal, TextSignal, Mention, Annotation, Scenario
from cltl import brain
from cltl.brain.utils.helper_functions import brain_response_to_json

#Others
import uuid
from datetime import datetime
import cv2



### Import the chatbot utility functions

In [16]:
import sys
import os

src_path = os.path.abspath(os.path.join('..'))
if src_path not in sys.path:
    sys.path.append(src_path)

#### The next utils are needed for the interaction and creating triples and capsules
import chatbots.util.driver_util as d_util
import chatbots.util.capsule_util as c_util
import chatbots.util.face_util as f_util
import chatbots.intentions.talk as talk
import chatbots.intentions.get_to_know_you as friend

In [17]:
### Link your camera
camera = cv2.VideoCapture(0)

## Standard initialisation of a scenario

In [18]:
from random import getrandbits
import requests
##### Setting the location
place_id = getrandbits(8)
location = requests.get("https://ipinfo.io").json()

##### Setting the agents
AGENT = "Leolani2"
HUMAN_NAME = "Stranger"
HUMAN_ID = "stranger"

### The name of your scenario
scenario_id = datetime.today().strftime("%Y-%m-%d-%H:%M:%S")

### Specify the path to an existing data folder where your scenario is created and saved as a subfolder
scenario_path = os.path.abspath(os.path.join('../../data'))
if scenario_path not in sys.path:
    sys.path.append(scenario_path)

    ### Specify the path to an existing data folder where your scenario is created and saved as a subfolder
scenario_path = os.path.abspath(os.path.join('../../data'))
if scenario_path not in sys.path:
    sys.path.append(scenario_path)

if not os.path.exists(scenario_path) :
    os.mkdir(scenario_path)
    print("Created a data folder for storing the scenarios", scenario_path)

### Create the scenario folder, the json files and a scenarioStorage and scenario in memory
scenarioStorage = d_util.create_scenario(scenario_path, scenario_id)
scenario = scenarioStorage.create_scenario(scenario_id, datetime.now().microsecond, datetime.now().microsecond, AGENT)

Directory  /Users/piek/PycharmProjects/cltl-chatbots/data/2021-11-15-09:46:49  Created 
Directory  /Users/piek/PycharmProjects/cltl-chatbots/data/2021-11-15-09:46:49/image  Created 


## Define the location of the face embedding information for her friends

The faces of friends are stored in a folder as embeddings. Every friend is identified through a name, gender and age property detected by the software. The name and the system time is used to create a unique identifier. We now save this in the file name of the mebdding file. A future version, we will create a json structure with the meta data on identities.

In [19]:
### Specify the path to an existing folder with the embeddings of your friends
friends_path = os.path.abspath(os.path.join('../../friend_embeddings'))
if friends_path not in sys.path:
    sys.path.append(friends_path)

print("The paths with the friends:", friends_path)

### Define the folder where the images are saved
imagefolder = scenario_path + "/" + scenario_id + "/" + "image"

The paths with the friends: /Users/piek/PycharmProjects/cltl-chatbots/friend_embeddings


### Loading the docker containers for face detection and face property detection

You only need to load the dockers once. The first time you load the docker, the images will be donwloaded from the DockerHub. This may take a few minutes depending on the speed of the internet connection. The images are cached in your local Docker installation.

One the images are in your local Docker, they are loaded instantaniously. Once the docker is started you do not need to start it again and you can skip the next commands.

In [23]:
#container_fdr = f_util.start_docker_container("tae898/face-detection-recognition", 10002)
#container_ag = f_util.start_docker_container("tae898/age-gender", 10003)
#container_yolo = f_util.start_docker_container("tae898/yolov5", 10004)

If there is a problem starting the dockers, you may need to kill them and start them again. Use the following command to kill and rerun the previous command. Note that if there are running already you should not restart. Starting it again gives an error that the port is occupied.

In [21]:
#!docker kill $(docker ps -q)

404c4820ea1d
28309175abc8


## We are now set to make a new friend

The functions in *intentions/get_to_know_you.py* are needed to get the properties and visual information for identifying a new friend.

The visual information is based on the camera images of the uses from which we extract an averaged embedding.
These embeddings are store in the *friend_embeddings* folder. 

By comparing an image with the stored embeddings, the system decides whether a person is a *stranger*.
In case the user is a *stranger*, the system will try to get to know him/her.

If you delete someone's embeddings from the *friend_embeddings* folder. This person will become a *stranger* again.

In [None]:
# First signals to get started
success, frame = camera.read()
imagepath = ""
if success:
    current_time = str(datetime.now().microsecond)
    imagepath = imagefolder + "/" + current_time + ".png"
    cv2.imwrite(imagepath, frame)
    (
        genders,
        ages,
        face_bboxes,
        faces_detected,
        det_scores,
        embeddings,
        yolo_results
    ) = f_util.do_stuff_with_image(friends_path, imagepath)

    # Initial prompt by the system from which we create a TextSignal and store it

    # Here we assume that only one face is in the image
    # TODO: deal with multiple people.
    for k, (gender, age, face_bbox, uuid_name, faceprob, embedding) in enumerate(
        zip(genders, ages, face_bboxes, faces_detected, det_scores, embeddings)
    ):
        age = round(age["mean"])
        gender = "male" if gender["m"] > 0.5 else "female"
        face_bbox = [int(num) for num in face_bbox.tolist()]

    assert k == 0

    if uuid_name["name"] is None:
        ### This is a stranger
        ### We create the agent response and store it as a text signal
        
        HUMAN_ID, HUMAN_NAME, textSignal = friend.get_to_know_person(scenario, AGENT, gender, age, uuid_name, embedding, friends_path)
        HUMAN_ID = HUMAN_NAME  ### Hack because we cannot force the namespace through capsules, name and identity are the same till this is fixed

        ### The system responds to the processing of the new name input and stores it as a textsignal
        response = f": Nice to meet you, {HUMAN_NAME}"
        print(f"{AGENT}: {response}\n")
        textSignal = d_util.create_text_signal(scenario, response)
        scenario.append_signal(textSignal)

    else:
        ### We know this person
        HUMAN_ID= uuid_name['name']
        HUMAN_NAME = HUMAN_ID  ### Hack because we cannot force the namespace through capsules, name and identity are the same till this is fixed 
        # HUMAN_NAME = HUMAN_ID.split("_t_")[0]

        response = f"Hi {HUMAN_NAME}. Nice to see you again. How are you today?"
        print(f"{AGENT}: {response}\n")
        textSignal = d_util.create_text_signal(scenario, response)
        scenario.append_signal(textSignal)


## Have a conversation with a friend

Below is a simple chat scenario in which we can say anything to our identified friend and store images and conversation in the EMISSOR scenario.

In [None]:
### First prompt

#response = "How are you doing today, "+HUMAN_NAME
print(f"{AGENT}: {response}\n")
#textSignal = d_util.create_text_signal(scenario, response)
#scenario.append_signal(textSignal)

utterance = input("\n")
print(f"{HUMAN_NAME}: {utterance}\n")

while not (utterance.lower() == "stop" or utterance.lower() == "bye"):
    textSignal = d_util.create_text_signal(scenario, utterance)
    scenario.append_signal(textSignal)

    # @TODO: also annotate the textSignal
    # Apply some processing to the textSignal and add annotations
        
        
    ## We capture the image again
    if success:
        imageSignal = d_util.create_image_signal(scenario, imagepath)
        container_id = str(uuid.uuid4())

        #### Properties are now stored as annotations
        #### We do not store these proeprties again to the BRAIN
        for gender, age, face_bbox, name, faceprob in zip(
            genders, ages, face_bboxes, faces_detected, det_scores
        ):

            age = round(age["mean"])
            gender = "male" if gender["m"] > 0.5 else "female"
            face_bbox = [int(num) for num in face_bbox.tolist()]
        
        f_util.add_face_annotation(imageSignal, container_id, "front_camera", container_id, current_time,
                                   face_bbox, HUMAN_ID, HUMAN_NAME, age, gender, faceprob)
 
        scenario.append_signal(imageSignal)


    # Create the response from the system and store this as a new signal
    # We could use the throughts to respond
    # @TODO generate a response from the thoughts

    response = "So you what do you want to talk about " + HUMAN_NAME
    print(f"{AGENT}: {response}\n")
    textSignal = d_util.create_text_signal(scenario, utterance)
    scenario.append_signal(textSignal)
          
    # Getting the next input signals
    utterance = input("\n")

    success, frame = camera.read()
    if success:
        current_time = str(datetime.now().microsecond)
        imagepath = imagefolder + "/" + current_time + ".png"
        cv2.imwrite(imagepath, frame)
        (
            genders,
            ages,
            face_bboxes,
            faces_detected,
            det_scores,
            embeddings,
            yolo_results
        ) = f_util.do_stuff_with_image(friends_path, imagepath)
        
        
        # Here we assume that only one face is in the image
        # TODO: deal with multiple people.
        for k, (gender, age, face_bbox, uuid_name, faceprob, embedding) in enumerate(
            zip(genders, ages, face_bboxes, faces_detected, det_scores, embeddings)
        ):
            age = round(age["mean"])
            gender = "male" if gender["m"] > 0.5 else "female"
            face_bbox = [int(num) for num in face_bbox.tolist()]

        assert k == 0

        if uuid_name["name"] is None:
            ### This is a stranger
            ### We create the agent response and store it as a text signal
            
            ### The system responds to the user switch
            response = f": Goodbye, {HUMAN_NAME}. And who are you?"
            print(f"{AGENT}: {response}\n")
            textSignal = d_util.create_text_signal(scenario, response)
            scenario.append_signal(textSignal)
            
            ### Establish a new name and id
            HUMAN_ID, HUMAN_NAME, textSignal = friend.get_to_know_person(scenario, AGENT, gender, age, uuid_name, embedding, friends_path)

            ### The system responds to the processing of the new name input and stores it as a textsignal
            response = f": Nice to meet you, {HUMAN_NAME}"
            print(f"{AGENT}: {response}\n")
            textSignal = d_util.create_text_signal(scenario, response)
            scenario.append_signal(textSignal)

        else:
            ### We know this person but it is a different person then the one we were talking to
            if not HUMAN_ID == uuid_name['name']:
                
                ### The system responds to the user switch
                response = f": Goodbye, {HUMAN_NAME}. And who are you?"
                print(f"{AGENT}: {response}\n")
                textSignal = d_util.create_text_signal(scenario, response)
                scenario.append_signal(textSignal)
                
                ### set the name and id for this other friend
                HUMAN_ID =  uuid_name['name']
                HUMAN_NAME = HUMAN_ID.split("_t_")[0]
                response = f"Hi {HUMAN_NAME}. Nice to see you too.)"
                print(f"{AGENT}: {response}\n")
                textSignal = d_util.create_text_signal(scenario, response)
                scenario.append_signal(textSignal)
            
            

### Set the end time of the scenario, save it and stop the containers

After we stopped the interaction, we set the end time and save the scenario as EMISSOR data.

In [None]:
#scenario.scenario.end = datetime.now().microsecond
scenarioStorage.save_scenario(scenario)

In [None]:
### Stopping the docker containers
### This is only needed of you started them in this notebook

f_util.kill_container(container_fdr)
f_util.kill_container(container_ag)
f_util.kill_container(container_yolo)

In [None]:
#### Stop the camera when we are done
camera.release()

## End of notebook