# Let's chat with a friend

Demo chat with Leolani. Leolani uses face recognition and gender/age
estimation to estiablish your identity. When you are new, it will add you to her friends.

To use the face functions, you need to install Docker.

In [14]:
import logging

logger = logging.getLogger()
logger.disabled = True

In [15]:
import emissor as em
from emissor.persistence import ScenarioStorage
from emissor.representation.annotation import AnnotationType, Token, NER
from emissor.representation.container import Index
from emissor.representation.scenario import Modality, ImageSignal, TextSignal, Mention, Annotation, Scenario
from cltl.brain.utils.helper_functions import brain_response_to_json

#Others
import uuid
import time
from datetime import datetime
import cv2

In [16]:
import sys
import os

# @TODO can we move the notebooks one level up instead?
src_path = os.path.abspath(os.path.join('../'))
if src_path not in sys.path:
    sys.path.append(src_path)

#### The next utils are needed for the interaction and creating triples and capsules
import chatbots.util.driver_util as d_util
import chatbots.util.face_util as f_util
import chatbots.util.text_util as t_util
import chatbots.intentions.get_to_know_you as friend
import chatbots.intentions.talk as talk

ModuleNotFoundError: No module named 'src'

In [17]:
### Link your camera
camera = cv2.VideoCapture(0)

## Standard initialisation of a scenario

Setup file paths and scenario context information.

In [18]:
import os
import requests
##### Setting the location
place_id = str(uuid.uuid4())
location = None
try:
    location = requests.get("https://ipinfo.io").json()
except:
    print("failed to get the IP location")

##### Setting the agents
AGENT = "Leolani2"
human_name = "Stranger"
human_id = "stranger"

### The name of your scenario
scenario_id = datetime.today().strftime("%Y-%m-%d-%H:%M:%S")

### Specify the path to an existing data folder where your scenario is created and saved as a subfolder
# Find the repository root dir
parent, dir_name = (d_util.__file__, "_")
while dir_name and dir_name != "src":
    parent, dir_name = os.path.split(parent)
root_dir = parent
scenario_path = os.path.abspath(os.path.join(root_dir, 'data'))

if not os.path.exists(scenario_path) :
    os.mkdir(scenario_path)
    print("Created a data folder for storing the scenarios", scenario_path)

### Create the scenario folder, the json files and a scenarioStorage and scenario in memory
scenarioStorage = d_util.create_scenario(scenario_path, scenario_id)
scenario_ctrl = scenarioStorage.create_scenario(scenario_id, int(time.time() * 1e3), None, AGENT)

Directories for 2022-03-24-21:46:15 created in /home/tk/repos/cltl-chatbots/data


## Define the location of the face embedding information for her friends

The faces of friends are stored in a folder as embeddings. Every friend is identified through a name, gender and age property detected by the software. The name and the system time is used to create a unique identifier. We now save this in the file name of the mebdding file. A future version, we will create a json structure with the meta data on identities.

In [19]:
### Specify the path to an existing folder with the embeddings of your friends
friends_path = os.path.abspath(os.path.join(root_dir, 'friend_embeddings'))
if friends_path not in sys.path:
    sys.path.append(friends_path)

print("The paths with the friends:", friends_path)

The paths with the friends: /home/tk/repos/cltl-chatbots/friend_embeddings


### Loading the docker containers for face detection and face property detection

You only need to load the dockers once. The first time you load the docker, the images will be donwloaded from the DockerHub. This may take a few minutes depending on the speed of the internet connection. The images are cached in your local Docker installation.

One the images are in your local Docker, they are loaded instantaniously. Once the docker is started you do not need to start it again and you can skip the next commands.

In [20]:
### This is only needed if you start the docker containers from this notebook

container_fdr = f_util.start_docker_container(
    "tae898/face-detection-recognition", 10002
)
container_ag = f_util.start_docker_container("tae898/age-gender", 10003)
container_yolo = f_util.start_docker_container("tae898/yolov5", 10004)
container_room = f_util.start_docker_container("tae898/room-classification", 10005)
container_erc = f_util.start_docker_container("tae898/emoberta-large", 10006)


If there is a problem starting the dockers, you may need to kill them and start them again. Use the following command to kill and rerun the previous command. Note that if there are running already you should not restart. Starting it again gives an error that the port is occupied.

In [21]:
# !docker kill $(docker ps -q)

## We are now set to make a new friend

The functions in *intentions/get_to_know_you.py* are needed to get the properties and visual information for identifying a new friend.

The visual information is based on the camera images of the uses from which we extract an averaged embedding.
These embeddings are store in the *friend_embeddings* folder. 

By comparing an image with the stored embeddings, the system decides whether a person is a *stranger*.
In case the user is a *stranger*, the system will try to get to know him/her.

If you delete someone's embeddings from the *friend_embeddings* folder. This person will become a *stranger* again.

In [22]:
def parse_age(face_info):
    return round(face_info.age["mean"])
def parse_gender(face_info):
    return "male" if face_info.gender["m"] > 0.5 else "female"
def parse_bbox(face_info):
    return [int(num) for num in face_info.bbox.tolist()]
def parse_id(face_info):
    return face_info.face_id['name'] if 'name' in face_info.face_id else f"Stranger_t_{int(time.time() * 1e3)}"
def parse_name(face_info):
    face_id = parse_id(face_info)
    return face_id.split("_t_")[0] if face_id else "Stranger"

# First signals to get started
faces =[]
while not len(faces) == 1:
    success, frame = camera.read()
    if not success:
        raise ValueError("Failed to take a picture")
        
    image_time = int(time.time() * 1e3)
    imagepath = d_util.absolute_path(scenarioStorage, scenario_id, Modality.IMAGE, f"{image_time}.png")
    cv2.imwrite(imagepath, frame)
    
    faces = f_util.detect_faces(friends_path, imagepath)
    
    image_bbox = (0, 0, frame.shape[1], frame.shape[0])
    imageSignal = d_util.create_image_signal(scenario_ctrl, f"{image_time}.png", image_bbox, image_time)
    mentions = [f_util.create_face_mention(imageSignal, "front_camera", image_time,
                                           parse_bbox(face), parse_id(face), parse_name(face),
                                           parse_age(face), parse_gender(face), face.det_score)
                for face in faces]
    imageSignal.mentions.extend(mentions)
    scenario_ctrl.append_signal(imageSignal)

    if not faces:
        response = "Hi, anyone there? I can't see you.."
        time.sleep(3)
    elif len(faces) > 1:
        response = "Hi there! Apologizes, but I will only talk to one of you at a time.."
        time.sleep(3)
    else:
        face = faces[0]
        if parse_id(face) is None:
            ### This is a stranger, we process the new face
            human_id, human_name, _ = friend.get_to_know_person(scenario_ctrl, AGENT, parse_gender(face),
                                                                parse_age(face), face.face_id, face.embedding,
                                                                friends_path)
            
           # human_id = human_name  ### Hack because we cannot force the namespace through capsules, name and identity are the same till this is fixed


            ### Add the new information to the signal
            mention = f_util.create_face_mention(imageSignal, "front_camera", image_time,
                                                 parse_bbox(face), human_id, human_name,
                                                 parse_age(face), parse_gender(face), face.det_score)
            imageSignal.mentions.append(mention)

            response = f"So you what do you want to talk about {human_name}?"
        else:
            ### We know this person
            human_id = parse_id(face)
            human_name = parse_name(face)
            response = f"Hi {parse_name(face)}. Nice to see you again. How are you today?"

    print(f"{AGENT}: {response}\n")

    # Store signals, annotated with the infered Person information
    textSignal = d_util.create_text_signal(scenario_ctrl, response)
    scenario_ctrl.append_signal(textSignal)
    
scenarioStorage.save_scenario(scenario_ctrl)

url http://127.0.0.1:10004
Leolani2: Hi Vu_Amsterdam. Nice to see you again. How are you today?



## Have a conversation with a friend

Below is a simple chat scenario in which we can say anything to our identified friend and store images and conversation in the EMISSOR scenario.

In [23]:
stopped = False
while not stopped:
    utterance = input("\n")
    utterance_timestamp = int(time.time() * 1e3)
    if not utterance:
        continue

    
    # @TODO: also annotate the textSignal
    # Apply some processing to the textSignal and add annotations
    success, frame = camera.read()
    if not success:
        raise ValueError("Failed to take a picture")
        
    image_time = int(time.time() * 1e3)
    imagepath = d_util.absolute_path(scenarioStorage, scenario_id, Modality.IMAGE, f"{image_time}.png")
    cv2.imwrite(imagepath, frame)
    
    faces = f_util.detect_faces(friends_path, imagepath)
    
    image_bbox = (0, 0, frame.shape[1], frame.shape[0])
    imageSignal = d_util.create_image_signal(scenario_ctrl, f"{image_time}.png", image_bbox, image_time)
    mentions = [f_util.create_face_mention(imageSignal, "front_camera", image_time,
                                           parse_bbox(face), face.face_id, parse_name(face),
                                           parse_age(face), parse_gender(face), face.det_score)
                for face in faces]
    imageSignal.mentions.extend(mentions)

    greeting = ""
    if faces and not human_id in [parse_id(face) for face in faces]:
        response = f"Good bye {human_name}!"
        print(f"{AGENT}: {response}\n")
        textSignal = d_util.create_text_signal(scenario_ctrl, response)
        scenario_ctrl.append_signal(textSignal)

        if len(faces) > 1:
            greeting = "Apologizes, but I will only talk to one person at a time.."
        else:
            face = faces[0]
            if parse_id(face) is None:
                ### This is a stranger, we process the new face
                human_id, human_name, _ = friend.get_to_know_person(scenario_ctrl, AGENT, parse_gender(face),
                                                                    parse_age(face), face.face_id, face.embedding,
                                                                    friends_path)
                ### Add the new information to the signal
                mention = f_util.create_face_mention(imageSignal, "front_camera", image_time,
                                                     parse_bbox(face), human_id, human_name,
                                                     parse_age(face), parse_gender(face), face.det_score)
                imageSignal.mentions.append(mention)
    
                greeting = f"Nice to meet you, {human_name}!"
            else:
                human_id = parse_id(face)
                human_name = parse_name(face)
                greeting = f"Hi {parse_name(face)}. Nice to see you again. How are you today?"
    else:
        ### If no face is detected, assume it's still the same person talking
        pass
    
    emotion = t_util.recognize_emotion(utterance)
    emotion = max(emotion, key=emotion.get)
    print(f"{human_name}: ({emotion}) {utterance}\n")
    utteranceSignal = d_util.create_text_signal(scenario_ctrl, utterance, utterance_timestamp)

    if utterance.lower() == "stop" or utterance.lower() == "bye":
        response = f"Good bye {human_name}!"
        stopped = True
    else:
        # If there is no greeting, create a response from the system and store this as a new signal
        # We could use the throughts to respond
        # @TODO generate a response from the thoughts
        response = f"{greeting} So you what do you want to talk about {human_name}?"

    print(f"{AGENT}: {response}\n")
    responseSignal = d_util.create_text_signal(scenario_ctrl, response)

    # Store signals, annotated with the infered Person information
    scenario_ctrl.append_signal(utteranceSignal)
    scenario_ctrl.append_signal(responseSignal)
    scenario_ctrl.append_signal(imageSignal)
    
    scenarioStorage.save_scenario(scenario_ctrl)


url http://127.0.0.1:10004
Vu_Amsterdam: (joy) hello leolani!

Leolani2:  So you what do you want to talk about Vu_Amsterdam?

url http://127.0.0.1:10004
Vu_Amsterdam: (neutral) about my future plans

Leolani2:  So you what do you want to talk about Vu_Amsterdam?

url http://127.0.0.1:10004
Vu_Amsterdam: (neutral) I wanna invest in cryptos.

Leolani2:  So you what do you want to talk about Vu_Amsterdam?

url http://127.0.0.1:10004
Vu_Amsterdam: (joy) Will cryptos bring me money? hahaha

Leolani2:  So you what do you want to talk about Vu_Amsterdam?

url http://127.0.0.1:10004
Vu_Amsterdam: (neutral) Is shiba coin better than bitcoins?

Leolani2:  So you what do you want to talk about Vu_Amsterdam?

url http://127.0.0.1:10004
Vu_Amsterdam: (neutral) you don't know either?

Leolani2:  So you what do you want to talk about Vu_Amsterdam?

url http://127.0.0.1:10004
Vu_Amsterdam: (neutral) okay boring.

Leolani2:  So you what do you want to talk about Vu_Amsterdam?

url http://127.0.0.1:100

### Set the end time of the scenario, save it and stop the containers

After we stopped the interaction, we set the end time and save the scenario as EMISSOR data.

In [24]:
scenario_ctrl.scenario.ruler.end = int(time.time() * 1e3)
scenarioStorage.save_scenario(scenario_ctrl)

In [25]:
## Stopping the docker containers
## This is only needed if you started them in this notebook
f_util.kill_container(container_fdr)
f_util.kill_container(container_ag)
f_util.kill_container(container_yolo)
f_util.kill_container(container_room)
f_util.kill_container(container_erc)

In [None]:
#### Stop the camera when we are done
camera.release()

## End of notebook