# Let's chat with a friend

Demo chat with Leolani. Leolani uses face recognition and gender/age
estimation.

Don't forget to install emissor by `pip install .` at the root of this repo.
Install the requirements `pip install -r requirements.txt`
you might also have to run `python -m spacy download en`

Occasionally you have to kill the docker containers if you force close the chat.
`docker kill $(docker ps -q)`

In [1]:
import emissor as em
import uuid
from datetime import datetime
from emissor.persistence import ScenarioStorage
from emissor.representation.annotation import AnnotationType, Token, NER
from emissor.representation.container import Index
from emissor.representation.scenario import (
    Modality,
    ImageSignal,
    TextSignal,
    Mention,
    Annotation,
    Scenario,
)
import cv2

In [2]:
import sys
import os
src_path = os.path.abspath(os.path.join('..'))
if src_path not in sys.path:
    sys.path.append(src_path)

#### The next utils are needed for the interaction and creating triples and capsules
import util.driver_util as d_util
import util.capsule_util as c_util
import util.face_util as f_util
import intentions.get_to_know_you as friend
import intentions.listen as listen
import intentions.answer as answer

[nltk_data] Downloading package punkt to /Users/piek/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


In [3]:
### Link your camera
camera = cv2.VideoCapture(0)

## Standard initialisation of a scenario

In [6]:
from random import getrandbits
import requests
##### Setting the location
place_id = getrandbits(8)
location = requests.get("https://ipinfo.io").json()

##### Setting the agents
agent = "Leolani2"
human = "Stranger"

### The name of your scenario
scenario_id = datetime.today().strftime("%Y-%m-%d-%H:%M:%S")

### Specify the path to an existing data folder where your scenario is created and saved as a subfolder
scenario_path = "../../data"

### Define the folder where the images are saved
imagefolder = scenario_path + "/" + scenario_id + "/" + "image"


### Create the scenario folder, the json files and a scenarioStorage and scenario in memory
scenarioStorage = d_util.create_scenario(scenario_path, scenario_id)
scenario = scenarioStorage.create_scenario(scenario_id, datetime.now().microsecond, datetime.now().microsecond, agent)

Directory  ../../data/2021-11-01-12:21:39  Created 
Directory  ../../data/2021-11-01-12:21:39/image  Created 


### Loading the docker containers for face detection and face property detection

You only need to load the dockers once. The first time you load the docker, the images will be donwloaded from the DockerHub. This may take a few minutes depending on the speed of the internet connection. The images are cached in your local Docker installation.

One the images are in your local Docker, they are loaded instantaniously. Once the docker is started you do not need to start it again and you can skip the next commands.

In [7]:
#container_fdr = f_util.start_docker_container("tae898/face-detection-recognition:v0.1", 10002)
#container_ag = f_util.start_docker_container("tae898/age-gender:v0.2", 10003)

If there is a problem starting the dockers, you may need to kill them and start them again. Use the following command to kill and rerun the previous command. Note that if there are running already you should not restart. Starting it again gives an error that the port is occupied.

In [8]:
#!docker kill $(docker ps -q)

## We are now set to make a new friend

The functions in *intentions/get_to_know_you.py* are needed to get the properties and visual information for identifying a new friend.

The visual information is based on the camera images of the uses from which we extract an averaged embedding.
These embeddings are store in the *friend_embeddings* folder. 

By comparing an image with the stored embeddings, the system decides whether a person is a *stranger*.
In case the user is a *stranger*, the system will try to get to know him/her.

If you delete someone's embeddings from the *friend_embeddings* folder. This person will become a *stranger* again.

In [9]:
# First signals to get started
success, frame = camera.read()
imagepath = ""
if success:
    current_time = str(datetime.now().microsecond)
    imagepath = imagefolder + "/" + current_time + ".png"
    cv2.imwrite(imagepath, frame)
    (
        genders,
        ages,
        bboxes,
        faces_detected,
        det_scores,embeddings,
    ) = f_util.do_stuff_with_image(imagepath)

    # Initial prompt by the system from which we create a TextSignal and store it

    # Here we assume that only one face is in the image
    # TODO: deal with multiple people.
    for k, (gender, age, bbox, uuid_name, faceprob, embedding) in enumerate(
        zip(genders, ages, bboxes, faces_detected, det_scores, embeddings)
    ):
        age = round(age["mean"])
        gender = "male" if gender["m"] > 0.5 else "female"
        bbox = [int(num) for num in bbox.tolist()]

    assert k == 0

    if uuid_name["name"] is None:
        ### This is a stranger
        ### We create the agent response and store it as a text signal
        
        human_id, human, textSignal = friend.get_to_know_person(scenario, agent, gender, age, uuid_name, embedding)


        ### The system responds to the processing of the new name input and stores it as a textsignal
        print(agent + f": Nice to meet you, {human}")
        response = f": Nice to meet you, {human}"
        textSignal = d_util.create_text_signal(scenario, response)
        scenario.append_signal(textSignal)

    else:
        ### We know this person
        human_id= uuid_name['name']
        human = human_id.split("_t_")[0]
        response = f"Hi {human}. Nice to see you again :)"
        print(f"{agent}: {response}")
        textSignal = d_util.create_text_signal(scenario, response)
        scenario.append_signal(textSignal)


2021-11-01 12:21:44.314 INFO face_util - load_binary_image: ../../data/2021-11-01-12:21:39/image/273436.png image loaded!
2021-11-01 12:21:45.053 INFO face_util - run_face_api: got <Response [200]> from server!...


TypeError: scalar() argument 1 must be numpy.dtype, not _IDProxy

In [10]:
### First prompt
response = "How are you doing "+human
textSignal = d_util.create_text_signal(scenario, response)
scenario.append_signal(textSignal)

print(agent + ": " + response)

utterance = input("\n")
print(human + ": " + utterance)

while not (utterance.lower() == "stop" or utterance.lower() == "bye"):
    textSignal = d_util.create_text_signal(scenario, utterance)
    scenario.append_signal(textSignal)

    # @TODO: also annotate the textSignal
    # Apply some processing to the textSignal and add annotations
        
        
    ## We capture the image again
    if success:
        imageSignal = d_util.create_image_signal(scenario, imagepath)
        container_id = str(uuid.uuid4())

        #### Properties are now stored as annotations
        #### We do not store these proeprties again to the BRAIN
        for gender, age, bbox, name, faceprob in zip(
            genders, ages, bboxes, faces_detected, det_scores
        ):

            age = round(age["mean"])
            gender = "male" if gender["m"] > 0.5 else "female"
            bbox = [int(num) for num in bbox.tolist()]
        
        f_util.add_face_annotation(imageSignal,
                                       "front_camera",
                                        str(uuid.uuid4(), 
                                        current_time,
                                        bbox,
                                        human_id,
                                        human_name,
                                        age, 
                                        gender, 
                                        faceprob)
        scenario.append_signal(imageSignal)


    # Create the response from the system and store this as a new signal
    # We could use the throughts to respond
    # @TODO generate a response from the thoughts

    utterance = "So you what do you want to talk about " + human + "\n"
    response = utterance[::-1]
    print(agent + ": " + utterance)
    textSignal = d_util.create_text_signal(scenario, utterance)
    scenario.append_signal(textSignal)

    # Getting the next input signals
    utterance = input("\n")

    success, frame = camera.read()
    if success:
        current_time = str(datetime.now().microsecond)
        imagepath = imagefolder + "/" + current_time + ".png"
        cv2.imwrite(imagepath, frame)
        (
            genders,
            ages,
            bboxes,
            faces_detected,
            det_scores,
            embeddings,
        ) = f_util.do_stuff_with_image(imagepath)



SyntaxError: invalid syntax (<ipython-input-10-3d2621a03323>, line 44)

### Set the end time of the scenario, save it and stop the containers

After we stopped the interaction, we set the end time and save the scenario as EMISSOR data.

In [5]:
#scenario.scenario.end = datetime.now().microsecond
scenarioStorage.save_scenario(scenario)

In [6]:
### Stopping the docker containers
### This is only needed of you started them in this notebook

#f_util.kill_container(container_fdr)
#f_util.kill_container(container_ag)

In [8]:
#### Stop the camera when we are done
camera.release()

## End of notebook