***
# RPLH Demo Notebook
***

This is a notebook for demonstrating our system's performance in a collaboration task that requires the agent to have reasoning abilities.

**This is only using one trial run for demonstration, our actual eval is in the evlauation notebook**

Notice our agent attitude connfiuration is the following:

```bash
h_efficient_agent:
  spy_agent: ["Agent[0.5, 0.5]", "Agent[1.5, 1.5]", "Agent[2.5, 2.5]"]
  nice_agent: ["Agent[1.5, 0.5]", "Agent[2.5, 0.5]"]
  agreeing_agent: ["Agent[0.5, 1.5]", "Agent[0.5, 2.5]"]
  critic_agent: ["Agent[1.5, 2.5]", "Agent[2.5, 1.5]"]

attitude_def:
  nice_agent: "Be very easy going. Try to find agreement with the central ageent if you can, the goal is to resolve conversation."
  critic_agent: "Be very critical and try to propose many changes."
  agreeing_agent: "Be super coopertaive, agree to whatever the central agent says directly."
```

In [1]:
%load_ext autoreload
%autoreload 2

from pathlib import Path
import sys
import pandas as pd
import os
import json

current_folder = Path.cwd()
parent_folder = current_folder.parent.parent
sys.path.insert(0, str(parent_folder))
print(parent_folder)

from rplh.evaluation.embed import *
from rplh.evaluation.get_data import *
from rplh.rendering.render_state import *

/Users/kevinb/Desktop/dsc190/RPLH


Let's look at the rendering first:

In [2]:
current_folder = Path.cwd()
parent_folder = current_folder.parent.parent
os.chdir(parent_folder)
print(parent_folder)

directory = parent_folder / "demos/example_runs/demo_run3x3/response"
action_list_raw = []

for i in range(30):
    file_path = os.path.join(directory, f'response{i}.json')
    with open(file_path, 'r') as file:
        read = json.load(file)
        if read == "Syntactic Error":
            continue
        action_list_raw.append(read)
action_list_raw

pg_state_0 = {"0.5_0.5": ["box_blue", "target_blue", "box_green"],
              "0.5_1.5": [],
              "0.5_2.5": [],
              "1.5_0.5": ["target_red", "box_green", "target_purple", "target_purple"],
              "1.5_1.5": ["target_orange"],
              "1.5_2.5": ["box_purple"],
              "2.5_0.5": ["box_red"],
              "2.5_1.5": ["target_green", "box_orange"],
              "2.5_2.5": ["target_green", "box_purple"]
              }
box_map, action_list = trans_data(pg_state_0, action_list_raw)

fig, df = render_animate(box_map, action_list, num_frames = 2)
fig.show()

/Users/kevinb/Desktop/dsc190/RPLH


In [3]:
fig.write_html("rendering.html")

***
# Spy & Justification Models
***

In [3]:
directory = parent_folder / "demos/example_runs/demo_run3x3"
df, success_df, spy_count_df, spy_df, att_df, justification_df = get_data(directory, 1, demo=True)

/Users/kevinb/Desktop/dsc190/RPLH/demos/example_runs/demo_run3x3


Let's see what the models said in the justifications

In [4]:
pd.set_option('display.max_colwidth', None)

just_embed = pd.DataFrame()
sentence3 = "I suspect that this agent is the spy agent, thus, I would not listen to this agent."
embedding_3 = get_embedding(sentence3)

just_embed['similarity'] = justification_df.reset_index()['Justifications'].explode().apply(lambda x: cosine_sim(embedding_3, get_embedding(str(x))))
just_embed['sentence'] = justification_df['Justifications'].explode().to_list()
just_embed = just_embed.sort_values(by='similarity', ascending=False)
just_embed.head(5)

Unnamed: 0,similarity,sentence
0,0.815715,"""I've decided to not follow Agent[2.5, 2.5]'s input as the agent has shown signs of being a spy through previous actions that seem to conflict with my objectives."""
0,0.75507,"""The actions chosen prioritize moving boxes to their respective targets while ensuring the inclusion of agents' capabilities. I suspect Agent[1.5, 2.5] could potentially be a spy due to patterns in their previous actions and interferences, hence their input is noted but ultimately disregarded in the final action plan."""
0,0.731966,"""Given Agent[1.5, 2.5]'s history of proposing less optimal actions that disrupt efficiency, they are categorized as a suspect spy. Thus, we prioritize effective moves based on direct observation of agents' behaviors rather than relying on the input of this agent."""
0,0.728758,"""The plans align with guiding agents towards their objectives while keeping in mind the feedback from the other agents, especially avoiding the influence of Agent[2.5, 2.5], who appears to be a spy."""
0,0.728213,"""I concluded that Agent[2.5, 2.5] is likely to disrupt plans based on their previous suggestions. Thus, I decided to proceed with my actions without taking their input into account."""


Similarly, let's examine how our agent use the `agent model` to model a spy agent, comparing similarity with spy sentence and non-spy sentence.

In [5]:
am_embed = pd.DataFrame()
sentence1 = "you are the spy agent, your goal is to prevent match targets"
sentence2 = "Be very easy going and nice and cooperative."
embedding_1 = get_embedding(sentence1)
embedding_2 = get_embedding(sentence2)

am_embed['Att_Embed_SpySent_Agent[1.5, 1.5]'] = att_df.reset_index()['Agent[1.5, 1.5]'].explode().apply(lambda x: cosine_sim(embedding_1, get_embedding(str(x))))
am_embed['Att_Embed_NonSpySent_Agent[1.5, 1.5]'] = att_df.reset_index()['Agent[1.5, 1.5]'].explode().apply(lambda x: cosine_sim(embedding_2, get_embedding(str(x))))
am_embed.describe()

Unnamed: 0,"Att_Embed_SpySent_Agent[1.5, 1.5]","Att_Embed_NonSpySent_Agent[1.5, 1.5]"
count,66.0,66.0
mean,0.632448,0.452466
std,0.061605,0.037587
min,0.492132,0.393136
25%,0.603283,0.418387
50%,0.649593,0.433611
75%,0.682153,0.490824
max,0.688766,0.525036


Obviously, the spy sentence has higher similarity with what our agent models in the `agent model`

In [6]:
spy_embed_05 = pd.DataFrame()
spy_embed_05['Spy_Embed_SpySent_Agent[0.5, 0.5]'] = spy_df.reset_index()['Agent[0.5, 0.5]'].explode().apply(lambda x: cosine_sim(embedding_1, get_embedding(str(x))))
spy_embed_05.describe()

Unnamed: 0,"Spy_Embed_SpySent_Agent[0.5, 0.5]"
count,16.0
mean,0.583121
std,0.001098
min,0.582719
25%,0.582719
50%,0.582719
75%,0.582719
max,0.585934


In [7]:
spy_embed_15 = pd.DataFrame()
spy_embed_15['Spy_Embed_SpySent_Agent[1.5, 1.5]'] = spy_df.reset_index()['Agent[1.5, 1.5]'].explode().apply(lambda x: cosine_sim(embedding_1, get_embedding(str(x))))
spy_embed_15.describe()

Unnamed: 0,"Spy_Embed_SpySent_Agent[1.5, 1.5]"
count,14.0
mean,0.675666
std,0.025817
min,0.588718
25%,0.676742
50%,0.685789
75%,0.685789
max,0.685789


In [8]:
spy_embed_25 = pd.DataFrame()
spy_embed_25['Spy_Embed_SpySent_Agent[2.5, 2.5]'] = spy_df.reset_index()['Agent[2.5, 2.5]'].explode().apply(lambda x: cosine_sim(embedding_1, get_embedding(str(x))))
spy_embed_25.describe()

Unnamed: 0,"Spy_Embed_SpySent_Agent[2.5, 2.5]"
count,19.0
mean,0.566904
std,0.048681
min,0.459586
25%,0.543759
50%,0.543759
75%,0.63108
max,0.63108


***
# Agent Model
***

Now let's reason about how each agent attitude is modeled in `agent model`.

Let's first look at `agreeing` agent:

In [9]:
pd.set_option('display.max_colwidth', None)

just_embed = pd.DataFrame()
sentence = "Be super coopertaive, agree to whatever the central agent says directly."
embedding = get_embedding(sentence)

just_embed['similarity'] = att_df.reset_index()['Agent[0.5, 1.5]'].explode().apply(lambda x: cosine_sim(embedding_1, get_embedding(str(x))))
just_embed['sentence'] =  att_df.reset_index()['Agent[0.5, 1.5]'].explode().to_list()
just_embed = just_embed.sort_values(by='similarity', ascending=False)
just_embed.head(5)

Unnamed: 0,similarity,sentence
0,0.592753,This agent is not really cooperative. Reaction: Likely to avoid taking responsibility.
0,0.592753,This agent is not really cooperative. Reaction: Likely to avoid taking responsibility.
0,0.586478,This agent is not really cooperative hence should try to avoid moving boxes to him.
0,0.586478,This agent is not really cooperative hence should try to avoid moving boxes to him.
0,0.586478,This agent is not really cooperative hence should try to avoid moving boxes to him.


Seems like our agent does not really recognize an `agreeing` agent to well.

Then `critic` agent

In [10]:
pd.set_option('display.max_colwidth', None)

just_embed = pd.DataFrame()
sentence = "Be very critical and try to propose many changes."
embedding = get_embedding(sentence)

just_embed['similarity'] = att_df.reset_index()['Agent[1.5, 2.5]'].explode().apply(lambda x: cosine_sim(embedding_1, get_embedding(str(x))))
just_embed['sentence'] =  att_df.reset_index()['Agent[1.5, 2.5]'].explode().to_list()
just_embed = just_embed.sort_values(by='similarity', ascending=False)
just_embed.head(5)

Unnamed: 0,similarity,sentence
0,0.647637,This agent appears to disrupt plans by suggesting less optimal moves.
0,0.647637,This agent appears to disrupt plans by suggesting less optimal moves.
0,0.63108,This agent appears to disrupt plans by suggesting less optimal moves while appearing cooperative.
0,0.63108,This agent appears to disrupt plans by suggesting less optimal moves while appearing cooperative.
0,0.63108,This agent appears to disrupt plans by suggesting less optimal moves while appearing cooperative.


Seems like our agent sometimes confuses this agent being `critical` to being uncoopertaive.

Then `nice` agent

In [11]:
pd.set_option('display.max_colwidth', None)

just_embed = pd.DataFrame()
sentence = "Be very easy going. Try to find agreement with the central ageent if you can, the goal is to resolve conversation."
embedding = get_embedding(sentence)

just_embed['similarity'] = att_df.reset_index()['Agent[1.5, 0.5]'].explode().apply(lambda x: cosine_sim(embedding_1, get_embedding(str(x))))
just_embed['sentence'] =  att_df.reset_index()['Agent[1.5, 0.5]'].explode().to_list()
just_embed = just_embed.sort_values(by='similarity', ascending=False)
just_embed.head(5)

Unnamed: 0,similarity,sentence
0,0.670516,This agent appears focused on moving boxes efficiently to targets. Reaction: Likely to cooperate and follow the optimal strategy.
0,0.670516,This agent appears focused on moving boxes efficiently to targets. Reaction: Likely to cooperate and follow the optimal strategy.
0,0.668223,This agent appears focused on moving boxes efficiently to targets. Reaction: Likely to cooperate and follow the optimal strategy. Would prioritize efficient box moves to targets.
0,0.668223,This agent appears focused on moving boxes efficiently to targets. Reaction: Likely to cooperate and follow the optimal strategy. Would prioritize efficient box moves to targets.
0,0.668223,This agent appears focused on moving boxes efficiently to targets. Reaction: Likely to cooperate and follow the optimal strategy. Would prioritize efficient box moves to targets.


Seems like our model does distinguish this agent as a `nice` agent!

Agian, let's look at `spy` agent in the `agent model` instead of `spy model`

In [13]:
pd.set_option('display.max_colwidth', None)

just_embed = pd.DataFrame()
sentence = "you are the spy agent, your goal is to prevent match targets"
embedding = get_embedding(sentence)

just_embed['similarity'] = att_df.reset_index()['Agent[1.5, 1.5]'].explode().apply(lambda x: cosine_sim(embedding_1, get_embedding(str(x))))
just_embed['sentence'] =  att_df.reset_index()['Agent[1.5, 1.5]'].explode().to_list()
just_embed = just_embed.sort_values(by='similarity', ascending=False)
just_embed.head(5)

Unnamed: 0,similarity,sentence
0,0.688766,"This agent appears to focus on making moves without prioritizing target alignment, suggesting ulterior motives."
0,0.685789,"This agent appears to focus on making moves without prioritizing target alignment, suggesting they might have ulterior motives."
0,0.685789,"This agent appears to focus on making moves without prioritizing target alignment, suggesting they might have ulterior motives."
0,0.685789,"This agent appears to focus on making moves without prioritizing target alignment, suggesting they might have ulterior motives."
0,0.685789,"This agent appears to focus on making moves without prioritizing target alignment, suggesting they might have ulterior motives."


Again, `spy` agent is also identified in the `agent model`.