# Version 3

The basic idea here is to extract the CPC first. This is the `core propositional content` or the the central thing that the speaker is talking about -- Could be an action or concept. In either case, it is the thing that the listener is not exepected to have presupposed. E.g., "The purple box is open". There the listener is expected to know about the purple box, but not that it is open. 

**Algorithm:**

Input: utterance U, ActionDb A, ConceptDB C, AvailableTypes T

1. speech_act <-- extract_speech_act(U) 
2. refs <-- extract_referents(U) 
3. ref_dict <-- extract_referent_types(U, refs, T)
4. cpc_name <-- extract_cpc_name(U) 
5. ling_cpc_signature <-- extract_cpc_sign(U, cpc_name, ref_dict)
6. * ling_parse <-- tether(U, ling_cpc_signature, A, C, T)
    
    
For "INSTRUCT" speech acts, `tethering` involves:
- comparing the linguistically derived parse (i.e., ling_parse) with available action signatures, and generating an association chain between the cpc_name and one or more corresponding actions 
    - ranked list of name matches. Filter down this list with argument matching. 
    - Failure here means agent cannot perform action

For "STATEMENT" speech acts, `tethering` involves:
- comparing the linguistically derived parse (i.e., ling_parse) with available concepts.
    - Failure here means agent can learn a new fact, but not understand its meaning or be able to recognize the concept in a different setting, without further attempts at tethering. 






In [1]:
dev = [
    "pick up the mug",
    "pick up that mug on the table",
    "this object is a mug",
    "that mug belongs to Evan",
]

In [2]:
# Imports

from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.chains import LLMChain

In [3]:
# Initialize LLM 

#llm = ChatOpenAI(model_name="gpt-4", temperature=0.0)
llm = OpenAI(temperature=0.0)

In [9]:
## (1) Speech Act Classification 

template_speech_act= """
Decide whether the utterance below from a speaker to a listener is one of INSTRUCT, STATEMENT, GREETING, QUESTIONWH ('wh', questions), QUESTIONYN ('yes/no' questions), ACK (e.g. "yes" or "ok"), or UNKNOWN 
An INSTRUCT is an imperative statement or a request by the speaker to have the listener do an action or stop doing an action.
A QUESTIONWH is a 'wh' query (what, why, when, where, who) or request from a speaker for more information from the listener about the listeners knowledge, beliefs or perceptions
A QUESTIONYN is a 'yes/no' query or request from a speaker for more information from the listener about the listeners knowledge, beliefs or perceptions, but the speaker expects a yes or no for an answer
A STATEMENT is a statement of fact or opinion that the speaker conveys to a listener. 
A GREETING is an expression of social connection establishing the start of the conversation. E.g., "Hello"
A ACK is an acknowledgement (either "yes" or "no").
A UNKNOWN is an utterance not one of the above. 

utterance: \n{utterance}\n
act:
"""

prompt_speech_act = PromptTemplate(
    input_variables=["utterance"],
    template=template_speech_act
)

chain_speech_act = LLMChain(llm=llm, prompt=prompt_speech_act)

In [10]:
## (2) Central Referents 

template_centralref = """
What is the central item (which could be a single thing or a collection of things) that is being referred to in the below sentence?

sentence: \n{utterance}\n 
referent:
"""

prompt_centralref = PromptTemplate(
    input_variables=["utterance"],
    template=template_centralref
)

chain_centralref = LLMChain(llm=llm, prompt=prompt_centralref)

In [25]:
## (3) Supporting Referents

template_suppref = """
What are some objects/referents (which could be a single thing or a collection of things) that is being referred to in the below sentence not including the central referent? Return as a python list.
If none, then return empty list []. Even if only one item, return as a list. 

sentence: \n{utterance}\n 
central referent: \n{centralref}\n
supporting referents:
"""

prompt_suppref = PromptTemplate(
    input_variables=["utterance", "centralref"],
    template=template_suppref
)

chain_suppref = LLMChain(llm=llm, prompt=prompt_suppref)

In [17]:
## (4) Getting the type of thing that the referents are 

template_typeof = """
Determine whether or not the referent item mentioned below in the context of the provided utterance is one of the types also provided below. To check if the referent is of a type, follow the below procedure
1. Iterate through each item mentioned in the list of types. 
2. For each item X in the list of types expand on the meaning of each item, and then ask if the central referent is of type X given that meaning. 
3. If the central referent is of type X in the list, return X.

\n\n EXAMPLE \n
utterance: The lemon is on the table
referent: lemon
types: ['area', 'physobj', 'location']
typeOf: Looking through the items in the list of types above. physobj is a physical object. lemon is a type of physical object. So, it is of type physobj

Remember, return specifically ONE of the items in the list, or if none apply then return NONE. 

utterance: \n{utterance}\n
referent: \n{ref}\n
types: \n{types}\n
typeOf:
"""

prompt_typeof = PromptTemplate(
    input_variables=["ref", "types", "utterance"],
    template=template_typeof
)

chain_typeof = LLMChain(llm=llm, prompt=prompt_typeof)


# Robot Capabilities

# Pipeline

In [32]:
import ast

def linguistic_parse(utterance):
    speech_act = chain_speech_act.run(utterance=utterance).lower()
    
    # 2. Central Referent Extraction
    centralref = chain_centralref.run(utterance=utterance).lower()
    
    # 3. Supporting Referents Extraction
    supprefs = chain_suppref.run(utterance=utterance, centralref=centralref).lower()
    supprefs = ast.literal_eval(supprefs)
    
    reftypes = []
    for suppref in supprefs:
        ref_type = chain_typeof(utterance=utterance, types=types, ref=suppref)
    
    output = {
        "utterance": utterance,
        "speech_act": speech_act,
        "centralref": centralref,
        "supprefs": supprefs    
    }
    
    return output


In [33]:
for text in dev:
    out = linguistic_parse(text)
    print(out)

{'utterance': 'pick up the mug', 'speech_act': 'instruct', 'centralref': 'mug', 'supprefs': []}
{'utterance': 'pick up that mug on the table', 'speech_act': 'instruct', 'centralref': 'mug', 'supprefs': ['table']}
{'utterance': 'this object is a mug', 'speech_act': 'statement', 'centralref': 'mug', 'supprefs': []}
{'utterance': 'that mug belongs to Evan', 'speech_act': 'statement', 'centralref': 'mug', 'supprefs': ['evan']}
