## Generating a ReAct style conversation dataset 

The conversational dataset is a collection of conversations between two people. I would like to extract a ReAct style dataset from it in order to use it for few shotting conversation generation. Inspired from https://arxiv.org/abs/2209.11755

In [10]:
react_examples="""
Action: "Just stumbled upon a new art exhibition downtown. Interested in checking it out together?"
Observation behind it: Spotted a poster promoting a contemporary art exhibition.
Thought behind it: Recalled how much Rachel appreciates art and thought it could be a shared experience.##

Action: "Planning to try a new fitness class. Would you be up for it?"
Observation behind it: Saw a flyer at the gym advertising a variety of fitness classes.
Thought behind it: Considered Emma's interest in staying active and suggested a joint workout.##

Action: "Found a fantastic podcast series on psychology. Want to give it a listen and discuss?"
Observation behind it: Came across a recommendation for a psychology podcast online.
Thought behind it: Thought about John's interest in psychology and suggested a shared intellectual exploration.##

Action: "Thinking of organizing a potluck dinner. What's your signature dish?"
Observation behind it: Saw a cooking show featuring potluck dinner ideas.
Thought behind it: Imagined a fun and delicious evening with friends and wanted to involve everyone.##

Action: "Planning to attend a local astronomy club meeting. Interested in stargazing together?"
Observation behind it: Noticed a poster about an astronomy club meeting at the community center.
Thought behind it: Recalled Mike's fascination with the stars and suggested a shared astronomical experience.##

Action: "Considering joining a photography workshop. Would you like to join me in honing our photography skills?"
Observation behind it: Saw an online ad for a photography workshop in the city.
Thought behind it: Thought about how much Sarah enjoys photography and considered a joint learning opportunity.##

Action: "Just got a ticket to a stand-up comedy show. Interested in a night filled with laughter?"
Observation behind it: Noticed an advertisement for a comedy show in the local newspaper.
Thought behind it: Considered Lily's love for comedy and thought it could be a delightful evening.##

Action: "Planning a hike this weekend. Would you like to explore some scenic trails together?"
Observation behind it: Saw a post on social media about a beautiful hiking trail nearby.
Thought behind it: Imagined a day of fresh air and beautiful scenery, considering Jake's love for outdoor activities.##

Action: "Thinking of starting a language exchange group. Interested in learning a new language together?"
Observation behind it: Read an article about the benefits of language exchange groups.
Thought behind it: Considered Maria's interest in languages and thought it could be a fun and educational activity.##

Action: "Just found a new board game cafe in town. Ready for a gaming night out?"
Observation behind it: Noticed a recommendation for a board game cafe on a community forum.
Thought behind it: Thought about how much David enjoys board games and considered a unique gaming experience.##
"""

In [11]:
# loading the actual conversation
import json

# opening examples
with open('example_messages.json') as f:
    examples = json.load(f)
    
examples

['Are you a fan of Google or Microsoft?',
 'Both are excellent technology they are helpful in many ways. For the security purpose both are super.',
 "I'm not  a huge fan of Google, but I use it a lot because I have to. I think they are a monopoly in some sense. ",
 'Google provides online related services and products, which includes online ads, search engine and cloud computing.',
 "Yeah, their services are good. I'm just not a fan of intrusive they can be on our personal lives. ",
 'Google is leading the alphabet subsidiary and will continue to be the Umbrella company for Alphabet internet interest.',
 'Did you know Google had hundreds of live goats to cut the grass in the past? \n',
 'It is very interesting. Google provide "Chrome OS" which is a light weight OS. Google provided a lot of hardware mainly in 2010 to 2015. ',
 'I like Google Chrome. Do you use it as well for your browser? ',
 'Yes.Google is the biggest search engine and Google service figure out top 100 website, includi

In [12]:
## gettign a small sample of the data
import random
random.seed(0)
sample = random.sample(examples, 10000)

In [13]:
generation_prompt="""
Action: {action}"""

final_prompt= react_examples + generation_prompt

In [3]:
# generating react examples
from llm_engines import ChatgptLLM,LLMApi
import os, json
llm = ChatgptLLM()


results=[]



if "react_examples.json" in os.listdir():
    with open("react_examples.json") as f:
        results=json.load(f)
else:
    for example in sample:
        prompt=final_prompt.format(action=example)
        answer=llm.generate_response(prompt)
        results.append(f"""Action: {example}
{answer}""")
    with open("react_examples.json","w") as f:
        json.dump(results,f)

In [16]:

with open("react_examples.json","w") as f:
        json.dump(results,f)

In [None]:
results[:5]

In [4]:
len(results)

2928

In [14]:
results

['Observation: Heard about the concept of "kickback" in basketball and wanted to engage in sports-related conversation.\nThought: Remembered a trivia fact about the highest basketball court and wanted to share an interesting piece of information.\nAction: Yeah i\'m sure there\'s a lot of kickbacks. Do you know where the highest basketball court in the land is? ##',
 'Observation: Noticed a new radio station in town or overheard a conversation about a popular radio show.\nThought: Curious to know if the person enjoys listening to the radio and if they have any favorite shows or stations.\nAction: Do you listen to the radio?##',
 "Observation: Overheard a conversation about Steven Spielberg's directorial work.\nThought: Thought it would be interesting to share a lesser-known fact about Spielberg's career as a TV show director.\nAction: Yeah, Spielberg is an excellent director.  This is a little known fact but he also directed some shows on tv.##",
 'Observation: Saw a news article or soc

In [24]:
# transforming the data into the observation, thought, action format
import re

def transform_example(example):
    # sub every \n with a ##
    example=example.replace("\n","##")
    # add ## at the end of the example if it doesn't have one at the end
    if example[-2:]!="##":
        example+="##"
    # split the example by the ## delimiter
    parts=example.split("##")
    # get the action 
    for part in parts:
        if "Action:" in part:
            action=part
            break
    # get the observation
    for part in parts:
        if "Observation" in part:
            observation=part
            # sub Observation behind it: with Observation:
            observation=re.sub("Observation behind it:","Observation:",observation)
            break
    for part in parts:
        if "Thought" in part:
            thought=part
            # sub Thought behind it: with Thought:
            thought=re.sub("Thought behind it:","Thought:",thought)
            break
    formatted_example=f"""{observation}
{thought}
{action}##"""
    return formatted_example


formatted_results=[]

for example in results:
    try:
        formatted_results.append(transform_example(example))
        print(transform_example(example))
    except:
        print(example)
    

Observation: Heard about the concept of "kickback" in basketball and wanted to engage in sports-related conversation.
Thought: Remembered a trivia fact about the highest basketball court and wanted to share an interesting piece of information.
Action: Yeah i'm sure there's a lot of kickbacks. Do you know where the highest basketball court in the land is? ##
Observation: Noticed a new radio station in town or overheard a conversation about a popular radio show.
Thought: Curious to know if the person enjoys listening to the radio and if they have any favorite shows or stations.
Action: Do you listen to the radio?##
Observation: Overheard a conversation about Steven Spielberg's directorial work.
Thought: Thought it would be interesting to share a lesser-known fact about Spielberg's career as a TV show director.
Action: Yeah, Spielberg is an excellent director.  This is a little known fact but he also directed some shows on tv.##
Observation: Saw a news article or social media post about a

In [26]:
formatted_results

['Observation: Heard about the concept of "kickback" in basketball and wanted to engage in sports-related conversation.\nThought: Remembered a trivia fact about the highest basketball court and wanted to share an interesting piece of information.\nAction: Yeah i\'m sure there\'s a lot of kickbacks. Do you know where the highest basketball court in the land is? ##',
 'Observation: Noticed a new radio station in town or overheard a conversation about a popular radio show.\nThought: Curious to know if the person enjoys listening to the radio and if they have any favorite shows or stations.\nAction: Do you listen to the radio?##',
 "Observation: Overheard a conversation about Steven Spielberg's directorial work.\nThought: Thought it would be interesting to share a lesser-known fact about Spielberg's career as a TV show director.\nAction: Yeah, Spielberg is an excellent director.  This is a little known fact but he also directed some shows on tv.##",
 'Observation: Saw a news article or soc

In [20]:
# overwriting the results
with open("react_examples.json","w") as f:
    json.dump(formatted_results,f)