# Chain of Thought prompting and GPT-4 prompting

In this notebook, I try to improve the results of data augmentation for GPT3.5 with CoT and compare to GPT-4.

In [14]:
import pandas as pd
import openai
import sys
import os
import re

sys.path.append(os.path.dirname(os.getcwd()))
import config
openai.api_key = config.OPENAI_API_KEY

Read data and sample from each level twice

In [6]:
egp = pd.read_csv('../dat/egponline.csv')
egp_samples = egp.groupby('Level', group_keys=False).apply(lambda x: x.sample(2))
egp_samples['Example'] = egp_samples['Example'].str.replace(r"\(.*\)", "", regex=True).str.strip()

In [7]:
egp_samples.head(20)

Unnamed: 0,#,SuperCategory,SubCategory,Level,Lexical Range,guideword,Can-do statement,Example
923,924,PRONOUNS,subject/ object,A1,,FORM: (OBJECT) WITH PREPOSITION,"Can use the object pronouns 'me', 'you', 'him'...",I really like to learn new words. It's very im...
924,925,PRONOUNS,subject/ object,A1,3.0,FORM: (SUBJECT) 'IT' FOR FIRST PERSON,Can use the pronoun 'it' before 'be' to refer ...,"Hello Mrs Bishop. It's Clarisse \n\nDear Cris,..."
334,335,DETERMINERS,quantity,A2,2.0,FORM: WITH SINGULAR NOUNS,Can use an increasing range of quantifying det...,The new art class starts next Monday and each ...
242,243,CLAUSES,subordinated,A2,,FORM/USE: PURPOSE,Can use a non-finite subordinate clause with '...,To get to my house you'll have to take the 5th...
103,104,ADVERBS,adverbs as modifiers,B1,2.0,USE: STANCE,Can use an increasing range of adverbs ('compl...,I am completely sure. \n\nIt's obviously much ...
413,414,FUTURE,present continuous for future use,B1,2.0,USE: FUTURE ARRANGEMENTS,Can use the present continuous with an increas...,I’m attending my grandmother’s funeral tomorro...
503,504,MODALITY,may,B2,2.0,FORM: WITH ADVERBS,Can use 'may' with an increasing range of adve...,"When you're reading books, you may even find w..."
370,371,FUTURE,future perfect continuous,B2,,FORM: AFFIRMATIVE,Can use the affirmative form with 'will'.,This summer I will have been working for three...
454,455,MODALITY,can,C1,,USE: EMPHASIS,Can use expressions with 'can' or 'can’t' to g...,"So, as you can see, there are no serious conse..."
521,522,MODALITY,might,C1,,FORM: QUESTIONS,Can use the question form.,Could it be possible that your company sent me...


Let's start with normal prompting as a baseline.

In [10]:
NUM_EXAMPLES = 20

In [11]:
def get_prompt(construction):
    return f'Create {NUM_EXAMPLES} more examples for the grammatical construction on CEFR level {construction["Level"]} in the category "{construction["SuperCategory"]}: {construction["SubCategory"]}" with guideword "{construction["guideword"]}" and the rule: "{construction["Can-do statement"]}"\n\nExamples:\n\n{construction["Example"]}\n\nOutput format:\n1. [EXAMPLE 1]\n2. [EXAMPLE 2]'

egp_samples['prompt'] = egp_samples.apply(get_prompt, axis=1)

In [13]:
print(egp_samples['prompt'].iloc[5])

Create 20 more examples for the grammatical construction on CEFR level B1 in the category "FUTURE: present continuous for future use" with guideword "USE: FUTURE ARRANGEMENTS" and the rule: "Can use the present continuous with an increasing range of verbs to talk about future arrangements."

Examples:

I’m attending my grandmother’s funeral tomorrow. 

We’re expecting a child very soon. 

The movie is starting at 8 o’clock.

Output format:
1. [EXAMPLE 1]
2. [EXAMPLE 2]


In [15]:
def get_examples(construction):
    print(construction['prompt'])
    messages = [
        {"role": "system", "content": "You are an English as a foreign language teacher who is knowledgable about grammar."},
        {"role": "user", "content": construction['prompt']}
    ]
    response = openai.ChatCompletion.create(model=config.OPENAI_MODEL, messages=messages )
    msg_content = response.choices[0].message.content
    print(f'{msg_content}\n\n')
    lines = msg_content.split('\n')
    positive_examples = [re.sub(r'^\d+\.\s*', '', line).strip() for line in lines]

    messages.append(response.choices[0].message)

    # negative examples
    messages.append({"role": "user", "content": "Rewrite each example with the same content but without using the rule."})
    print(messages)
    
    response = openai.ChatCompletion.create(model=config.OPENAI_MODEL, messages=messages)
    msg_content = response.choices[0].message.content
    print(f'{msg_content}\n\n')
    lines = msg_content.split('\n')
    negative_examples = [re.sub(r'^\d+\.\s*', '', line).strip() for line in lines]
    return positive_examples, negative_examples

egp_samples[['augmented_examples', 'augmented_negative_examples']] = egp_samples.apply(get_examples, axis=1, result_type='expand')

Create 20 more examples for the grammatical construction on CEFR level A1 in the category "PRONOUNS: subject/ object" with guideword "FORM: (OBJECT) WITH PREPOSITION " and the rule: "Can use the object pronouns 'me', 'you', 'him', 'her', 'it', 'us' and 'them' in the object position after prepositions. "

Examples:

I really like to learn new words. It's very important for me. 

Sometimes I go with her. 

Can you bring some music so we can listen to it. 

I spend my free time with them.

Output format:
1. [EXAMPLE 1]
2. [EXAMPLE 2]
1. She's always there for me.
2. Take this gift to him.
3. They're waiting for you.
4. I'm going to the party with them.
5. The book is for her.
6. This is a special place for us.
7. The park is near me.
8. Can you pass the salt to him?
9. She looks after them.
10. This present is for her.
11. We're going to the beach with them.
12. The movie is about me.
13. Can you give this to him?
14. She takes care of them.
15. This job is for her.
16. I'm going to the m

In [22]:
egp_samples[['augmented_examples', 'augmented_negative_examples']].iloc[9,0]

['Might it not be better to schedule the meeting for tomorrow instead of today?',
 'Could it be that you left your keys in the office, or might you have dropped them on your way home?',
 'Might it be worth trying a different approach to solve this problem?',
 'Could it be that I misplaced my wallet, or might someone have taken it?',
 'Might it not be helpful to consult with a financial advisor before making a major investment?',
 'Could it be possible that we missed an important detail, or might there be something we overlooked?',
 'Might it not be a good idea to check the weather forecast before planning an outdoor event?',
 'Could it be that you misinterpreted what she said, or might she have been misunderstood?',
 'Might it be better to wait for a few more days and see if the situation improves?',
 'Could it be possible that the package got lost in transit, or might there be a delay in delivery?',
 'Might it not be wise to save some money for emergencies instead of spending it all?'

## CoT prompting

Now, we want the LLM first explain the examples, make them negative and then do the same job. This should give the system more time to reflect what the rule actually defines.

In [42]:
def get_CoT_examples(construction):
    # reflection part
    reflection_prompt = f"""
Learn to analyse the grammatical construction on CEFR level {construction["Level"]} in the category "{construction["SuperCategory"]}: {construction["SubCategory"]}" with guideword "{construction["guideword"]}" and the rule: "{construction["Can-do statement"]}"

Examples:
{construction["Example"]}

Task:
Explain how each example uses the rule.
""" 
    
    messages = [
        {"role": "system", "content": "You are an English as a foreign language teacher who is knowledgable about grammar."},
        {"role": "user", "content": reflection_prompt}
    ]
    print(messages)
    response = openai.ChatCompletion.create(model=config.OPENAI_MODEL, messages=messages)
    messages.append(response.choices[0].message)

    messages.append({"role": "user", "content": "Now rewrite each example avoiding this rule but maintaining the content."})
    print(messages)
    response = openai.ChatCompletion.create(model=config.OPENAI_MODEL, messages=messages)
    messages.append(response.choices[0].message)

    # positive examples
    messages.append({"role": "user", "content": f"Now write {NUM_EXAMPLES} more examples using the rule. Output format:\n1. [EXAMPLE 1]\n2. [EXAMPLE 2]"})
    print(messages)
    response = openai.ChatCompletion.create(model=config.OPENAI_MODEL, messages=messages)
    messages.append(response.choices[0].message)
    msg_content = response.choices[0].message.content
    print(f'{msg_content}\n\n')
    lines = msg_content.split('\n')
    positive_examples = [re.sub(r'^\d+\.\s*', '', line).strip() for line in lines]

    # negative examples
    messages.append({"role": "user", "content": "Rewrite each example with the same content but without using the rule. Output format:\n1. [EXAMPLE 1]\n2. [EXAMPLE 2]"})    
    print(messages)
    response = openai.ChatCompletion.create(model=config.OPENAI_MODEL, messages=messages)
    msg_content = response.choices[0].message.content
    print(f'{msg_content}\n\n')
    lines = msg_content.split('\n')
    negative_examples = [re.sub(r'^\d+\.\s*', '', line).strip() for line in lines]
    return positive_examples, negative_examples

Test the function:

In [41]:
construction = egp_samples.iloc[1]
get_CoT_examples(construction)

[{'role': 'system', 'content': 'You are an English as a foreign language teacher who is knowledgable about grammar.'}, {'role': 'user', 'content': '\nLearn to analyse the grammatical construction on CEFR level A1 in the category "PRONOUNS: subject/ object" with guideword "FORM: (SUBJECT) \'IT\' FOR FIRST PERSON" and the rule: "Can use the pronoun \'it\' before \'be\' to refer to a first person speaker or writer."\n\nExamples:\nHello Mrs Bishop. It\'s Clarisse \n\nDear Cris, it\'s me, Paarth.\n\nTask:\nExplain how each example uses the rule.\n'}]
[{'role': 'system', 'content': 'You are an English as a foreign language teacher who is knowledgable about grammar.'}, {'role': 'user', 'content': '\nLearn to analyse the grammatical construction on CEFR level A1 in the category "PRONOUNS: subject/ object" with guideword "FORM: (SUBJECT) \'IT\' FOR FIRST PERSON" and the rule: "Can use the pronoun \'it\' before \'be\' to refer to a first person speaker or writer."\n\nExamples:\nHello Mrs Bishop.

(["It's Sarah, your new neighbor.",
  "It's me, John from the party last night.",
  "It's Alex. Can you pass me the salt?",
  "It's time for dinner. It's Dad cooking tonight.",
  "It's my turn to do the dishes. It's Jane's turn tomorrow.",
  "It's so cold today. It's me freezing in this weather.",
  "It's me, your best friend since childhood.",
  "It's your favorite singer on stage, Taylor Swift.",
  "It's me, the one you were talking about.",
  "It's Alex's birthday. It's him turning 30 today.",
  "It's her, the woman I was telling you about.",
  "It's your turn to score the goal. It's you being the star player.",
  "It's my wedding anniversary. It's me celebrating ten years of marriage.",
  "It's our team winning the match. It's us bringing home the trophy.",
  "It's your favorite actor in the movie, Tom Hanks.",
  "It's your favorite band singing their new song.",
  "It's me on the phone. It's me calling to check in.",
  "It's her, the one who made the cake for the party.",
  "It's 

Now run it for all samples.

In [44]:
egp_samples[['augmented_examples_CoT', 'augmented_negative_examples_CoT']] = egp_samples.apply(get_CoT_examples, axis=1, result_type='expand')

[{'role': 'system', 'content': 'You are an English as a foreign language teacher who is knowledgable about grammar.'}, {'role': 'user', 'content': '\nLearn to analyse the grammatical construction on CEFR level A1 in the category "PRONOUNS: subject/ object" with guideword "FORM: (OBJECT) WITH PREPOSITION " and the rule: "Can use the object pronouns \'me\', \'you\', \'him\', \'her\', \'it\', \'us\' and \'them\' in the object position after prepositions. "\n\nExamples:\nI really like to learn new words. It\'s very important for me. \n\nSometimes I go with her. \n\nCan you bring some music so we can listen to it. \n\nI spend my free time with them.\n\nTask:\nExplain how each example uses the rule.\n'}]
[{'role': 'system', 'content': 'You are an English as a foreign language teacher who is knowledgable about grammar.'}, {'role': 'user', 'content': '\nLearn to analyse the grammatical construction on CEFR level A1 in the category "PRONOUNS: subject/ object" with guideword "FORM: (OBJECT) WITH

In [57]:
egp_samples[['augmented_examples_CoT', 'augmented_negative_examples_CoT']].iloc[11,1]

['Certainly! Here are the revised versions of the examples without using the rule:',
 '',
 'I believe that education is the key to success.',
 'Personally, I have always been fascinated by history.',
 'In my case, I prefer spending my weekends outdoors.',
 'Personally, I find it important to prioritize self-care.',
 'In my experience, I am committed to lifelong learning.',
 "In my opinion, it's crucial to maintain a healthy work-life balance.",
 'Personally, I enjoy trying out new recipes in the kitchen.',
 'In my view, I have a deep appreciation for art and creativity.',
 'From my perspective, I value honesty and open communication.',
 'For me, I feel a sense of fulfillment when helping others.',
 'In my view, I am passionate about protecting the environment.',
 'Personally, I find solace in practicing mindfulness and meditation.',
 'It has always been a personal interest of mine to explore science and technology.',
 'In my belief, I think that positive thinking can be powerful.',
 'P

In [58]:
egp_samples.to_json('../dat/egpaugmented_cot.json')

Now let's manually prompt GPT-4 for the 12 constructions.

In [62]:
egp_samples['augmented_examples_GPT4'] = None
egp_samples['augmented_negative_examples_GPT4'] = None

Have you every dreamed of being a human for loop? Now it's your chance!

In [101]:
i=11

print(egp_samples.iloc[i]['prompt'])

Create 20 more examples for the grammatical construction on CEFR level C2 in the category "PRONOUNS: reflexive" with guideword "USE: DISCOURSE MARKERS" and the rule: "Can use 'as for myself' as a discourse marker to introduce or focus on a personal opinion. "

Examples:

As for myself, as you can clearly see from my account of this musical event, music still means an awful lot to me, and most probably allways will. 

 As for myself I like taking responsibilities. 

 As for myself, I grew up during the "internet era", the years when it was spreading to average peoples homes and therefore I can still remember life without it.

Output format:
1. [EXAMPLE 1]
2. [EXAMPLE 2]


Paste the responses here:

In [105]:
msg_content = """1. As for myself, I believe that the advancements in artificial intelligence will revolutionize our approach to problem-solving in various fields.

2. As for myself, having traveled extensively, I can confidently say that cultural immersion is the best way to understand and appreciate the diversity of our world.

3. As for myself, I find that reading historical fiction not only entertains but also provides a unique perspective on the events that shaped our history.

4. As for myself, I've always been fascinated by the complexities of quantum mechanics and its implications for our understanding of the universe.

5. As for myself, working in a fast-paced environment has always energized me and fueled my creativity.

6. As for myself, I see volunteering not just as a duty, but as a privilege to contribute positively to society.

7. As for myself, I argue that sustainable living should be a priority for everyone, given the current environmental challenges.

8. As for myself, I consider mastering multiple languages an invaluable skill in our increasingly interconnected world.

9. As for myself, I've found that maintaining a healthy work-life balance is crucial for long-term productivity and well-being.

10. As for myself, I believe that true leadership is about empowering others and fostering a collaborative spirit.

11. As for myself, I've always held the opinion that classical music possesses an unparalleled depth and complexity.

12. As for myself, I see the digital revolution not only as a technological advancement but as a catalyst for societal transformation.

13. As for myself, I've always been an advocate for lifelong learning and the pursuit of knowledge.

14. As for myself, I firmly believe that empathy and understanding are key to resolving conflicts and building harmonious relationships.

15. As for myself, I view the challenges of the 21st century as opportunities for innovation and progress.

16. As for myself, I've always been intrigued by the mysteries of space exploration and the potential discoveries beyond our planet.

17. As for myself, I regard the preservation of traditional arts and crafts as essential for maintaining our cultural heritage.

18. As for myself, I think that investing in renewable energy sources is not only environmentally responsible but also economically prudent.

19. As for myself, I've always been passionate about animal welfare and the ethical treatment of all living beings.

20. As for myself, I've come to appreciate the subtle art of culinary creativity and its capacity to bring people together."""
lines = msg_content.split('\n')
positive_examples = [re.sub(r'^\d+\.\s*', '', line).strip() for line in lines if re.match(r'^\d+\.', line)]

"Rewrite each example with the same content but without using the rule."
msg_content = """1. From my perspective, the advancements in artificial intelligence will revolutionize our approach to problem-solving in various fields.

2. Having traveled extensively, I can confidently say that cultural immersion is the best way to understand and appreciate the diversity of our world.

3. I find that reading historical fiction not only entertains but also provides a unique perspective on the events that shaped our history.

4. I've always been fascinated by the complexities of quantum mechanics and its implications for our understanding of the universe.

5. Working in a fast-paced environment has always energized me and fueled my creativity.

6. I see volunteering not just as a duty, but as a privilege to contribute positively to society.

7. I argue that sustainable living should be a priority for everyone, given the current environmental challenges.

8. I consider mastering multiple languages an invaluable skill in our increasingly interconnected world.

9. I've found that maintaining a healthy work-life balance is crucial for long-term productivity and well-being.

10. I believe that true leadership is about empowering others and fostering a collaborative spirit.

11. I've always held the opinion that classical music possesses an unparalleled depth and complexity.

12. I see the digital revolution not only as a technological advancement but as a catalyst for societal transformation.

13. I've always been an advocate for lifelong learning and the pursuit of knowledge.

14. I firmly believe that empathy and understanding are key to resolving conflicts and building harmonious relationships.

15. I view the challenges of the 21st century as opportunities for innovation and progress.

16. I've always been intrigued by the mysteries of space exploration and the potential discoveries beyond our planet.

17. I regard the preservation of traditional arts and crafts as essential for maintaining our cultural heritage.

18. I think that investing in renewable energy sources is not only environmentally responsible but also economically prudent.

19. I've always been passionate about animal welfare and the ethical treatment of all living beings.

20. I've come to appreciate the subtle art of culinary creativity and its capacity to bring people together."""
lines = msg_content.split('\n')
negative_examples = [re.sub(r'^\d+\.\s*', '', line).strip() for line in lines if re.match(r'^\d+\.', line)]

egp_samples['augmented_examples_GPT4'].iloc[i] = positive_examples
egp_samples['augmented_negative_examples_GPT4'].iloc[i] = negative_examples

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  egp_samples['augmented_examples_GPT4'].iloc[i] = positive_examples
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  egp_samples['augmented_negative_examples_GPT4'].iloc[i] = negative_examples


In [107]:
egp_samples['augmented_negative_examples_GPT4'].iloc[i]

['From my perspective, the advancements in artificial intelligence will revolutionize our approach to problem-solving in various fields.',
 'Having traveled extensively, I can confidently say that cultural immersion is the best way to understand and appreciate the diversity of our world.',
 'I find that reading historical fiction not only entertains but also provides a unique perspective on the events that shaped our history.',
 "I've always been fascinated by the complexities of quantum mechanics and its implications for our understanding of the universe.",
 'Working in a fast-paced environment has always energized me and fueled my creativity.',
 'I see volunteering not just as a duty, but as a privilege to contribute positively to society.',
 'I argue that sustainable living should be a priority for everyone, given the current environmental challenges.',
 'I consider mastering multiple languages an invaluable skill in our increasingly interconnected world.',
 "I've found that maintai

In [108]:
egp_samples.to_json('../dat/egpaugmented_cot.json')