# Guidelines for Prompting
This program is built following the guidelines of the Prompt Engineer Course by DeepLearning.AI

## Setup
#### Load the API key and relevant Python libaries.

In [4]:
import openai
import os

openai.api_key  = 'your_OPENAI_API-keyword'


We will use OpenAI's `gpt-3.5-turbo` model and the [chat completions endpoint](https://platform.openai.com/docs/guides/chat). 



In [10]:
def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0, # this is the degree of randomness of the model's output
    )
    return response.choices[0].message["content"] #there is better call?

**Note:** This notebook use OpenAI library version `0.27.0`. 

In order to use the OpenAI library version `1.0.0`, here is the code that you would use instead for the `get_completion` function:

```python
client = openai.OpenAI()

def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=0
    )
    return response.choices[0].message.content
```

## Prompting Principles
- **Principle 1: Write clear and specific instructions**
- **Principle 2: Give the model time to “think”**

### Tactics

#### Tactic 1: Use delimiters to clearly indicate distinct parts of the input
- Delimiters can be anything like: ```, """, < >, `<tag> </tag>`, `:`

In [6]:
import csv
eng_messages = []
with open('engage_messages_2024.csv', mode='r') as file:
    # Create a CSV reader object
    csv_reader = csv.reader(file)
    eng_messages = [item for sublist in csv_reader for item in sublist]


In [7]:
batch_size = 50
batches = [eng_messages[i:i + batch_size] for i in range(0, len(eng_messages), batch_size)]

In [47]:
#I have added more contextual information about what it means to have a bad message and it understands. 

responses = {} 
import ast

def qualitative ():
    for batch in batches[:2]: # when they are a lot of messages, it may rushed to a conclusion
        # without thinking. consider analyzing a sert of batches and then gather the info extracted 
        # for each set
        prompt = f"""

        In the variable delimited by triple backticks, you find a list \
        of messages. These are messages from educators explaining how space inspires them \
        in their educator role.\
        I will give you a list of tasks that you have to perform in the order I am writing them to you\
        You can see that all task are delimited by single quotes\

        Your task in the following order is:\
        '0)' print messages that express bad feelings towared ESA conference \
        '1)' collect and print 2 main topics in all messages \
        '2)' collect and print 2 predominant emotions in all messages \
        '3)' print a summary of all the messages in 250 words \
        '4)' print an example of one literal real message that you find informative of the summary \
         Format your response as a JSON object with "bad_messages", "Topics" 
, "Emotions", "Summary" and "Message" as the keys           

        ```{batch}```

        """

        response = get_completion(prompt)
        #print('batch=', batch, response, "\n\n") uncomment for printing the messges
        print(response, "\n\n")
        response = ast.literal_eval(response)
        for key, value in response.items():
            if key in responses:
                responses[key] += value  # Add the values of the matching keys
            else:
                responses[key] = value  # Add new key-value pairs if not already in dict
        
        
    return responses
 

In [48]:
responses = qualitative()

{
    "bad_messages": [],
    "Topics": [
        "STEM education",
        "Space exploration"
    ],
    "Emotions": [
        "Fascination",
        "Inspiration"
    ],
    "Summary": "Educators find space to be a captivating and inspiring topic that engages students in STEM education. They express fascination with the vastness of the universe, the potential for space exploration, and the impact of space technology on Earth. Space serves as a motivator for students to learn, expand their perspectives, and consider future careers in STEM fields. Educators aim to transmit their passion for space to students, encouraging curiosity, critical thinking, and problem-solving skills. They see space as a gateway to innovation, technological advancements, and addressing global challenges.",
    "Message": "Space inspires me because it connects me with some of my students. At this moment I have two students with autisme and giftedness, they enjoy staying in the class during the breaks and lear

#### Tactic 2: Ask for a structured output
- JSON, HTML

##### I am outputting a Python dictionary, then saving it as a JSON file for reading/writing purposes

In [50]:
responses
# for saving the dictionary, I can use a JSON or directly save the dictionary in a python module, so to say 
#(save a file that contains the dict)
# will use a json
import json
with open('space_engagement_online.json', 'w') as json_file:
    json.dump(responses, json_file)

#to load the JSON file through another python file
#with opoen (space_engagement_online.json, 'r') as json_file:
# loaded_dict = json.load(json_file)

#### Tactic 3: Ask the model to check whether conditions are satisfied

Add something like this inside your prompt

'If the messages does not contain bad emotions, output a list with the only element
\"No bad emotions.\"'

## ATTENTION! LLMs have great difficulties when handling with negations. It may output that the messages contain bad emotions when these appear negated as it is in the case of : *I have experienced no bad emotions*

### This is a common thing to look at when dealing with EHR (Elctronic Health Records).

#### Here, I provide some references from the research literature:
-

#### Tactic 4: "Few-shot" prompting

In [None]:
prompt = f"""
Your task is to answer in a consistent style.

<child>: Teach me about patience.

<grandparent>: The river that carves the deepest \ 
valley flows from a modest spring; the \ 
grandest symphony originates from a single note; \ 
the most intricate tapestry begins with a solitary thread.

<child>: Teach me about resilience.
"""
response = get_completion(prompt)
print(response)

### Principle 2: Give the model time to “think” 

#### Tactic 1: Specify the steps required to complete a task

#### Ask for output in a specified format

#### Tactic 2: Instruct the model to work out its own solution before rushing to a conclusion
##### We can fix this by instructing the model to work out its own solution first and then compare to the solution provided

## Hallucinations

### How can we track and solve hallucinations?

First, what is an hallucination?

```python
f"""
<user>: what is an hallucination of a LLM?

<ChatGPT> said:

In the context of Large Language Models (LLMs) like GPT, a hallucination refers to the phenomenon where the model generates information that is factually incorrect, nonsensical, or completely made up, even though the output may appear plausible or convincing.

"""
```

## How to reference the use of LLMs? [1]

When prompted "what is an hallucination of a LLM" the ChatGPT-generated text indicated that 

in the context of Large Language Models (LLMs) like GPT, a hallucination refers to the phenomenon where the model generates information that is factually incorrect, nonsensical, or completely made up, even though the output may appear plausible or convincing.[*]

[*]OpenAI (2024), ChatGPT (GPT-4)[Large Language Model]. Response to query made by ALba Morquecho-Delgado. 22/09/2024. https://chat.openai.com/chat

[1] Hosseini, M., Resnik, D. B., & Holmes, K. (2023). The ethics of disclosing the use of artificial intelligence tools in writing scholarly manuscripts. Research Ethics, 19(4), 449-465. https://doi.org/10.1177/17470161231180449

##### A note about the backslash
- In the course, we are using a backslash `\` to make the text fit on the screen without inserting newline '\n' characters.
- GPT-3 isn't really affected whether you insert newline characters or not.  But when working with LLMs in general, you may consider whether newline characters in your prompt may affect the model's performance.