<a href="https://colab.research.google.com/github/koleshjr/PROMPT-ENGINEERING/blob/main/Learning_about_openai_gpt3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## PROMPT ENGINEERING USING GPT-3


### Proof of concept
* built gradually over time
* for a starting example: {'category','key','value'}

###### Steps
* Extract facts from Natural Language and adding them to a database
* search the database using key words

###### Prompt Engineering
* Prompts inputs both instructions and content to the model
* It requires an appropriate language processing architecture to deal with prompt composition and later model output interpretation
* It also requires gradual refinement of the prompts carefully examining the results to iteratively improve them

###### General architecture
* Pre-processing -> prompt -> GPT-3 -> row_output -> postprocessing -> suitable data structure -> application usage

###### Iterative prompt development
* Write a prompt, test it with the available data (or just a couple manual inputs) , examine the mistakes, fix the prompt to account for those mistakes and repeat

* If you don't get some interesting results in your first iterations, perhaps you should consider using another technique rather than GPT-3

### Notice the Iteration Life cycle in the below examples for prompt engineering

##### Example 3
* Extraction prompt 3, given user input x

* Extract pieces of personal information, like phone numbers, email 
addresses, names, trivia, reminders, etc., as tuples with the following 
format: (Category, Key, Value)

* Example input: "Mom's phone number is 555-555-5555"
Example output: ("Family", "mom's phone number", "555-555-5555")



###### Example 5: 
* Extraction prompt 4, given user input x and valid categories

* Extract pieces of personal information, like phone numbers, email 
addresses, names, trivia, reminders, etc., as tuples with the following 
format: (Category, Key, Value)
Assume everything mentioned refers to the same thing. Constraints:
  - Allowed Categories: {', '.join(categories)}


* Example input: "Mom's phone number is 555-555-5555"
Example output: ("Family", "mom's phone number", "555-555-5555")

* Example input: "Need to do: lab work, ultrasound, buy aspirin"
Example output: 
("Health", "to do", "lab work")
("Health", "to do", "ultrasound")
("Health", "buy", "aspirin") 

#### Example 6:
* Extraction prompt 5, given user input x and valid categories

* Extract pieces of personal information, like phone numbers, email 
addresses, names, trivia, reminders, etc., as tuples with the following 
format: (Category, Type, People, Key, Value)
Assume everything mentioned refers to the same thing. Constraints:
  - Allowed Categories: {', '.join(categories)}
  - Allowed Types: "List", "Email", "Phone", "Address", "Document", 
    "Pendency", "Price", "Reminder", "Note", "Doubt", "Wish", "Other"
  - People contain the name or description of the people or organizations 
    concerned, or is empty if no person or organization is mentioned.
  
* Example input: "Mom's phone number is 555-555-5555"
Example output: ("Family", "Phone", "mom", "mom's number", "555-555-5555")

* Example input: "email of the building administration = adm@example.com"
Example output: ("Work", "Email", "building administration", 
                 "email", "adm@example.com")

* Example input: "Need to do: lab work, ultrasound, buy aspirin"
Example output: 
("Health", "List", "", "to do", "lab work")
("Health", "List", "", "to do", "ultrasound")
("Shopping", "List", "", "aspirin", "buy")  


###### NB: We can get more accurate results by changing the temperature parameter by increeasing it from 0.5 to 0.1


### Notice the iteration life cycle for search below
* Procedure involves looking for any of the key words that the user queried, if the fact exists in a database it is returned

* GPT3 - Extracts the main terms from a query
* Augment the main terms with related ones

#### Example 1: Query terms extraction prompt, given user query

* Extract the main entities (one per line, without bullets) in the following 
sentence: "{query}"

* input : what is the cost of a flu shot?
* Output: flu shot and cost
* Input : What is the cost of a flu shot?
* Output : Pediatrician and Questions

#### Example2:Data augmentation prompt, given a term

* List some synonyms for the following term: "{term}"
Synonyms (one synonym per line):

* E.g flu shot: Vaccination, Injection, Immunization, Jab
* E.g Pediatrician: Child Doctor, Pediatrician, Infant Specialist, Kids Physician

##### Work Flow
* Get a user input, extract its main terms, augment them, and then search a dataframe for them. 



### Application Development
* The Engine: A python Library that encapsulates the main functions required to make the application work. Classes and Methods here are to abstract the calls to GPT-3 so that users of the library don't need to be aware of them at all

* The User Interface: This exposes the underlying engine to the user, taking commands and presenting results

###### Main job of the engine is to 
* Allow the user to search a database with keywords.
  * user types search query -> extract important terms -> augment terms -> query()
* Allow the user to insert new facts into the database using NL utterances.
  * user types NL facts -> extract facts -> looks good to the user? -> commit if good cancel if bad and retry
* Optionally, allow the user to manually check how GPT-3 interpreted the NL utterances, and either commit or cancel the data insertion.


##### Proper practices to follow when building the engine
* Facts are stored in a CSV file and are manipulated by the application as a Pandas dataframe. This is merely to keep things simple of course — in a production version, a database engine could be better (SQLite might be a good choice).
* The engine contains methods to support a workflow that allows the user to examine the fact extraction before accepting it (BraindumpEngine class).
* A preprocessor provides the mechanisms to build prompts for GPT-3 (BraindumpPreprocessor class).
A postprocessor provides the mechanisms to interpret GPT-3’s outputs (BraindumpPostoprocessor class).
* Unit tests should be in place, particularly to check the quality of the information extraction.


#### The preprocessor
* Transforms the user utterances into appropriate prompts
* Inserting a user NL utterance (e.g., the facts we are allowing the users to type).
* Inserting dynamic user choices (e.g., specific categories chosen by the user).
* Inserting table or document schemas of specific files into the prompt text, so that the prompt can leverage them.

#### The postprocessor
*  is responsible for parsing GPT-3’s response and performing any necessary additional computation

* Breaking lines, trimming text, and normalizing capitalization, among other simple text transformations.
* Instantiating appropriate data structures that can be more easily manipulated later by the system.
Checking for invalid elements (e.g., an invalid category).
* Checking for offensive or illegal content. Remember that LLMs have been trained with vast amounts of rather arbitrary Internet content, which means they also carry information that users can find objectionable. Some platforms (such as the Azure OpenAI Service offering) already support filtering those at the API call level.
* If a program is being generated (e.g., as can be easily done with the Codex version of GPT-3), applying appropriate program repair transformations as needed.
* Checking for dangerous outputs. For example, if you are generating programs, it might be a good idea to ensure they cannot harm a user’s computer.

#### Testing
* Just as with any other software, LLM-based tools should also have automated tests. In addition to the general software engineering aspects regarding testing, we can add the following specific considerations:

* The underlying model is expected to evolve over time. Though this typically means that the model gets better, it might also mean that prompts that worked well before stop working as well. Hence, it’s important to include tests to also guard against this expected behavioral change. In fact, as I write these lines, I’m wondering and fearing what the rumored upcoming GPT-4 model will do to my tutorial!
* Different model outputs might be equally suitable as ground truth. In Braindump, for instance, it matters little if the model writes “buy” instead of “purchase” regarding a shopping list item. Hence, special matching operators can be used when writing tests, to make tests more resilient to such innocuous perturbations.



### Question: 
* Question: So how can we be sure that the result really is what the user asked? And what do we do when the result fails, as it eventually most certainly will?
* Answer: For each output, make sure it is easy for the user to inspect it, and that there’s a way to undo it and try again if the user finds it necessary.
