<a href="https://colab.research.google.com/github/Dillon-Sim/Loan_Cluster_Analysis/blob/main/TA_Group1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# __Text Analytics - Group 1 - Prompt Engineering__




## Table of Contents  
    
1. [Introduction](#1_Intro)
2. [Effective Prompt Engineering](#2_EPE)  
3. [Effective Prompt Engineering - Example](#3_Ex)  
3. [References](#4_Ref)  


## Introduction <a class="anchor" id="1_Intro"></a>

Prompt engineering is a machine learning paradigm that encompasses the interaction between human and large language models (LLMs) to generate desired content; where a prompt is a set of instructions given to LLMs that customises, enhances, and refines the LLMs' capabilities (Liu et al., 2023). The fundamental goal of prompt engineering is to enhance the performance and usability of LLMs (Learn-Prompting, 2023, Shin et al., 2020), which are trained on massive amounts of text data and can generate human-like text and complete other language-related tasks with high accuracy (Kasneci et al., 2023). Users can fine-tune their inputs to obtain an applicable output through prompting, which frequently includes task descriptions to direct a particular model further. 

In particular, a prompt can influence an LLM conversation by offering a set of initial rules and guidelines. These rules can impact the subsequent interactions with the LLM and the output generated from it. Furthermore, providing context for a conversation is crucial as it guides the LLM on what data should be prioritised and how the outcome should be presented (White et al., 2023). Prompt engineering can involve tuning the weights of the model's parameters, fine-tuning the model on specific data, or designing new architectures and training methods that are optimised for specific tasks or domains. Additionally, prompts can adapt to the conversation and suggest additional prompts to gather more data or produce related artefacts, making the programming process more efficient. Their enhanced capabilities are demonstrated by designing prompts to offer value beyond just text or code production (White et al., 2023).

Prompt engineering is essential in the future since it has the potential to revolutionise the way we interact with language models. By crafting effective prompts, we can utilise the full potential of program language models to perform complex tasks and generate high-quality outputs that meet specific needs (Ekin, 2023). Moreover, as more people rely on language models for various applications, proper prompt engineering is essential to ensure that language models generate accurate, relevant, and unbiased outputs.


## Effective Prompt Engineering <a class="anchor" id="2_EPE"></a>

In general, there are several ways to compose queries for AI language models (Shieh, 2023, OpenAI, 2023):

1.	Use separator or placeholder to separate the instruction and context:

>For example: "#" for instruction; 
"{text input here}" for actual context.

2. Use specific, descriptive and domain-specific prompts:

>Use prompts that specifies what to achieve instead of a generic prompt to get responses that are more relevant and accurate. For instance, "Add more detail about my experience as a Risk Analyst in the banking industry to my cover letter." rather than "Modify my cover letter." 

3. Use multiple prompts:  

>Instead of relying on a single prompt, use multiple prompts that cover different aspects of the task, helping the model generate more diverse and relevant responses.

4. Provide examples:  

>Normally prompt engineering starts with zero-shot learning, which could produce proper output without being given any examples. In order to improve the performance of the generated content, few-shot prompting (given examples) is a technique to help the LLMs to articulate the output (Wei et al., 2021, Short and Short, 2023). 


Despite the different ways of phrasing prompts, research has demonstrated that with fine-tuning prompts, ChatGPT could generate the desired output (Short and Short, 2023). In addition, Lo (2023) suggested a CLEAR Framework for generating prompts in order to develop an AI-generated content in an effective way: 

- Concise: be precise and avoid superfluous information 

- Logical: the prompt should be structured and coherent, following the logical flow 

- Explicit: set prompts that regards to the output specifications, including the format, content, or scope 

- Adaptive: attempting different prompt formulations and temperature settings based on the output performance 

- Reflective: evaluate the output, and refine the future prompts with the insights from the evaluation 

On the other hand, the output of prompt engineering will also be affected by LLM settings, for instance, temperature, and top-p parameters (Lo, 2023). The temperature and top-p settings both control the randomness of the responses. The higher the temperature setting leads to the more diverse and creative generated content, whereas the lower temperature produces the factual and deterministic output. As for top-p, it sets the cut-off probability, forcing the model only generate tokens if the probability of being the next token is higher than the given probability threshold, reducing the tendency of composing unexpected content. 

 
 

## Utilising effective prompt engineering to generate customised cover letters  <a class="anchor" id="3_Ex"></a>

Using OpenAI's GPT API, and effective prompt engineering, the below code generates customised cover letters for a user, tailored to their experience and the job they are applying for. After an output is generated, the code assesses the cover letter's quality using a rating and provides feedback to the user for further improvements. 


In [3]:
!pip install openai


#Import packages
import os
import openai
import getpass

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting openai
  Downloading openai-0.27.6-py3-none-any.whl (71 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m71.9/71.9 kB[0m [31m5.3 MB/s[0m eta [36m0:00:00[0m
Collecting aiohttp (from openai)
  Downloading aiohttp-3.8.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m73.0 MB/s[0m eta [36m0:00:00[0m
Collecting multidict<7.0,>=4.5 (from aiohttp->openai)
  Downloading multidict-6.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (114 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m114.5/114.5 kB[0m [31m17.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting async-timeout<5.0,>=4.0.0a3 (from aiohttp->openai)
  Downloading async_timeout-4.0.2-py3-none-any.whl (5.8 kB)
Collecting yarl<2.0,>=1.0 (from aiohttp->openai)
  Downloadin

The user is prompted to enter the required information before proceeding. 

In [4]:
#User is prompted to enter the job description
job_desc = str(input("Enter the job description here:"))

if job_desc:
  job_desc = job_desc
else:
  print("\nNothing was entered, please try again.")
  job_desc = str(input("Enter the job description here:"))

Enter the job description here:At Contentful, data plays a key role in everything that we do - understanding our users, discovering and sizing opportunities, and measuring the success of our initiatives. Our Product Analysts are at the centre of these efforts, both on a strategic and tactical level. We're looking for someone with business instincts, creativity, analytics mastery, and experience working with product development teams; someone who is passionate about using these skills to build a data culture which drives business impact.  Our team operates on an embedded model, as such you’ll be part of a cross-functional team of Product Managers, Designers and Engineers, while reporting to the Director of Data Analytics. The ideal teammate has the skills to unearth insights from large volumes of data, and to communicate those insights in a way that will empower our senior business and product leaders to make better-informed decisions.  What to expect?  •	You will lead analytics for one

In [5]:
#User is prompted to enter their CV
CV_doc = str(input("Enter your CV here:"))

if CV_doc:
  CV_doc = CV_doc
else:
  print("Nothing was entered, please try again.")
  CV_doc = str(input("Enter your CV here:"))

Enter your CV here:Professional Profile: Data Scientist with extensive experience in developing and deploying machine learning models to drive business decisions and achieve organizational goals. Skilled in analyzing large datasets using SQL, Python, and R to identify trends and patterns. Proficient in creating interactive dashboards and reports using Tableau and other data visualization tools to communicate insights to stakeholders. Passionate about utilizing data-driven insights to improve customer experience and drive innovation. Skills: •	Programming Languages: Python, R, SQL •	Data Science Tools: TensorFlow, Keras, Scikit-learn, Pandas, Numpy, Matplotlib, Seaborn, Tableau •	Database Systems: MySQL, PostgreSQL, MongoDB •	Operating Systems: Linux, Windows •	Other Skills: Machine Learning, Deep Learning, Natural Language Processing, Computer Vision, Statistical Analysis, Data Cleaning, Data Wrangling, Data Exploration, Cloud Computing, Agile Methodologies Experience: ABC Corporation,

In [6]:
#User is prompted to enter the title of the job
role_title = str(input("Enter the job title here:"))

if role_title:
  role_title = role_title
else:
  print("Nothing was entered, please try again.")
  role_title = str(input("Enter the job title here:"))

Enter the job title here:Lead Data Scientist


In [7]:
#User is prompted to enter the name of the company
company = str(input("Enter the company name here:"))

if company:
  company = company
else:
  print("Nothing was entered, please try again.")
  company = str(input("Enter the company name here:"))

Enter the company name here:Contentful


In [8]:
#User is prompted to enter their Open API key, which is stored securely
print("Enter your OpenAI API Key:")
openai.api_key = getpass.getpass()

Enter your OpenAI API Key:
··········


Below, some examples of best practices in prompt engineering have been illustrated. For instance:
-  __Prompt:__ _"incorporate relevant phrases or keywords from the job description, only if the user has experience in those areas based on their CV"_ 
- __Explanation:__ directs the model to include only those relevant phrases or keywords that correspond to the user's experience as described in their CV. It guarantees that the user's skills, qualifications, and experience are appropriately reflected in the cover letter and are in line with the job requirements.  

-  __Prompt:__ _"emphasize relevant achievements to create an impressive letter"_
-  __Explanation:__ ensures that the model highlights the user's most notable accomplishments in the cover letter, presenting them in an appealing manner to the organization. 

Overall, the model should demonstrate the user's skills and experience that is most relevant for their targeted job, and result in a well-flowing and grammatically correct cover letter. 

In [9]:
# Define instructions for GPT to generate a tailored cover letter

tailor_cl_instructions = [
    {"role": "system", "content": 
         '''
         You will act as an expert in cover letter writing, business and analytics. 
         Your job is to help the user get an interview by generating a personalised cover letter. 
         You must always follow all the instructions the user provides. 
         The user will provide the job description, his/her CV, the position title and company name. 
         
      '''}   ,
    {"role": "user", "content": 
                         
     f''' 
     Job description: <{job_desc}>.
     The position title is <{role_title}>. 
     The company is <{company}>.
     CV: <{CV_doc}>
        
         Your instructions are below:
             (1) Generate a personalised cover letter for the specific job posting provided by the user.
             (2) Incorporate relevant phrases or keywords from the job description, only if the user has experience in those areas based on their CV.
             (3) Avoid listing skills in any paragraph or sentence. 
             (4) Emphasise skills and experience that the employer is seeking, instead of merely listing them. Use specific examples to demonstrate proficiency.
             (5) Do not fabricate any information or experience for the user. Only include information that is accurate and supported by the CV.
             (6) Ensure that the letter is grammatically correct and flows well. 
             (7) Avoid using overly literal language, such as "mid-level experience." Instead, use more natural phrasing.
             (8) Demonstrate skills and experience without explicitly mentioning that they align with the employer's requirements.
             (9) Emphasise relevant achievements in order to create an impressive letter.
             (10) Avoid directly quoting the job description.
             (11) Keep the letter between 250 and 400 words.
             (12) Select the most relevant experience from the CV to highlight in the letter. Focus on recent experience before older experience.
             (13) Do not use the phrases "job description", "job posting" or anything similar in the output of the cover letter. 
             (14) Follow Cover Letter writing best practices in terms of structuring paragraphs.
             (15) Do not say my experience with "tool1 or tool2" , for example never say "Power BI or Tableau", just pick the one the user has experience with or the one desired in the job description.
            
        
        Follow the instructions to generate a draft of the cover letter. Then, evaluate your draft, and make an improved version while still following the instructions.
        Display the draft and improved version, clearly indicating which is which. Lastly, explain how you have met each instruction above in the final version.
            '''}              
         
      ]
                              

The following parametes are used:
- The temperature at 0.5 is used to regulate the "creativity" of the chatbot model's answers. While responses to higher temperatures are more varied and unexpected, those to lower temperatures are more conservative and predictable.
- Frequency Penalty is set at 0.5 to prevent the chatbot model from repeating itself too frequently. A high-frequency penalty will motivate the model to produce a wider range of answers.
- Presence penalty is set to 0.4 to discourage the model from generating replies that do not match the provided context. A strong presence penalty ensures that the model pays close attention to the offered cues and delivers suitable responses(OpenAI,2022).

In [10]:
# Use OpenAI's GPT 3.5  to generate a tailored cover letter based on the provided instructions

tailored_cover_letter_model = openai.ChatCompletion.create(model="gpt-3.5-turbo", 
                                               messages=tailor_cl_instructions,
                                               temperature = 0.5,
                                               frequency_penalty = 0.5,
                                               presence_penalty = 0.4,
                                               n=1)

# Retrieve the generated tailored cover letter
tailored_cover_letter = tailored_cover_letter_model.choices[0].message.content



Now that the customised cover letter has been generated and stored as a variable, it is displayed to the user.

In [12]:
# Print the tailored cover letter
print(f"Cover Letter: \n{tailored_cover_letter}")


Cover Letter: 
Draft Cover Letter:

Dear Hiring Manager,

I am writing to express my interest in the Lead Data Scientist position at Contentful. With over 3 years of experience as a dedicated product analyst, I am confident that I have the skills and expertise necessary to excel in this role.

At ABC Corporation, I developed and deployed machine learning models to forecast sales and improve customer retention. Using SQL, Python, and R, I analyzed large datasets to identify trends and patterns. Additionally, I created interactive dashboards and reports using Tableau to communicate insights to stakeholders.

I understand the commercial/financial side of a SaaS company in addition to behavioral product/user data, which allows me to recommend fit-for-purpose solutions. My strong quantitative skills include experience with statistical methods, predictive modeling, and experiment design.

As a Lead Data Scientist at Contentful, I will lead analytics for one of your product lines while being 

In the below section, the generated cover letter is assessed by having GPT act like an Applicant Tracking System (ATS). The model provides a rating between 0-100 and outputs some feedback on how the cover letter can be improved. 

In [13]:
# This code is used to generate a rating and feedback for a cover letter based on how well it matches a job description
# Define a list of messages for the chatbot to generate a rating and feedback on the cover letter

rate_cover_letter = [
        {"role": "system", "content": 
         '''
        Pretend you are an ATS. You are going to give a rating between 0-100, for a cover letter and how well it matches with a job description. 
        A score of 100 means that the cover letter is perfect for ATS purposes. 0 means it will not make it through the ATS successfully.
        '''},
        {"role": "user", "content": 
         f''' Here is the job description: <{job_desc}>.
         Here is the cover letter: <{tailored_cover_letter}>.
         Generate the score. Explain what could be better. '''}

    
    ]



In the following section, the critical evaluation of the cover letter takes place. A temperature of 0.0 is used to ensure that the same score is generated for the same cover letter each time - limiting creativity / randomness in responses. 

In [14]:
# Use the OpenAI API to generate a rating and feedback on the cover letter

ATS_model = openai.ChatCompletion.create(model="gpt-3.5-turbo", 
                                                messages=rate_cover_letter,
                                                temperature = 0.0, #temperature set to 0 here so the same score is generated for the same cover letter each time
                                                frequency_penalty = 0.5,
                                                presence_penalty = 0.5,
                                                n=1)
# Get the rating and feedback generated by the model
rating_and_changes = ATS_model.choices[0].message.content


In [18]:
# Print the rating and feedback generated by the chatbot model
print(f"ATS:\n{rating_and_changes}")  

ATS:
Score: 85

The cover letter is well-written and tailored to the job description. It highlights relevant skills and experience, and provides specific examples of achievements. The language used is clear and concise, and there are no grammatical errors.

However, there are a few areas for improvement. Firstly, the cover letter could have addressed the job description more directly by mentioning some of the specific responsibilities listed in the posting. Secondly, while the candidate's experience with SQL is mentioned, their practical experience with data modelling using dbt is not highlighted. Finally, it would have been beneficial to mention any experience with Looker or Redash as these are modern BI tools that are specifically mentioned in the job description.

Overall, this cover letter has a good chance of making it through an ATS successfully but could be improved by addressing some of the specific requirements listed in the job description more directly.


Moreover, besides the programme above, there are some other approaches to use prompt engineering to generate a cover letter. For example, providing a cover letter template and CV as the starting point, and formulating the prompts to make sure the AI-generated cover letter could follow the structure of the template and CV but also be tailored to specific roles. 


## References <a class="anchor" id="4_Ref"></a>

Ekin, S. 2023. Prompt Engineering For ChatGPT: A Quick Guide To Techniques, Tips, And Best Practices. 

Kasneci, E., Sessler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., Gasser, U., Groh,
G., Günnemann, S., Hüllermeier, E., Krusche, S., Kutyniok, G., Michaeli, T., Nerdel, C., Pfeffer, J., Poquet, O., Sailer, M., Schmidt, A., Seidel, T., Stadler, M., Weller, J., Kuhn, J. & Kasneci, G. 2023. ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 103, 102274. 

Learn-Prompting. 2023. Learn Prompting [Online]. Learn Prompting org. Available:
https://learnprompting.org/docs/intro/ [Accessed 12/05 2023]. 

Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H. & Neubig, G. 2023. Pre-train, Prompt, and Predict:
A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Comput. Surv., 55, Article 195. 

Lo, L. S. 2023. The CLEAR path: A framework for enhancing information literacy through prompt
engineering. The Journal of Academic Librarianship, 49, 102720. 

OpenAI. 2023. Best practices for prompt engineering with OpenAI API [Online]. Available: https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-openai-api [Accessed 12/05 2023].

Shieh, J. 2023. Best practices for prompt engineering with OpenAI API [Online]. OpenAI API. Available: https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-openai-api [Accessed 09/05 2023]. 

Shin, T., Razeghi, Y., Logan IV, R. L., Wallace, E. & Singh, S. 2020. Autoprompt: Eliciting knowledge from language models with automatically generated prompts. arXiv preprint arXiv:2010.15980. 

Short, C. E. & Short, J. C. 2023. The artificially intelligent entrepreneur: ChatGPT, prompt engineering, and entrepreneurial rhetoric creation. Journal of Business Venturing Insights, 19, e00388. 

Wei, J., Bosma, M., Zhao, V. Y., Guu, K., Yu, A. W., Lester, B., Du, N., Dai, A. M. & Le, Q. V. 2021. Finetuned language models are zero-shot learners. arXiv preprint arXiv:2109.01652. 

White, J., Fu, Q., Hays, S., Sandborn, M., Olea, C., Gilbert, H., Elnashar, A., Spencer-Smith, J. Schmidt, D. C. 2023. A prompt pattern catalog to enhance prompt engineering with chatgpt. arXiv preprint arXiv:2302.11382. 


