# Fine-Tuning ChatGPT on Domain-Specific Content 
This program demonstrates how to fine tune and use openAI's ChatGPT language model to answer questions in specific domain areas.  

The sample content used is drawn from the 2023 investment outlook summaries posted on the websites of 
Morgan Stanley [here](https://www.morganstanley.com/ideas/global-investment-strategy-outlook-2023), 
JPMorgan [here](https://www.jpmorgan.com/insights/research/market-outlook) and 
Goldman Sachs [here](https://www.goldmansachs.com/insights/pages/gs-research/macro-outlook-2023-this-cycle-is-different/report.pdf).  

The base ChatGPT model (GPT-3) is first fine-tuned on the sample content.  The new model is able to answer questions on the new content -- but only vaguely, at a very high level. 

Next, selected content is appended to each prompt as context before it fed to  the fine-tuned model. Specifically, an interface asks a user for a question about the banks' investment outlooks.  The program compares the user's query with the domain content to identify the most useful sections of text. The program answers the question by using the fine-tuned model's powerful underlying capabilities while referencing the specific context supplied in the prompt.

For a detailed discussion, see ["Leveraging ChatGPT for
Business and Organizational Purposes"](https://github.com/robjm16/domain_specific_ChatGPT/blob/main/DOMAIN_SPECIFIC_CHATGPT.md).

##1.Install Libraries 

In [None]:
! pip install openai 
! pip install transformers 
! pip install gradio
! pip install PyPDF2
! pip install python-docx
! pip install pandas

## 2.Imports 

In [None]:
import docx
import pandas as pd
import numpy as np
import json 
import openai
import gradio as gr
import pickle
import ast
import os
from transformers import GPT2TokenizerFast
from sklearn.model_selection import train_test_split # only if using validation file 

##3.Variables

In [None]:
USE_INTERFACE = True  # Change to False if you want to run the code without the Gradio interface, and instead see a single pre-supplied question 
filepath = 'investment_outlook_2023.docx' # Path to document containing domain content.  
# emb_filepath = 'PATH HERE'  # Path to document containing saved content embeddings, if applicable 
COMPLETIONS_MODEL = "text-davinci-003"  
api_key = 'YOUR OPENAI KEY HERE'
os.environ['API_KEY'] = api_key                                              
openai.api_key = os.environ["API_KEY"]
MODEL_NAME = "curie"
DOC_EMBEDDINGS_MODEL = f"text-search-{MODEL_NAME}-doc-001"
QUERY_EMBEDDINGS_MODEL = f"text-search-{MODEL_NAME}-query-001"
MAX_SECTION_LEN =1100  # The API limits total tokens -- for the prompt containing the wuestion and domain-specific content and the answer -- to 2048 tokens, or about 1500 words.  
SEPARATOR = "\n* "  # A string called SEPARATOR is defined as the newline character followed by an asterisk and a space. This string will be used as a separator between different pieces of text.
tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")
separator_len = len(tokenizer.tokenize(SEPARATOR))
COMPLETIONS_API_PARAMS = {
    # We use temperature of 0.0 because it gives the most predictable, factual answer.
    "temperature": 0.0,
    "max_tokens": 300,
    "model": COMPLETIONS_MODEL,  
    "stop":[".###"]
}


## 4.Functions

In [None]:
def load_text(filepath):
  """
  Loads a Microsoft Word document and returns a DataFrame containing the text of each paragraph in the document.

  Input:
    filepath (str): the filepath to the Microsoft Word document.
    
  Returns:
    df (pandas.DataFrame): a DataFrame containing the 'content' column with the text of each paragraph in the document.
  """
  # Open the Word document
  doc = docx.Document(filepath)

  # Create an empty pandas DataFrame
  df = pd.DataFrame()

  # Iterate through the paragraphs in the document and add each to the df
  for i, p in enumerate(doc.paragraphs):

      # Add the paragraph text [and index to the DataFrame]    
      df.loc[i, 'content'] = p.text
      # df.loc[i, 'paragraph_index'] = i

  # Delete empty paragraphs
  df['content'] = df['content'].replace('', np.nan)
  df = df.dropna(axis=0, subset=['content']).reset_index(drop=True)

  return df
    
def count_tokens(row):
    """count the number of tokens in a string"""
    return len(tokenizer.encode(row))

def truncate_text(df):
    """
    Truncates the text in the 'content' column of the input DataFrame if the number of tokens 
    in the text exceeds a specified maximum number. It will set the truncated text and the 
    number of tokens in the 'content' and 'tokens' columns, respectively.

    Input:
    df (pandas.DataFrame): a DataFrame containing the 'content' column

    Returns:
    df (pandas.DataFrame): the input DataFrame with modified 'content' and 'tokens' columns.

    """
    for i in range(len(df)):
        if df['tokens'][i] > 590:
            text = df['content'][i]
            tokens = tokenizer.encode(text)
            truncated_tokens = tokens[:590]
            truncated_text = tokenizer.decode(truncated_tokens)
            df.at[i, 'content'] = truncated_text
            df.at[i, 'tokens'] = len(truncated_tokens)
    return df

 
def get_embedding(text, model): 
    """
    Generates an embedding for the given text using the specified OpenAI model.
    
    Args:
        text (str): The text for which to generate an embedding.
        model (str): The name of the OpenAI model to use for generating the embedding.
    
    Returns:
        numpy.ndarray: The embedding for the given text.
    """
    result = openai.Embedding.create(
      model=model,
      input=[text]
    )
    return result["data"][0]["embedding"]

def get_doc_embedding(text):
    """
    Generates an embedding for the given text using the OpenAI document embeddings model.
    
    Args:
        text (str): The text for which to generate an embedding.
    
    Returns:
        numpy.ndarray: The embedding for the given text.
    """
    return get_embedding(text, DOC_EMBEDDINGS_MODEL)

def get_query_embedding(text):
   """
    Generates an embedding for the given text using the OpenAI query embeddings model.
    
    Args:
        text (str): The text for which to generate an embedding.
    
    Returns:
        numpy.ndarray: The embedding for the given text.
    """
   return get_embedding(text, QUERY_EMBEDDINGS_MODEL)

def compute_doc_embeddings(df): 
     """
    Generate embeddings for each row in a Pandas DataFrame using the OpenAI document embeddings model.
    
    Args:
        df (pandas.DataFrame): The DataFrame for which to generate embeddings.
    
    Returns:
        dict: A dictionary that maps the embedding vectors to the indices of the rows that they correspond to.
    """
     return {
        idx: get_doc_embedding(r.content.replace("\n", " ")) for idx, r in df.iterrows() # r here refers to each row 
   }

def load_embeddings(fname): 
    """
    Load document embeddings and their keys from a CSV file.  Only if embeddings are pre-loaded.
    
    Args:
        fname (str): The path to the CSV file. The file must have exactly these named columns: 
            "title", "heading", "0", "1", ... up to the length of the embedding vectors.
    
    Returns:
        dict: A dictionary that maps the embedding vectors to tuples of the form (title, heading).
    """
    
    df = pd.read_csv(fname, header=0)
    max_dim = max([int(c) for c in df.columns if c != "title" and c != "heading"])
    return {
           (r.title, r.heading): [r[str(i)] for i in range(max_dim + 1)] for _, r in df.iterrows()
    }

def vector_similarity(x, y):
    """
    Calculate the similarity between two vectors using dot product.
    
    Args:
        x (iterable): The first vector.
        y (iterable): The second vector.
    
    Returns:
        float: The dot product of the two vectors.
    """
    return np.dot(np.array(x), np.array(y))

def order_document_sections_by_query_similarity(query, contexts):  #  CHANGED FROM (query, contexts)???????????????????????????
    """
    Find the query embedding for the given query, and compare it against all of the pre-calculated document embeddings
    to find the most relevant sections. 
    
    Args:
        query (str): The query for which to find relevant document sections.
        contexts (dict): A dictionary mapping document embeddings to their indices.
      
    Returns:
        list: A list of tuples, each containing the similarity score and index of a document section, sorted in descending
        order of relevance.
    """
    query_embedding = get_query_embedding(query)
    print("GETTING DOC SIMILARIITIES.........")  # FOR TESTING PURPOSES
    document_similarities = sorted([(vector_similarity(query_embedding, doc_embedding), doc_index) \
                                    for doc_index, doc_embedding in contexts.items()], \
                                    reverse=True)
    print("FINISHED DOC SIMILARITIES..............")  # FOR TESTING PURPOSES
    
    return document_similarities
    
def construct_prompt(question, context_embeddings, df):
    """
    Construct a prompt for answering a question using the most relevant document sections.
    
    Args:
      question (str): The question to answer.
      context_embeddings (dict): A dictionary mapping document embeddings to their indices.
      df (pandas.DataFrame): A DataFrame containing the document sections.
    
    Returns:
      str: The prompt, including the question and the relevant context.
    """
    most_relevant_document_sections = order_document_sections_by_query_similarity(question, context_embeddings)
    
    chosen_sections = []
    chosen_sections_len = 0
    chosen_sections_indexes = []
     
    for _, section_index in most_relevant_document_sections:
        # Add contexts until we run out of space.        
        document_section = df.loc[section_index]
        
        chosen_sections_len += document_section.tokens + separator_len  # Note that "token" column is used here 
        if chosen_sections_len > MAX_SECTION_LEN:
            break
            
        chosen_sections.append(SEPARATOR + document_section.content.replace("\n", " ")) # Note that 'content" column is used here 
        chosen_sections_indexes.append(str(section_index))
            
    # Useful diagnostic information  -- FOR TESTING PURPOSES
    print(f"Selected {len(chosen_sections)} document sections:")
    print("\n".join(chosen_sections_indexes))
    
    header = """Given the following context, answer the question as truthfully as possible, and if the answer is not contained within the context below, say "Sorry, I don't know."\n\nContext:\n"""

    full_prompt = header + "".join(chosen_sections) + "\n\nQuestion: " + question + "\n\n###\n\n"

    print(full_prompt) # FOR TESTING PURPOSES

    return full_prompt
    

def answer_query_with_context(
    query,
    df,
    document_embeddings,
    show_prompt: bool = False):
    prompt = construct_prompt(
        query,
        document_embeddings,
        df
    )
    """
    Answer a query using relevant context from a DataFrame.
    
    Args:
        query (str): The query to answer.
        df (pandas.DataFrame): A DataFrame containing the document sections.
        document_embeddings (dict): A dictionary mapping document embeddings to their indices.
        show_prompt (bool, optional): If `True`, print the prompt before generating a response.
    
    Returns:
        str: The generated response to the query.
    """   
    # print("LINE 232..............")  # FOR TESTING PURPOSES



    if show_prompt:
        print(prompt)

    response = openai.Completion.create(
                prompt=prompt,
                **COMPLETIONS_API_PARAMS
            )

    return response["choices"][0]["text"].strip(" \n")

def get_questions(context):
    """
    get_questions(context) is a function that takes in a string of text(context) 
    as an argument and returns a string of questions generated based on the context 
    using the OpenAI API. The function uses the "text-davinci-001" engine, the 
    prompt is constructed by combining the context and the string "Write questions 
    based on the text below\n\nText: {context}\n\nQuestions:\n1."  
    The temperature, max_tokens, top_p, frequency_penalty, presence_penalty all set 
    to 0, and stop is set to "\n\n"
    If there is any exception, the function will return an empty string.
    """
    try:
        response = openai.Completion.create(
            engine="text-davinci-001",
            prompt=f"Write questions based on the text below\n\nText: {context}\n\nQuestions:\n1.",
            temperature=0,
            max_tokens=257,
            top_p=1,
            frequency_penalty=0,
            presence_penalty=0,
            stop=["\n\n"]
        )
        # print(response)  # FOR TESTING PURPOSES
        # print(response['choices'][0]['text'])  # FOR TESTING PURPOSES
        return response['choices'][0]['text']
    except:
        return ""

def get_answers(row):
    """
    get_answers(row) is a function that takes in a row of dataframe
    and returns a string of answers generated based on the questions and context 
    in the dataframe using the OpenAI API.
    The function uses the "text-davinci-001" engine, the prompt is constructed by
    combining the context and the questions in the dataframe.
    The temperature, max_tokens, top_p, frequency_penalty, presence_penalty all 
    set to 0.
    If there is any exception, the function will print the exception and return 
    an empty string.
  """
    try:
        response = openai.Completion.create(
            engine="text-davinci-001",
            prompt=f"Write questions based on the text below\n\nText: {row.context}\n\nQuestions:\n{row.questions}\n\nAnswers:\n1.",
            temperature=0,
            max_tokens=257,
            top_p=1,
            frequency_penalty=0,
            presence_penalty=0
        )
        return response['choices'][0]['text']
    except Exception as e:
        print (e)
        return ""



##5.Fine Tune the Model 

###5A. Prepare the Master Dataframe 
Load the content, count tokens, generate fine-tuning questions and truncate long content as needed 



In [None]:
# Load the text into dataframe 
df = load_text(filepath)
# print(df.head()) # For Testing Purposes.................

# Count the tokens; used to size sections in potentially truncating text df = df.copy()    
df['tokens'] = df['content'].apply(count_tokens)

# print(df.head(10))   # For Testing Purposes................. 
# print(df['content'][3])   # For Testing Purposes.................

df['context'] = df['content']  # Leaving content column as is, for future reference 
df['questions']= df['context'].apply(get_questions) # Generates questions for each section
df['questions'] = "1." + df.questions  # Adds number to first question
# print(df[['questions']].values[0][0]) # For Testing Purposes.................

# Call the truncate_text function on the dataframe  
df = df.copy()    
df = truncate_text(df)

Add answers (to the questions), paragraph numbers and embeddings of the content to the dataframe.  
NOTE:  May need to wait >1 min to run next cell in order to stay under free-of-charge usage limits.  


In [None]:
df['answers']= df.apply(get_answers, axis=1)
df['answers'] = "1." + df.answers  # Adds first number to answers
df = df.dropna().reset_index().drop('index',axis=1)
print(df[['answers']].values[0][0]) 
# df.tail()  # For testing purposes.....................
# Add paragraph number column for optional use later in generating adversarial context/answer
for i, row in enumerate(df.iterrows()):
    df.loc[i, "paragraph_number"] = i
# Add embeddings of "context" column text to df, to allow repeated use 
document_embeddings = compute_doc_embeddings(df)
df = df.assign(embeddings=df.index.map(document_embeddings))
df.head(3) # For testing purposes ...............................................

1. The main difference between the market conditions in 2022 and 2023 is that the market will be in a more stable environment with slower growth and lower inflation.
2. The outlook for income investing in 2023 is positive, with good opportunities for investors to find income-producing assets.
3. The main risks to the market outlook in 2023 are potential reversals of the current market trends.


Unnamed: 0,content,tokens,context,questions,answers,paragraph_number,embeddings
0,Morgan Stanley says: In an environment of slo...,143,Morgan Stanley says: In an environment of slo...,1. What is the main difference between the mar...,1. The main difference between the market cond...,0.0,"[0.028803126886487007, -0.007771830074489117, ..."
1,Morgan Stanley says: Bonds—the biggest losers...,124,Morgan Stanley says: Bonds—the biggest losers...,1. What are the global macro trends that Morga...,1. Morgan Stanley believes that global macro t...,1.0,"[0.010308968834578991, -0.00835722591727972, -..."
2,Morgan Stanley says: Other key takeaways from ...,142,Morgan Stanley says: Other key takeaways from ...,1. What is the main reason for the predicted d...,1. The main reason for the predicted decline i...,2.0,"[0.013922976329922676, -0.004230298567563295, ..."


In [None]:
# Save df
df.to_csv('invest_outlook_2023.csv', index=False)
df = pd.read_csv('invest_outlook_2023.csv')
# df.head(3) # For testing purposes ...............................................

###5B. Fine Tuning 

Create new dataframe with one question per row, create prompts, further process prompts/completions per OpenAI instructions

In [None]:
# Create a new dataframe with one question and answer per row
expanded_df = pd.DataFrame(columns=df.columns)
for i, row in df.iterrows():
    questions = row['questions'].split("\n")
    answers = row['answers'].split("\n")
    for j in range(len(questions)):
        if j < len(questions) and j < len(answers):
            new_row = {'paragraph_number': row['paragraph_number'],\
                       'content': row['content'], 'tokens': row['tokens'], \
                       'context': row['context'], 'embeddings': row['embeddings'],\
                       'questions': questions[j], 'answers': answers[j]}
            expanded_df = expanded_df.append(new_row, ignore_index=True)

# Label questions "original" to distinguish from optional adversarial examples, if added 
expanded_df["label"] = "original"
expanded_df.rename(columns={'answers': 'completion'}, inplace=True)

# Remove question/completion numbers
expanded_df['questions'] = expanded_df['questions'].str[2:].str.strip()
expanded_df['completion'] = expanded_df['completion'].str[2:].str.strip()

# Create prompts
expanded_df["prompt"] = expanded_df.apply(lambda row: f"Context: {row['context']}\nQuestion: {row['questions']} ", axis=1) 

# Add whitespace to start of completion and unqiue identfier to end, per OpenAI
expanded_df['completion'] = expanded_df['completion'].str.ljust(1)
expanded_df['prompt'] = expanded_df['prompt'] + "\n\n###\n\n"
expanded_df['completion'] = ' ' + expanded_df['completion'] + "###"

expanded_df.head(2) # For testing only .................................

Unnamed: 0,content,tokens,context,questions,completion,paragraph_number,embeddings,label,prompt
0,Morgan Stanley says: In an environment of slo...,143,Morgan Stanley says: In an environment of slo...,What is the main difference between the market...,The main difference between the market condit...,0.0,"[0.028803126886487007, -0.007771830074489117, ...",original,Context: Morgan Stanley says: In an environme...
1,Morgan Stanley says: In an environment of slo...,143,Morgan Stanley says: In an environment of slo...,What is the outlook for income investing in 2023?,The outlook for income investing in 2023 is p...,0.0,"[0.028803126886487007, -0.007771830074489117, ...",original,Context: Morgan Stanley says: In an environme...


Extract columns for fine tuning, create single line json file and fine tune via the API 

In [None]:
# Extract columns needed for fine tuning  
temp_df=expanded_df[["prompt","completion"]].copy()

# Writes prompts/completions to jsonl (single lines)
df_train_list = temp_df.to_dict(orient='records')
with open('df_train3.jsonl', 'w', encoding='utf-8') as outfile:
    for row in df_train_list:
        outfile.write(json.dumps(row, ensure_ascii=False))
        outfile.write('\n')
        # print(row)  # For testing purposes............................

df_train_list[0] # For testing purposes.......................................

# # If using validation file
# # Split the expanded dataframe into training and testing sets
# train_df, test_df = train_test_split(temp_df, test_size=0.2)

# train_df = train_df.reset_index(drop=True)
# test_df = test_df.reset_index(drop=True)

# Create (optional) adversarial questions; train dataset only

# for i in range(1, len(train_df), 15):
#     if i < len(train_df):
#         # Randomly select a question from another paragraph
#         adversarial_question = train_df[train_df['paragraph_number'] != train_df.loc[i, 'paragraph_number']].sample(1)
#         # Replace the question with the adversarial value
#         train_df.loc[i, 'completion'] = adversarial_question['completion'].item()
#         train_df.loc[i, 'label'] = "adversarial question"
#     else:
#         break

# for i in range(7, len(train_df), 15):
#     if i < len(train_df):
#         # Randomly select a context from another paragraph
#         adversarial_context = train_df[train_df['paragraph_number'] != train_df.loc[i, 'paragraph_number']].sample(1)
#         # Replace the context with the adversarial value
#         train_df.loc[i, 'context'] = adversarial_context['context'].item()
#         train_df.loc[i, 'label'] = "adversarial context"
#     else:
#         break

# If separate train and test datasets 
# df_train_list = train_df.to_dict(orient='records')
# with open('df_train2.jsonl', 'w', encoding='utf-8') as outfile:
#     for row in df_train_list:
#         outfile.write(json.dumps(row, ensure_ascii=False))
#         outfile.write('\n')
#         print(row)  # For testing purposes

# df_test_list = test_df.to_dict(orient='records')
# with open('df_test2.jsonl', 'w', encoding='utf-8') as outfile:
#     for row in df_test_list:
#         outfile.write(json.dumps(row, ensure_ascii=False))
#         outfile.write('\n')
#         print(row)  # For testing purposes

{'prompt': 'Context: Morgan Stanley says:  In an environment of slow growth, lower inflation and new monetary policies, expect 2023 to have upside for bonds, defensive stocks and emerging markets. Investors may find themselves a bit whiplashed in 2023 as inflation and some of this year’s other dominant market trends fully reverse themselves, according to the 2023 Strategy Outlook from Morgan Stanley Research.  “For markets, this presents a\xa0very\xa0different backdrop than 2022, which was marked by resilient growth, high inflation and hawkish policy,” says Andrew Sheets, Chief Cross-Asset Strategist for Morgan Stanley Research. “Overall, 2023 will be a good year for income investing.” \nQuestion: What is the main difference between the market conditions in 2022 and 2023? \n\n###\n\n',
 'completion': ' The main difference between the market conditions in 2022 and 2023 is that the market will be in a more stable environment with slower growth and lower inflation.###'}

In [None]:
# Prepare the data in the JSONL file for fine-tuning
!openai tools fine_tunes.prepare_data -f df_train3.jsonl -q 

In [None]:
# Create the fine tuning 
!openai api fine_tunes.create -t "df_train3.jsonl" 

Load the fine-tuned model and test on a few questions

In [None]:
# Test
question = ['Why is inflation so high?', 'What is the outlook for oil?', 'What does JPMorgan think about 2023?', 'What is the view on emerging markets?']
ft_model = 'ada:ft-openai-2021-07-30-12-26-20' # Example fine tune name only; OpenAI API provides proprietary model name after fine tuning 
result = openai.Completion.create(model=ft_model, prompt=question[1] + '\n\n###\n\n', max_tokens=120, temperature=0, stop=[".###"]) # To test from test dataset use "df_test_list['prompt'][0] + " instead of "question" 
print(result['choices'][0]['text'])

 The outlook for oil is positive


### 5C. Fine-Tuned Model with Context Added to Prompts  
After testing, it is clear that fine tuning on domain-specific content enables the new model to answer questions on that new knowledge base -- but the answers answers are less robust than when injecting specific context into a prompt.  

Here, specific context is added to the prompts of the fine tuned model. Embeddings of the question and the knowledge base are used to extract the contexts that most directly fit the question, and those contexts are added to the prompt.   

If picking up previously fined tune model, change model name in completion paramters at top 

In [None]:
df=df.reset_index()
df_excerpt = df[['content', 'tokens']].copy()
df['embeddings'] = df['embeddings'].apply(lambda x: [float(i) for i in ast.literal_eval(x)]) #Changes string to float
# # Create dictionary of embeddings, by row of df
doc_embeddings = df.set_index('index').to_dict()['embeddings']

Launch the Q/A interface.  
NOTE: Additional information may be printed for validation purposes

In [None]:
if USE_INTERFACE:
    demo = gr.Interface(
    fn=lambda query: answer_query_with_context(query, df_excerpt, doc_embeddings),
    inputs=gr.Textbox(lines=2,  label="Query", placeholder="Type Question Here..."),
    outputs=gr.Textbox(lines=2, label="Answer"),
    description="Example of a domain-specific chatbot, using ChatGPT with supplemental content and fine-tuning.<br>\
                  Here, the content relates to the investment outlook for 2023, according to Morgan Stanley, JPMorgan and Goldman Sachs.<br>\
                  Sample queries: What is Goldman's outlook for inflation? What about the bond market? What does JPMorgan think about 2023?<br>\
                  NOTE: High-level demo only. Supplemental content used here limited to about 30 paragraphs, due to limits on free-of-charge usage of ChatGPT.<br>\
                  Far more robust domain-specific responses are possible.",
    title="Fine-Tuned Domain-Specific Chatbot",)
    # Launch the interface
    demo.launch(debug=True) # To show errors in colab notebook, set debug=True in launch()
else:
    prompt = construct_prompt(
        'What is the outlook for inflation?',
        document_embeddings,
        df_excerpt
    )
    # print("===\n", prompt) # FOR TESTING ONLY
    answer_query_with_context("What is Goldman's outlook for inflation?", df_excerpt, document_embeddings)  

Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
Note: opening Chrome Inspector may crash demo inside Colab notebooks.

To create a public link, set `share=True` in `launch()`.


<IPython.core.display.Javascript object>

GETTING DOC SIMILARIITIES.........
FINISHED DOC SIMILARITIES..............
Selected 4 document sections:
20
19
2
9
Given the following context, answer the question as truthfully as possible, and if the answer is not contained within the context below, say "Sorry, I don't know."

Context:

* JPMorgan says:  Commodity price forecasts 2023. Commodity price forecasts for 2023, with Brent averaging $90 per barrel, WTI averaging $83 and gold averaging $1,860 in the fourth quarter of 2023. There are strong reasons to expect a relatively robust 1.3 million barrels per day (mbd) of oil demand growth next year, despite expectations for the global economy to expand at a sub-par 1.5% pace in 2023. There is still substantial room for a cyclical rebound, driven by a continued normalization of demand for mobility fuels like gasoline, diesel and jet fuel to pre-COVID levels.“Our forecast of a $90 Brent in 2023 centers on the view that the OPEC+ alliance (Organization of the Petroleum Exporting Countri