# Baseline RAG example

This is a simple example of a baseline RAG application which purpose is to answer questions about the fantasy series [Malazan Universe](https://malazan.fandom.com/wiki/Malazan_Wiki) created by Steven Erikson and Ian C. Esslemont.

First the example will show each step of a baseline RAG pipeline including **Indexing**, **Retrieval** and **Generation**. This is done in order to show the architecture without the abstraction provided by frameworks like LlamaIndex and LangChain.
Then a more "normal" example will be shown using LlamaIndex.

As a vector database, we will use [ChromaDB](https://docs.trychroma.com/), but this can easily be exchanged with other databases.

In this example, we will use the following technologies

- OpenAI API
- ChromaDB
- LlamaIndex


### Setup libraries and environment


In [3]:
# %pip install chromadb llama-index-vector-stores-chroma

Collecting llama-index-vector-stores-chroma
  Downloading llama_index_vector_stores_chroma-0.1.10-py3-none-any.whl.metadata (705 bytes)
Downloading llama_index_vector_stores_chroma-0.1.10-py3-none-any.whl (5.0 kB)
Installing collected packages: llama-index-vector-stores-chroma
Successfully installed llama-index-vector-stores-chroma-0.1.10
Note: you may need to restart the kernel to use updated packages.


In [1]:
import os

import chromadb
import chromadb.utils.embedding_functions as embedding_functions
from chromadb import Settings
from IPython.display import Markdown, display
from llama_index.core import PromptTemplate, SimpleDirectoryReader
from llama_index.core.node_parser import SentenceSplitter
from openai import OpenAI, AzureOpenAI

import importlib
import util

# Suppose you made some changes to util.py

#importlib.reload(util.helpers)
from util.helpers import create_and_save_md_files, get_malazan_pages, get_theoffice_pages

### Environment variables

For this example you need to use an OpenAI API key. Go to [your API keys](https://platform.openai.com/api-keys) in the OpenAI console to generate one.

Then add the following to a `.env` file in the root of the project.

```
OPENAI_API_KEY=<YOUR_KEY_HERE>
```


In [2]:
from dotenv import load_dotenv

load_dotenv(override=True)
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
OPENAI_API_VERSION = os.getenv("OPENAI_API_VERSION")

openai_client = AzureOpenAI(
    api_key=OPENAI_API_KEY,  
    api_version="2024-05-01-preview", # https://learn.microsoft.com/en-us/azure/ai-services/openai/reference?WT.mc_id=AZ-MVP-5004796
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT")
)

In [3]:

#openai_client = AzureOpenAI(api_key=OPENAI_API_KEY)
#openai_ef = embedding_functions.OpenAIEmbeddingFunction(
#    api_key=OPENAI_API_KEY,
#    model_name="text-embedding-3-small"
#)

openai_ef = embedding_functions.OpenAIEmbeddingFunction(
    api_key=OPENAI_API_KEY,
    model_name="text-embedding-ada-002",
    api_type="azure", 
    api_version="2024-05-01-preview"
)

chroma_client = chromadb.PersistentClient(
    path="./data/baseline-rag-pdf-docs/chromadb", settings=Settings(allow_reset=True))

## Indexing

In this step, we will index the documents in our vector database. This will allow us to retrieve the most relevant documents when we ask a question.

We will use ChromaDB as our vector database and 'text-embedding-3-small' from OpenAI as our embedding model.


#### Fetch and process saved documents

First we need to fetch the documents we saved earlier.

Then we will process the documents in order to add them to our vector database.
The `SimpleDirectoryReader` fetches each section of the markdown file
Then each section is split in to smaller chunks of text and each chunk is embedded using the OpenAI API.


In [4]:
documents = SimpleDirectoryReader('./data/docs').load_data()
text_splitter = SentenceSplitter(chunk_size=512, chunk_overlap=20)

document_data = []

for document in documents:
    chunks = text_splitter.split_text(document.text)
    for idx, chunk in enumerate(chunks):
        embedding = openai_client.embeddings.create(
            input=chunk, model="text-embedding-ada-002")
        document_data.append({
            "id": f"{document.id_}-{idx}",
            "text": chunk,
            "metadata": document.metadata,
            "embedding": embedding.data[0].embedding
        })

In [5]:
print("embeddings dim", len(document_data[0]['embedding']))
print(document_data[0])

embeddings dim 1536
{'id': '8dbefb37-643c-4ee5-a4ae-57a430027364-0', 'text': '15\nMOD EL RISK\nModels are the wonder and, on occasion, the curse of the modern financial\nworld. They are used throughout the financial and corporate world for any\nnumber of purposes, and especially to put a number against the value, or the\nrisk, of investments and financial positions. They have become central to\nmany key corporate activities, including market, credit, and asset/liability\nrisk management.\nUnfortunately , models can be wrong in the sense of containing some\ninternal error , and they can also be misapplied, fed the wrong input\ninformation, and their results misinterpreted. As our dependence on models\nto understand a complex world has grown, model risks have grown too,\nincluding within risk management. In this chapter , we explain the\nimportance of model risk, using the example of market risk. In particular ,\nwe examine:\n•   How widespread the problem is\n•   Model error\n•   Implem

#### Add documents to ChromaDB


In [6]:
documents = [doc["text"] for doc in document_data]
embeddings = [doc["embedding"] for doc in document_data]
metadatas = [doc["metadata"] for doc in document_data]
ids = [doc["id"] for doc in document_data]

In [7]:
chroma_client.reset()
collection = chroma_client.get_or_create_collection(
    name="vtdat", metadata={"hnsw:space": "cosine"}, embedding_function=openai_ef)

In [8]:
collection.add(
    embeddings=embeddings,
    documents=documents,
    metadatas=metadatas,
    ids=ids)

In [9]:
import chromadb
from chromadb.config import Settings

# Initialize the PersistentClient again with the same path
chroma_client_load = chromadb.PersistentClient(
    path="./data/baseline-rag-pdf-docs/chromadb",
    settings=Settings(allow_reset=True)
)

# Get the existing collection by name
collection_load = chroma_client_load.get_collection(name="vtdat", embedding_function=openai_ef)

## Retrieval

In this step, we will retrieve the most relevant documents to a given question. We will use the vector database to retrieve the most similar documents to the question.

In order to do this we will use the `text-embedding-3-small` model (**the same model used to index the documents**) from OpenAI to embed the question and then use the vector database to retrieve the most similar documents.

We will retrieve the top 5 documents based on the _cosine similarity_ between the question and the documents. Other similarity metrics can be used as well like squared L2 or inner product.

Change `cosine` to `l2` or `ip` when creating the collection above to try these out.


In [14]:
queries = [#"Hvad står P, L, A, C og E for i PLACE-akronymet af John Ziman",
         "explain some of the risk measures"
         ]

query = queries[-1]

In [10]:
collection_load

<chromadb.api.models.Collection.Collection at 0x7f2a83665e10>

In [16]:
result = collection_load.query(query_texts=[query], n_results=5)
context = result["documents"][0]
#display(Markdown(f"------------\n\n{"\n\n------------\n\n".join(context)}"))

formatted_text = "\n\n------------\n\n".join(context)

# Display the formatted markdown
display(Markdown(f"{formatted_text}"))

1, Risk Rating 3, and so on), they can help us make more rational in-class
comparative decisions. More ambitiously , if we can assign absolute
numbers to some risk factor (a 0.02 percent chance of default versus a 0.002
percent chance of default), then we can weigh one decision against another
with some precision.
If we can put an absolute cost or price on a risk (ideally using data from
markets where risks are traded or from some internal “cost of risk”
calculation based on economic capital), then we can make truly rational
economic decisions about assuming, managing, and transferring risks. At
this point, risk management decisions become fungible with many other
kinds of management decision in the running of an enterprise.
While assigning numbers to risk is incredibly useful for risk
management and risk transfer , it’s also potentially dangerous. Only some
kinds of numbers are truly comparable, but all kinds of numbers tempt us to
make comparisons. For example, using the face value or “notional amount”
of a bond to indicate the risk of a bond is a flawed approach. As we explain
in Chapter 7 , a million-dollar position in a par value 10-year T reasury bond
does not represent at all the same amount of risk as a million-dollar position
in a 4-year par value T reasury bond.
Introducing sophisticated models to describe risk is one way to defuse
this problem, but this has its own dangers. Professionals in the financial
markets invented the V aR framework as a way of measuring and comparing
risk across many dif ferent markets. But the V aR measure works well as a
risk measure only for markets operating under normal conditions and only
over a short period, such as one trading day ( Chapter 7 ). Potentially , it’s a
very poor and misleading measure of risk in abnormal markets, over longer
time periods, or for illiquid portfolios.
VaR, like all risk measures, depends on a robust control environment for
its integrity . In recent rogue-trading cases, hundreds of millions of dollars of
losses have been suf fered by trading desks that had orders not to assume
VaR exposures of more than a few million dollars. The reason for the
discrepancy is often that the trading desks have found some way of
circumventing trading controls and suppressing risk measures.

------------

7
MEASURING MARKET RISK
Value-at-Risk, Expected Shortfall, and Similar Metrics
The measurement of market risk has evolved from simple naïve indicators,
such as the face value or “notional” amount of an individual security ,
through more complex measures of price sensitivities such as the basis
point value or duration approach of a bond ( Chapter 6 ) and various specific
measures of risk for derivatives (“the Greeks”), to relatively sophisticated
risk measures such as the latest value-at-risk (V aR) methodology for whole
portfolios of securities, and new risk metrics such as stress V aR, expected
shortfall, and scenario analysis. In this chapter we’ll chart this evolutionary
trajectory and spend some time examining the principles that lie behind
VaR and associated techniques to make clear the strengths and weaknesses
of the approaches in nonmathematical language.
The limitations of V aR as a risk metric have been understood for years,
but they played a significant role in obfuscating the risks run by the banking
industry in the build up to the 2007–2009 global financial crisis (GFC). The
result has been a series of attempts by regulators and the industry to both
improve V aR analysis and to reduce the financial industry’ s reliance on V aR
numbers. The Fundamental Review of the T rading Book (FR TB; see
Chapter 3 ) addresses the shortcomings of the previous measurement
methodologies to determine market risk capital requirements. In this
chapter we look at “expected shortfall” approach that attempts to look
beyond the V aR number to summarize the risk in the tail of any loss
distribution, and we discuss how V aR fits with the many other risk
methodologies that make up a best practice approach to risk measurement,
including stress testing and scenario analysis—approaches we deal with in
depth in Chapter 16 .
Crouhy, Michel, et al. The Essentials of Risk Management, Third Edition, McGraw-Hill Education, 2023. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/asb/detail.action?docID=30610510.
Created from asb on 2024-06-27 13:09:24.
Copyright © 2023. McGraw-Hill Education. All rights reserved.

------------

The VaR Controversy: A Quick Primer
Since the late 1990s, V aR has become the standard way to measure and
report market risk, and the methodology has also been extended to credit
risk ( Chapter 1 1). VaR is a useful risk measure during normal market
conditions and of fers a powerful way of assessing the overall market risk of
trading positions over a short horizon, such as one day or a two-week (i.e.,
10 trading days) period for regulatory purposes. In ef fect, the methodology
allows us to capture in a single number the multiple components of market
risk, such as curve risk, basis risk, and volatility risk.
However , each time there is turmoil in the world’ s markets, V aR’s
limitations and other sophisticated market risk measures are revealed. The
reason is simple: V aR models are based on the assumption that key
parameters such as volatilities and correlations are stationary—that is, that
they do not change in value during the period in which the risk is measured.
This assumption is often proven wrong during extreme market conditions,
making V aR an unreliable measure of risk at exactly the moment that robust
risk analytics are most required.
Exceptional market shocks—such as the crisis in the world markets in
1998 that capsized the giant U.S. hedge fund Long-T erm Capital
Management (L TCM), or the GFC of 2007–2009 that led to several bank
failures (such as Lehman Brothers in September 2008), and the 2020
COVID-19 pandemic—are usually accompanied by a drying up of market
liquidity and substantial trading losses.1 The risk these events pose can be
captured only by means of supplemental methodologies, so each crisis
reemphasizes the importance of using multiple risk-measurement tools,
including stress tests and scenario analyses, and of achieving the right blend
of quantitative rigor and qualitative assessment. Using a wide range of risk
measures helps because each approach has particular limitations and
strengths.
Just as V aR cannot easily capture the impact of disruptions in liquidity ,
prices, volatility , and correlations, it also struggles to capture strong non-
linearities in risk of the kind seen in complex structured products—for
example, subprime CDOs. Using dif ferent types of risk analysis is essential.

------------

interest rate risk by changing the product mix and pricing in their business
strategy , in a way that is consistent with customer needs.
We can see from our discussion so far that ALM involves answering
three critical risk-related questions:
•   How much risk do you want to take?  Answering this question is a
function of the risk appetite of the firm.
•   How much risk do we have now?  Answering this question means
developing tools to measure the risks of the firm’ s assets and
liabilities.
•   How do we move fr om wher e we ar e now to wher e we want to be?
Answering this question involves the execution of cost-ef fective risk
management strategies of the kind we outlined earlier .
In the next sections we describe some tools used by financial
institutions to measure their balance-sheet sensitivity to interest rate
changes. The first tools we’ll look at are simple approaches; they provide
partial, not very accurate, though useful answers to complicated questions.
Gap Analysis
Gap analysis is the approach used by most banks to measure interest rate
risk in their balance sheets. The gap is defined as the dif ference between the
amounts of rate-sensitive assets and rate-sensitive liabilities maturing or
repricing within a specific time period. In other words,
Gap = rate-sensitive assets (RSA) − rate-sensitive liabilities (RSL)
and the gap is measured for a specific time bucket.
A firm is said to have a positive gap  within a specific time period when
its rate-sensitive assets exceed its rate-sensitive liabilities—that is, “assets
reprice before liabilities,” in the professional jar gon. It describes the case in
which an institution’ s short-term assets are funded by long-term liabilities.
An increase (decrease) in interest rates leads to an increase (decrease) in
NII.
Crouhy, Michel, et al. The Essentials of Risk Management, Third Edition, McGraw-Hill Education, 2023. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/asb/detail.action?docID=30610510.
Created from asb on 2024-06-27 13:09:32.
Copyright © 2023. McGraw-Hill Education. All rights reserved.

------------

modeling—varies considerably depending on both the problem and
the sophistication of the approach. For example, in the recent past,
bank risk analysts might have analyzed the risk of an interest rate
position in terms of the ef fect of a single risk factor—for example, the
yield to maturity of government bonds, assuming that the yields for all
maturities are perfectly correlated. But this one-factor model approach
ignored the risk that the dynamic of the term structure of interest rates
is driven by more factors, for example, the forward rates. Now , leading
banks analyze their interest rate exposures using at least two or three
factors.
Further , the risk manager must also measure the influence of the
risk factors on each other , the statistical measure of which is the
“covariance.” Disentangling the ef fects of multiple risk factors and
quantifying the influence of each is a fairly complicated undertaking,
especially when covariance alters over time (i.e., is stochastic, in the
modeler ’s terminology). There is often a distinct dif ference in the
behavior and relationship of risk factors during normal business
conditions and during stressful situations such as financial crises.
Under ordinary market conditions, the behavior of risk factors is
relatively less dif ficult to predict because it does not change
significantly in the short and medium term: future behavior can be
extrapolated, to some extent, from past performance. However , during
stressful conditions, the behavior of risk factors becomes far more
unpredictable, and past behavior may of fer little help in predicting
future behavior . It’s at this point that statistically measurable risk
threatens to turn into the kind of unmeasurable uncertainty that we
discuss in Box 1-2 .
The distribution can also be related to the institution’ s stated “risk
appetite” for its various activities.

## Generation

In this step, we will generate an answer to the question using the retrieved documents as context. We will use the OpenAI API to generate the answer.


In [17]:
prompt = PromptTemplate("""You are a helpful assistant that answers questions about the US tv show known as "The Office" using provided context. 

Question: {query}

Context: 

-----------------------------------
{context}

-----------------------------------

""")

prompt = PromptTemplate(
    """You are a helpful assistant that answers questions about the course material from "Financial Risk Management" using provided context.
    Context information is below.
    ---------------------
    {context}
    ---------------------
    Given the context information and not prior knowledge, answer the query. Always provide an answer in the English language.
    Query: {query}
    Answer: 
    """,
)


message = prompt.format(query=query, context="\n\n".join(context))
display(Markdown(f"{message}"))

You are a helpful assistant that answers questions about the course material from "Financial Risk Management" using provided context.
    Context information is below.
    ---------------------
    1, Risk Rating 3, and so on), they can help us make more rational in-class
comparative decisions. More ambitiously , if we can assign absolute
numbers to some risk factor (a 0.02 percent chance of default versus a 0.002
percent chance of default), then we can weigh one decision against another
with some precision.
If we can put an absolute cost or price on a risk (ideally using data from
markets where risks are traded or from some internal “cost of risk”
calculation based on economic capital), then we can make truly rational
economic decisions about assuming, managing, and transferring risks. At
this point, risk management decisions become fungible with many other
kinds of management decision in the running of an enterprise.
While assigning numbers to risk is incredibly useful for risk
management and risk transfer , it’s also potentially dangerous. Only some
kinds of numbers are truly comparable, but all kinds of numbers tempt us to
make comparisons. For example, using the face value or “notional amount”
of a bond to indicate the risk of a bond is a flawed approach. As we explain
in Chapter 7 , a million-dollar position in a par value 10-year T reasury bond
does not represent at all the same amount of risk as a million-dollar position
in a 4-year par value T reasury bond.
Introducing sophisticated models to describe risk is one way to defuse
this problem, but this has its own dangers. Professionals in the financial
markets invented the V aR framework as a way of measuring and comparing
risk across many dif ferent markets. But the V aR measure works well as a
risk measure only for markets operating under normal conditions and only
over a short period, such as one trading day ( Chapter 7 ). Potentially , it’s a
very poor and misleading measure of risk in abnormal markets, over longer
time periods, or for illiquid portfolios.
VaR, like all risk measures, depends on a robust control environment for
its integrity . In recent rogue-trading cases, hundreds of millions of dollars of
losses have been suf fered by trading desks that had orders not to assume
VaR exposures of more than a few million dollars. The reason for the
discrepancy is often that the trading desks have found some way of
circumventing trading controls and suppressing risk measures.

7
MEASURING MARKET RISK
Value-at-Risk, Expected Shortfall, and Similar Metrics
The measurement of market risk has evolved from simple naïve indicators,
such as the face value or “notional” amount of an individual security ,
through more complex measures of price sensitivities such as the basis
point value or duration approach of a bond ( Chapter 6 ) and various specific
measures of risk for derivatives (“the Greeks”), to relatively sophisticated
risk measures such as the latest value-at-risk (V aR) methodology for whole
portfolios of securities, and new risk metrics such as stress V aR, expected
shortfall, and scenario analysis. In this chapter we’ll chart this evolutionary
trajectory and spend some time examining the principles that lie behind
VaR and associated techniques to make clear the strengths and weaknesses
of the approaches in nonmathematical language.
The limitations of V aR as a risk metric have been understood for years,
but they played a significant role in obfuscating the risks run by the banking
industry in the build up to the 2007–2009 global financial crisis (GFC). The
result has been a series of attempts by regulators and the industry to both
improve V aR analysis and to reduce the financial industry’ s reliance on V aR
numbers. The Fundamental Review of the T rading Book (FR TB; see
Chapter 3 ) addresses the shortcomings of the previous measurement
methodologies to determine market risk capital requirements. In this
chapter we look at “expected shortfall” approach that attempts to look
beyond the V aR number to summarize the risk in the tail of any loss
distribution, and we discuss how V aR fits with the many other risk
methodologies that make up a best practice approach to risk measurement,
including stress testing and scenario analysis—approaches we deal with in
depth in Chapter 16 .
Crouhy, Michel, et al. The Essentials of Risk Management, Third Edition, McGraw-Hill Education, 2023. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/asb/detail.action?docID=30610510.
Created from asb on 2024-06-27 13:09:24.
Copyright © 2023. McGraw-Hill Education. All rights reserved.

The VaR Controversy: A Quick Primer
Since the late 1990s, V aR has become the standard way to measure and
report market risk, and the methodology has also been extended to credit
risk ( Chapter 1 1). VaR is a useful risk measure during normal market
conditions and of fers a powerful way of assessing the overall market risk of
trading positions over a short horizon, such as one day or a two-week (i.e.,
10 trading days) period for regulatory purposes. In ef fect, the methodology
allows us to capture in a single number the multiple components of market
risk, such as curve risk, basis risk, and volatility risk.
However , each time there is turmoil in the world’ s markets, V aR’s
limitations and other sophisticated market risk measures are revealed. The
reason is simple: V aR models are based on the assumption that key
parameters such as volatilities and correlations are stationary—that is, that
they do not change in value during the period in which the risk is measured.
This assumption is often proven wrong during extreme market conditions,
making V aR an unreliable measure of risk at exactly the moment that robust
risk analytics are most required.
Exceptional market shocks—such as the crisis in the world markets in
1998 that capsized the giant U.S. hedge fund Long-T erm Capital
Management (L TCM), or the GFC of 2007–2009 that led to several bank
failures (such as Lehman Brothers in September 2008), and the 2020
COVID-19 pandemic—are usually accompanied by a drying up of market
liquidity and substantial trading losses.1 The risk these events pose can be
captured only by means of supplemental methodologies, so each crisis
reemphasizes the importance of using multiple risk-measurement tools,
including stress tests and scenario analyses, and of achieving the right blend
of quantitative rigor and qualitative assessment. Using a wide range of risk
measures helps because each approach has particular limitations and
strengths.
Just as V aR cannot easily capture the impact of disruptions in liquidity ,
prices, volatility , and correlations, it also struggles to capture strong non-
linearities in risk of the kind seen in complex structured products—for
example, subprime CDOs. Using dif ferent types of risk analysis is essential.

interest rate risk by changing the product mix and pricing in their business
strategy , in a way that is consistent with customer needs.
We can see from our discussion so far that ALM involves answering
three critical risk-related questions:
•   How much risk do you want to take?  Answering this question is a
function of the risk appetite of the firm.
•   How much risk do we have now?  Answering this question means
developing tools to measure the risks of the firm’ s assets and
liabilities.
•   How do we move fr om wher e we ar e now to wher e we want to be?
Answering this question involves the execution of cost-ef fective risk
management strategies of the kind we outlined earlier .
In the next sections we describe some tools used by financial
institutions to measure their balance-sheet sensitivity to interest rate
changes. The first tools we’ll look at are simple approaches; they provide
partial, not very accurate, though useful answers to complicated questions.
Gap Analysis
Gap analysis is the approach used by most banks to measure interest rate
risk in their balance sheets. The gap is defined as the dif ference between the
amounts of rate-sensitive assets and rate-sensitive liabilities maturing or
repricing within a specific time period. In other words,
Gap = rate-sensitive assets (RSA) − rate-sensitive liabilities (RSL)
and the gap is measured for a specific time bucket.
A firm is said to have a positive gap  within a specific time period when
its rate-sensitive assets exceed its rate-sensitive liabilities—that is, “assets
reprice before liabilities,” in the professional jar gon. It describes the case in
which an institution’ s short-term assets are funded by long-term liabilities.
An increase (decrease) in interest rates leads to an increase (decrease) in
NII.
Crouhy, Michel, et al. The Essentials of Risk Management, Third Edition, McGraw-Hill Education, 2023. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/asb/detail.action?docID=30610510.
Created from asb on 2024-06-27 13:09:32.
Copyright © 2023. McGraw-Hill Education. All rights reserved.

modeling—varies considerably depending on both the problem and
the sophistication of the approach. For example, in the recent past,
bank risk analysts might have analyzed the risk of an interest rate
position in terms of the ef fect of a single risk factor—for example, the
yield to maturity of government bonds, assuming that the yields for all
maturities are perfectly correlated. But this one-factor model approach
ignored the risk that the dynamic of the term structure of interest rates
is driven by more factors, for example, the forward rates. Now , leading
banks analyze their interest rate exposures using at least two or three
factors.
Further , the risk manager must also measure the influence of the
risk factors on each other , the statistical measure of which is the
“covariance.” Disentangling the ef fects of multiple risk factors and
quantifying the influence of each is a fairly complicated undertaking,
especially when covariance alters over time (i.e., is stochastic, in the
modeler ’s terminology). There is often a distinct dif ference in the
behavior and relationship of risk factors during normal business
conditions and during stressful situations such as financial crises.
Under ordinary market conditions, the behavior of risk factors is
relatively less dif ficult to predict because it does not change
significantly in the short and medium term: future behavior can be
extrapolated, to some extent, from past performance. However , during
stressful conditions, the behavior of risk factors becomes far more
unpredictable, and past behavior may of fer little help in predicting
future behavior . It’s at this point that statistically measurable risk
threatens to turn into the kind of unmeasurable uncertainty that we
discuss in Box 1-2 .
The distribution can also be related to the institution’ s stated “risk
appetite” for its various activities.
    ---------------------
    Given the context information and not prior knowledge, answer the query. Always provide an answer in the English language.
    Query: explain some of the risk measures
    Answer: 
    

In [18]:
stream = openai_client.chat.completions.create(
    messages=[{"role": "user", "content": query}],
    model="gpt4",
    stream=True)

output = ""
for chunk in stream:
    if chunk.choices:  # Check if the list is not empty
        output += chunk.choices[0].delta.content or ""
    display(Markdown(f"{output}"), clear=True)

1. Value at Risk (VaR): This is a risk metric that quantifies the level of financial risk within a firm, portfolio, or position over a specific time frame. It provides an estimate of the maximum loss that a portfolio of financial instruments is projected to endure over a selected period of time, given a set confidence level.

2. Conditional Value at Risk (CVaR): This risk measure estimates the expected losses that could occur beyond the VaR cutoff point. In other words, it looks at the severe end of losses in the distribution of possible outcomes. It provides a clearer picture of the potential loss in extreme circumstances.

3. Sharpe Ratio: Named after Nobel Laureate William F. Sharpe, this financial metric measures the performance of an investment compared to a risk-free asset, after adjusting for its risk. It is the measure of risk-adjusted return and is widely used by financial analysts and investors.

4. Treynor Ratio: Similar to the Sharpe ratio, the Treynor ratio compares returns earned in excess of that which could have been earned on a riskless investment per each unit of market risk in a portfolio.

5. Standard Deviation: This statistical measurement is used to measure the dispersion of a set of data from its average. In finance, standard deviation is applied to the annual rate of return of an investment, with a higher standard deviation projecting a significant degree of volatility.

6. Beta: Beta measures the volatility of an individual security/portfolio, in comparison to the broader market. A beta of 1 indicates that the investment's price will move with the market, while a beta lower/higher than 1 indicates that the investment will be less/more volatile than the market.

7. Sortino Ratio: This statistical tool differentiates harmful volatility from total overall volatility by taking the standard deviation of negative asset returns, and it helps investors to assess risk-adjusted return of their investment.

8. R-Squared: R-squared is a statistical measure that represents the percentage of a portfolio or security's movements that can be explained by movements in the benchmark index. It shows the level of an investment's risk associated with a specific market index. 

9. Drawdown: It is a measure of the decline from a historical peak in some variable, especially in portfolio value or investment returns. 

10. Stress Testing: This is a risk-oriented financial analysis that gauges how extreme market conditions might impact an investment or a portfolio. It includes various methods that evaluate the resilience of a portfolio in adverse situations.

In [24]:
stream = openai_client.chat.completions.create(
    messages=[{"role": "user", "content": message}],
    model="gpt4",
    stream=True)

output = ""
for chunk in stream:
    if chunk.choices:  # Check if the list is not empty
        output += chunk.choices[0].delta.content or ""
    display(Markdown(f"{output}"), clear=True)

Risk refers to the potential for an unpredictable or unexpected event or action that would negatively impact the individual or entity involved. In a financial context, risk is associated with changes in market prices, rates, or any event that can reduce the value of a security or a portfolio. It involves the unpredictability of losses and costs. An important distinction exists between risk and uncertainty, with risk referring to variability that can be quantified in terms of probabilities, while uncertainty involves variability that cannot be quantified at all.

## SlideGPT

In [9]:
queries = [
    #"Hvad står P, L, A, C og E for i PLACE-akronymet af John Ziman"
    #"Redegør for hovedtrækkene i sagen om COMPAS-algoritmen som den beskrives i Angwin et al. (2016)."
    #"Hvad er verifikation og falsifikation?"
    #"Redegør for hovedtrækkene i sagen om COMPAS-algoritmen som den beskrives i Angwin et al. (2016)."
    "2. COMPAS-algoritmen inddrager ikke direkte etnicitet, men den inddrager faktorer, der kan være korreleret med etnicitet (Angwin et al., 2016, 3–4). Har man som modelkonstruktør et ansvar for at undlade at inddrage faktorer, der kan korreleres med etnicitet eller andre følsomme forhold? Kan man det?"
]

query = queries[-1]

result = collection.query(query_texts=[query], n_results=10)
context = result["documents"][0]
#display(Markdown(f"------------\n\n{"\n\n------------\n\n".join(context)}"))

formatted_text = "\n\n------------\n\n".join(context)

# Display the formatted markdown
display(Markdown(f"{formatted_text}"))

prompt = PromptTemplate(
    """You are a helpful assistant that answers questions about the course material from "Philosophy of Computer Science (VtDat)" using provided context.
    Context information is below.
    ---------------------
    {context}
    ---------------------
    Given the context information and not prior knowledge, answer the query. Always provide an answer in the Danish language.
    Query: {query}
    Answer: 
    """,
)

prompt = PromptTemplate(
    """You are a helpful assistant that answers questions about the course material from "Philosophy of Computer Science (VtDat)" using provided context.
    Specifically, you will provide your answer such that it is generated as PowerPoint slide(s) content with a MAXIMUM of 3 PowerPoint slides. If helpful, please provide the answer using bullet points.
    Context information is below.
    ---------------------
    {context}
    ---------------------
    Given the context information and not prior knowledge, answer the query. Always provide an answer in the Danish language.
    Query: {query}
    Answer: 
    """,
)

message = prompt.format(query=query, context="\n\n".join(context))
display(Markdown(f"{message}"))

stream = openai_client.chat.completions.create(
    messages=[{"role": "user", "content": message}],
    model="gpt-4",
    stream=True)

output = ""
for chunk in stream:
    if chunk.choices:  # Check if the list is not empty
        output += chunk.choices[0].delta.content or ""
    display(Markdown(f"{output}"), clear=True)

Slide 1:
Title: Compas-algoritmen og etnicitet: 
- Modellens konstruktør har et ansvar for at undgå bias i algoritmer, hvilket inkluderer at undgå faktorer, der er stærkt korrelerede med følsomme forhold som etnicitet.
- Dette kan dog være en udfordring, da nogle faktorer, selvom de ikke direkte indbefatter etnicitet, stadig kan være tæt forbundet med det.

Slide 2:
Title: Udfordringer ved fjernelse af etnicitetskorrelerede faktorer:
- Fjerne sådanne faktorer kan reducere algoritmens effektivitet: dens evne til præcis forudsigelse kan falde.
- At ignorere disse faktorer kan også gøre algoritmen mindre pålidelig for forskellige etniske grupper.

Slide 3:
Title: Løsninger og Ansvar:
- Transparens omkring, hvilke data der bruges til at træne algoritmen og hvordan de anvendes, er afgørende.
- Det er vigtigt at overveje den samlede kontekst, hvori algoritmen skal bruges.
- En fair og retfærdig brug af algoritmer kræver en balance mellem nøjagtighed og etiske hensyn.

In [10]:
queries = [
    #"Hvad står P, L, A, C og E for i PLACE-akronymet af John Ziman"
    #"Redegør for hovedtrækkene i sagen om COMPAS-algoritmen som den beskrives i Angwin et al. (2016)."
    #"Hvad er verifikation og falsifikation?"
    #"Redegør for hovedtrækkene i sagen om COMPAS-algoritmen som den beskrives i Angwin et al. (2016)."
    "2. COMPAS-algoritmen inddrager ikke direkte etnicitet, men den inddrager faktorer, der kan være korreleret med etnicitet (Angwin et al., 2016, 3–4). Har man som modelkonstruktør et ansvar for at undlade at inddrage faktorer, der kan korreleres med etnicitet eller andre følsomme forhold? Kan man det? Hvilke etiske overvejelser bør man have. Inddrag gerne synspunkter fra nytte og pligtetik, og Nagels synsteseetik."
]

query = queries[-1]

result = collection.query(query_texts=[query], n_results=10)
context = result["documents"][0]
#display(Markdown(f"------------\n\n{"\n\n------------\n\n".join(context)}"))

formatted_text = "\n\n------------\n\n".join(context)

# Display the formatted markdown
display(Markdown(f"{formatted_text}"))

prompt = PromptTemplate(
    """You are a helpful assistant that answers questions about the course material from "Philosophy of Computer Science (VtDat)" using provided context.
    Context information is below.
    ---------------------
    {context}
    ---------------------
    Given the context information and not prior knowledge, answer the query. Always provide an answer in the Danish language.
    Query: {query}
    Answer: 
    """,
)

prompt = PromptTemplate(
    """You are a helpful assistant that answers questions about the course material from "Philosophy of Computer Science (VtDat)" using provided context.
    Specifically, you will provide your answer such that it is generated as PowerPoint slide(s) content with a MAXIMUM of 1 PowerPoint slides. If helpful, please provide the answer using bullet points.
    Context information is below.
    ---------------------
    {context}
    ---------------------
    Given the context information and not prior knowledge, answer the query. Always provide an answer in the Danish language.
    Query: {query}
    Answer: 
    """,
)

message = prompt.format(query=query, context="\n\n".join(context))
display(Markdown(f"{message}"))

stream = openai_client.chat.completions.create(
    messages=[{"role": "user", "content": message}],
    model="gpt-4",
    stream=True)

output = ""
for chunk in stream:
    if chunk.choices:  # Check if the list is not empty
        output += chunk.choices[0].delta.content or ""
    display(Markdown(f"{output}"), clear=True)

- Etnicitet inddrages indirekte i COMPAS-algoritmen gennem variabler, der korrelerer med etnicitet.
- Som modelkonstruktør har man ansvar for de faktorer, der inddrages i en algoritme.
- Følsomme faktorer som etnicitet kræver særlig overvejelse. Mens nogle mener, at det at ignorere sådanne faktorer kan være mere retfærdigt, kan det også reducere algoritmens nøjagtighed.
- Fra en nytteetisk perspektiv, ønsker man at gøre den største nytte. Hvis en algoritme bedre kan forudsige udfald ved at inddrage etniciteten, kan det argumenteres at det vil være mest gavnligt at inkludere det.
- En pligtetisk tilgang ville lægge vægt på fairness og lige behandling, hvilket kan argumentere imod at inkludere etnicitet.
- Nagels synsteseetik ville opfordre til at harmonisere disse to perspektiver og finde en balance.
- I sidste ende er etiske overvejelser afgørende ved design af algoritmer for at sikre, at de ikke forstærker eksisterende skævheder eller unfairness.

In [24]:
queries = [
    #"Hvad står P, L, A, C og E for i PLACE-akronymet af John Ziman"
    #"Redegør for hovedtrækkene i sagen om COMPAS-algoritmen som den beskrives i Angwin et al. (2016)."
    #"Hvad er verifikation og falsifikation?"
    #"Redegør for hovedtrækkene i sagen om COMPAS-algoritmen som den beskrives i Angwin et al. (2016)."
    #"2. COMPAS-algoritmen inddrager ikke direkte etnicitet, men den inddrager faktorer, der kan være korreleret med etnicitet (Angwin et al., 2016, 3–4). Har man som modelkonstruktør et ansvar for at undlade at inddrage faktorer, der kan korreleres med etnicitet eller andre følsomme forhold? Kan man det? Hvilke etiske overvejelser bør man have. Inddrag synspunkter fra nytte og pligtetik (kategoriske og praktiske imperativ), og Nagels synsteseetik. Hvorfor har modelkonstruktøren et etisk ansvar?"
    #"Forklar begreberne group fairness og individuel fairness fra Friedler et al. (2021) og forklar hvorfor en algoritme (normalt) ikke kan udvise gruppe- og individuel fairness samtidig."
    "Analyser hvordan disse fairness-begreber (group fairness og individuel fairness) er på spil i COMPAS-casen"
]

query = queries[-1]

result = collection.query(query_texts=[query], n_results=10)
context = result["documents"][0]
#display(Markdown(f"------------\n\n{"\n\n------------\n\n".join(context)}"))

formatted_text = "\n\n------------\n\n".join(context)

# Display the formatted markdown
display(Markdown(f"{formatted_text}"))



prompt = PromptTemplate(
    """You are a helpful assistant that answers questions about the course material from "Philosophy of Computer Science (VtDat)" using provided context.
    Specifically, you will provide your answer such that it is generated as PowerPoint slide(s) content with 3 PowerPoint slides. If helpful, please provide the answer using bullet points.
    Context information is below.
    ---------------------
    {context}
    ---------------------
    Given the context information and not prior knowledge, answer the query. Always provide an answer in the Danish language.
    Query: {query}
    Answer: 
    """,
)

message = prompt.format(query=query, context="\n\n".join(context))
display(Markdown(f"{message}"))

stream = openai_client.chat.completions.create(
    messages=[{"role": "user", "content": message}],
    model="gpt-4",
    stream=True)

output = ""
for chunk in stream:
    if chunk.choices:  # Check if the list is not empty
        output += chunk.choices[0].delta.content or ""
    display(Markdown(f"{output}"), clear=True)

Slide 1:
Title: Analyse af Fairness i COMPAS-casen
- Introduktion til COMPAS
   - COMPAS-algoritmen bruges i USA's retssystem til bedømmelse af kriminelles risiko for recidivisme.
   - Et review af denne algoritme viste en systematisk skævhed i dens fejl. 
   - Personer af farvet hud, der ikke begår ny kriminalitet, har langt højere chance for fejlagtigt at få en høj risikoscore end hvide. Omvendt gælder det for hvide for fejlagtigt at få en lav score.
   
Slide 2:
Title: Gruppe Fairness og COMPAS
- Gruppe Fairness involverer at behandle grupper af mennesker ligeligt. 
- I COMPAS-casen:
   - Det forekommer at der er en skævhed mod farvede personer. 
   - Mulighed for gruppe-diskrimination, hvor folk med samme etniske baggrund behandles anderledes end andre grupper.
   
Slide 3:
Title: Individuel Fairness og COMPAS
- Individuel fairness handler om at behandle hver enkelt individ retfærdigt.
- I COMPAS-casen:
   - COMPAS-algoritmen kan potentielt behandle individuelle farvede personer uretfærdigt baseret på deres hudfarve. 
   - Potentialet for diskrimination på individuelt niveau selv hvis gruppefairness er opfyldt.
- Afsluttende tanker: Vi skal være omhyggelige med de data, vi bruger til at træne algoritmer, og være opmærksomme på potentialerne for skjulte forudsætninger og diskriminationer.

In [23]:
display(Markdown(f"{message}"))

You are a helpful assistant that answers questions about the course material from "Philosophy of Computer Science (VtDat)" using provided context.
    Specifically, you will provide your answer such that it is generated as PowerPoint slide(s) content with 3 PowerPoint slides. If helpful, please provide the answer using bullet points.
    Context information is below.
    ---------------------
    F or det første er de etiske argumenter
for at bruge prædiktive algoritmer typisk utilitaristiske; den grundlæggende påstand er, at
en algoritme som COMP AS på den eller anden måde øger den samlede nytte i samfundet og
det er på den baggrund algoritmen skal vurderes (der kan selvfølgelig også være ikke-etiske
begrundelser som kommercielle interesser eller andet). 620
F or andet skal være klar over, at prædiktive algoritmer altid diskriminerer, i den forstand
at din fremtidige adfærd estimeres på baggrund af den fortidige adfærd af den gruppe, som
du anses for at tilhøre. Hvis man fx vil estimere hvor længe en 25 årig, ikke-rygende, køben-
havnsk studerende lever, vil man se på, hvor længe folk i en referenceklasse af personer, der
minder om vores 25-årige studerende i gennemsnit har levet. Det er ganske simpelt sådan, 625
statistiske forudsigelser fungerer (i det mindste i den frekventistisk tolkning af sandsynlig-
hed, som er mest naturlig i denne sammenhæng). Spørgsmålet er altså ikke om algoritmerne
må diskriminere, men i højere grad hvilke faktorer, vi vil tillade, at der indgår i reference-
klassen. Hvilke træk ved dig, kan man det etisk set forsvare at diskriminere på baggrund
af, når du skal have en livsforsikring, ansøger om et job eller søger ind på en uddannelse? 630
F or at svare på det spørgsmål bliver vi for det tredje nødt til at inddrage begrebet ret-
færdighed.

også et socialt gode, idet den grundlæggende autonomi og integritet, beskyttelse af privat-
livet muliggør, ifølge Johnson er en afgørende forudsætning for at have et demokrati. Et
demokrati kan kun fungere, hvis der er plads til at tænke anderledes og mulighed for at gøre 560
og afprøve nye ting, som de siddende magthaverne måske ikke billiger. Uden en privatsfære,
hvor man kan tage selvstændig stilling til og indgå med andre borgere i en kritisk dialog om
magthavernes beslutninger kan man ikke have et demokrati. Hvis det argument er korrekt,
er spørgsmålet om privatliv vs. sikkerhed ikke blot et spørgsmål om individuelle rettigheder
mod almenvellets velfærd, men også et spørgsmål om forskellige former for generel velfærd. 565
Sat på spidsen kan man sige, at valgte også står mellem den sikkerhed man kan få med
totalitær overvågning og de fordele, et oplyst, inddragende demokratisk styre giver. Man
kan i længden ikke have begge dele.
10.7 Bias og algoritmisk transparens
Brugen af algoritmer giver også en anden etisk konflikt, nemlig en konflikt mellem effek- 570
tivitet og retfærdighed. Det har i århundrede været helt almindeligt at bruge datadrevne
sandsynlighedsberegninger til at støtte beslutninger, der går ud på at vurdere en form for
risiko. Hvis man fx tegner en ulykkesforsikring, vil forsikringsselskabet bruge deres viden
om, hvor hyppigt folk, der på relevante træk ligner dig, kommer til skade, til at fastsætte
forsikringspræmien.

En
algoritme som COMP AS, der har en større tilbøjelighed til at holde farvede i fængsel end
ikke-farvede, vil bidrage til fastholde denne strukturelle ulighed. En datadreven algoritme
afspejler virkeligheden, som den er, men hvis virkeligheden er racistisk, vil algoritmen også
blive det, og hvis algoritmen bliver brugt uden forbehold, kan den bidrage til at fastholde 685
historisk betingede diskriminerende strukturer. Der er derimod ikke noget, der tyder på, at
mænds kortere levealder er et udtryk for strukturel diskrimination. Så med andre ord er
det vigtigt, at se på den kontekst, en algoritme skal indgå i, når man overvejer, om den er
retfærdig.
Pandoras black box 690
I en traditionel statistisk model vil man have nogenlunde styr på de variable, modellen
inddrager. Det er dog værd at bemærke, at tingene selv med traditionelle algoritmer ik-
ke altid er så klare; COMP AS-algoritmen inddrager således ikke direkte etnicitet som en
variabel. Den inddrager til gengæld flere variable, der er tæt korrelerede med etnicitet, og
dermed indgår etnicitet indirekte i algoritmen. Det gør naturligvis tingene mere besværlige, 695
16Henrik Kragh Sørensen og Mikkel Willum Johansen (apr. 2022). „Invitation til de datalogiske fags
videnskabsteori“ . Lærebog til brug for undervisning ved Institut for Naturfagenes Didaktik, Københavns
Universitet. Under udarbejdelse.
Kapitel 10, version 151 (2021-06-05).

Algoritmen til forudsigelse af frafald inddrog således direkte elevernes etnicitet
som en faktor i beregningen, men hvad nu hvis algoritmen havde en bias så den systematisk
gav elever med en etnicitet en højre risikovurdering end elever med en anden?
Spørgsmålet om diskrimination i algoritmer er navnligt blevet diskuteret i forbindelse 595
med den såkaldte COMP AS-algoritme, der bruges i det amerikanske retsvæsen til vurdering
af kriminelles tilbagefaldsrisiko. En gennemgang af algoritmen viste imidlertid en systema-
tisk skævhed i de fejl, algoritmen lavede. Således havde farvede, der ikke senere begik ny
kriminalitet, en langt højere chance for fejlagtigt at få en høj risikoscore end hvide, og hvide
havde omvendt en højere chance for fejlagtigt at få en lav score end farvede (Angwin m.fl., 600
2016 ).
Hvis man imidlertid tager udgangspunkt i de to kategorier høj- og lav-risiko, og under-
14Henrik Kragh Sørensen og Mikkel Willum Johansen (apr. 2022). „Invitation til de datalogiske fags
videnskabsteori“ . Lærebog til brug for undervisning ved Institut for Naturfagenes Didaktik, Københavns
Universitet. Under udarbejdelse.
Kapitel 10, version 151 (2021-06-05).

Algoritmisk bias er et nært forbun-
det fænomen. Som vi så ovenfor havde Flu Trends svært ved at skel-ne in ﬂ uenza- og vintersæsonen, for-
di de to fænomener forekom samti-digt i det oprindelige træningssæt. Det var et ret uskyldigt eksempel, men hvad nu hvis de to fænomener, der var blevet sammenblandet, var noget mere følsomt som etnicitet og kriminalitet? I navnlig det ameri-kanske retssystem støtter man sig i stigende grad til statistisk trænede algoritmer, når man skal afgøre, hvorvidt fanger skal prøveløslades. Fangerne skal udfylde et spørge-skema, og på baggrund af, hvordan det er gået fanger, der tidligere har udfyldt samme skema, kan man udregne en score for, hvor sandsyn-ligt det er, at en person vil begå ny kriminalitet. Metoden er imidlertid blevet beskyldt for konsekvent at give sorte en højre kriminalitets-score end folk af andre hudfarver. Man må ikke dømme folk alene ud fra deres hudfarve – det er racisme – men noget tyder altså på, at al-goritmen i dette tilfælde til dels har lært at gætte, om folk er sorte eller ej, præcis som Flu Trends til dels havde lært at gætte på, om vinteren nærmede sig. 
Vi risikerer naturligvis altid at 
blive mødt af fordomme, men når fordommene bygges ind i model-ler, der fremstår som neutrale og objektive, kan de være sværere at adressere. Når man bruger big data er der derfor ikke bare gode metodologiske grunde til at være opmærksom på sammenblandin-gen mellem over ﬂ adisk relaterede 
fænomener. Der kan også være gode etiske grunde til det.

søger, hvor god algoritmen er til at komme med korrekte forudsigelser, er der paradoksalt
nok ingen forskel på farvede og hvide; blandt de, der blev bedømt som højrisiko, var pro-
centdelen af farvede, der var blevet fejlplaceret, nogenlunde lige så stor som procentdelen 605
af hvide, og tilsvarende for de, der blev vurderet som højrisiko. Det betyder med andre
ord, at risikoscoren er nogenlunde lige pålidelig, uanset hvilken etnicitet den person, den
bliver lavet for, har. Y dermere har det vist sig, at man ikke kan have lighed for begge mål
samtidig. Enten vil algoritmen som nu være lige pålidelig i sine forudsigelser uanset etnici-
tet, men til gengæld have en systematisk skævhed i forhold til enkeltpersoners risiko for at 610
blive fejlplaceret, eller også kan man gøre risikoen for at blive fejlplaceret lige stor uanset
etnicitet, hvilket til gengæld betyder, at algoritmen vil være mindre pålidelig for den ene
etniske gruppe end for den anden (for en stringent matematiske gennemgang, se Kleinberg,
Mullainathan og Raghavan, 2017 ). Det er selvfølgelig en interessant etisk udfordring!
Når man skal diskutere hvorvidt brugen af prædiktive algoritmer kan forsvares etisk er 615
der er en række ting, det kan være godt at få på plads.

Det er ret ukontroversielt. Det er straks mere kontroversielt, at man 575
bl.a. i det amerikanske retsvæsen siden starten af den 20. århundrede har brugt lignende
beregninger til at vurdere kriminelles risiko for at begå ny kriminalitet – og at den type
beregninger er blevet brugt i vurderingen strafudmåling, prøveløsladelser mv. (Carlson, A.
(2017). The Need for T ransparency in the Age of Predictive Sentencing Algorithms. Iowa
Law Review, 103(1), 303–329.). I de sidste årtier har lettere adgang til store datamængder 580
og fremkomsten af diverse machine learning teknikker gjort det lettere at udvikle den type
systemer og diverse former for prædiktive algoritmer bruges i dag på en lang række områder
fra reklamer til sortering af jobansøgninger.
Det er dog ikke helt uproblematisk at bruge prædiktive algoritmer. Som vi så tidligere, er
der en række epistemiske problemer forbundet med machine learning. Der er typisk en meget 585
direkte sammenhæng mellem epistemiske og etiske overvejelser; hvis de etiske overvejelser
inddrager utilititaristiske komponenter er det i høj grad væsentligt at forstå, hvor godt en
given algoritme i praksis virker. Ville den etiske vurdering af algoritmen til forudsigelse af
frafald i gymnasiet fx falde anderledes ud, hvis systemet kun havde en træfsikkerhed på
67%? Der er imidlertid også et anden og vanskeligere etisk aspekt specielt hvis man bruger 590
algoritmiske forudsigelser til at træffe afgørelser, der har betydning for enkeltindividers
muligheder.

at man på den måde kan komme til at bruge variable, man ikke direkte har indbygget i sin
algoritme.
Det problem bliver imidlertid kraftigt forstærket, hvis man træner en algoritme med
machine learning. Her har vi styr på de data, algoritmen er trænet med, og vi kan opstille
forskellige mål for, hvor godt den virker, men resten er en black box; vi aner reelt ikke 700
hvorfor den virker, og som vi berørte ovenfor, kan machine learning algoritmer med lethed
diskriminere langs variable, der er af en helt anden type den de data, vi har trænet algo-
ritmen med. Hvis vi vil sikre, at de algoritmer, vi udvikler, ikke diskriminerer på en etisk
kritisabel måde, bliver vi nødt til at gøre dem transparente. Det er imidlertid ikke altid let,
at gøre en blackbox transparent! 705
10.8 Systemisk etik
Efter at have set på de normative teorier vil vi nu vende tilbage til den deskriptive etik og
beskrive et interessant og i høj grad overset aspekt af etikken, nemlig det tilsyneladende
empiriske faktum, at vores etiske beslutninger kan påvirkes af selv små ændringer i vores
miljø. Som et enkelt eksempel testede John M. Darley (1938–2018) og Daniel Batson 710
teologistuderende i et eksperimentelt paradigme, hvor en forsøgsperson (typisk en stude-
rende) bliver bedt om at forberede et kort oplæg om den barmhjertige samaritaner, som
er en lignelse fra Det Nye T estamente, der i korte træk går ud på, at man ubetinget skal
hjælpe andre i nød, også selv om det er en vildt fremmet af et andet folk (Darley og Bat-
son, 1973 ).

Og hvad gør man så? Er
det mest retfærdige at kønsdiskriminere i dette tilfælde (og lade de fornuftige mænd betale 665
prisen) eller er det mest retfærdigt at undlade at kønsdiskriminere (og lade kvinderne betale
prisen)?
Det bringer os tilbage til COMP AS-algoritmen. I den population, hvor algoritmen er
blevet testet, har farvede fanger reelt en større risiko for at begå ny kriminalitet. Når
algoritmen diskriminerer på baggrund af etnicitet er det derfor blot et udtryk for, at den 670
har identificeret en informationsbærende dimension i datasættet. Og lige som ovenfor vil vi
have valget imellem at forhindre algoritmen i at bruge informationen, hvorved den vil blive
mindre effektiv i den forstand at dens overordnede fejlrate vil stige, eller at tillade den at
bruge informationen, hvorved den vil diskriminere enkeltindivider, således at farvede, der
ikke vil begå ny kriminalitet har en meget større risiko for at få en høj kriminalitetsscore 675
end ikke-farvede. Hvilket af de to scenarier forekommer dig mest retfærdigt, når du står
bag uvidenhedens slør?
Der er dog også væsentlige forskelle på de to cases. Specielt kan man argumentere for, at
den gennensnitligt høje kriminalitetsrate blandt farvede til dels skyldes strukturelle forhold
i det amerikanske samfund; pga. en generel diskrimination mod farvede har de sværere ved 680
at få uddannelser og job end ikke-farvede, og derfor vil de hyppigere ende i kriminalitet.

Begrebet retfærdighed befinder sig et sted imellem etik og politisk filosofi, idet
det angår mere generelle egenskaber samfund, som fordeling af goder, adgang til politisk
magt og det juridiske straffesystem. Præcis som det er tilfældet for etik findes der mange
forskellige retninger indenfor politisk filosofi, og mange forskellige ideer om, hvad retfærdig- 635
hed er. Det vil vi på ingen måde gå i dybden med her. Vi vil nøjes med at præsentere en
enkelt og meget operationel forestilling om retfærdighed, til dels med udgangspunkt i den
amerikanske filosof John Ra wls (1921–2002). Ra wls hævder (Rawls, 1972 ), at hvis sam-
fundet skal være retfærdigt, skal vi tage alle grundlæggende beslutninger om det indretning
og institutioner, mens vi befinder os bag et uvidenhedens slør , der skjuler, hvilken position 640
vi selv vil få i samfundet. Så man skal altså forestille sig, at man ikke ved, hvilket køn man
har, at man ikke ved, om man er rig eller fattig, at man ikke kender sin etnicitet mv., og
ud fra den position kan man beslutte, hvordan samfundet skal indrettes. I dette tilfælde:
Hvilke former for diskrimination vil vi tillade i prædiktive algoritmer? Det er let at forestille
sig (men det kan selvfølgelig diskuteres) at man i den situation vil vælge et samfund der 645
er uden kønsdiskrimination og ude diskrimination mode fx seksuelle og etniske minoriteter.
Så simpelt er det.
Og så alligevel ikke, for der kan være principielt forskellige former for diskrimination.
    ---------------------
    Given the context information and not prior knowledge, answer the query. Always provide an answer in the Danish language.
    Query: Forklar begreberne group fairness og individuel fairness fra Friedler et al. (2021) og forklar hvorfor en algoritme (normalt) ikke kan udvise gruppe- og individuel fairness samtidig.
    Answer: 
    

## Normal example using LlamaIndex

In this example, we will use LlamaIndex to abstract the indexing and retrieval steps. This shows how easily the same pipeline can be implemented using LlamaIndex.


In [None]:

#%pip install llama-index-embeddings-azure-openai
#%pip install llama-index-llms-azure-openai

In [15]:
import chromadb
from chromadb import Settings
from llama_index.llms.openai import OpenAI
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.ingestion import IngestionPipeline
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.vector_stores.chroma import ChromaVectorStore

from llama_index.llms.azure_openai import AzureOpenAI
from llama_index.embeddings.azure_openai import AzureOpenAIEmbedding

# ChromaDB Vector Store
chroma_client = chromadb.PersistentClient(
    path="./data/baseline-rag-pdf-docs/chromadb", settings=Settings(allow_reset=True))
chroma_client.reset()
collection = chroma_client.get_or_create_collection(
    name="VtDat", metadata={"hnsw:space": "cosine"})
vector_store = ChromaVectorStore(chroma_collection=collection)


llm = AzureOpenAI(
    model="gpt-4",
    deployment_name="gpt4",
    api_key=os.getenv("OPENAI_API_KEY"),  
    # api_version=os.getenv("OPENAI_API_VERSION"),
    api_version = "2024-05-01-preview", # https://learn.microsoft.com/en-us/azure/ai-services/openai/reference?WT.mc_id=AZ-MVP-5004796
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT")
)

# You need to deploy your own embedding model as well as your own chat completion model
embed_model = AzureOpenAIEmbedding(
    model="text-embedding-ada-002",
    deployment_name="text-embedding-ada-002",
    api_key=os.getenv("OPENAI_API_KEY"),  
    # api_version=os.getenv("OPENAI_API_VERSION"),
    api_version = "2024-05-01-preview", # https://learn.microsoft.com/en-us/azure/ai-services/openai/reference?WT.mc_id=AZ-MVP-5004796
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT")
)

# Define the ingestion pipeline to add documents to vector store
pipeline = IngestionPipeline(
    transformations=[
        SentenceSplitter(chunk_size=512, chunk_overlap=20),
        embed_model,
    ],
    vector_store=vector_store,
)

# Create index with the vector store and using the embedding model
index = VectorStoreIndex.from_vector_store(
    vector_store=vector_store, embed_model=embed_model)

In [16]:
embed_model

AzureOpenAIEmbedding(model_name='text-embedding-ada-002', embed_batch_size=10, callback_manager=<llama_index.core.callbacks.base.CallbackManager object at 0x7f2a83526dd0>, num_workers=None, additional_kwargs={}, api_key='e611f630906c45b08f1042c53e896e3d', api_base='https://api.openai.com/v1', api_version='2024-05-01-preview', max_retries=10, timeout=60.0, default_headers=None, reuse_client=True, dimensions=None, azure_endpoint='https://rag-test1.openai.azure.com/', azure_deployment='text-embedding-ada-002', azure_ad_token_provider=None, use_azure_ad=False)

In [17]:
# Fetch documents
documents = SimpleDirectoryReader('./data/docs').load_data()

# Run pipeline
pipeline.run(documents=documents)

print("Indexing complete")

Indexing complete


In [18]:
index.as_retriever()

<llama_index.core.indices.vector_store.retrievers.retriever.VectorIndexRetriever at 0x7f2a7a867710>

#### Create base QueryEngine from LlamaIndex


In [19]:
import os

from dotenv import load_dotenv
from IPython.display import Markdown, display
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.core.prompts.default_prompt_selectors import \
    DEFAULT_TREE_SUMMARIZE_PROMPT_SEL
from llama_index.core.query_engine import (RouterQueryEngine,
                                           TransformQueryEngine)
from llama_index.core.response_synthesizers import TreeSummarize
from llama_index.core.selectors import LLMMultiSelector
from llama_index.core.tools import QueryEngineTool
from llama_index.llms.openai import OpenAI
# from llama_index.postprocessor.cohere_rerank import CohereRerank

from util.helpers import create_and_save_wiki_md_files, get_wiki_pages
from util.query_engines import VerboseHyDEQueryTransform, WeatherQueryEngine

In [30]:
from util.query_engines import VerboseHyDEQueryTransform

hyde = VerboseHyDEQueryTransform(include_original=True, verbose=True)

In [31]:
hyde

<util.query_engines.VerboseHyDEQueryTransform at 0x7f2a607e8a90>

In [22]:
query_engine = index.as_query_engine(llm=llm, verbose=True)

In [32]:
transformed_query_engine = TransformQueryEngine(
    query_engine=index.as_query_engine(llm=llm, verbose=True),
    query_transform=hyde,
)

In [33]:
transformed_query_engine

<llama_index.core.query_engine.transform_query_engine.TransformQueryEngine at 0x7f2a7a94fc10>

#### Or alternatively, create a CustomQueryEngine


In [57]:
from llama_index.core import PromptTemplate
from llama_index.core.query_engine import CustomQueryEngine
from llama_index.core.retrievers import BaseRetriever
from llama_index.core import get_response_synthesizer
from llama_index.core.response_synthesizers import BaseSynthesizer


class MyRetriever(BaseRetriever):
    def retrieve(self, query_str: str, max_documents: int = 10):
        # Get all documents relevant to the query
        all_documents = super().retrieve(query_str)

        # Return only the first `max_documents` documents
        return all_documents[:max_documents]


qa_prompt = PromptTemplate(
    """You are a helpful assistant that answers questions about the course material from "Philosophy of Computer Science (VtDat)" using provided context.
    Context information is below.
    ---------------------
    {context_str}
    ---------------------
    Given the context information and not prior knowledge, answer the query. Always provide an answer in the Danish language.
    Query: {query_str}
    Answer: 
    """,
)


class RAGQueryEngine(CustomQueryEngine):
    """RAG String Query Engine."""

    retriever: MyRetriever #BaseRetriever
    response_synthesizer: BaseSynthesizer
    llm: OpenAI
    qa_prompt: PromptTemplate

    def custom_query(self, query_str: str):
        #nodes = self.retriever.retrieve(query_str)#
        nodes = self.retriever.retrieve(query_str, max_documents=5)
        context_str = "\n\n".join([n.node.get_content() for n in nodes])
        print("Prompt:\n\n", qa_prompt.format(
            context_str=context_str, query_str=query_str))
        response = self.llm.complete(
            qa_prompt.format(context_str=context_str, query_str=query_str)
        )

        return str(response)


synthesizer = get_response_synthesizer(response_mode="compact")
query_engine = RAGQueryEngine(
    retriever = index.as_retriever(),
    response_synthesizer=synthesizer,
    llm=llm,
    qa_prompt=qa_prompt,
)

ValidationError: 1 validation error for RAGQueryEngine
retriever
  instance of MyRetriever expected (type=type_error.arbitrary_type; expected_arbitrary_type=MyRetriever)

In [45]:
queries = [
    #"What are some of the made-up characters impersonated by Michael Scott? Give a oneliner description for each character.",
    #"Who is the character that is known for his 'That's what she said' jokes in The Office?",
    #"Which character loves cats?",
    #"A character has a heart attack in an episode of the show. Who is it?",
    #"What character lives proudly on a farm?",
    #"What happens to Kevin's Famous Chili when he brings it to the office?",
    #"Why did Michael Scott play the Savannah murder game?",
    #"How many marriages are in the show?",
    "Hvad dækker P, L, A, C og E over i PLACE-akronymet af John Ziman over?"
]

query = queries[-1]

response = query_engine.query(query)
display(Markdown(f"{response}"))

InvalidCollectionException: Collection e3406e00-da5e-41a3-b052-410e1e72d3a9 does not exist.

## Simplest RAG implementation using LlamaIndex


In [36]:
query = "what is operational risk?"

In [26]:
llm = AzureOpenAI(
    model="gpt-4",
    deployment_name="gpt4",
    api_key=os.getenv("OPENAI_API_KEY"),  
    api_version=os.getenv("OPENAI_API_VERSION"), # https://learn.microsoft.com/en-us/azure/ai-services/openai/reference?WT.mc_id=AZ-MVP-5004796
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT")
)

# You need to deploy your own embedding model as well as your own chat completion model
embed_model = AzureOpenAIEmbedding(
    model="text-embedding-ada-002",
    deployment_name="text-embedding-ada-002",
    api_key=os.getenv("OPENAI_API_KEY"),  
    api_version=os.getenv("OPENAI_API_VERSION"), # https://learn.microsoft.com/en-us/azure/ai-services/openai/reference?WT.mc_id=AZ-MVP-5004796
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT")
)

# Fetch documents
# documents = SimpleDirectoryReader('./data/baseline-rag-pdf-docs').load_data()
documents = SimpleDirectoryReader('./data/docs').load_data()

from llama_index.core import Settings

Settings.llm = llm
Settings.embed_model = embed_model

# build VectorStoreIndex that takes care of chunking documents
# and encoding chunks to embeddings for future retrieval
index = VectorStoreIndex.from_documents(documents=documents)#, 
                                        #api_key=os.environ['OPENAI_API_KEY'],
                                        #base_url=os.environ['AZURE_API_BASE'],
                                        #app_url=os.environ['AZURE_OPENAI_ENDPOINT'])
#index = VectorStoreIndex.from_documents(documents=documents, embed_model=embed_model, llm=llm, verbose=True)

# The QueryEngine class is equipped with the generator
# and facilitates the retrieval and generation steps
query_engine = index.as_query_engine()
# query_engine = transformed_query_engine

# Use your Default RAG
response = query_engine.query(query)
display(Markdown(f"{response}"))

Risk refers to the potential for unexpected events or changes that could lead to negative outcomes. It is not synonymous with the size of a cost or loss. In everyday life, risk can be understood as the difference between a predictable or expected loss (a cost that has been budgeted for) and an unexpected cost (a catastrophic loss that is significantly beyond normal daily losses). The real risk lies in the variability of these costs, such as when they suddenly rise unexpectedly or when an unforeseen cost appears and disrupts planned expenditures.

In [34]:
response = transformed_query_engine.query(query)
display(Markdown(f"{response}"))

Generating hypothetical doc


<b>[VerboseHyDEQueryTransform]<b> Generated hypothetical document: Risk refers to the potential for loss or damage when a certain action or inaction is taken. It is a concept that denotes the probability of certain hazards causing harm. Risks are typically identified within a Risk Assessment, and can be anything that could potentially interfere with the achievement of objectives. The concept of risk is often categorized into different types such as financial risk, operational risk, strategic risk, and hazard risk. It is important to note that risk is inherent in all actions and decisions, as every action or decision carries some degree of uncertainty and potential for negative outcomes. Risk can be managed and mitigated, but not completely eliminated.

-------------------



Risk refers to the potential for unexpected events or changes that could lead to negative outcomes. It is not synonymous with the size of a cost or a loss. Some costs in daily life are large but are not considered a risk because they are predictable and are already accounted for in plans. The real risk lies in the possibility that these costs will suddenly increase unexpectedly, or that an unforeseen cost will emerge, affecting the resources set aside for expected expenses. The risk lies in the variability of these costs.

In [37]:
query_bundle = hyde(query)
hyde_doc = query_bundle.embedding_strs[0]

Generating hypothetical doc


<b>[VerboseHyDEQueryTransform]<b> Generated hypothetical document: Operational risk is a type of risk that a company faces in its daily business operations. It is the risk of loss resulting from inadequate or failed internal processes, people and systems, or from external events. This can include a wide range of potential risks and losses, from minor incidents like accidental data entry errors, to major events like fraud or a natural disaster. Operational risk can also include legal risks, such as the risk of loss due to failure to comply with laws, regulations, or contractual obligations. It does not include strategic or reputational risk. Managing operational risk effectively is crucial for businesses to maintain profitability and reputation.

-------------------



In [68]:
query_bundle

QueryBundle(query_str='what is risk', image_path=None, custom_embedding_strs=['Risk refers to the potential for loss or damage when a particular action or inaction is taken. It is essentially the level of uncertainty one is willing to take in order to achieve a particular outcome. Risk can be associated with various aspects of life, including financial investments, business decisions, health issues, and even personal choices. It is often measured in terms of the likelihood of an adverse event occurring and the potential severity of the consequences. Risk can be managed and mitigated, but not completely eliminated. It is an inherent part of decision-making and can often lead to either rewards or losses.', 'what is risk'], embedding=None)

In [62]:
query_engine

<llama_index.core.query_engine.retriever_query_engine.RetrieverQueryEngine at 0x7fd2d6506fd0>

In [61]:
transformed_query_engine

<llama_index.core.query_engine.transform_query_engine.TransformQueryEngine at 0x7fd2f2779ed0>

In [37]:

from ironpdf import *

# Load existing PDF document
pdf = PdfDocument.FromFile("C:/Users/jach/Documents/JacobsDocs/KU/VtDat/Tekster/Uge3/Kuhn_objektivitet.pdf")

# Extract text from specific page in the document
page_text = pdf.ExtractTextFromPage(1)


LicensingException: IronPDF must be licensed for development.
Please receive a free trial key instantly to your email by visiting:
https://ironpdf.com/licensing/?utm_source=product
   at IronPdf.License.kpbbda()
   at IronPdf.PdfDocument.axjqme(IEnumerable`1 ralkye, TextExtractionOrder ralkyf)
   at IronPdf.PdfDocument.ExtractTextFromPages(IEnumerable`1 PageIndices, TextExtractionOrder Order)
   at IronPdf.PdfDocument.ExtractTextFromPage(Int32 PageIndex, TextExtractionOrder Order)
   at IronPdf.PdfDocument.ExtractTextFromPage(Int32 PageIndex)
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
   at System.Reflection.MethodBaseInvoker.InvokeDirectByRefWithFewArgs(Object obj, Span`1 copyOfArgs, BindingFlags invokeAttr)