# Chat with PDF - test, evaluation and experimentation

# Develop a flow

https://microsoft.github.io/promptflow/how-to-guides/develop-a-dag-flow/quick-start.html





## env variables
```
AZURE_SUBSCRIPTION_ID= SUBCRIPTION_ID
AZURE_RESOURCE_GROUP= RESOURCE GROUP
AZUREAI_PROJECT_NAME= AZURE AI STUDIO PROJECT NAME
AZURE_OPENAI_CONNECTION_NAME=  Azure OPENai Connection NAME

AZURE_OPENAI_ENDPOINT= AZURE OPENAI ENDPOINT URL
AZURE_OPENAI_CHAT_DEPLOYMENT= AZURE OPENAI CHAT DEPLOYMENT NAME
AZURE_OPENAI_API_VERSION= AZURE OPENAI CHAT DEPLOYMENT VERSION 
AZURE_OPENAI_API_KEY= AZURE OPENAI KEY

RESOURCE_GROUP= RESOURCE GROUP NAME = AZURE_RESOURCE_GROUP
SUBSCRIPTION_ID=  AZURE_SUBSCRIPTION_ID
AZUREML_WORKSPACE_NAME= AZURE ML NAME 
TENANTID= TENANT ID SERVICE PRINCIPaL ACCOUNT
AZURE_CLIENT_ID=   CLIENT  ID SERVICE PRINCIPaL ACCOUNT
AZURE_TENANT_ID=  TENANT ID SERVICE PRINCIPaL ACCOUNT = TENANTID
AZURE_CLIENT_SECRET=  SERVICE PRINCIPAL ACCOUNT SECRET

OPENAI_API_TYPE=azure
OPENAI_API_BASE=https://open-ai-olonok.openai.azure.com/
OPENAI_API_KEY=""
OPENAI_API_VERSION=2024-02-15-preview
EMBEDDING_MODEL_DEPLOYMENT_NAME=text-embedding-ada-002
CHAT_MODEL_DEPLOYMENT_NAME=gpt-4
PROMPT_TOKEN_LIMIT=3000
MAX_COMPLETION_TOKENS=1024
CHUNK_SIZE=256
CHUNK_OVERLAP=64
VERBOSE=False


```

### Requirements

```

python-dotenv
bs4
azure-identity
azure-search-documents==11.4.0
promptflow-tracing==1.11.0
promptflow-evals==0.3.0
jinja2
aiohttp
azure-ai-ml==1.16.0
promptflow[azure]==1.11.0
promptflow-tools==1.4.0
promptflow-rag==0.1.0
jinja2
aiohttp


# The following dependencies are required for provisioning

# openai SDK
openai==1.13.3

# azure dependencies
azure-core==1.30.1
azure-mgmt-authorization==4.0.0
azure-mgmt-resource==23.0.1
azure-mgmt-search==9.1.0
azure-mgmt-cognitiveservices==13.5.0

# utilities
omegaconf-argparse==1.0.1
omegaconf==2.3.0
pydantic>=2.6

```

In [None]:
%pip install -r requirements.txt

## 1. Create connections
Connection in prompt flow is for managing settings of your application behaviors incl. how to talk to different services (Azure OpenAI for example).

In [6]:
from dotenv import load_dotenv, dotenv_values

load_dotenv("../.env")

True

In [8]:
config = dotenv_values("../.env")

In [28]:
import promptflow
from promptflow.client import PFClient
pf = PFClient()

# List all the available connections
for c in pf.connections.list():
    print(c.name + " (" + c.type + ")")

open_ai_connection (AzureOpenAI)


You will need to have a connection named "open_ai_connection" to run the chat_with_pdf flow.

In [29]:
# create needed connection
from promptflow.entities import AzureOpenAIConnection, OpenAIConnection

try:
    conn_name = "open_ai_connection"
    conn = pf.connections.get(name=conn_name)
    print("using existing connection")
except:
    # Follow https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=web-portal to create an Azure OpenAI resource.
    connection = AzureOpenAIConnection(
        name="open_ai_connection",
        api_key=config.get("OPENAI_API_KEY"),
        api_base=config.get("OPENAI_API_BASE"),
        api_type="azure",
        api_version=config.get("OPENAI_API_VERSION"),
    )

    # use this if you have an existing OpenAI account
    # connection = OpenAIConnection(
    #     name=conn_name,
    #     api_key="<user-input>",
    # )
    conn = pf.connections.create_or_update(connection)
    print("successfully created connection")

print(conn)

using existing connection
auth_mode: key
name: open_ai_connection
module: promptflow.connections
created_date: '2024-11-16T10:59:01.985404'
last_modified_date: '2024-11-16T10:59:01.985404'
type: azure_open_ai
api_key: '******'
api_base: https://open-ai-olonok.openai.azure.com/
api_type: azure
api_version: 2024-02-15-preview



## 2. Test the flow

**Note**: this sample uses `predownloaded PDFs` and `prebuilt FAISS Index` to speed up execution time.
You can remove the folders to start a fresh run.

In [30]:
# ./chat_with_pdf/.pdfs/ stores predownloaded PDFs
# ./chat_with_pdf/.index/ stores prebuilt index files

output = pf.flows.test(
    "./flow.dag.yaml",
    inputs={
        "chat_history": [],
        "pdf_url": "https://arxiv.org/pdf/1810.04805.pdf",
        "question": "what are  Unsupervised Feature-based Approaches?",
    },
)


Prompt flow service has started...
2024-11-16 19:24:55 +0000   34368 execution.flow     INFO     Start executing nodes in thread pool mode.
2024-11-16 19:24:55 +0000   34368 execution.flow     INFO     Start to run 6 nodes with concurrency level 16.
2024-11-16 19:24:55 +0000   34368 execution.flow     INFO     Executing node setup_env. node run id: bdaa42f8-0244-4011-8baa-36b328e92891_setup_env_0
2024-11-16 19:24:55 +0000   34368 execution.flow     INFO     Node setup_env completes.
2024-11-16 19:24:55 +0000   34368 execution.flow     INFO     Executing node download_tool. node run id: bdaa42f8-0244-4011-8baa-36b328e92891_download_tool_0
2024-11-16 19:24:55 +0000   34368 execution.flow     INFO     Executing node rewrite_question_tool. node run id: bdaa42f8-0244-4011-8baa-36b328e92891_rewrite_question_tool_0
2024-11-16 19:24:55 +0000   34368 execution.flow     INFO     [download_tool in line 0 (index starts from 0)] stdout> Pdf already exists in D:\repos2\rag-data-openai-python-promptf



2024-11-16 19:24:56 +0000   34368 execution.flow     INFO     [rewrite_question_tool in line 0 (index starts from 0)] stdout> Rewritten question: What are unsupervised feature-based approaches?




2024-11-16 19:24:56 +0000   34368 execution.flow     INFO     Node rewrite_question_tool completes.
2024-11-16 19:24:56 +0000   34368 execution.flow     INFO     Executing node find_context_tool. node run id: bdaa42f8-0244-4011-8baa-36b328e92891_find_context_tool_0
2024-11-16 19:24:57 +0000   34368 execution.flow     INFO     Node find_context_tool completes.
2024-11-16 19:24:57 +0000   34368 execution.flow     INFO     Executing node qna_tool. node run id: bdaa42f8-0244-4011-8baa-36b328e92891_qna_tool_0
2024-11-16 19:25:01 +0000   34368 execution.flow     INFO     Node qna_tool completes.


In [31]:
print(output)

{'answer': "Les approches basées sur les caractéristiques non supervisées impliquent l'utilisation de paramètres de plongement de mots pré-entraînés à partir de textes non étiquetés. Ces représentations sont ensuite ajustées pour une tâche en aval spécifique. Les avantages de ces approches incluent la nécessité de paramètres moins nombreux à apprendre à partir de zéro.", 'context': ['ngio. 2010.\nWord representations: A simple and general method\nfor semi-supervised learning. In Proceedings of the\n48th Annual Meeting of the Association for Compu-\ntational Linguistics , ACL ’10, pages 384–394.\nAshish Vaswani, Noam Shazeer, Niki Parmar, Jakob\nUszkoreit, Llion Jones, Aidan N Gomez, Lukasz\nKaiser, and Illia Polosukhin. 2017. Attention is all\nyou need. In Advances in Neural Information Pro-\ncessing Systems , pages 6000–6010.\nPascal Vincent, Hugo Larochelle, Yoshua Bengio, and\nPierre-Antoine Manzagol. 2008. Extracting and\ncomposing robust features with denoising autoen-\ncoders. In

## 3. Run the flow with a data file

In [32]:
flow_path = "."
data_path = "./data/bert-paper-qna-3-line.jsonl"

config_2k_context = {
    "EMBEDDING_MODEL_DEPLOYMENT_NAME": "text-embedding-ada-002",
    "CHAT_MODEL_DEPLOYMENT_NAME": "gpt-4",  # change this to the name of your deployment if you're using Azure OpenAI
    "PROMPT_TOKEN_LIMIT": 2000,
    "MAX_COMPLETION_TOKENS": 256,
    "VERBOSE": True,
    "CHUNK_SIZE": 1024,
    "CHUNK_OVERLAP": 64,
}

column_mapping = {
    "question": "${data.question}",
    "pdf_url": "${data.pdf_url}",
    "chat_history": "${data.chat_history}",
    "config": config_2k_context,
}
run_2k_context = pf.run(flow=flow_path, data=data_path, column_mapping=column_mapping)
pf.stream(run_2k_context)

print(run_2k_context)

[2024-11-16 19:30:02 +0000][promptflow._sdk._orchestrator.run_submitter][INFO] - Submitting run chat_with_pdf_variant_0_20241116_193001_832639, log path: C:\Users\User\.promptflow\.runs\chat_with_pdf_variant_0_20241116_193001_832639\logs.txt


Prompt flow service has started...
You can view the traces in local from http://127.0.0.1:23333/v1.0/ui/traces/?#run=chat_with_pdf_variant_0_20241116_193001_832639
2024-11-16 19:30:02 +0000   34368 execution.bulk     INFO     Current thread is not main thread, skip signal handler registration in BatchEngine.
2024-11-16 19:30:02 +0000   34368 execution.bulk     INFO     Current system's available memory is 37806.98828125MB, memory consumption of current process is 307.45703125MB, estimated available worker count is 37806.98828125/307.45703125 = 122
2024-11-16 19:30:02 +0000   34368 execution.bulk     INFO     Set process count to 3 by taking the minimum value among the factors of {'default_worker_count': 4, 'row_count': 3, 'estimated_worker_count_based_on_memory_usage': 122}.
2024-11-16 19:30:10 +0000   34368 execution.bulk     INFO     Process name(SpawnProcess-24)-Process id(36792)-Line number(0) start execution.
2024-11-16 19:30:10 +0000   34368 execution.bulk     INFO     Process na

In [16]:
pf.get_details(run_2k_context)

Unnamed: 0,inputs.question,inputs.pdf_url,inputs.chat_history,inputs.config,inputs.line_number,outputs.answer,outputs.context
0,What is the main difference between BERT and p...,https://arxiv.org/pdf/1810.04805.pdf,[],{'EMBEDDING_MODEL_DEPLOYMENT_NAME': 'text-embe...,0,The main difference between BERT and earlier l...,[BERT: Pre-training of Deep Bidirectional Tran...
1,What is the size of the vocabulary used by BERT?,https://arxiv.org/pdf/1810.04805.pdf,[],{'EMBEDDING_MODEL_DEPLOYMENT_NAME': 'text-embe...,1,I don't know.,"[E (L=12, H=768, A=12, Total Param-\neters=110..."
2,论文写作中论文引言有什么注意事项？,https://grs.pku.edu.cn/docs/2018-03/2018030108...,[],{'EMBEDDING_MODEL_DEPLOYMENT_NAME': 'text-embe...,2,在论文写作中，论文引言的要点应该包括：简洁地提出研究问题或研究主题，阐明研究的重要性和必要性...,[须言之成理，论据可靠 ，严格遵循本学科国际通行的学术规范。 内容包括：第 一\n章引言（或...


## 4. Evaluate the "groundedness"
The `eval-groundedness flow` is using ChatGPT/GPT4 model to grade the answers generated by chat-with-pdf flow.

In [19]:
eval_groundedness_flow_path = "../evaluation/eval-groundedness/"
eval_groundedness_2k_context = pf.run(
    flow=eval_groundedness_flow_path,
    run=run_2k_context,
    column_mapping={
        "question": "${run.inputs.question}",
        "answer": "${run.outputs.answer}",
        "context": "${run.outputs.context}",
    },
    display_name="eval_groundedness_2k_context",
)
pf.stream(eval_groundedness_2k_context)

print(eval_groundedness_2k_context)

Prompt flow service has started...
You can view the traces in local from http://127.0.0.1:23333/v1.0/ui/traces/?#run=eval_groundedness_variant_0_20241116_120943_262113


[2024-11-16 12:09:43 +0000][promptflow._sdk._orchestrator.run_submitter][INFO] - Submitting run eval_groundedness_variant_0_20241116_120943_262113, log path: C:\Users\User\.promptflow\.runs\eval_groundedness_variant_0_20241116_120943_262113\logs.txt


2024-11-16 12:09:44 +0000   34368 execution.bulk     INFO     Current thread is not main thread, skip signal handler registration in BatchEngine.
2024-11-16 12:09:44 +0000   34368 execution.bulk     INFO     Current system's available memory is 39408.26171875MB, memory consumption of current process is 290.78125MB, estimated available worker count is 39408.26171875/290.78125 = 135
2024-11-16 12:09:44 +0000   34368 execution.bulk     INFO     Set process count to 3 by taking the minimum value among the factors of {'default_worker_count': 4, 'row_count': 3, 'estimated_worker_count_based_on_memory_usage': 135}.
2024-11-16 12:09:52 +0000   34368 execution.bulk     INFO     Process name(SpawnProcess-6)-Process id(35100)-Line number(0) start execution.
2024-11-16 12:09:52 +0000   34368 execution.bulk     INFO     Process name(SpawnProcess-8)-Process id(17748)-Line number(1) start execution.
2024-11-16 12:09:52 +0000   34368 execution.bulk     INFO     Process name(SpawnProcess-7)-Process id(

In [20]:
pf.get_details(eval_groundedness_2k_context)

Unnamed: 0,inputs.question,inputs.answer,inputs.context,inputs.line_number,outputs.groundedness
0,What is the main difference between BERT and p...,The main difference between BERT and earlier l...,['BERT: Pre-training of Deep Bidirectional Tra...,0,10
1,What is the size of the vocabulary used by BERT?,I don't know.,"['E (L=12, H=768, A=12, Total Param-\neters=11...",1,1
2,论文写作中论文引言有什么注意事项？,在论文写作中，论文引言的要点应该包括：简洁地提出研究问题或研究主题，阐明研究的重要性和必要性...,['须言之成理，论据可靠 ，严格遵循本学科国际通行的学术规范。 内容包括：第 一\n章引言（...,2,3


In [21]:
pf.get_metrics(eval_groundedness_2k_context)

{'groundedness': 4.666666666666667}

In [22]:
pf.visualize(eval_groundedness_2k_context)



The HTML file is generated at 'C:\\Users\\User\\AppData\\Local\\Temp\\pf-visualize-detail-vdjdgjdd.html'.
Trying to view the result in a web browser...
Successfully visualized from the web browser.


You will see a web page like this. It gives you detail about how each row is graded and even the details how the evaluation run executes:
![pf-visualize-screenshot](./media/chat-with-pdf/pf-visualize-screenshot.png)

## 5. Try a different configuration and evaluate again - experimentation

NOTE: since we only use 3 lines of test data in this example, and because of the non-deterministic nature of LLMs, don't be surprised if you see exact same metrics when you run this process.

In [23]:
config_3k_context = {
    "EMBEDDING_MODEL_DEPLOYMENT_NAME": "text-embedding-ada-002",
    "CHAT_MODEL_DEPLOYMENT_NAME": "gpt-4",  # change this to the name of your deployment if you're using Azure OpenAI
    "PROMPT_TOKEN_LIMIT": 3000,
    "MAX_COMPLETION_TOKENS": 256,
    "VERBOSE": True,
    "CHUNK_SIZE": 1024,
    "CHUNK_OVERLAP": 64,
}

run_3k_context = pf.run(flow=flow_path, data=data_path, column_mapping=column_mapping)
pf.stream(run_3k_context)

print(run_3k_context)

[2024-11-16 12:13:00 +0000][promptflow._sdk._orchestrator.run_submitter][INFO] - Submitting run chat_with_pdf_variant_0_20241116_121300_021422, log path: C:\Users\User\.promptflow\.runs\chat_with_pdf_variant_0_20241116_121300_021422\logs.txt


Prompt flow service has started...
You can view the traces in local from http://127.0.0.1:23333/v1.0/ui/traces/?#run=chat_with_pdf_variant_0_20241116_121300_021422
2024-11-16 12:13:00 +0000   34368 execution.bulk     INFO     Current thread is not main thread, skip signal handler registration in BatchEngine.
2024-11-16 12:13:00 +0000   34368 execution.bulk     INFO     Current system's available memory is 39546.0MB, memory consumption of current process is 301.6015625MB, estimated available worker count is 39546.0/301.6015625 = 131
2024-11-16 12:13:00 +0000   34368 execution.bulk     INFO     Set process count to 3 by taking the minimum value among the factors of {'default_worker_count': 4, 'row_count': 3, 'estimated_worker_count_based_on_memory_usage': 131}.
2024-11-16 12:13:08 +0000   34368 execution.bulk     INFO     Process name(SpawnProcess-12)-Process id(10896)-Line number(0) start execution.
2024-11-16 12:13:08 +0000   34368 execution.bulk     INFO     Process name(SpawnProcess-

In [24]:
eval_groundedness_3k_context = pf.run(
    flow=eval_groundedness_flow_path,
    run=run_3k_context,
    column_mapping={
        "question": "${run.inputs.question}",
        "answer": "${run.outputs.answer}",
        "context": "${run.outputs.context}",
    },
    display_name="eval_groundedness_3k_context",
)
pf.stream(eval_groundedness_3k_context)

print(eval_groundedness_3k_context)

[2024-11-16 12:14:27 +0000][promptflow._sdk._orchestrator.run_submitter][INFO] - Submitting run eval_groundedness_variant_0_20241116_121427_369215, log path: C:\Users\User\.promptflow\.runs\eval_groundedness_variant_0_20241116_121427_369215\logs.txt


Prompt flow service has started...
You can view the traces in local from http://127.0.0.1:23333/v1.0/ui/traces/?#run=eval_groundedness_variant_0_20241116_121427_369215
2024-11-16 12:14:28 +0000   34368 execution.bulk     INFO     Current thread is not main thread, skip signal handler registration in BatchEngine.
2024-11-16 12:14:28 +0000   34368 execution.bulk     INFO     Current system's available memory is 39504.9296875MB, memory consumption of current process is 303.93359375MB, estimated available worker count is 39504.9296875/303.93359375 = 129
2024-11-16 12:14:28 +0000   34368 execution.bulk     INFO     Set process count to 3 by taking the minimum value among the factors of {'default_worker_count': 4, 'row_count': 3, 'estimated_worker_count_based_on_memory_usage': 129}.
2024-11-16 12:14:36 +0000   34368 execution.bulk     INFO     Process name(SpawnProcess-18)-Process id(37144)-Line number(0) start execution.
2024-11-16 12:14:36 +0000   34368 execution.bulk     INFO     Process 

In [25]:
pf.get_details(eval_groundedness_3k_context)

Unnamed: 0,inputs.question,inputs.answer,inputs.context,inputs.line_number,outputs.groundedness
0,What is the main difference between BERT and p...,BERT se distingue de los modelos anteriores de...,['BERT: Pre-training of Deep Bidirectional Tra...,0,10
1,What is the size of the vocabulary used by BERT?,I don't know.,"['E (L=12, H=768, A=12, Total Param-\neters=11...",1,1
2,论文写作中论文引言有什么注意事项？,在撰写论文时，引言部分应严格遵循本学科国际通行的学术规范，内容需要有逻辑性。标题格式应采用“...,['须言之成理，论据可靠 ，严格遵循本学科国际通行的学术规范。 内容包括：第 一\n章引言（...,2,9


In [26]:
pf.visualize([eval_groundedness_2k_context, eval_groundedness_3k_context])

The HTML file is generated at 'C:\\Users\\User\\AppData\\Local\\Temp\\pf-visualize-detail-pocru2yp.html'.
Trying to view the result in a web browser...
Successfully visualized from the web browser.
