# Q+A w/out context

This notebook will help you set up an input file (`CSV`) to use to run Azure OpenAI evaluation requiring data with `queston` and `answer` pairs. This is not designed to generate massive testing data but should work to generate spreadsheets with a few hundred (give or take) rows.

### Prerequisites

#### Input file

`csv` file with one column (with a header of `question`). Save the input file in the `input_files` folder
 
| question    | 
| -------- | 
| What color is green?  |
| Where am I?           |
| Do dogs like cats?    | 



#### Environment Variables

You'll need to first create a `.env` file in the root of this directory containing values for `DC_CHAT_URL` and `DC_API_TOKEN` as shown in the `env.example` file. 

- To obtain the `DC_API_TOKEN` 
   - log into the Digital Collections website (either staging or production will work)
   - then visit the corresponding staging or production API route and copy the token into the `.env` file
     - Production: `https://api.dc.library.northwestern.edu/api/v2/auth/token`
     - Staging: `https://dcapi.rdc-staging.library.northwestern.edu/api/v2/auth/token`
   - Note that these tokens are by default good for 1 day so you'll need to redo these steps once it expires
- `DC_CHAT_URL`: Decide whether you want to hit the production or staging endpoing and use one of these values:
  - Staging `https://pimtkveo5ev4ld3ihe4qytadxe0jvcuz.lambda-url.us-east-1.on.aws`
  - Production `https://hdtl6p2qzfxszvbhdb7dyunuxe0dgexo.lambda-url.us-east-1.on.aws`

## Output

CSV containing question and answer columns

| question              | answer         |
| --------              | -------        |
| What color is green?  | Green.         |
| Where am I?           | Here.          |
| Do dogs like cats?    | No.            | 


In [None]:
# install required packages

%pip install requests

In [3]:
# setup imports
import os
import random
import json, requests
import csv
from requests.exceptions import HTTPError
from datetime import datetime

# load environment variables
DC_CHAT_URL = os.getenv('DC_CHAT_URL')
DC_API_TOKEN = os.getenv('DC_API_TOKEN')

In [4]:
# put your input file inside the `input_files` folder
# put your input filename here
input_filename = '5-questions.csv'

In [5]:
# function that fetches the chat responses

def get_answer(question):
    url = DC_CHAT_URL
    header = {'Content-Type': 'application/json'}
    
    body = {
        'message': 'chat',
        'auth': DC_API_TOKEN,
        'ref': 'DEV-TEAM-TEST-' + str(random.random()),
        "question": question
    }
    print("Asking question: " + question)
    
    
    try:
        response = requests.post(url, json.dumps(body), headers=header).json()
        return response['answer']
    except Exception as err:
        print(f"Other error occurred: {err}")
        return "--ERROR--" 

In [None]:
# Run cell to generate file

input_base_path = 'input_files/'
output_base_path = 'output_files/'
output_filename = os.path.join(output_base_path, f"{os.path.splitext(input_filename)[0]}-{datetime.now().strftime("%Y%m%d%H%M%S")}.csv")
csvfile = csv.DictReader(open(os.path.join(input_base_path, input_filename), "r", newline='', encoding='utf_8_sig') )

with open(output_filename, "w+", newline='') as csv_file1:
    fieldnames = ['question', 'answer']
    writer = csv.DictWriter(csv_file1, fieldnames=fieldnames)
    writer.writeheader()
    for row in csvfile: 
        question = row['question']
        answer = get_answer(question).replace("\n", "\\n")
        row['answer'] = answer
        writer.writerow(row)

print(f"\n\nSuccess. File written to: {output_filename}")