# Test prompt versions for issue #97

Notebook for testing prompt versions in the AI Assistant to address issue #97. 

This notebook provides an initial case study of how to generate a testing dataset targeting a particular issue. It compares two prompt versions using an LLM, and calculates an overall success score for each prompt on the dataset.

In [2]:
import os
from dotenv import load_dotenv
import anthropic

load_dotenv()
ANTHROPIC_API_KEY = os.getenv("ANTHROPIC_API_KEY")

client = anthropic.Anthropic(api_key=ANTHROPIC_API_KEY)

In [93]:
# Prompt for generating non-OpenFn-related questions

user_prompt = """
Generate one general short question a user might ask an AI assistant about each of the platforms listed below to understand what they do.

Platforms:

asana
azure-storage
beyonic
bigquery
cartodb
cht
collections
commcare
common
dhis2
dynamics
facebook
fhir
fhi
fhir-ndr-et
godata
googlehealthcare
googlesheets
hive
http
khanacademy
kobotoolbox
magpi
mailchimp
mailgun
maximo
medicmobile
mogli
mojatax
mongodb
msgraph
mssql
mysql
nexmo
ocl
odk
openfn
openhim
openimis
openlmis
openmrs
openspp
postgresql
primero
progres
rapidpro
redis
resourcemap
salesforce
satusehat
sftp
smpp
surveycto
telerivet
template
testing
twilio
vtiger
zoho

Output your answer with new lines without numbers or bullet points.

<example>
What is Salesforce?
What does Khan Academy do?
What is the difference between fhir and fhir-ndr-et?
"""

In [94]:
# Generate questions

message = client.messages.create(
    model="claude-3-5-sonnet-20241022", # TODO change to cheaper model
    max_tokens=1000,
    temperature=0,
    system="You are an AI programming assistant. Follow the user's requirements carefully and to the letter.",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": user_prompt
                }
            ]
        }
    ]
)
print(message.content[0].text[:50])

INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
How can Asana help me manage my team's projects?
W


In [95]:
# Format generated questions as a list

generated_questions = [q for q in message.content[0].text.split("\n") if q]

In [None]:
# Import tenacity to implement waiting after failed api calls

from tenacity import (
    retry,
    stop_after_attempt,
    wait_random_exponential,
)  # for exponential backoff

In [121]:
from apollo.services.job_chat.job_chat import generate
from apollo.services.job_chat.prompt import build_prompt
from apollo.services.util import DictObj

# @retry(wait=wait_random_exponential(min=1, max=90), stop=stop_after_attempt(20))

@retry(wait=wait_random_exponential(multiplier=1, max=60), stop=stop_after_attempt(20))
def generate_apollo_format(content, history="", adaptor="@openfn/language-googlesheets@3.0.5", model="claude-3-5-sonnet-20240620"):
  """Use Apollo prompt formatting templates to generate a Claude answer."""

  context = DictObj({
      "expression": "// write your job code here",
      "adaptor": adaptor
    })
  print(content)
  system_message, prompt = build_prompt(content, history, context)
  # print(system_message)
  # print(prompt)

  message = client.beta.prompt_caching.messages.create(
      max_tokens=1024, messages=prompt, model=model, system=system_message
  )
  return message.content[0].text

In [64]:
generate_apollo_format("What is Azure Storage and how does it work?")

[{'type': 'text', 'text': "\nYou are a software engineer helping a non-expert user write a job for OpenFn,\nthe world's leading digital public good for workflow automation.\n\nDO NOT answer questions unrelated to OpenFn,\njavascript programming, and workflow automation.\n\nYour responses short be short, accurate and friendly unless otherwise instructed.\n\nDo not thank the user or be obsequious.\n\nAddress the user directly.\n\nAdditional context is attached.\n"}, {'type': 'text', 'text': "<job_writing_guide>\nAn OpenFn Job is written in a DSL which is very similar to Javascript.\n\nJob code does not use import statements or async/await.\n\nJob code must only contain function calls at the top level.\n\nEach job is associated with an adaptor, which provides functions for the job.\nAll jobs have the fn() and each() function, which are very important.\n\nDO NOT use the `alterState()` function. Use `fn()` instead.\n\nThe adaptor API may be attached.\n\nThe functions provided by an adaptor 

"I apologize, but I don't have information about Azure Storage or how it works. I'm an AI assistant focused specifically on helping write OpenFn jobs and automations. I can't provide details about other cloud services or technologies outside of OpenFn. Is there something I can help you with related to writing OpenFn jobs or using the Google Sheets adaptor?"

In [78]:
# New edited system prompt

system_prompt_role_v2 ="""
You are a software engineer helping a non-expert user write a job for OpenFn,
the world's leading digital public good for workflow automation.

Where reasonable, assume questions are related to workflow automation, 
professional platforms or programming. You may provide general information around these topics, 
e.g. general programming assistance unrelated to job writing.
If a question is entirely irrelevant, do not answer it.

Your responses short be short, accurate and friendly unless otherwise instructed.

Do not thank the user or be obsequious.

Address the user directly.

Additional context is attached.
"""

# formatted like in apollo
system_prompt_v2 = [{'type': 'text', 'text': system_prompt_role_v2}, {'type': 'text', 'text': "<job_writing_guide>\nAn OpenFn Job is written in a DSL which is very similar to Javascript.\n\nJob code does not use import statements or async/await.\n\nJob code must only contain function calls at the top level.\n\nEach job is associated with an adaptor, which provides functions for the job.\nAll jobs have the fn() and each() function, which are very important.\n\nDO NOT use the `alterState()` function. Use `fn()` instead.\n\nThe adaptor API may be attached.\n\nThe functions provided by an adaptor are called Operations. \n\nAn Operation is a factory function which returns a function that takes state and returns state, like this:\n```\nconst myOperation = (arg) => (state) => { /* do something with arg and state */ return state; }\n```\n<examples>\n<example>\nHere's how we issue a GET request with the http adaptor:\n```\nget('/patients');\n```\nThe first argument to get is the path to request from (the configuration will tell\nthe adaptor what base url to use). In this case we're passing a static string,\nbut we can also pass a value from state:\n```\nget(state => state.endpoint);\n```\n</example>\n<example>\nExample job code with the HTTP adaptor:\n```\nget('/patients');\nfn(state => {\n  const patients = state.data.map(p => {\n    return { ...p, enrolled: true }\n  });\n\n  return { ...state, data: { patients } };\n})\npost('/patients', dataValue('patients'));\n</example>\n<example>\n```\nExample job code with the Salesforce adaptor:\n```\neach(\n  '$.form.participants[*]',\n  upsert('Person__c', 'Participant_PID__c', state => ({\n    Participant_PID__c: state.pid,\n    First_Name__c: state.participant_first_name,\n    Surname__c: state.participant_surname,\n  }))\n);\n```\n</example>\n<example>\nExample job code with the ODK adaptor:\n```\ncreate(\n  'ODK_Submission__c',\n  fields(\n    field('Site_School_ID_Number__c', dataValue('school')),\n    field('Date_Completed__c', dataValue('date')),\n    field('comments__c', dataValue('comments')),\n    field('ODK_Key__c', dataValue('*meta-instance-id*'))\n  )\n);\n```\n</example>\n<examples>\n</job_writing_guide>"}, {'type': 'text', 'text': '.', 'cache_control': {'type': 'ephemeral'}}, {'type': 'text', 'text': '<adaptor>The user is using the OpenFn @openfn/language-googlesheets@3.0.5 adaptor. Use functions provided by its API.Typescript definitions for doc @openfn/language-common/**\n * Execute a sequence of operations.\n * Main outer API for executing expressions.\n * @public\n * @example\n *  execute(\n *    create(\'foo\'),\n *    delete(\'bar\')\n *  )\n * @private\n * @param {Operations} operations - Operations to be performed.\n * @returns {Promise}\n */\nexport function execute(...operations: Operations): Promise<any>;\n/**\n * alias for "fn()"\n * @function\n * @param {Function} func is the function\n * @returns {Operation}\n */\nexport function alterState(func: Function): Operation;\n/**\n * Creates a custom step (or operation) for more flexible job writing.\n * @public\n * @function\n * @example\n * fn(state => {\n *   // do some things to state\n *   return state;\n * });\n * @param {Function} func is the function\n * @returns {Operation}\n */\nexport function fn(func: Function): Operation;\n/**\n * A custom operation that will only execute the function if the condition returns true\n * @public\n * @function\n * @example\n * fnIf((state) => state?.data?.name, get("https://example.com"));\n * @param {Boolean} condition - The condition that returns true\n * @param {Operation} operation - The operation needed to be executed.\n * @returns {Operation}\n */\nexport function fnIf(condition: boolean, operation: Operation): Operation;\n/**\n * Picks out a single value from a JSON object.\n * If a JSONPath returns more than one value for the reference, the first\n * item will be returned.\n * @public\n * @function\n * @example\n * jsonValue({ a:1 }, \'a\')\n * @param {object} obj - A valid JSON object.\n * @param {String} path - JSONPath referencing a point in given JSON object.\n * @returns {Operation}\n */\nexport function jsonValue(obj: object, path: string): Operation;\n/**\n * Picks out a single value from source data.\n * If a JSONPath returns more than one value for the reference, the first\n * item will be returned.\n * @public\n * @function\n * @example\n * sourceValue(\'$.key\')\n * @param {String} path - JSONPath referencing a point in `state`.\n * @returns {Operation}\n */\nexport function sourceValue(path: string): Operation;\n/**\n * Picks out a value from source data.\n * Will return whatever JSONPath returns, which will always be an array.\n * If you need a single value use `sourceValue` instead.\n * @public\n * @function\n * @example\n * source(\'$.key\')\n * @param {String} path - JSONPath referencing a point in `state`.\n * @returns {Array.<String|Object>}\n */\nexport function source(path: string): Array<string | any>;\n/**\n * Ensures a path points at the data.\n * @public\n * @function\n * @example\n * dataPath(\'key\')\n * @param {string} path - JSONPath referencing a point in `data`.\n * @returns {string}\n */\nexport function dataPath(path: string): string;\n/**\n * Picks out a single value from the source data object—usually `state.data`.\n * If a JSONPath returns more than one value for the reference, the first\n * item will be returned.\n * @public\n * @function\n * @example\n * dataValue(\'key\')\n * @param {String} path - JSONPath referencing a point in `data`.\n * @returns {Operation}\n */\nexport function dataValue(path: string): Operation;\n/**\n * Ensures a path points at references.\n * @public\n * @function\n * @example\n * referencePath(\'key\')\n * @param {string} path - JSONPath referencing a point in `references`.\n * @returns {string}\n */\nexport function referencePath(path: string): string;\n/**\n * Picks out the last reference value from source data.\n * @public\n * @function\n * @example\n * lastReferenceValue(\'key\')\n * @param {String} path - JSONPath referencing a point in `references`.\n * @returns {Operation}\n */\nexport function lastReferenceValue(path: string): Operation;\n/**\n * Simple switcher allowing other expressions to use either a JSONPath or\n * object literals as a data source.\n * - JSONPath referencing a point in `state`\n * - Object Literal of the data itself.\n * - Function to be called with state.\n * @public\n * @function\n * @example\n * asData(\'$.key\'| key | callback)\n * @param {String|object|function} data\n * @param {object} state - The current state.\n * @returns {array}\n */\nexport function asData(data: string | object | Function, state: object): any[];\n/**\n * Iterates over an array of items and invokes an operation upon each one, where the state\n * object is _scoped_ so that state.data is the item under iteration.\n * The rest of the state object is untouched and can be referenced as usual.\n * You can pass an array directly, or use lazy state or a JSONPath string to\n * reference a slice of state.\n * @public\n * @function\n * @example <caption>Using lazy state ($) to iterate over items in state.data and pass each into an "insert" operation</caption>\n * each(\n *   $.data,\n *   // Inside the callback operation, `$.data` is scoped to the item under iteration\n *   insert("patient", {\n *     patient_name: $.data.properties.case_name,\n *     patient_id: $.data.case_id,\n *   })\n * );\n * @example <caption>Iterate over items in state.data and pass each one into an "insert" operation</caption>\n * each(\n *   $.data,\n *   insert("patient", (state) => ({\n *     patient_id: state.data.case_id,\n *     ...state.data\n *   }))\n * );\n * @example <caption>Using JSON path to iterate over items in state.data and pass each one into an "insert" operation</caption>\n * each(\n *   "$.data[*]",\n *   insert("patient", (state) => ({\n *     patient_name: state.data.properties.case_name,\n *     patient_id: state.data.case_id,\n *   }))\n * );\n * @param {DataSource} dataSource - JSONPath referencing a point in `state`.\n * @param {Operation} operation - The operation needed to be repeated.\n * @returns {Operation}\n */\nexport function each(dataSource: DataSource, operation: Operation): Operation;\n/**\n * Combines two operations into one\n * @public\n * @function\n * @example\n * combine(\n *   create(\'foo\'),\n *   delete(\'bar\')\n * )\n * @param {Operations} operations - Operations to be performed.\n * @returns {Operation}\n */\nexport function combine(...operations: Operations): Operation;\n/**\n * Adds data from a target object\n * @public\n * @function\n * @example\n * join(\'$.key\',\'$.data\',\'newKey\')\n * @param {String} targetPath - Target path\n * @param {String} sourcePath - Source path\n * @param {String} targetKey - Target Key\n * @returns {Operation}\n */\nexport function join(targetPath: string, sourcePath: string, targetKey: string): Operation;\n/**\n * Recursively resolves objects that have resolvable values (functions).\n * @public\n * @function\n * @param {object} value - data\n * @param {Function} [skipFilter] - a function which returns true if a value should be skipped\n * @returns {Operation}\n */\nexport function expandReferences(value: object, skipFilter?: Function): Operation;\n/**\n * Returns a key, value pair in an array.\n * @public\n * @function\n * @example\n * field(\'destination_field_name__c\', \'value\')\n * @param {string} key - Name of the field\n * @param {Value} value - The value itself or a sourceable operation.\n * @returns {Field}\n */\nexport function field(key: string, value: Value): Field;\n/**\n * Zips key value pairs into an object.\n * @public\n * @function\n * @example\n *  fields(list_of_fields)\n * @param {Fields} fields - a list of fields\n * @returns {Object}\n */\nexport function fields(...fields: Fields): any;\n/**\n * Merges fields into each item in an array.\n * @public\n * @example\n * merge(\n *   "$.books[*]",\n *   fields(\n *     field( "publisher", sourceValue("$.publisher") )\n *   )\n * )\n * @function\n * @public\n * @param {DataSource} dataSource\n * @param {Object} fields - Group of fields to merge in.\n * @returns {DataSource}\n */\nexport function merge(dataSource: DataSource, fields: any): DataSource;\n/**\n * Groups an array of objects by a specified key path.\n * @public\n * @example\n * const users = [\n *   { name: \'Alice\', age: 25, city: \'New York\' },\n *   { name: \'Bob\', age: 30, city: \'San Francisco\' },\n *   { name: \'Charlie\', age: 25, city: \'New York\' },\n *   { name: \'David\', age: 30, city: \'San Francisco\' }\n * ];\n * group(users, \'city\');\n * // state is { data: { \'New York\': [/Alice, Charlie/], \'San Francisco\': [ /Bob, David / ] }\n * @function\n * @public\n * @param {Object[]} arrayOfObjects - The array of objects to be grouped.\n * @param {string} keyPath - The key path to group by.\n * @param {function} callback - (Optional) Callback function\n * @returns {Operation}\n */\nexport function group(arrayOfObjects: any[], keyPath: string, callback?: Function): Operation;\n/**\n * Returns the index of the current array being iterated.\n * To be used with `each` as a data source.\n * @public\n * @function\n * @example\n * index()\n * @returns {DataSource}\n */\nexport function index(): DataSource;\n/**\n * Turns an array into a string, separated by X.\n * @public\n * @function\n * @example\n * field("destination_string__c", function(state) {\n *   return arrayToString(dataValue("path_of_array")(state), \', \')\n * })\n * @param {array} arr - Array of toString\'able primatives.\n * @param {string} separator - Separator string.\n * @returns {string}\n */\nexport function arrayToString(arr: any[], separator: string): string;\n/**\n * Ensures primitive data types are wrapped in an array.\n * Does not affect array objects.\n * @public\n * @function\n * @example\n * each(function(state) {\n *   return toArray( dataValue("path_of_array")(state) )\n * }, ...)\n * @param {any} arg - Data required to be in an array\n * @returns {array}\n */\nexport function toArray(arg: any): any[];\n/**\n * Prepares next state\n * @public\n * @function\n * @example\n * composeNextState(state, response)\n * @param {State} state - state\n * @param {Object} response - Response to be added\n * @returns {State}\n */\nexport function composeNextState(state: State, response: any): State;\n/**\n * Substitutes underscores for spaces and proper-cases a string\n * @public\n * @function\n * @example\n * field("destination_string__c", humanProper(state.data.path_to_string))\n * @param {string} str - String that needs converting\n * @returns {string}\n */\nexport function humanProper(str: string): string;\n/**\n * Splits an object into two objects based on a list of keys.\n * The first object contains the keys that are not in the list,\n * and the second contains the keys that are.\n * @public\n * @function\n * @param {Object} obj - The object to split.\n * @param {string[]} keys - List of keys to split on.\n * @returns {Object[]} - Tuple of objects, first object contains keys not in list, second contains keys that are.\n */\nexport function splitKeys(obj: any, keys: string[]): any[];\n/**\n * Replaces emojis in a string.\n * @public\n * @function\n * @example\n * scrubEmojis(\'Dove🕊️⭐ 29\')\n * @param {string} text - String that needs to be cleaned\n * @param {string} replacementChars - Characters that replace the emojis\n * @returns {string}\n */\nexport function scrubEmojis(text: string, replacementChars: string): string;\n/**\n * Chunks an array into an array of arrays, each with no more than a certain size.\n * @public\n * @function\n * @example\n * chunk([1,2,3,4,5], 2)\n * @param {Object} array - Array to be chunked\n * @param {Integer} chunkSize - The maxiumum size of each chunks\n * @returns {Object}\n */\nexport function chunk(array: any, chunkSize: Integer): any;\n/**\n * Takes a CSV file string or stream and parsing options as input, and returns a promise that\n * resolves to the parsed CSV data as an array of objects.\n * Options for `parsingOptions` include:\n * - `delimiter` {string/Buffer/[string/Buffer]} - Defines the character(s) used to delineate the fields inside a record. Default: `\',\'`\n * - `quote` {string/Buffer/[string/Buffer]} - Defines the characters used to surround a field. Default: `\'"\'`\n * - `escape` {Buffer/string/null/boolean} - Set the escape character as one character/byte only. Default: `"`\n * - `columns` {boolean / array / function} - Generates record in the form of object literals. Default: `true`\n * - `bom` {boolean} - Strips the {@link https://en.wikipedia.org/wiki/Byte_order_mark byte order mark (BOM)} from the input string or buffer. Default: `true`\n * - `trim` {boolean} - Ignore whitespace characters immediately around the `delimiter`. Default: `true`\n * - `ltrim` {boolean} - Ignore whitespace characters from the left side of a CSV field. Default: `true`\n * - `rtrim` {boolean} - Ignore whitespace characters from the right side of a CSV field. Default: `true`\n * - `chunkSize` {number} - The size of each chunk of CSV data. Default: `Infinity`\n * - `skip_empty_lines` {boolean} - Ignore empty lines in the CSV file. Default: `true`\n * @public\n * @function\n * @param {String | Stream} csvData - A CSV string or a readable stream\n * @param {Object} [parsingOptions] - Optional. Parsing options for converting CSV to JSON.\n * @param {function} [callback] - (Optional) callback function. If used it will be called state and an array of rows.\n * @returns {Operation} The function returns a Promise that resolves to the result of parsing a CSV `stringOrStream`.\n */\nexport function parseCsv(csvData: string | Stream, parsingOptions?: any, callback?: Function): Operation;\n/**\n * Validate against a JSON schema. Any erors are written to an array at `state.validationErrors`.\n * Schema can be passed directly, loaded as a JSON path from state, or loaded from a URL\n * Data can be passed directly or loaded as a JSON path from state.\n * By default, schema is loaded from `state.schema` and data from `state.data`.\n * @pubic\n * @function\n * @param {string|object} schema - The schema, path or URL to validate against\n * @param {string|object} data - The data or path to validate\n * @example <caption>Validate `state.data` with `state.schema`</caption>\n * validate()\n * @example <caption>Validate form data at `state.form` with a schema from a URL</caption>\n * validate("https://www.example.com/schema/record", "form")\n * @example <caption>Validate the each item in `state.records` with a schema from a URL</caption>\n * each("records[*]", validate("https://www.example.com/schema/record"))\n * @returns {Operation}\n */\nexport function validate(schema?: string | object, data?: string | object): Operation;\n/**\n * Sets a cursor property on state.\n * Supports natural language dates like `now`, `today`, `yesterday`, `n hours ago`, `n days ago`, and `start`,\n * which will be converted relative to the environment (ie, the Lightning or CLI locale). Custom timezones\n * are not yet supported.\n * You can provide a formatter to customise the final cursor value, which is useful for normalising\n * different inputs. The custom formatter runs after natural language date conversion.\n * See the usage guide at {@link https://docs.openfn.org/documentation/jobs/job-writing-guide#using-cursors}\n * @public\n * @function\n * @example <caption>Use a cursor from state if present, or else use the default value</caption>\n * cursor($.cursor, { defaultValue: \'today\' })\n * @example <caption>Use a pagination cursor</caption>\n * cursor(22)\n * @param {any} value - the cursor value. Usually an ISO date, natural language date, or page number\n * @param {object} options - options to control the cursor.\n * @param {string} options.key - set the cursor key. Will persist through the whole run.\n * @param {any} options.defaultValue - the value to use if value is falsy\n * @param {Function} options.format - custom formatter for the final cursor value\n * @returns {Operation}\n */\nexport function cursor(value: any, options?: {\n    key: string;\n    defaultValue: any;\n    format: Function;\n}): Operation;\n/**\n * Scopes an array of data based on a JSONPath.\n * Useful when the source data has `n` items you would like to map to\n * an operation.\n * The operation will receive a slice of the data based of each item\n * of the JSONPath provided.\n * @public\n * @function\n * @example\n * map("$.[*]",\n *   create("SObject",\n *     field("FirstName", sourceValue("$.firstName"))\n *   )\n * )\n * @param {string} path - JSONPath referencing a point in `state.data`.\n * @param {function} operation - The operation needed to be repeated.\n * @param {State} state - Runtime state.\n * @returns {State}\n */\nexport const map: any;\n\n\n/**\n * Scopes an array of data based on a JSONPath.\n * Useful when the source data has `n` items you would like to map to\n * an operation.\n * The operation will receive a slice of the data based of each item\n * of the JSONPath provided.\n *\n * It also ensures the results of an operation make their way back into\n * the state\'s references.\n * @public\n * @example\n *  each("$.[*]",\n *    create("SObject",\n *    field("FirstName", sourceValue("$.firstName")))\n *  )\n * @function\n * @param {DataSource} dataSource - JSONPath referencing a point in `state`.\n * @param {Operation} operation - The operation needed to be repeated.\n * @returns {Operation}\n */\nexport function each(dataSource: DataSource, operation: Operation): Operation;\n\n\nexport { parse, format } from "date-fns";\n\n\n/**\n * Builder function to create request options. Returns an object with helpers to\n * easily add commonly used options. The return object is chainable so you can set\n * as many options as you want.\n * Pass an object to set your own options.\n * @param {CommonRequestOptions} options - options to pass to the request\n * @returns {OptionsHelpers}\n * @function\n * @public\n * @example <caption>Get with a query an oath token</caption>\n * get($.data.url, http.options({ query: $.query }).oath($.configuration.access_token)\n */\nexport function options(opts?: {}): any;\n/**\n * Make a GET request.\n * @public\n * @function\n * @example <caption>Request a resource</caption>\n * http.get(\'https://jsonplaceholder.typicode.com/todos\')\n * @example <caption>Request a resource with basic auth</caption>\n * http.get(\n *  \'https://jsonplaceholder.typicode.com/todos\',\n *  http.options().basic(\'user\', \'pass\')\n * )\n * @example <caption>Request a resource with oauth</caption>\n * http.get(\n *  \'https://jsonplaceholder.typicode.com/todos\',\n *  http.options().oauth($.configuration.access_token)\n * )\n * @param {string} url - URL to access\n * @param {CommonRequestOptions} options - Request options\n * @state {CommonHttpState}\n * @returns {Operation}\n */\nexport function get(url: string, options: CommonRequestOptions): Operation;\n/**\n * Make a POST request.\n * @public\n * @function\n * @example <caption>Post a JSON object (setting the content-type header)</caption>\n *  http.post(\n *    \'https://jsonplaceholder.typicode.com/todos\',\n *    $.data,\n *    options().json(),\n *  })\n * @param {string} url - URL to access\n * @param {CommonRequestOptions} options - Request options\n * @state {CommonHttpState}\n * @returns {Operation}\n */\nexport function post(path: any, data: any, options: CommonRequestOptions): Operation;\nexport { req as request };\n/**\n * Helper functions provided by `http.options`.\n */\nexport type OptionsHelpers = any;\n/**\n * Options provided to the HTTP request\n */\nexport type CommonRequestOptions = {\n    /**\n     * - Map of errorCodes -> error messages, ie, `{ 404: \'Resource not found;\' }`. Pass `false` to suppress errors.\n     */\n    errors: object;\n    /**\n     * - Pass a JSON object to be serialised into a multipart HTML form (as FormData) in the body.\n     */\n    form: object;\n    /**\n     * - An object of query parameters to be encoded into the URL.\n     */\n    query: object;\n    /**\n     * - An object of headers to append to the request.\n     */\n    headers: object;\n    /**\n     * - Parse the response body as json, text or stream. By default will use the response headers.\n     */\n    parseAs: string;\n    /**\n     * - Request timeout in ms. Default: 300 seconds.\n     */\n    timeout: number;\n    /**\n     * - TLS/SSL authentication options. See https://nodejs.org/api/tls.html#tlscreatesecurecontextoptions\n     */\n    tls: object;\n};\n/**\n * State object\n */\nexport type CommonHttpState = any;\n/**\n * Options provided to the HTTP request\n * @typedef {Object} CommonRequestOptions\n * @property {object} errors - Map of errorCodes -> error messages, ie, `{ 404: \'Resource not found;\' }`. Pass `false` to suppress errors.\n * @property {object} form - Pass a JSON object to be serialised into a multipart HTML form (as FormData) in the body.\n * @property {object} query - An object of query parameters to be encoded into the URL.\n * @property {object} headers - An object of headers to append to the request.\n * @property {string} parseAs - Parse the response body as json, text or stream. By default will use the response headers.\n * @property {number} timeout - Request timeout in ms. Default: 300 seconds.\n * @property {object} tls - TLS/SSL authentication options. See https://nodejs.org/api/tls.html#tlscreatesecurecontextoptions\n */\n/**\n * State object\n * @typedef {Object} CommonHttpState\n * @private\n * @property data - the parsed response body\n * @property response - the response from the HTTP server, including headers, statusCode, body, etc\n * @property references - an array of all previous data objects used in the Job\n **/\n/**\n * Make a HTTP request.\n * @public\n * @function\n * @example\n * http.request(\n *   \'GET\',\n *   \'https://jsonplaceholder.typicode.com/todos\'\n * )\n * @name request\n * @param {string} method - The HTTP method to use.\n * @param {string} url - URL to resource.\n * @param {CommonRequestOptions} options - Request options\n * @state {CommonHttpState}\n * @returns {Operation}\n */\ndeclare function req(method: string, url: string, options: CommonRequestOptions): Operation;\n\n\nimport * as Adaptor from \'./Adaptor\';\nexport default Adaptor;\nexport * from \'./Adaptor\';\nexport * as beta from \'./beta\';\nexport * as http from \'./http\';\nexport * as dateFns from \'./dateFns\';\nimport * as metadata from \'./metadata\';\nexport { metadata };\n\n\ndeclare type Entity = {\n    name: string;\n    type: string;\n    label?: string;\n    datatype?: string;\n    desc?: string;\n    children?: Entity[] | Record<string, Entity>;\n    meta?: Record<string, any>;\n    addChild: (e: Entity, name?: string) => void;\n};\ndeclare type DataType = \'string\' | \'boolean\' | \'date\';\n\n\nexport function encode(data: string): string;\nexport function decode(base64Data: string): string;\nexport function uuid(): string;\n\n\n/**\n * `request` is a helper function that sends HTTP requests and returns the response\n * body, headers, and status code.\n * Use the error map to provide custom error messages or get hold of the response in case of errors.\n * @param method - The HTTP method to use for the request (e.g., "GET", "POST", "PUT", "DELETE", etc.).\n * @param fullUrlOrPath - The full or partial URL for the request.\n * @param [options] - The `options` parameter is an object that contains additional configuration\n * options for the request.\n * @returns an object with the following properties:\n * - method: the request method\n * - url: the request url\n * - code: the status code of the response\n * - headers: the headers of the response\n * - body: the body of the response\n * - message: the status text of the response\n * - duration: the response time\n */\nexport function request(method: any, fullUrlOrPath: any, options?: {}): Promise<{\n    url: string;\n    method: any;\n    statusCode: any;\n    statusMessage: string;\n    headers: any;\n    body: any;\n    duration: number;\n}>;\nexport function makeBasicAuthHeader(username: any, password: any): {\n    Authorization: string;\n};\nexport function logResponse(response: any): any;\nexport function enableMockClient(baseUrl: any): import("undici/types/mock-interceptor").Interceptable;\nexport const ERROR_ABSOLUTE_URL: "Absolute URLs not suppored";\nexport function assertRelativeUrl(path: any): void;\nexport const ERROR_URL_MISMATCH: "Target origin does not match baseUrl origin";\nexport function parseUrl(pathOrUrl: string, baseUrl: any): {\n    url: string;\n    baseUrl: string;\n    path: string;\n    query: any;\n};\nexport function get(url: any, options: any): Promise<{\n    url: string;\n    method: any;\n    statusCode: any;\n    statusMessage: string;\n    headers: any;\n    body: any;\n    duration: number;\n}>;\nexport function post(url: any, body: any, options: any): Promise<{\n    url: string;\n    method: any;\n    statusCode: any;\n    statusMessage: string;\n    headers: any;\n    body: any;\n    duration: number;\n}>;\nexport function put(url: any, body: any, options: any): Promise<{\n    url: string;\n    method: any;\n    statusCode: any;\n    statusMessage: string;\n    headers: any;\n    body: any;\n    duration: number;\n}>;\nexport function del(url: any, body: any, options: any): Promise<{\n    url: string;\n    method: any;\n    statusCode: any;\n    statusMessage: string;\n    headers: any;\n    body: any;\n    duration: number;\n}>;\n\n\nexport * from "./http";\nexport * from "./helpers";\nexport * from "./references";\nimport parseDate from "./parse-date";\nimport throwError from "./throw-error";\nexport { parseDate, throwError };\n\n\ndeclare function _default(d: any, startDate: any): any;\nexport default _default;\n\n\nexport function expandReferences(state: any, ...args: any[]): any[];\nexport function normalizeOauthConfig(configuration: any): any;\n\n\ndeclare function _default(code: any, { description, fix, ...extras }?: {\n    description: any;\n    fix: any;\n}): never;\nexport default _default;\nTypescript definitions for doc @openfn/language-googlesheets/**\n * Execute a sequence of oper.\n * Wraps `language-common/execute`, and prepends initial state for http.\n * @example\n * execute(\n *   create(\'foo\'),\n *   delete(\'bar\')\n * )(state)\n * @private\n * @param {Operations} operations - Operations to be performed.\n * @returns {Operation}\n */\nexport function execute(...operations: Operations): Operation;\n/**\n * Add an array of rows to the spreadsheet.\n * https://developers.google.com/sheets/api/samples/writing#append_values\n * @public\n * @example\n * appendValues({\n *   spreadsheetId: \'1O-a4_RgPF_p8W3I6b5M9wobA3-CBW8hLClZfUik5sos\',\n *   range: \'Sheet1!A1:E1\',\n *   values: [\n *     [\'From expression\', \'$15\', \'2\', \'3/15/2016\'],\n *     [\'Really now!\', \'$100\', \'1\', \'3/20/2016\'],\n *   ],\n * })\n * @function\n * @param {Object} params - Data object to add to the spreadsheet.\n * @param {string} [params.spreadsheetId] The spreadsheet ID.\n * @param {string} [params.range] The range of values to update.\n * @param {array} [params.values] A 2d array of values to update.\n * @param {function} callback - (Optional) Callback function\n * @returns {Operation}\n */\nexport function appendValues(params: {\n    spreadsheetId?: string;\n    range?: string;\n    values?: any[];\n}, callback?: Function): Operation;\n/**\n * Batch update values in a Spreadsheet.\n * @example\n * batchUpdateValues({\n *   spreadsheetId: \'1O-a4_RgPF_p8W3I6b5M9wobA3-CBW8hLClZfUik5sos\',\n *   range: \'Sheet1!A1:E1\',\n *   values: [\n *     [\'From expression\', \'$15\', \'2\', \'3/15/2016\'],\n *     [\'Really now!\', \'$100\', \'1\', \'3/20/2016\'],\n *   ],\n * })\n * @function\n * @public\n * @param {Object} params - Data object to add to the spreadsheet.\n * @param {string} [params.spreadsheetId] The spreadsheet ID.\n * @param {string} [params.range] The range of values to update.\n * @param {string} [params.valueInputOption] (Optional) Value update options. Defaults to \'USER_ENTERED\'\n * @param {array} [params.values] A 2d array of values to update.\n * @param {function} callback - (Optional) callback function\n * @returns {Operation} spreadsheet information\n */\nexport function batchUpdateValues(params: {\n    spreadsheetId?: string;\n    range?: string;\n    valueInputOption?: string;\n    values?: any[];\n}, callback?: Function): Operation;\n/**\n * Gets cell values from a Spreadsheet.\n * @public\n * @example\n * getValues(\'1O-a4_RgPF_p8W3I6b5M9wobA3-CBW8hLClZfUik5sos\',\'Sheet1!A1:E1\')\n * @function\n * @param {string} spreadsheetId The spreadsheet ID.\n * @param {string} range The sheet range.\n * @param {function} callback - (Optional) callback function\n * @returns {Operation} spreadsheet information\n */\nexport function getValues(spreadsheetId: string, range: string, callback?: Function): Operation;\nexport { alterState, combine, cursor, dataPath, dataValue, each, field, fields, fn, fnIf, http, lastReferenceValue, merge, sourceValue } from "@openfn/language-common";\n\n\nexport default Adaptor;\nexport * from "./Adaptor";\nimport * as Adaptor from "./Adaptor";\n</adaptor>'}, {'type': 'text', 'text': '.', 'cache_control': {'type': 'ephemeral'}}, {'type': 'text', 'text': '<user_code>// write your job code here</user_code>'}]

In [122]:
# Generate answers with the new system prompt
# @retry(wait=wait_random_exponential(min=1, max=90), stop=stop_after_attempt(20))

@retry(wait=wait_random_exponential(multiplier=1, max=60), stop=stop_after_attempt(20))
def generate_new_format(question, new_system_prompt, model="claude-3-5-sonnet-20240620"):
  """Use the apollo format to call claude with a custom system prompt."""
  prompt = [{'role': 'user', 'content': question}]
  print(question)
  message = client.beta.prompt_caching.messages.create(
      max_tokens=1024, messages=prompt, model=model, system=new_system_prompt
  )
  return message.content[0].text

In [123]:
# Get answers to generated questions
qa_pairs = []
for question in generated_questions[:30]:
    answer = generate_apollo_format(question)
    qa_pairs.append((question, answer, "v1"))
for question in generated_questions[:30]: # two loops to make use of prompt caching
    new_answer = generate_new_format(question, system_prompt_v2)
    qa_pairs.append((question, new_answer, "v2"))

How can Asana help me manage my team's projects?
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages?beta=prompt_caching "HTTP/1.1 200 OK"
What is Azure Storage and how does it store my data?
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages?beta=prompt_caching "HTTP/1.1 200 OK"
How does Beyonic handle mobile money payments?
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages?beta=prompt_caching "HTTP/1.1 200 OK"
What can I do with BigQuery's data analytics?
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages?beta=prompt_caching "HTTP/1.1 200 OK"
What kind of maps can I create with CartoDB?
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages?beta=prompt_caching "HTTP/1.1 200 OK"
What is CHT and how does it help health workers?
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages?beta=prompt_caching "HTTP/1.1 200 OK"
What are Collections in programming?
INFO:httpx:HTTP Request: POST https://

In [34]:
# Template for evaluating answers to questions

evaluate_answer_template = """
For the given Question and Answer pair, evaluate whether the answer at least briefly answers the question (True), or if it entirely fails to answer the question (False).

<example>
Q: "What is Salesforce?"
A: "I apologize, but I can't provide information about Salesforce or other topics unrelated to OpenFn workflow automation and job writing. I'm focused specifically on helping with OpenFn jobs and related technologies. Is there something I can help you with regarding writing OpenFn jobs or using the Google Sheets adaptor?"
Success: False

Now, following Q/A pair:
Q: "What is Asana used for?"
A: "Asana is a popular project management and collaboration tool, not directly related to OpenFn or Google Sheets integration. However, I can provide some general information about Asana:\n\nAsana is used for:\n\n1. Task management: Creating, assigning, and tracking tasks within teams\n2. Project planning: Organizing work into projects with timelines and milestones\n3. Team collaboration: Sharing files, comments, and updates on work items\n4. Workflow management: Creating custom workflows for different processes\n5. Goal tracking: Setting and monitoring team and company objectives\n\nWhile Asana is useful for many organizations, it's not directly related to the OpenFn job you're working on, which appears to be focused on Google Sheets integration. If you have any questions about Google Sheets or the OpenFn job you're writing, I'd be happy to help with those."
Success: True

Now, following Q/A pair, by ONLY outputting "True" or "False":
Q: {question}
A: {answer}
Success: 
"""

In [124]:
# Evaluate QA pairs

results = []

for q, a, version in qa_pairs:
    evaluate_answer_template_filled = evaluate_answer_template.format(question=q, answer=a)

    message = client.messages.create(
        model="claude-3-5-sonnet-20241022",
        max_tokens=1000, #TODO limit to 1
        temperature=0,
        system="You are an AI programming assistant. Follow the user's requirements carefully and to the letter.",
        messages=[
            {
                "role": "user",
                "content": [
                    {
                        "type": "text",
                        "text": evaluate_answer_template_filled
                    }
                ]
            }
        ]
    )
    result = message.content[0].text
    results.append([q, a, version, result])


INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.anthropic.co

In [125]:
import pandas as pd

results_df = pd.DataFrame(results, columns=["question", "answer", "system_prompt_version", "result"])
results_df['result'] = results_df['result'].map({'True': True, 'False': False})
v1_mean = results_df[results_df['system_prompt_version'] == 'v1']['result'].mean()
v2_mean = results_df[results_df['system_prompt_version'] == 'v2']['result'].mean()

print(f"V1 success rate: {v1_mean}, V2 success rate: {v2_mean}")
results_df.to_csv("data/202411_claude_system_prompt_issue_97.csv", index=False)
results_df

V1 success rate: 0.2, V2 success rate: 0.6


Unnamed: 0,question,answer,system_prompt_version,result
0,How can Asana help me manage my team's projects?,"I apologize, but I don't have specific informa...",v1,False
1,What is Azure Storage and how does it store my...,Azure Storage is Microsoft's cloud storage sol...,v1,True
2,How does Beyonic handle mobile money payments?,"I apologize, but I don't have any information ...",v1,False
3,What can I do with BigQuery's data analytics?,"I apologize, but I can't provide information a...",v1,False
4,What kind of maps can I create with CartoDB?,"I apologize, but I don't have specific informa...",v1,False
5,What is CHT and how does it help health workers?,"I apologize, but I don't have specific informa...",v1,False
6,What are Collections in programming?,"I apologize, but I'm not able to provide gener...",v1,False
7,How does CommCare help with mobile data collec...,"I apologize, but I don't have specific informa...",v1,False
8,What is meant by Common in programming contexts?,"In programming contexts, ""Common"" often refers...",v1,True
9,How does DHIS2 handle health information?,"I apologize, but I don't have specific informa...",v1,False
