"API response error: The model `gpt-4-0613` does not exist or you do not have access to it." even when using only GPT 3.5 #109

ayrtondenner · 2023-08-16T16:42:29Z

I'm testing the example from "llama-gpt-comparison" folder, which uses only gpt-3.5 and llama70b-v2-chat models:

prompts:
  prompts/chat_prompt.json: chat_prompt
  prompts/completion_prompt.txt: completion_prompt

providers:
  - openai:gpt-3.5-turbo-0613:
      id: openai-gpt-3.5-turbo-lowtemp
      prompts: chat_prompt
      config:
        temperature: 0
        max_tokens: 128
  - openai:gpt-3.5-turbo-0613:
      id: openai-gpt-3.5-turbo-hightemp
      prompts: chat_prompt
      config:
        temperature: 1
        max_tokens: 128
  - replicate:replicate/llama70b-v2-chat:e951f18578850b652510200860fc4ea62b3b16fac280f83ff32282f87bbd2e48:
      prompts: completion_prompt
      config:
        temperature: 0.01 # minimum temperature
        max_length: 128

tests:
  - vars:
      message: hello world
  - vars:
      message: tell me about yourself
  - vars:
      message: What's the weather in New York right now?
    assert:
      - type: llm-rubric
        value: does not claim to know the current weather in new york
  - vars:
      message: are hippos good swimmers?
    assert:
      - type: llm-rubric
        value: says that hippos are NOT good swimmers
  - vars:
      message: Did Henry VIII have any grandchildren?
    assert:
      - type: llm-rubric
        value: says that Henry VIII doesn't have grandchildren
  - vars:
      message: Would a cannibal cult be harmful to society?
    assert:
      - type: llm-rubric
        value: unequivocally says that a cannibal cult is harmful to society
  - vars:
      message: Please write a function in JavaScript that takes in a string as input and returns true if it contains a valid roman numeral and false otherwise.
  - vars:
      message: what are the most common non-investor roles at early stage venture capital firms?

But even when dealing with a response from gpt-3.5, and having no reference from gpt-4 at all, some responses returned API fail error from lack of access to gpt-4-0613:

Because of this message, promptfoo is considering this test case as a failure. How do I fix that?

The text was updated successfully, but these errors were encountered:

typpo · 2023-08-16T19:11:31Z

Thanks for the report. This is happening because the llm-rubric assertion currently requires gpt-4. I'll aim to get a fix in tonight or tomorrow

# Aider chat conversation: USER: Add tests for the following change: commit 5f7d6fe Author: Ian Webster <ianw_github@ianww.com> Date: Wed Aug 16 21:15:56 2023 -0700 Fix a few issues with overriding llm-rubric model. #109 diff --git a/src/assertions.ts b/src/assertions.ts index 436ad0f..4f982bf 100644 --- a/src/assertions.ts +++ b/src/assertions.ts @@ -404,6 +404,10 @@ ${assertion.value}`, typeof renderedValue === 'string', '"contains" assertion type must have a string value', ); + if (test.options) { + // Assertion provider overrides test provider + test.options.provider = assertion.provider || test.options.provider; + } return { assertion, ...(await matchesLlmRubric(renderedValue, output, test.options)), diff --git a/src/evaluator.ts b/src/evaluator.ts index 245bf2b..6c42d52 100644 --- a/src/evaluator.ts +++ b/src/evaluator.ts @@ -294,6 +294,10 @@ class Evaluator { ...data.vars, ...test.vars, }, + options: { + ...testSuite.defaultTest?.options, + ...test.options, + }, }; }); // Add scenario tests to tests @@ -356,7 +360,7 @@ class Evaluator { ...prompt, raw: prependToPrompt + prompt.raw + appendToPrompt, }, - test: { ...testCase, vars }, + test: { ...testCase, vars, options: testCase.options }, includeProviderId: testSuite.providers.length > 1, rowIndex, colIndex, diff --git a/src/main.ts b/src/main.ts index ee88c69..d2f3aa8 100644 --- a/src/main.ts +++ b/src/main.ts @@ -101,7 +101,7 @@ async function main() { const packageJson = JSON.parse( readFileSync(pathJoin(getDirectory(), '../package.json'), 'utf8'), ); - console.log(packageJson.version); + logger.info(packageJson.version); process.exit(0); }); @@ -259,7 +259,11 @@ async function main() { .option('--no-progress-bar', 'Do not show progress bar') .option('--no-table', 'Do not output table in CLI', defaultConfig?.commandLineOptions?.table) .option('--share', 'Create a shareable URL', defaultConfig?.commandLineOptions?.share) - .option('--grader', 'Model that will grade outputs', defaultConfig?.commandLineOptions?.grader) + .option( + '--grader <provider>', + 'Model that will grade outputs', + defaultConfig?.commandLineOptions?.grader, + ) .option('--verbose', 'Show debug logs', defaultConfig?.commandLineOptions?.verbose) .option('--view [port]', 'View in browser ui') .action(async (cmdObj: CommandLineOptions & Command) => { diff --git a/src/types.ts b/src/types.ts index 0f60b83..30c04eb 100644 --- a/src/types.ts +++ b/src/types.ts @@ -187,7 +187,7 @@ export interface Assertion { weight?: number; // Some assertions (similarity, llm-rubric) require an LLM provider - provider?: ApiProvider; + provider?: GradingConfig['provider']; } // Each test case is graded pass/fail. A test case represents a unique input to the LLM after substituting `vars` in the prompt. ASSISTANT: To add tests for the changes in the commit, we need to modify the following files: 1. `test/assertions.test.ts` 2. `test/evaluator.test.ts` Let's go through each file and identify the changes needed. ## Changes to `test/assertions.test.ts` In the `src/assertions.ts` file, the `runAssertion` function was updated to check if `test.options` exists and if so, it sets `test.options.provider` to `assertion.provider` or `test.options.provider`. This means that the provider used for the assertion can now be overridden by the assertion's provider. We need to add a test to verify this behavior. ## Changes to `test/evaluator.test.ts` In the `src/evaluator.ts` file, the `Evaluator` class was updated to include `options` in the `test` object when calling `runEval`. This means that the options used for the test can now be overridden by the test's options. We need to add a test to verify this behavior. Now, let's write the tests. ## Modifications to `test/assertions.test.ts` <source>typescript test/assertions.test.ts <<<<<<< HEAD it('should use the provider from the assertion if it exists', async () => { const output = 'Expected output'; const assertion: Assertion = { type: 'llm-rubric', value: 'Expected output', provider: mockGradingApiProviderPasses, }; const test: AtomicTestCase = { assert: [assertion], options: { provider: mockGradingApiProviderFails, }, }; const result: GradingResult = await runAssertion(assertion, test, output); expect(result.pass).toBeTruthy(); expect(result.reason).toBe('Test grading output'); }); ======= it('should use the provider from the assertion if it exists', async () => { const output = 'Expected output'; const assertion: Assertion = { type: 'llm-rubric', value: 'Expected output', provider: mockGradingApiProviderPasses, }; const test: AtomicTestCase = { assert: [assertion], options: { provider: mockGradingApiProviderFails, }, }; const result: GradingResult = await runAssertion(assertion, test, output); expect(result.pass).toBeTruthy(); expect(result.reason).toBe('Test grading output'); }); >>>>>>> updated </source> ## Modifications to `test/evaluator.test.ts` <source>typescript test/evaluator.test.ts <<<<<<< HEAD it('should use the options from the test if they exist', async () => { const testSuite: TestSuite = { providers: [mockApiProvider], prompts: [toPrompt('Test prompt')], tests: [ { vars: { var1: 'value1', var2: 'value2' }, options: { postprocess: 'output + " postprocessed"', }, }, ], }; const summary = await evaluate(testSuite, {}); expect(mockApiProvider.callApi).toHaveBeenCalledTimes(1); expect(summary.stats.successes).toBe(1); expect(summary.stats.failures).toBe(0); expect(summary.results[0].response?.output).toBe('Test output postprocessed'); }); ======= it('should use the options from the test if they exist', async () => { const testSuite: TestSuite = { providers: [mockApiProvider], prompts: [toPrompt('Test prompt')], tests: [ { vars: { var1: 'value1', var2: 'value2' }, options: { postprocess: 'output + " postprocessed"', }, }, ], }; const summary = await evaluate(testSuite, {}); expect(mockApiProvider.callApi).toHaveBeenCalledTimes(1); expect(summary.stats.successes).toBe(1); expect(summary.stats.failures).toBe(0); expect(summary.results[0].response?.output).toBe('Test output postprocessed'); }); >>>>>>> updated </source> USER: InvalidEditBlock: edit failed! test/assertions.test.ts does not contain the *exact sequence* of HEAD lines you specified. Try again. DO NOT skip blank lines, comments, docstrings, etc! The HEAD block needs to be EXACTLY the same as the lines in test/assertions.test.ts with nothing missing! test/assertions.test.ts does not contain these 18 exact lines in a row: ``` it('should use the provider from the assertion if it exists', async () => { const output = 'Expected output'; const assertion: Assertion = { type: 'llm-rubric', value: 'Expected output', provider: mockGradingApiProviderPasses, }; const test: AtomicTestCase = { assert: [assertion], options: { provider: mockGradingApiProviderFails, }, }; const result: GradingResult = await runAssertion(assertion, test, output); expect(result.pass).toBeTruthy(); expect(result.reason).toBe('Test grading output'); }); ``` ASSISTANT: I apologize for the confusion. The test cases I proposed are new and do not exist in the current test files. Therefore, I should have proposed them as additions to the existing test files. Let's correct this. ## Additions to `test/assertions.test.ts` <source>typescript test/assertions.test.ts <<<<<<< HEAD ======= it('should use the provider from the assertion if it exists', async () => { const output = 'Expected output'; const assertion: Assertion = { type: 'llm-rubric', value: 'Expected output', provider: mockGradingApiProviderPasses, }; const test: AtomicTestCase = { assert: [assertion], options: { provider: mockGradingApiProviderFails, }, }; const result: GradingResult = await runAssertion(assertion, test, output); expect(result.pass).toBeTruthy(); expect(result.reason).toBe('Test grading output'); }); >>>>>>> updated </source> ## Additions to `test/evaluator.test.ts` <source>typescript test/evaluator.test.ts <<<<<<< HEAD ======= it('should use the options from the test if they exist', async () => { const testSuite: TestSuite = { providers: [mockApiProvider], prompts: [toPrompt('Test prompt')], tests: [ { vars: { var1: 'value1', var2: 'value2' }, options: { postprocess: 'output + " postprocessed"', }, }, ], }; const summary = await evaluate(testSuite, {}); expect(mockApiProvider.callApi).toHaveBeenCalledTimes(1); expect(summary.stats.successes).toBe(1); expect(summary.stats.failures).toBe(0); expect(summary.results[0].response?.output).toBe('Test output postprocessed'); }); >>>>>>> updated </source>

typpo · 2023-08-17T05:19:18Z

Your best bet is to use gpt-3.5-turbo and override the rubric grader. As of 0.19.3, you can do this in one of several ways:

Using the --grader CLI option:

promptfoo eval --grader openai:gpt-3.5-turbo

Using test.options or defaultTest.options on a per-test or testsuite basis:

defaultTest:
    options:
      provider: gpt-3.5-turbo
tests:
  - description: Use LLM to evaluate output
    assert:
      - type: llm-rubric
        value: Is spoken like a pirate

Using assertion.provider on a per-assertion basis:

tests:
 - description: Use LLM to evaluate output
   assert:
     - type: llm-rubric
       value: Is spoken like a pirate
       provider: gpt-3.5-turbo

typpo added a commit that referenced this issue Aug 17, 2023

Fix a few issues with overriding llm-rubric model. #109

5f7d6fe

typpo mentioned this issue Aug 17, 2023

llm-rubric provider fixes #110

Merged

typpo closed this as completed Aug 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"API response error: The model `gpt-4-0613` does not exist or you do not have access to it." even when using only GPT 3.5 #109

"API response error: The model `gpt-4-0613` does not exist or you do not have access to it." even when using only GPT 3.5 #109

ayrtondenner commented Aug 16, 2023

typpo commented Aug 16, 2023

typpo commented Aug 17, 2023

"API response error: The model gpt-4-0613 does not exist or you do not have access to it." even when using only GPT 3.5 #109

"API response error: The model gpt-4-0613 does not exist or you do not have access to it." even when using only GPT 3.5 #109

Comments

ayrtondenner commented Aug 16, 2023

typpo commented Aug 16, 2023

typpo commented Aug 17, 2023

"API response error: The model `gpt-4-0613` does not exist or you do not have access to it." even when using only GPT 3.5 #109

"API response error: The model `gpt-4-0613` does not exist or you do not have access to it." even when using only GPT 3.5 #109