Add new GPT-4o models and update tiktoken to 0.7.0 #901

KennyDizi · 2024-05-14T01:26:47Z

Type

enhancement

Description

Added 'gpt-4o' and 'gpt-4o-2024-05-13' models to the list of available models
Updated tiktoken to version 0.7.0

Changes walkthrough

Relevant files

Enhancement

__init__.py `Add new GPT-4o models` pr_agent/algo/init.py Added 'gpt-4o' and 'gpt-4o-2024-05-13' models to the list of available models	+2/-0
requirements.txt `Update tiktoken to 0.7.0` requirements.txt Updated tiktoken to version 0.7.0	+1/-1

codiumai-pr-agent-pro · 2024-05-14T01:27:09Z

PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

PR Description updated to latest commit (36ad893)

Copy walkthrough table to "Files Changed" Tab

codiumai-pr-agent-pro · 2024-05-14T01:27:19Z

PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

PR Review 🔍

⏱️ Estimated effort to review [1-5]	2, because the changes are straightforward and limited to a few lines in two files. The complexity is low, involving only the addition of model configurations and a package version update.
🏅 Score	85
🧪 Relevant tests	No
⚡ Possible issues	Possible Configuration Error: Ensure that the token limits set for the new models ('gpt-4o' and 'gpt-4o-2024-05-13') are correct and consistent with the intended use cases and system capabilities.
🔒 Security concerns	No

codiumai-pr-agent-pro · 2024-05-14T01:27:44Z

PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

PR Code Suggestions ✨

Category	Suggestion	Score
Maintainability	Add a comment to explain the significance of the model version date Consider adding a comment to explain the significance of the specific date '2024-05-13' in the model name 'gpt-4o-2024-05-13'. This will help maintainers understand the context or reason for this specific versioning. pr_agent/algo/init.py [15] -'gpt-4o-2024-05-13': 128000, # 128K, but may be limited by config.max_model_tokens +'gpt-4o-2024-05-13': 128000, # 128K, specific release date for targeted improvements or features Suggestion importance[1-10]: 6 Why: The suggestion to add a comment explaining the significance of the date in the model name is valid and improves code maintainability. However, it's not a critical issue, hence the moderate score.	6

barnett-yuxiang

It is recommended to update the content of the doc synchronously?

KennyDizi · 2024-05-14T03:20:46Z

It is recommended to update the content of the doc synchronously?

Would you pls assist me with where I can update the doc for those new models? I can't find it out

barnett-yuxiang · 2024-05-14T04:10:15Z

It is recommended to update the content of the doc synchronously?

Would you pls assist me with where I can update the doc for those new models? I can't find it out

I understand that updating here is enough, but let's still listen to @mrT23 advice. 🤝

KennyDizi · 2024-05-14T04:12:03Z

It is recommended to update the content of the doc synchronously?

Would you pls assist me with where I can update the doc for those new models? I can't find it out

I understand that updating here is enough, but let's still listen to @mrT23 advice. 🤝

cool!

mrT23 · 2024-05-14T06:25:20Z

Thanks @KennyDizi

Currently, it's just an option, but it's definitely a strong candidate to become the default model in the near future.
(I am benchmarking it on alphaCodium, to get a sense of how good it really is on code. will be finished soon)

KennyDizi · 2024-05-14T06:34:14Z

Thanks @KennyDizi

Currently, it's just an option, but it's definitely a strong candidate to become the default model in the near future. (I am benchmarking it on alphaCodium, to get a sense of how good it really is on code. will be finished soon)

Thanks, @mrT23 gpt-4o model made a new standard; we're looking for your benchmark.

vKongv · 2024-05-14T10:47:46Z

I tried to use gpt-4o as my model and I encountered this error. I am running it in Github Action

My github action yml file:

jobs:
  pr_agent_job:
    runs-on: ubuntu-latest
    if: |
      (github.event_name == 'pull_request') ||
      (github.event_name == 'issue_comment' && 
      github.event.issue.pull_request && 
      contains(github.event.comment.body, '/review'))
    permissions:
      issues: write
      pull-requests: write
      contents: write
    name: Run pr agent on pull requests and respond to '/review' comments
    steps:
      - name: PR Agent action step
        id: pragent
        uses: Codium-ai/pr-agent@main
        env:
          CONFIG.MODEL: gpt-4o-2024-05-13
          PR_REVIEWER.INLINE_CODE_COMMENTS: true
          PR_REVIEWER.PERSISTENT_COMMENT: false
          PR_REVIEWER.NUM_CODE_SUGGESTIONS: 8
          OPENAI_KEY: ${{ secrets.OPENAI_KEY }}
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

{"text": "Applying repo settings\n", "record": {"elapsed": {"repr": "0:00:01.098014", "seconds": 1.098014}, "exception": null, "extra": {}, "file": {"name": "github_action_runner.py", "path": "/app/pr_agent/servers/github_action_runner.py"}, "function": "run_action", "level": {"icon": "ℹ️", "name": "INFO", "no": 20}, "line": 75, "message": "Applying repo settings", "module": "github_action_runner", "name": "__main__", "process": {"id": 6, "name": "MainProcess"}, "thread": {"id": 139708756859776, "name": "MainThread"}, "time": {"repr": "2024-05-14 10:44:11.476957+00:00", "timestamp": 1715683451.476957}}}
{"text": "Failed to clip tokens: 'Could not automatically map gpt-4o-2024-05-13 to a tokeniser. Please use `tiktoken.get_encoding` to explicitly get the tokeniser you expect.'\n", "record": {"elapsed": {"repr": "0:00:03.222232", "seconds": 3.222232}, "exception": null, "extra": {"command": "review"}, "file": {"name": "utils.py", "path": "/app/pr_agent/algo/utils.py"}, "function": "clip_tokens", "level": {"icon": "⚠️", "name": "WARNING", "no": 30}, "line": 582, "message": "Failed to clip tokens: 'Could not automatically map gpt-4o-2024-05-13 to a tokeniser. Please use `tiktoken.get_encoding` to explicitly get the tokeniser you expect.'", "module": "utils", "name": "pr_agent.algo.utils", "process": {"id": 6, "name": "MainProcess"}, "thread": {"id": 139708756859776, "name": "MainThread"}, "time": {"repr": "2024-05-14 10:44:13.60[11](https://github.com/dresstal-berhad/auito-be/actions/runs/9078012351/job/24944105248#step:3:12)75+00:00", "timestamp": 1715683453.601175}}}
{"text": "Failed to clip tokens: 'Could not automatically map gpt-4o-2024-05-13 to a tokeniser. Please use `tiktoken.get_encoding` to explicitly get the tokeniser you expect.'\n", "record": {"elapsed": {"repr": "0:00:03.511451", "seconds": 3.511451}, "exception": null, "extra": {"command": "review"}, "file": {"name": "utils.py", "path": "/app/pr_agent/algo/utils.py"}, "function": "clip_tokens", "level": {"icon": "⚠️", "name": "WARNING", "no": 30}, "line": 582, "message": "Failed to clip tokens: 'Could not automatically map gpt-4o-2024-05-13 to a tokeniser. Please use `tiktoken.get_encoding` to explicitly get the tokeniser you expect.'", "module": "utils", "name": "pr_agent.algo.utils", "process": {"id": 6, "name": "MainProcess"}, "thread": {"id": 139708756859776, "name": "MainThread"}, "time": {"repr": "2024-05-14 10:44:13.890394+00:00", "timestamp": 1715683453.890394}}}
Traceback (most recent call last):
  File "/app/pr_agent/servers/github_action_runner.py", line 148, in <module>
    asyncio.run(run_action())
  File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/app/pr_agent/servers/github_action_runner.py", line 141, in run_action
    await PRAgent().handle_request(url, body,
  File "/app/pr_agent/agent/pr_agent.py", line 93, in handle_request
    await command2class[action](pr_url, ai_handler=self.ai_handler, args=args).run()
  File "/app/pr_agent/tools/pr_reviewer.py", line 75, in __init__
    self.token_handler = TokenHandler(
  File "/app/pr_agent/algo/token_handler.py", line 47, in __init__
    self.encoder = TokenEncoder.get_token_encoder()
  File "/app/pr_agent/algo/token_handler.py", line 19, in get_token_encoder
    cls._encoder_instance = encoding_for_model(cls._model) if "gpt" in cls._model else get_encoding(
  File "/usr/local/lib/python3.10/site-packages/tiktoken/model.py", line 97, in encoding_for_model
    return get_encoding(encoding_name_for_model(model_name))
  File "/usr/local/lib/python3.10/site-packages/tiktoken/model.py", line 84, in encoding_name_for_model
    raise KeyError(
KeyError: 'Could not automatically map gpt-4o-2024-05-[13](https://github.com/dresstal-berhad/auito-be/actions/runs/9078012351/job/24944105248#step:3:14) to a tokeniser. Please use `tiktoken.get_encoding` to explicitly get the tokeniser you expect.'

Am I missing anything?

Edit: I also try using gpt-4o for CONFIG.MODEL but it still getting the same error

KennyDizi · 2024-05-14T10:57:48Z

I tried to use gpt-4o as my model and I encountered this error. I am running it in Github Action

My github action yml file:

jobs:

  pr_agent_job:

    runs-on: ubuntu-latest

    if: |

      (github.event_name == 'pull_request') ||

      (github.event_name == 'issue_comment' && 

      github.event.issue.pull_request && 

      contains(github.event.comment.body, '/review'))

    permissions:

      issues: write

      pull-requests: write

      contents: write

    name: Run pr agent on pull requests and respond to '/review' comments

    steps:

      - name: PR Agent action step

        id: pragent

        uses: Codium-ai/pr-agent@main

        env:

          CONFIG.MODEL: gpt-4o-2024-05-13

          PR_REVIEWER.INLINE_CODE_COMMENTS: true

          PR_REVIEWER.PERSISTENT_COMMENT: false

          PR_REVIEWER.NUM_CODE_SUGGESTIONS: 8

          OPENAI_KEY: ${{ secrets.OPENAI_KEY }}

          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}


{"text": "Applying repo settings\n", "record": {"elapsed": {"repr": "0:00:01.098014", "seconds": 1.098014}, "exception": null, "extra": {}, "file": {"name": "github_action_runner.py", "path": "/app/pr_agent/servers/github_action_runner.py"}, "function": "run_action", "level": {"icon": "ℹ️", "name": "INFO", "no": 20}, "line": 75, "message": "Applying repo settings", "module": "github_action_runner", "name": "__main__", "process": {"id": 6, "name": "MainProcess"}, "thread": {"id": 139708756859776, "name": "MainThread"}, "time": {"repr": "2024-05-14 10:44:11.476957+00:00", "timestamp": 1715683451.476957}}}

{"text": "Failed to clip tokens: 'Could not automatically map gpt-4o-2024-05-13 to a tokeniser. Please use `tiktoken.get_encoding` to explicitly get the tokeniser you expect.'\n", "record": {"elapsed": {"repr": "0:00:03.222232", "seconds": 3.222232}, "exception": null, "extra": {"command": "review"}, "file": {"name": "utils.py", "path": "/app/pr_agent/algo/utils.py"}, "function": "clip_tokens", "level": {"icon": "⚠️", "name": "WARNING", "no": 30}, "line": 582, "message": "Failed to clip tokens: 'Could not automatically map gpt-4o-2024-05-13 to a tokeniser. Please use `tiktoken.get_encoding` to explicitly get the tokeniser you expect.'", "module": "utils", "name": "pr_agent.algo.utils", "process": {"id": 6, "name": "MainProcess"}, "thread": {"id": 139708756859776, "name": "MainThread"}, "time": {"repr": "2024-05-14 10:44:13.60[11](https://github.com/dresstal-berhad/auito-be/actions/runs/9078012351/job/24944105248#step:3:12)75+00:00", "timestamp": 1715683453.601175}}}

{"text": "Failed to clip tokens: 'Could not automatically map gpt-4o-2024-05-13 to a tokeniser. Please use `tiktoken.get_encoding` to explicitly get the tokeniser you expect.'\n", "record": {"elapsed": {"repr": "0:00:03.511451", "seconds": 3.511451}, "exception": null, "extra": {"command": "review"}, "file": {"name": "utils.py", "path": "/app/pr_agent/algo/utils.py"}, "function": "clip_tokens", "level": {"icon": "⚠️", "name": "WARNING", "no": 30}, "line": 582, "message": "Failed to clip tokens: 'Could not automatically map gpt-4o-2024-05-13 to a tokeniser. Please use `tiktoken.get_encoding` to explicitly get the tokeniser you expect.'", "module": "utils", "name": "pr_agent.algo.utils", "process": {"id": 6, "name": "MainProcess"}, "thread": {"id": 139708756859776, "name": "MainThread"}, "time": {"repr": "2024-05-14 10:44:13.890394+00:00", "timestamp": 1715683453.890394}}}

Traceback (most recent call last):

  File "/app/pr_agent/servers/github_action_runner.py", line 148, in <module>

    asyncio.run(run_action())

  File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run

    return loop.run_until_complete(main)

  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete

    return future.result()

  File "/app/pr_agent/servers/github_action_runner.py", line 141, in run_action

    await PRAgent().handle_request(url, body,

  File "/app/pr_agent/agent/pr_agent.py", line 93, in handle_request

    await command2class[action](pr_url, ai_handler=self.ai_handler, args=args).run()

  File "/app/pr_agent/tools/pr_reviewer.py", line 75, in __init__

    self.token_handler = TokenHandler(

  File "/app/pr_agent/algo/token_handler.py", line 47, in __init__

    self.encoder = TokenEncoder.get_token_encoder()

  File "/app/pr_agent/algo/token_handler.py", line 19, in get_token_encoder

    cls._encoder_instance = encoding_for_model(cls._model) if "gpt" in cls._model else get_encoding(

  File "/usr/local/lib/python3.10/site-packages/tiktoken/model.py", line 97, in encoding_for_model

    return get_encoding(encoding_name_for_model(model_name))

  File "/usr/local/lib/python3.10/site-packages/tiktoken/model.py", line 84, in encoding_name_for_model

    raise KeyError(

KeyError: 'Could not automatically map gpt-4o-2024-05-[13](https://github.com/dresstal-berhad/auito-be/actions/runs/9078012351/job/24944105248#step:3:14) to a tokeniser. Please use `tiktoken.get_encoding` to explicitly get the tokeniser you expect.'

Am I missing anything?

Edit: I also try using gpt-4o for CONFIG.MODEL but it still getting the same error

You have to install the tiktoken v0.7.0 to use new model.

mrT23 · 2024-05-14T11:13:33Z

@vKongv
i only now updated the dockers. try again

vKongv · 2024-05-14T17:07:59Z

works now

…

On Tue, 14 May 2024 at 19:13, Tal ***@***.***> wrote: @vKongv <https://github.com/vKongv> i only now updated the dockers. try again — Reply to this email directly, view it on GitHub <#901 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACC2Z54EOUNIRT2FJ7XKRJTZCHWXHAVCNFSM6AAAAABHVG2A5KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBZHEZTEOBQG4> . You are receiving this because you were mentioned.Message ID: ***@***.***>

mrT23 · 2024-05-14T17:34:15Z

@KennyDizi
p.s.
https://twitter.com/talrid23/status/1790342268555079906

the model does look strong (and fast).
couple more days, and I think I will make it the default one

barnett-yuxiang · 2024-05-15T07:01:56Z

@mrT23

The speed is 2x faster, and the price is only 50% of the original.

mrT23 · 2024-05-15T07:21:56Z

@barnett-yuxiang i understand, and will probably use it as the default model in the near future.

But:
As a general strategy, I don't like to switch models the first day they come out:

usually there are availability issues, that are resolved along the way
I want to validate the quality, both from my inspection and from general internet feedback

Ping me in a week or so if the base model was not switched :-)

barnett-yuxiang · 2024-05-15T14:50:01Z

Oh, @mrT23 , I understand your expression very well, and if it were me, I would also make decisions like you. Our thoughts are the same.

KennyDizi added 2 commits May 14, 2024 08:21

Update tiktoken to 0.7.0

9dd2520

Add gpt-4o models

36ad893

codiumai-pr-agent-pro bot added the Review effort [1-5]: 2 label May 14, 2024

KennyDizi changed the title ~~Add gpt-4o models~~ Add new GPT-4o models and update tiktoken to 0.7.0 May 14, 2024

barnett-yuxiang reviewed May 14, 2024

View reviewed changes

barnett-yuxiang approved these changes May 14, 2024

View reviewed changes

mrT23 approved these changes May 14, 2024

View reviewed changes

mrT23 merged commit b4458ff into Codium-ai:main May 14, 2024
1 check passed

mrT23 mentioned this pull request May 14, 2024

Update tiktoken to use gpt-4o #899

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new GPT-4o models and update tiktoken to 0.7.0 #901

Add new GPT-4o models and update tiktoken to 0.7.0 #901

KennyDizi commented May 14, 2024 •

edited

Loading

codiumai-pr-agent-pro bot commented May 14, 2024

codiumai-pr-agent-pro bot commented May 14, 2024

codiumai-pr-agent-pro bot commented May 14, 2024

barnett-yuxiang left a comment

KennyDizi commented May 14, 2024

barnett-yuxiang commented May 14, 2024

KennyDizi commented May 14, 2024

mrT23 commented May 14, 2024

KennyDizi commented May 14, 2024

vKongv commented May 14, 2024 •

edited

Loading

KennyDizi commented May 14, 2024

mrT23 commented May 14, 2024

vKongv commented May 14, 2024 via email

mrT23 commented May 14, 2024

barnett-yuxiang commented May 15, 2024

mrT23 commented May 15, 2024 •

edited

Loading

barnett-yuxiang commented May 15, 2024

Add new GPT-4o models and update tiktoken to 0.7.0 #901

Add new GPT-4o models and update tiktoken to 0.7.0 #901

Conversation

KennyDizi commented May 14, 2024 • edited Loading

Type

Description

Changes walkthrough

codiumai-pr-agent-pro bot commented May 14, 2024

codiumai-pr-agent-pro bot commented May 14, 2024

PR Review 🔍

codiumai-pr-agent-pro bot commented May 14, 2024

PR Code Suggestions ✨

barnett-yuxiang left a comment

Choose a reason for hiding this comment

KennyDizi commented May 14, 2024

barnett-yuxiang commented May 14, 2024

KennyDizi commented May 14, 2024

mrT23 commented May 14, 2024

KennyDizi commented May 14, 2024

vKongv commented May 14, 2024 • edited Loading

KennyDizi commented May 14, 2024

mrT23 commented May 14, 2024

vKongv commented May 14, 2024 via email

mrT23 commented May 14, 2024

barnett-yuxiang commented May 15, 2024

mrT23 commented May 15, 2024 • edited Loading

barnett-yuxiang commented May 15, 2024

KennyDizi commented May 14, 2024 •

edited

Loading

vKongv commented May 14, 2024 •

edited

Loading

mrT23 commented May 15, 2024 •

edited

Loading