Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BFCL April 9th Release (Dataset Bug Fix) #338

Merged
merged 11 commits into from
Apr 11, 2024

Conversation

HuanzhiMao
Copy link
Collaborator

@HuanzhiMao HuanzhiMao commented Apr 11, 2024

This PR is for the BFCL April 9th release:

  1. Bug fix in the evaluation dataset. This involves modifying both prompts and function docs.
  2. Bug fix for possible answers.

The detailed breakdown is attached below. If you spot any issue with our evaluation dataset and/or possible answers, please feel free to raise an issue!

Test Category Prompt/Func Doc Correction Count Possible Answer Correction Count
Simple 3 16
Parallel 1 16
Multiple 1 11
Parallel Multiple 10 43

This PR DOES change the leaderboard score. We will update the leaderboard website shortly, in PR #341


Co-authored-by: Charlie Cheng-Jie Ji charliechengjieji@berkeley.edu
Co-authored-by: Fanjia Yan fanjiayan@berkeley.edu

@HuanzhiMao HuanzhiMao changed the title BFCL April 9th Release [WIP] BFCL April 9th Release Apr 11, 2024
@HuanzhiMao HuanzhiMao marked this pull request as ready for review April 11, 2024 09:02
@HuanzhiMao HuanzhiMao changed the title [WIP] BFCL April 9th Release BFCL April 9th Release Apr 11, 2024
Copy link
Contributor

@CharlieJCJ CharlieJCJ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ShishirPatil ShishirPatil merged commit 1dd0576 into ShishirPatil:main Apr 11, 2024
ShishirPatil pushed a commit that referenced this pull request Apr 11, 2024
This PR updates the leaderboard data, as mentioned in #338. As a result,
some values/scores are changed.
Note that the model `glaiveai/glaive-function-calling-v1` is excluded in
this leaderboard update PR due to the model's tokenizer problem. We
cannot generate that model's output data. So to avoid confusion, we
excluded that model in this update.
@HuanzhiMao HuanzhiMao deleted the BFCL-v2 branch April 11, 2024 23:48
@HuanzhiMao HuanzhiMao changed the title BFCL April 9th Release BFCL April 9th Release (Dataset Bug Fix) May 7, 2024
devanshamin pushed a commit to devanshamin/gorilla that referenced this pull request Jul 9, 2024
This PR is for the BFCL April 9th release:

1. Bug fix in the evaluation dataset. This involves modifying both
prompts and function docs.
2. Bug fix for possible answers.

The detailed breakdown is attached below. If you spot any issue with our
evaluation dataset and/or possible answers, please feel free to raise an
issue!

| Test Category | Prompt/Func Doc Correction Count | Possible Answer
Correction Count |

|---------------------|-----------------------------|-----------------------------|
| Simple              | 3                           | 16 |
| Parallel             | 1                           | 16|
| Multiple              | 1                         | 11 |
| Parallel Multiple   | 10                          | 43 |

This PR **DOES** change the leaderboard score. We will update the
leaderboard website shortly, in PR ShishirPatil#341

---------

Co-authored-by: Charlie Cheng-Jie Ji <charliechengjieji@berkeley.edu>
Co-authored-by: Fanjia Yan <fanjiayan@berkeley.edu>

---------

Co-authored-by: Charlie Cheng-Jie Ji <charliechengjieji@berkeley.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants