Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JinaChat to the leaderboards #117

Merged
merged 3 commits into from
Aug 9, 2023

Conversation

jupyterjazz
Copy link
Contributor

JinaChat Evaluation

PR includes JinaChat evaluation on AlpacaEval dataset using both gpt4 and claude evaluators.

Instructions to evaluate the model

Set JinaChat api key as an env variable

export JINA_CHAT_API_KEY=<key>

and run

alpaca_eval evaluate_from_model --model_configs 'jina-chat' --annotators_config <annotator_name>

Signed-off-by: jupyterjazz <saba.sturua@jina.ai>
Signed-off-by: jupyterjazz <saba.sturua@jina.ai>
logging.info(f"Completed {n_examples} examples in {t}.")

# refer to https://chat.jina.ai/billing
price = 0
Copy link
Collaborator

@YannDubs YannDubs Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

price is not 0. Pleasier either:

  • use [np.nan] * len(completions) to say that the price is not given
    or
  • (better) use the estimated price which seems to be approximately [0 if len(c) < 100 else 0.08 for c in completions]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following the standard package, I calculate pricing in the following way:

if msg tokens is more than 300, price is 0.08
otherwise 0

@YannDubs
Copy link
Collaborator

YannDubs commented Aug 9, 2023

Thanks for the contribution, JinaChar seems like a cool project/product!

Please make the few small changes above and I'll merge

Signed-off-by: jupyterjazz <saba.sturua@jina.ai>
@jupyterjazz
Copy link
Contributor Author

@YannDubs suggestions applied! can you take a look again? thx

@YannDubs
Copy link
Collaborator

YannDubs commented Aug 9, 2023

LGTM although I don't have access to the API key so I can't test.
Merging, thanks!

@YannDubs YannDubs merged commit eda4a40 into tatsu-lab:main Aug 9, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants