Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add text comparison metrics to Ben's PR #632

Merged
merged 23 commits into from
Jun 21, 2024

Conversation

ntlind
Copy link
Collaborator

@ntlind ntlind commented Jun 18, 2024

Improvements

  • Add ROUGE and BLEU metrics to evaluate_text_generation
  • Make a few minor suggested changes to the PR
  • Change back to python-slim from python-alpine since the latter doesn't seem to work with hugging face's dependencies (e.g., pyarrow). Rationale is listed below. Alternatively, we can try to stick with alpine and try manually installing pyarrow's dependencies.
    • We've repeatedly run into dependency issues with python3.10-alpine; first it was scipy, now it's pyarrow. Switching back means shorter build times and fewer dependency limitations
    • We originally switched to alpine because John said that "ubuntu/debian has historically been pretty bad at patching security issues.". This doesn't seem like an overriding concern, and one that matters less in a future world where we're decoupling from Chariot.
    • python3.10-slim is a "Docker Official" image; it seems secure enough for our use.

@ntlind ntlind self-assigned this Jun 21, 2024
@ntlind ntlind marked this pull request as ready for review June 21, 2024 07:12
@ntlind ntlind merged commit de9be2e into add_llm_guided_metrics Jun 21, 2024
10 of 11 checks passed
@ntlind ntlind deleted the add_text_comparison branch June 21, 2024 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants