Skip to content

Add reflection pattern to agbench lint #6051

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
Open

Add reflection pattern to agbench lint #6051

wants to merge 20 commits into from

Conversation

gagb
Copy link
Collaborator

@gagb gagb commented Mar 21, 2025

This pull request includes several changes to the agbench package, focusing on adding new functionality, improving existing code, and updating dependencies. The most important changes include adding a new dependency, modifying the Document class, enhancing the code_document method, and introducing a new prompt for qualitative coding.

Dependency Updates:

  • Added the tiktoken library to the dependencies in pyproject.toml.

Enhancements to Document Class:

  • Added a new lines field to the Document class to store the document as a list of strings.
  • Updated the load_log_file function to populate the lines field in the Document class.

Improvements to Qualitative Coding:

  • Introduced a new MAIN_PROMPT for generating error codes in oai_coder.py.
  • Implemented the count_tokens function to count tokens using the tiktoken library.
  • Enhanced the code_document method to handle long documents, generate feedback, and update codes based on feedback. [1] [2] [3]

These changes aim to improve the accuracy and efficiency of the qualitative coding process in the agbench package.

@gagb gagb requested a review from afourney March 21, 2025 02:23
@gagb
Copy link
Collaborator Author

gagb commented Mar 21, 2025

@changliu2 and @ShipraJain01 fyi

@afourney
Copy link
Member

Generally looks good, but it's failing the CI.

@gagb
Copy link
Collaborator Author

gagb commented Mar 25, 2025

Generally looks good, but it's failing the CI.

Working with @jackgerrits to resolve it. Not sure what's causing all the uv errors.

@gagb
Copy link
Collaborator Author

gagb commented Mar 25, 2025

More features to add based on Chang's feedback

  • Can you add a feature to summarize that into counts, like 3 counts for a red code "xyz", and 2 counts for a red code "abc" like in the M1 paper?
  • The LLM non-determinism is an issue. Would it be a good idea if we set into temperature to 0 and top_p to 1, or even fix seed?
  • I repeated the run twice; the red categories were quite inconsistent
  • allow switching the model
  • batch summarization

@gagb
Copy link
Collaborator Author

gagb commented Mar 26, 2025

@jackgerrits , I changed my uv to 0.5.18 but I am still stuck in uv.lock rabbit hole. Help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants