Information Retrieval - Challenge b #4261

PortlandKyGuy · 2023-05-17T04:03:14Z

Background

This is the information retrieval challenge b. It retrieves information that is consistent over time. It also requires enough details that a couple requests from the LLM are often needed.

Relates to Issue #3837

Changes

Added a new agent specific for this challenge (get_nobel_prize_agent) in tests/integration/agent_factory.py

Added a the new information retrieval test that is more challenging than challenge 'a', but not much more. The file is tests/integration/challenges/information_retrieval/test_information_retrieval_challenge_b.py

Documentation

The test's methods are documented and uses the same structure as challenge a.

Test Plan

This is a challenge and intended to be solved over time. Currently it is marked as skip.

PR Quality Checklist

My pull request is atomic and focuses on a single change.
I have thoroughly tested my changes with multiple different prompts.
I have considered potential risks and mitigations for my changes.
I have documented my changes clearly and comprehensively.
I have not snuck in any "extra" small tweaks changes

vercel · 2023-05-17T04:03:18Z

Deployment failed with the following error:

Resource is limited - try again in 2 hours (more than 100, code: "api-deployments-free-per-day").

codecov · 2023-05-17T04:06:30Z

Codecov Report

Patch and project coverage have no change.

Comparison is base (ee9f10a) 67.77% compared to head (d8811a4) 67.77%.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #4261   +/-   ##
=======================================
  Coverage   67.77%   67.77%           
=======================================
  Files          72       72           
  Lines        3516     3516           
  Branches      560      560           
=======================================
  Hits         2383     2383           
  Misses        948      948           
  Partials      185      185

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

tests/integration/agent_factory.py

waynehamadi · 2023-05-17T13:45:53Z

@PortlandKyGuy left some comments, here is my take : we can keep your test but precise that the information retrieval challenges also apply to using the LLM.

Here is my suggestion:

add the web command to the challenge
change CYCLE_COUNT to 2
normally this challenge should be done in 2 cycles. (1 to write to file and 1 to task_complete) because it has to use the LLM to get the answer.

This is coherent with the obtain knowledge function we plan to build: the obtain knowledge function first asks the llm and then if the llm is not confident, asks the web.

PortlandKyGuy · 2023-05-17T19:44:37Z

@merwanehamadi , It looks like there were changes to the run_interaction_loop as well that were not picked up. I will work on your suggestions and fixing the run_interaction_loop for this test as well.

vercel · 2023-05-17T19:59:07Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
docs	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	May 29, 2023 0:22am

waynehamadi · 2023-05-17T23:40:12Z

thank you ! ❤️
@PortlandKyGuy

tests/integration/challenges/information_retrieval/test_information_retrieval_challenge_b.py

vercel · 2023-05-26T14:18:07Z

Deployment failed with the following error:

Resource is limited - try again in 10 minutes (more than 100, code: "api-deployments-free-per-day").

waynehamadi · 2023-05-26T14:58:05Z

@PortlandKyGuy thanks ! could you fix the linter ?
Also we need this in the doc, you can copy how it's been done with docs/challenges/information_retrieval/challenge_a.md

don't forget to update the mkdocs.yml as well please

waynehamadi · 2023-05-29T02:02:00Z

@PortlandKyGuy thanks a lot! it's here now there are some things to add:
#4456

PortlandKyGuy added 5 commits May 16, 2023 20:46

test: add information retrieval challenge b

3d57985

test: get information retrieval challenge be working.

716c6de

chore: clean up comments and imports.

8ac2913

chore: fix incorrect import

9f49ced

chore: clean up imports.

3fd6575

github-actions bot added the size/l label May 17, 2023

waynehamadi suggested changes May 17, 2023

View reviewed changes

tests/integration/agent_factory.py Show resolved Hide resolved

tests/integration/agent_factory.py Show resolved Hide resolved

gravelBridge added the challenge label May 17, 2023

fix: add web_selenium cmd. resolve missing loop cycle

43025b0

chore: remove commented code and unused imports.

6608080

Merge branch 'Significant-Gravitas:master' into master

fca5f64

vercel bot temporarily deployed to Preview May 20, 2023 20:42 Inactive

waynehamadi suggested changes May 24, 2023

View reviewed changes

tests/integration/challenges/information_retrieval/test_information_retrieval_challenge_b.py Outdated Show resolved Hide resolved

Merge branch 'Significant-Gravitas:master' into master

c42b8a5

vercel bot temporarily deployed to Preview May 26, 2023 13:33 Inactive

PortlandKyGuy added 3 commits May 26, 2023 06:46

fix (4261): use 2 cycles instead of 3

1cc8cb4

chore: fix mypy formatting

2ca53ce

chore: try 2 for mypy formatting

82eeea9

PortlandKyGuy and others added 3 commits May 27, 2023 21:11

Merge branch 'Significant-Gravitas:master' into master

d049d67

chore: resolve flake8 issues

c3b6da7

chore: add docs

b1623c5

vercel bot temporarily deployed to Preview May 28, 2023 04:27 Inactive

PortlandKyGuy and others added 4 commits May 27, 2023 21:39

chore: resolve linting flake8

9025171

Merge branch 'Significant-Gravitas:master' into master

9e25310

chore: correct formatting to black

67c95bb

Update challenge_b.md

d8811a4

vercel bot temporarily deployed to Preview May 29, 2023 00:22 Inactive

waynehamadi mentioned this pull request May 29, 2023

Information retrieval challenge #4456

Merged

6 tasks

waynehamadi closed this May 29, 2023

waynehamadi mentioned this pull request May 31, 2023

Help us build challenges! #3835

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Information Retrieval - Challenge b #4261

Information Retrieval - Challenge b #4261

PortlandKyGuy commented May 17, 2023 •

edited

Loading

vercel bot commented May 17, 2023

codecov bot commented May 17, 2023 •

edited

Loading

waynehamadi commented May 17, 2023

PortlandKyGuy commented May 17, 2023

vercel bot commented May 17, 2023 •

edited

Loading

waynehamadi commented May 17, 2023

vercel bot commented May 26, 2023

waynehamadi commented May 26, 2023

waynehamadi commented May 29, 2023

Information Retrieval - Challenge b #4261

Information Retrieval - Challenge b #4261

Conversation

PortlandKyGuy commented May 17, 2023 • edited Loading

Background

Changes

Documentation

Test Plan

PR Quality Checklist

vercel bot commented May 17, 2023

codecov bot commented May 17, 2023 • edited Loading

Codecov Report

waynehamadi commented May 17, 2023

PortlandKyGuy commented May 17, 2023

vercel bot commented May 17, 2023 • edited Loading

waynehamadi commented May 17, 2023

vercel bot commented May 26, 2023

waynehamadi commented May 26, 2023

waynehamadi commented May 29, 2023

PortlandKyGuy commented May 17, 2023 •

edited

Loading

codecov bot commented May 17, 2023 •

edited

Loading

vercel bot commented May 17, 2023 •

edited

Loading