Rememberer: Large Language Models Are Semi-Parametric Reinforcement Learning Agents

Code repository for RLEM (Reinforcement Learning with Experience Memory) agent, Rememberer. The corresponding paper is available at arXiv. Our paper is accepted by NeurIPS 2023.

Launch Test

launchw.sh is the launcher for the WebShop experiments. The corresponding main program is webshop.py. To launch the experiment, WebShop environment should be set up.

launch.sh is the launcher for the WikiHow experiments. The corresponding main program is wikihow.py. To launch the program, Mobile-Env environment should be set up. WikiHow task set v1.2 is used. Additionally, a tokenizer is required for VhIoWrapper wrapper, which can be downloaded from Hugging Face. The tokenizer of bert-base-uncased is ok.

To launch test with static exemplars, you may add --static option in the script.

To train a Rememberer agent, you may add --train option in the script. When launching training, you may want to shrink the size of test set for the program to prevent a complete evaluation each epoch.

The exemplars and prompt templates are stored under prompts and the initial history memories are stored under history-pools.

OpenAI API key is configed through openaiconfig.yaml

About Training Set

In this paper, two training sets are used for WebShop experiments:

S0: [500, 510)
S1: [510, 520)

These training sets are completely outside the test set of ReAct and this paper. You can simply use --trainseta 0 --trainsetb 10 or --trainseta 10 --trainsetb 20 to enable these two training sets. You can also try other training sets.

The training sets for WikiHow experiments are selected from the complementary set of the micro canonical set in the canonical set of WikiHow. They are

S0:
- add_a_contact_on_whatsapp-8
- avoid_misgendering-0
- become_a_grandmaster-7
- become_a_hooters_girl-8
- become_a_pro_footballP28soccerP29_manager-7
- become_a_specialist_physician-4
- be_cool_in_high_school_P28boysP29-0
- care_for_florida_white_rabbits-4
- fix_wet_suede_shoes-6
- get_zorua_in_pokPC3PA9mon_white-6
S1:
- be_free-0
- build_a_robot_car-8
- change_an_excel_sheet_from_read_only-4
- choose_a_swiss_army_knife-8
- color_streak_a_ponytail-0
- come_up_with_a_movie_idea-4
- contact_avast_customer_support-7
- drink_mezcal-7
- identify_hickory_nuts-6
- wear_a_dress_to_school-6

The selection simply keeps the balance of task categories and applies no other filtering.

Customized Codes for WebShop

As stated in the paper and the supplementary, the text_rich observation format of WebShop is further simplified in the certain way of ReAct. Besides, two typos of the closed tag in the HTML templates are corrected. The customized codes ared provided at zdy023/WebShop.

Citation

@article{DanyangZhang2023_Rememberer,
  author       = {Danyang Zhang and
                  Lu Chen and
                  Situo Zhang and
                  Hongshen Xu and
                  Zihan Zhao and
                  Kai Yu},
  title        = {Large Language Model Is Semi-Parametric Reinforcement Learning Agent},
  journal      = {CoRR},
  volume       = {abs/2306.07929},
  year         = {2023},
  url          = {https://doi.org/10.48550/arXiv.2306.07929},
  doi          = {10.48550/arXiv.2306.07929},
  eprinttype    = {arXiv},
  eprint       = {2306.07929},
}

Name		Name	Last commit message	Last commit date
Latest commit History 194 Commits
branch-config		branch-config
history-pools		history-pools
prompts		prompts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
agent_protos.py		agent_protos.py
branch_flag		branch_flag
history.py		history.py
launch.sh		launch.sh
launchw.sh		launchw.sh
openaiconfig.yaml		openaiconfig.yaml
requirements.txt		requirements.txt
vh_to_html.py		vh_to_html.py
webshop.py		webshop.py
webshop_agent.py		webshop_agent.py
wikihow.py		wikihow.py
wikihow_agent.py		wikihow_agent.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rememberer: Large Language Models Are Semi-Parametric Reinforcement Learning Agents

Launch Test

About Training Set

Customized Codes for WebShop

Citation

About

Releases

Packages

Languages

License

OpenDFM/Rememberer

Folders and files

Latest commit

History

Repository files navigation

Rememberer: Large Language Models Are Semi-Parametric Reinforcement Learning Agents

Launch Test

About Training Set

Customized Codes for WebShop

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages