New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about webshopEnv #14
Comments
Could you please tell me how to access the url in the https://github.com/ysymyth/ReAct/blob/master/WebShop.ipynb: http://3.83.245.205:3000? Thank you |
The livesite is down princeton-nlp/WebShop#20 I installed WebShop from source and ran a local server for my experiments |
@topwasu if your script works, mind sharing or creating a pull request? thanks! |
@ysymyth You mean for the modified The modification:
And I also remove the code that hides the 4th-10th item on the page in
|
yes, but for the GPT-3 result. |
@topwasu Could you share the prompt for chatgpt? |
Hi! I'm replicating ReAct results on WebShop, and I have several questions with webshopEnv in the jupyter notebook
Is this also what you used in the paper?
assert False
when the button Next or Prev is clicked. Is this also intentional?Also, I have got results of ReAct on WebShop with session id fixed_{1-500}, which I believe is the same setup as the paper, using this environment (did not modify it) but with different llm (not PaLM-540B):
gpt-turbo-3.5
Act - Score: 64.99 Success Rate: 34.0
ReAct - Score: 59.9 Success Rate: 30.0
code-davinci-002
Act - Score: 64.99 Success Rate: 34.0
ReAct - Score: 65.60 Success Rate: 38.8
Is this to be expected? Wondering if you have any thoughts on this. After some researching, there're people saying that chain-of-thought might not be as effective for models that was trained with RLHF like ChatGPT. But I don't have much explanation for why I'm not seeing the performance boost from Act to ReAct with Codex (code-davinci-002)
Thank you in advance! Love the simplicity of your work and I'm trying to come up with new ideas based off of this paper :)
The text was updated successfully, but these errors were encountered: