Question about webshopEnv #14

topwasu · 2023-07-14T18:00:45Z

Hi! I'm replicating ReAct results on WebShop, and I have several questions with webshopEnv in the jupyter notebook

It seems like you set the environment to only output the top 3 product (instead of the full 10)

if prod_cnt >= 3:
    processed_t = ''

Is this also what you used in the paper?

There's also assert False when the button Next or Prev is clicked. Is this also intentional?

Also, I have got results of ReAct on WebShop with session id fixed_{1-500}, which I believe is the same setup as the paper, using this environment (did not modify it) but with different llm (not PaLM-540B):

gpt-turbo-3.5
Act - Score: 64.99 Success Rate: 34.0
ReAct - Score: 59.9 Success Rate: 30.0

code-davinci-002
Act - Score: 64.99 Success Rate: 34.0
ReAct - Score: 65.60 Success Rate: 38.8

Is this to be expected? Wondering if you have any thoughts on this. After some researching, there're people saying that chain-of-thought might not be as effective for models that was trained with RLHF like ChatGPT. But I don't have much explanation for why I'm not seeing the performance boost from Act to ReAct with Codex (code-davinci-002)

Thank you in advance! Love the simplicity of your work and I'm trying to come up with new ideas based off of this paper :)

The text was updated successfully, but these errors were encountered:

leoozy · 2023-07-19T05:34:54Z

Could you please tell me how to access the url in the https://github.com/ysymyth/ReAct/blob/master/WebShop.ipynb: http://3.83.245.205:3000? Thank you

topwasu · 2023-07-19T16:27:29Z

Could you please tell me how to access the url in the https://github.com/ysymyth/ReAct/blob/master/WebShop.ipynb: http://3.83.245.205:3000? Thank you

The livesite is down princeton-nlp/WebShop#20 I installed WebShop from source and ran a local server for my experiments

ysymyth · 2023-07-20T18:20:32Z

@topwasu if your script works, mind sharing or creating a pull request? thanks!

topwasu · 2023-07-21T01:01:56Z

@ysymyth You mean for the modified webshopEnv and webshop_text? I mean, yes, I can definitely share it below and can also create a pull request if you want, but I don't think I should create a pull request if this is what you used in your paper though. Like, it seems more like an intentional modification to webshop you used for the paper than a bug to me. So just to go back to the original question, is this jupyter book the one you used for the experimental results in the paper?

The modification:
I simply remove the two assert False under the if condition for the 'Next >' and '< Prev' button in webshopEnv

elif button == 'Next >':
  assert False # ad hoc page limitation # <-- I commented this out
  assert self.sessions[session]['page_type'] == 'search'
  self.sessions[session]['page_num'] += 1
elif button == '< Prev':
  assert self.sessions[session]['page_type'] in ['search', 'item_sub', 'item']
  if self.sessions[session]['page_type'] == 'search':
    assert False # <-- I commented this out

And I also remove the code that hides the 4th-10th item on the page in webshop_text

elif t.parent.get('class') == ["product-link"]: # product asins
  processed_t = f'\n[{t}] '
  if prod_cnt >= 3: # <-- I commented this out
    processed_t = '' # <-- I commented this out
  prod_cnt += 1
  asins.append(str(t))
  just_prod = 0

ysymyth · 2023-07-23T02:50:13Z

is this jupyter book the one you used for the experimental results in the paper?

yes, but for the GPT-3 result.

SuhongMoon · 2024-01-03T23:21:43Z

@topwasu Could you share the prompt for chatgpt?

ysymyth closed this as completed Jul 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about webshopEnv #14

Question about webshopEnv #14

topwasu commented Jul 14, 2023

leoozy commented Jul 19, 2023

topwasu commented Jul 19, 2023

ysymyth commented Jul 20, 2023

topwasu commented Jul 21, 2023

ysymyth commented Jul 23, 2023

SuhongMoon commented Jan 3, 2024

Question about webshopEnv #14

Question about webshopEnv #14

Comments

topwasu commented Jul 14, 2023

leoozy commented Jul 19, 2023

topwasu commented Jul 19, 2023

ysymyth commented Jul 20, 2023

topwasu commented Jul 21, 2023

ysymyth commented Jul 23, 2023

SuhongMoon commented Jan 3, 2024