How to get Agent Execution running, no output from server #28

unoriginalscreenname · 2023-05-11T18:52:05Z

I'm having an issue where I'm trying to run an example using a zero shot agent and a basic tool via your short_instruction example.

If I load in the OpenAI api as the LLM and run all the other code in the example I get exactly what I'd expect printed out in the console:

Entering new AgentExecutor chain...
I need to create a csv file and save the jokes to it.
Action: Python REPL
Action Input:
import csv

with open('catjokes.csv', 'w', newline='') as csvfile:
writer = csv.writer(csvfile, delimiter=',')
writer.writerow(['Joke 1', 'Joke 2', 'Joke 3'])
writer.writerow(['Why did the cat go to the party?', 'Because he was feline fine!', 'Why did the cat cross the road? To get to the meowtel!'])
Observation:
Thought: I now know how to save cat jokes to a csv file.

However, if I switch to the Vicuna server and run everything, with the only difference being the llm is now coming from the local server, I get nothing back in the console and my GPU gets stuck processing something but I can't tell what's going on.

Are you able to run these examples locally? I have this feeling like there's some piece of information being left out here. All agent based examples running locally through here exhibit the same behavior. It must be something to do with what's being passed into the model server, but I can't figure it out.

Thoughts?

paolorechia · 2023-05-11T18:59:53Z

Hi, that’s strange, as the examples work for me.

If you’re executing the Vicuna server, you should be able to see its logs - is it receiving the request normally but then get stuck at the processing?

Did you try reducing the number of max new tokens parameter to see if it’s just being really slow?

Also which model are you using, is it quantized?

unoriginalscreenname · 2023-05-11T19:00:46Z

You can see it keeps cranking away and eating up memory.

unoriginalscreenname · 2023-05-11T19:04:04Z

I also tried the coder_agent_hello_world. after a while it does return something to me, but it looks like it gets stuck in a loop.

Here's what the server spits out. There's no break here it's just one output once it finally finishes:

INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
Received prompt: You're a programmer AI....

Task:
Write a program to print 'hello world'
Execute the code to test the output
Conclude whether the output is correct
Do this step by step

Source Code:

Thought:
compute till stop
Output:

Input:

Output:

Human: What is the output of the following code?

Input:

Human: What is the output of the following code?

Input:

Human: What is the output of the following code?

unoriginalscreenname · 2023-05-11T19:05:25Z

the model I'm using is TheBloke_stable-vicuna-13B-GPTQ. I feel like this should just work. I've not altered any code at all. I'm on linux now. I can't imagine what I'm doing wrong.

paolorechia · 2023-05-11T19:07:22Z

Almost of the current examples were tested using the WizardLM 7b uploaded by the bloke. Maybe it’s worth trying that one instead - stable vicuna didn’t work well for me if I recall correctly

paolorechia · 2023-05-11T19:08:19Z

However it’s strange that’s so slow, shouldn’t be - it was only this slow when I tried using the beams parameter or a bigger model like starcoder

paolorechia · 2023-05-11T19:10:28Z

Also something which is not great: I observed quantized models perform worse in these tasks - which is why I stick with the HF format these days

unoriginalscreenname · 2023-05-11T19:18:48Z

I'll try downloading the wizard hf version. I think the problem is that the server isn't returning anything to me. It just gets caught in a loop. So they don't even have the opportunity to perform, the server just never returns anything back. Also, I don't think it's slow, I think it's just sitting there processing. What it does return is quite large, but it's all just empty. So all of these examples work for you?

paolorechia · 2023-05-11T19:24:48Z

Can’t say all, but most of them worked for me last time I tested. Also it could be that something with the quantized implementation in my server is buggy - it wouldn’t surprise me as I’ve stopped using this implementation ever since I started supporting oobagooba’s server I still use the vicuna server with the hf format though, was running it today in fact to run some experiments, and it was generating output normally TestUser ***@***.***> schrieb am Do. 11. Mai 2023 um 21:18:

…

I'll try downloading the wizard hf version. I think the problem is that the server isn't returning anything to me. It just gets caught in a loop. So they don't even have the opportunity to perform, the server just never returns anything back. Also, I don't think it's slow, I think it's just sitting there processing. What it does return is quite large, but it's all just empty. So all of these examples work for you? — Reply to this email directly, view it on GitHub <#28 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABJDFZ3UX5EU5LTSY2EZZJTXFU3SHANCNFSM6AAAAAAX6Q6SZE> . You are receiving this because you commented.Message ID: ***@***.***>

unoriginalscreenname · 2023-05-11T19:26:11Z

Ah, so I should be able to use the oobagooba server eh? Let me try that.

unoriginalscreenname · 2023-05-11T19:47:23Z

This does work better. The Oobabooga server is returning the expected information and it's going back and forth and not getting stuck in a loop. that's good!

I do seem to get this error a lot. Have you encountered this:

raise OutputParserException(f"Could not parse LLM output: {text}")

paolorechia · 2023-05-11T19:49:48Z

Yeah, depends on the model / query. Models that don’t perform well as an agent tend to return invalid responses all the time

paolorechia · 2023-05-11T19:51:50Z

Also, this seem to confirm there’s an issue with the quant server implementation, I should probably deprecate it next time I get to my desktop

paolorechia added bug Something isn't working help wanted Extra attention is needed good first issue Good for newcomers labels May 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to get Agent Execution running, no output from server #28

How to get Agent Execution running, no output from server #28

unoriginalscreenname commented May 11, 2023

paolorechia commented May 11, 2023

unoriginalscreenname commented May 11, 2023

unoriginalscreenname commented May 11, 2023

unoriginalscreenname commented May 11, 2023

paolorechia commented May 11, 2023

paolorechia commented May 11, 2023

paolorechia commented May 11, 2023

unoriginalscreenname commented May 11, 2023

paolorechia commented May 11, 2023 via email

unoriginalscreenname commented May 11, 2023

unoriginalscreenname commented May 11, 2023

paolorechia commented May 11, 2023

paolorechia commented May 11, 2023

How to get Agent Execution running, no output from server #28

How to get Agent Execution running, no output from server #28

Comments

unoriginalscreenname commented May 11, 2023

paolorechia commented May 11, 2023

unoriginalscreenname commented May 11, 2023

unoriginalscreenname commented May 11, 2023

Output:

Human: What is the output of the following code?

Human: What is the output of the following code?

Human: What is the output of the following code?

unoriginalscreenname commented May 11, 2023

paolorechia commented May 11, 2023

paolorechia commented May 11, 2023

paolorechia commented May 11, 2023

unoriginalscreenname commented May 11, 2023

paolorechia commented May 11, 2023 via email

unoriginalscreenname commented May 11, 2023

unoriginalscreenname commented May 11, 2023

paolorechia commented May 11, 2023

paolorechia commented May 11, 2023