Skip to content
This repository has been archived by the owner on Sep 30, 2023. It is now read-only.

Not getting any predictions - but posting to turbopilot.......... #7

Closed
voarsh2 opened this issue Apr 13, 2023 · 6 comments
Closed

Comments

@voarsh2
Copy link

voarsh2 commented Apr 13, 2023

Running the latest code for turbopilot. Running the extension (fauxpilot) v1.1.4. I've tried earlier versions too...

It runs:
image

I see POST requests:
image

I see the calls in VSC:
image

But I don't get any prediction as per the project's .gif image.... 🗡️
Am I missing something?

My settings:
image

image

@ravenscroftj
Copy link
Owner

Thanks for your ticket. Looking at the prediction logs you screenshotted there it's taking about 2 minutes to generate a response on your system - I'm not 100% sure but its possible you're hitting a timeout issue from within vscode. Are you seeing any timeout logs from fauxcode? What system are you running on and how many threads are you using?

@voarsh2
Copy link
Author

voarsh2 commented Apr 13, 2023

What system are you running on and how many threads are you using?

The memory usage was around 4GB's. 12 x AMD Ryzen 5 3600X 6-Core Processor (1 Socket) (with AVX2 support).

At the time I believe I was using 6 threads. I had also been using the 2B model.

I'm not 100% sure but its possible you're hitting a timeout issue from within vscode. Are you seeing any timeout logs from fauxcode?

The developer tools shows the extension requests with making the requests to the server and also shows the request is done, but no prediction is returned. There's no log there about a timeout that I can see. :/

I tried with 2 threads and saw a prediction (once) - trying to reproduce... but it seems very hit and miss (it was also nonsensical and pressing TAB didn't accept the prediction) O.o
Might need to open an issue on the fauxpilot vsc plugin repo.

image
image

image

One thing I notice is with K8 I add a readiness health check for start up - but during my requests for predictions, it often fails the healthcheck.... so K8 stops traffic to it..... for whatever reason, when making requests it often fails the healthcheck while I am asking for predictions. My CPU isn't pegged 100%, so I am thinking there's an issue with the webserver getting hung.

@ravenscroftj
Copy link
Owner

Ok great thank you for your report - is it possible that k8s is trying to kill and restart the pod while prediction is happening and the health endpoint is timing out? I imagine k8s would be a pretty common deployment case so I definitely want to support a liveness check. I think the web server only only handle one request at a time at the moment - I will see if I can turn on Crow's multithreaded support.

I also noticed that the fauxpilot plugin can be a little bit fussy about retaining focus - I think if you start a prediction, change to another window and change back it will cancel the suggestion popup. I might dive into the plugin itself over the weekend to see if I can make any improvements myself - from a UX point of view even a notification of progress would be welcome right?

@ravenscroftj
Copy link
Owner

ravenscroftj commented Apr 15, 2023

I've now enabled multi-threaded serving in v0.0.4 so hopefully you won't find that the healthcheck endpoint becomes unresponsive while the system is working.

I've also forked the fauxpilot plugin for VSCode which actually shows a status wheel while it is thinking which helps with the UX a bit. I've provided a custom build of the plugin that can be downloaded and installed via the install from VSIX file option for now (hopefully the upstream maintainer will incorporate my PR).

image

I've also noticed that the fauxcode plugin is a bit fussy about maintaining focus so if you click out of the window while it's thinking it seems to silently throw the suggestions away.

@whatvn
Copy link

whatvn commented Apr 17, 2023

are you thinking about changing the way this extension work but not automatically generate code but follow some command. For example, if I want to generate code, I press "CMD-K", this reduce a lot of unnecessary request to server. What do you think?

@ravenscroftj
Copy link
Owner

@whatvn I think that'd be great - I'm looking into the feasibility at the moment. Within VSCode we're limited by what their API lets us do. The API Documentation states that "Providers are asked for completions either explicitly by a user gesture or implicitly when typing." so in theory it seems possible but I am looking at how to make it work with a user gesture or at least make it wait a little longer when you stop typing before sending a request

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants