-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError when setting up self hosted model + langchain integration #9
Comments
Hi! Thanks for raising this. It looks like the GPU type you're specifying is "A10", which is not a valid GPU type. Can you try "A100:1"? To see all the GPU types available, you can run |
Cc @concretevitamin - it looks to me like the accelerator validation for lambda is not catching properly? |
Hey, thanks for the report. This bug showed when Lambda console contained existing instances in addition to using SkyPilot to launch new instances. This has been fixed in SkyPilot main branch. |
Ya, I spun up an A10 shortly after I wrote the above and realized it works and just wasn't in the catalogue 😄. Excellent, glad to hear it's fixed. @dcavadia I can help you get set up on the SkyPilot main branch if that's helpful, or use an existing Lambda instance you have up if you'd prefer to do that instead. |
@dongreenberg It's a quirk on our end: |
Based on my intuition to run the show-gpus command to see if a particular variant exists in the catalogue, my bias would be to either show all by default or print a warning that this is only common GPUs, and to run -a to see the full list. Maybe a middle ground would be that if I just run |
Hi! im glad you guys find the issue thanks a lot. I just set up the SkyPilot main branch with pip install git+https://github.com/skypilot-org/skypilot and that solved the problem i had before... it now set up the instance in lambda i can launch it but i get a new error while still running the function, looks like a InactiveRpcError. Have any idea on this? |
Great! Glad this worked. It's because your working directory (referenced in reqs by If for some reason you're getting an error about gRPC not finding methods, the gRPC server on your instance went down from this error. You can restart it by running |
Great! But no, it shouldn't take nearly that long to download the model with a small model like gpt2. One way to see what's happening on the server is to call the RPC with stream_logs=True (though it's not integrated in a user-facing way into langchain). Can you halt that and try running the following:
If that doesn't work, there's a way to inspect the server logs directly that I can point you to. Thank you for bearing with us! |
Yes that would be great if you can point me where to get the server logs directly. Thanks! |
If you ssh into the cluster (you can just run |
That would indeed cause the thread to hang. Confusing why Ray would be halting that when the resources are clearly available in ray status. Could you try running gpu.restart_grpc_server(restart_ray=True)? |
That did something, now at least the message is send but still giving some errors. `INFO | 2023-02-27 19:58:17,693 | Running _generate_text via gRPC ERROR | 2023-02-27 19:58:18,564 | Internal Python error in the inspect module. TypeError: 'str' object is not callable During handling of the above exception, another exception occurred: AttributeError: 'TypeError' object has no attribute 'render_traceback' During handling of the above exception, another exception occurred: AssertionError |
Ok great - your local llm object is still using the pipeline reference string stored on the previous Ray KVstore that we killed. You should be able to fix this by rerunning the cells to create the |
Oh I see, i rerun the cells but i get back to the hanging at Running _generate_text via gRPC. Now with a different log info.
|
Ok that's a new one - notebooks are funny, I think something is sticking in memory. Would it be possible to restart the notebook kernel, run from the top, and run |
Yes its been funny, I tried even with a new instance, this is the code so far:
im trying to run this in my virtual machine instead but im still setting that up, while investigating this issue wihin the notebook |
Hm, that code makes sense, it won't run? |
Its just get hang at the Running _generate_text via gRPC... Is it normal to have no resources show when going to the https://api.run.house/ dashboard interface? |
If you've logged into runhouse (i.e. you've run Would you mind just please confirming in the server if the RPC hanging is the Ray resource insufficiency again? If so, I'll raise it to the ray team, because it looks like a bug. |
Oh i see. And yes, the resources problem dosent seem to appear anymore. This is the logs from the server:
And this is the status of the ray:
|
Thanks for your patience and sorry for the delay. I filed the issue above into Ray. While filing I noticed that your traceback has both Python 3.8 (miniconda) and Python 3.10 (user) and is probably calling different Ray versions through different layers. Do you know why that would be? |
Interesting, i didnt notice that. Im not sure why would that happen but i'll dig on that right now. On the other hand, can you confirm you used in your lambda instance these same libraries/req as me?.
Thanks |
Im having this bug when trying to setup a model within a lambda cloud running SelfHostedHuggingFaceLLM() after the rh.cluster() function.
`
from langchain.llms import SelfHostedPipeline, SelfHostedHuggingFaceLLM
from langchain import PromptTemplate, LLMChain
import runhouse as rh
gpu = rh.cluster(name="rh-a10", instance_type="A10:1").save()
template = """Question: {question}
Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=["question"])
llm = SelfHostedHuggingFaceLLM(model_id="gpt2", hardware=gpu, model_reqs=["pip:./", "transformers", "torch"])
`
I made sure with sky check that the lambda credentials are set, but the error i get within the log is this, which i havent been able to solve.
If i can get any help solving this i would appreciate it.
The text was updated successfully, but these errors were encountered: