Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

在创建GPU-4G-7B-CN时出现错误,使用的硬件是RTX2060 6G #9

Closed
alphaDeng opened this issue May 24, 2023 · 8 comments
Closed

Comments

@alphaDeng
Copy link

模型文件为RWKV-4-Raven-7B-v11-Eng49%-Chn49%-Jpn1%-Other1%-20230430-ctx8192.pth
具体报错信息为:
�[32mINFO�[0m: Started server process [�[36m17688�[0m]
�[32mINFO�[0m: Waiting for application startup.
torch found: D:\py-ai\RWVK-Runner\py310\Lib\site-packages\torch\lib
torch set
�[32mINFO�[0m: Application startup complete.
�[32mINFO�[0m: Uvicorn running on �[1mhttp://127.0.0.1:8000�[0m (Press CTRL+C to quit)
�[32mINFO�[0m: 127.0.0.1:49573 - "�[1mGET / HTTP/1.1�[0m" �[32m200 OK�[0m
�[32mINFO�[0m: 127.0.0.1:49573 - "�[1mGET / HTTP/1.1�[0m" �[32m200 OK�[0m
�[32mINFO�[0m: 127.0.0.1:49573 - "�[1mGET /status HTTP/1.1�[0m" �[32m200 OK�[0m
�[32mINFO�[0m: 127.0.0.1:49573 - "�[1mOPTIONS /update-config HTTP/1.1�[0m" �[32m200 OK�[0m
�[32mINFO�[0m: 127.0.0.1:49573 - "�[1mOPTIONS /switch-model HTTP/1.1�[0m" �[32m200 OK�[0m
max_tokens=4100 temperature=1.2 top_p=0.5 presence_penalty=0.4 frequency_penalty=0.4
�[32mINFO�[0m: 127.0.0.1:49573 - "�[1mPOST /update-config HTTP/1.1�[0m" �[32m200 OK�[0m
RWKV_JIT_ON 1 RWKV_CUDA_ON 0 RESCALE_LAYER 6

Loading models/RWKV-4-Raven-7B-v11-Eng49%-Chn49%-Jpn1%-Other1%-20230430-ctx8192.pth ...
Strategy: (total 32+1=33 layers)

  • cuda [float16, uint8], store 8 layers, stream 24 layers
  • cpu [float32, float32], store 1 layers
    0-cuda-float16-uint8 1-cuda-float16-uint8 2-cuda-float16-uint8 3-cuda-float16-uint8 4-cuda-float16-uint8 5-cuda-float16-uint8 6-cuda-float16-uint8 7-cuda-float16-uint8 8-cuda-float16-uint8-stream 9-cuda-float16-uint8-stream 10-cuda-float16-uint8-stream 11-cuda-float16-uint8-stream 12-cuda-float16-uint8-stream 13-cuda-float16-uint8-stream 14-cuda-float16-uint8-stream 15-cuda-float16-uint8-stream 16-cuda-float16-uint8-stream 17-cuda-float16-uint8-stream 18-cuda-float16-uint8-stream 19-cuda-float16-uint8-stream 20-cuda-float16-uint8-stream 21-cuda-float16-uint8-stream 22-cuda-float16-uint8-stream 23-cuda-float16-uint8-stream 24-cuda-float16-uint8-stream 25-cuda-float16-uint8-stream 26-cuda-float16-uint8-stream 27-cuda-float16-uint8-stream 28-cuda-float16-uint8-stream 29-cuda-float16-uint8-stream 30-cuda-float16-uint8-stream 31-cuda-float16-uint8-stream 32-cpu-float32-float32
    emb.weight f16 cpu 50277 4096
    1 validation error for RWKV
    root
    Torch not compiled with CUDA enabled (type=assertion_error)
    INFO: 127.0.0.1:49573 - "POST /switch-model HTTP/1.1" 500 Internal Server Error
    ERROR: Exception in ASGI application
    Traceback (most recent call last):
    File "D:\py-ai\RWVK-Runner\backend-python\routes\config.py", line 36, in switch_model
    RWKV(
    File "pydantic\main.py", line 341, in pydantic.main.BaseModel.init
    pydantic.error_wrappers.ValidationError: 1 validation error for RWKV
    root
    Torch not compiled with CUDA enabled (type=assertion_error)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 428, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\uvicorn\middleware\proxy_headers.py", line 78, in call
return await self.app(scope, receive, send)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\fastapi\applications.py", line 276, in call
await super().call(scope, receive, send)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\applications.py", line 122, in call
await self.middleware_stack(scope, receive, send)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\middleware\errors.py", line 184, in call
raise exc
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\middleware\errors.py", line 162, in call
await self.app(scope, receive, _send)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\middleware\cors.py", line 92, in call
await self.simple_response(scope, receive, send, request_headers=headers)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\middleware\cors.py", line 147, in simple_response
await self.app(scope, receive, send)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\middleware\exceptions.py", line 79, in call
raise exc
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\middleware\exceptions.py", line 68, in call
await self.app(scope, receive, sender)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\fastapi\middleware\asyncexitstack.py", line 21, in call
raise e
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\fastapi\middleware\asyncexitstack.py", line 18, in call
await self.app(scope, receive, send)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\routing.py", line 718, in call
await route.handle(scope, receive, send)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\routing.py", line 276, in handle
await self.app(scope, receive, send)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\routing.py", line 66, in app
response = await func(request)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\fastapi\routing.py", line 237, in app
raw_response = await run_endpoint_function(
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\fastapi\routing.py", line 165, in run_endpoint_function
return await run_in_threadpool(dependant.call, **values)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\starlette\concurrency.py", line 41, in run_in_threadpool
return await anyio.to_thread.run_sync(func, *args)
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "D:\py-ai\RWVK-Runner\py310\Lib\site-packages\anyio_backends_asyncio.py", line 867, in run
result = context.run(func, *args)
File "D:\py-ai\RWVK-Runner\backend-python\routes\config.py", line 45, in switch_model
raise HTTPException(status.HTTP_500_INTERNAL_SERVER_ERROR, "failed to load")
AttributeError: 'function' object has no attribute 'HTTP_500_INTERNAL_SERVER_ERROR'

@josStorer
Copy link
Owner

把site-packages\torch目录删了运行,会提示安装依赖, 安装指定版本的pytorch,需要用1.13
另外以下是2060 6G的推荐配置
}BUDI1D(H_VOJS}$MMQPR~4

@josStorer
Copy link
Owner

保证 py310\Lib\site-packages\torch-1.13.1+cu117.dist-info 存在,不能是2.x版本

@alphaDeng
Copy link
Author

在安装新的pytorch(torch==1.13.1+cu117)以后,模型可以运行,但在上述设置下会出现爆显存的情况.
将载入显存层数降低到11层之后,此时模型能够运行,但是出现以下的警告,对话回复出现乱码的情况。

警告内容:
INFO: 127.0.0.1:52740 - "POST /switch-model HTTP/1.1" 200 OK
INFO: 127.0.0.1:52804 - "OPTIONS /chat/completions HTTP/1.1" 200 OK
INFO: 127.0.0.1:52804 - "POST /chat/completions HTTP/1.1" 200 OK
D:\py-ai\RWVK-Runner\py310\Lib\site-packages\rwkv\model.py:672: UserWarning: operator () profile_node %194 : int = prim::profile_ivalue(%192)
does not have profile information (Triggered internally at ..\torch\csrc\jit\codegen\cuda\graph_fuser.cpp:109.)
x, state[i5+0], state[i5+1], state[i5+2], state[i5+3] = ATT(

具体输出内容乱码情况:
图片1
图片2

切换为cpu时运作正常,这个或许是cuda版本问题?

@josStorer
Copy link
Owner

不应该,我群里有同样用2060的,就用上面的配置,运行良好,比较奇怪

@alphaDeng
Copy link
Author

话说群里人用的2060的cuda是11.4以上的版本的吗?我这个好像是11.1的,目前nvdia-smi那里显示的cuda版本限制是11.4,我不太清楚现在能不能直接升级11.7的cuda。。。。

@josStorer
Copy link
Owner

cuda是不用专门装的,更新显卡驱动就好了

@alphaDeng
Copy link
Author

嗯,目前用cuda的toolkit装了个cu11.7的cuda,那个套件顺手把驱动更新了。安装完以后可以正常运行了,但那个警告还是在那里。

@josStorer
Copy link
Owner

那个警告无所谓,能正常用就好了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants