Skip to content

[Bug]: 服务化部署失败,ModuleNotFoundError: No module named 'predict' #10265

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task done
yiyayieryo opened this issue Mar 26, 2025 · 5 comments
Open
1 task done
Assignees
Labels
bug Something isn't working

Comments

@yiyayieryo
Copy link

软件环境

参考“大模型服务化部署-快速开始教程”,使用支持v100的镜像ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddlenlp:llm-serving-cuda118-cudnn8-v2.1

教程链接:https://paddlenlp.readthedocs.io/zh/latest/llm/server/docs/general_model_inference.html

- paddlepaddle:3.0.0.dev20250310
- paddlepaddle-gpu: 3.0.0.dev20250310
- paddlenlp: 3.0.0b4.post20250306

root@default-paddlenlp-inference-service-0:/opt/source/PaddleNLP/llm# pip list | grep paddle
paddle2onnx              2.0.0
paddlefsl                1.1.0
paddlenlp                3.0.0b4.post20250306
paddlenlp-ops            0.0.0
paddlepaddle-gpu         3.0.0.dev20250310

重复问题

  • I have searched the existing issues

错误描述

1、服务化部署失败
参考https://github.com/PaddlePaddle/PaddleNLP/blob/develop/llm/README.md服务化部署脚本,执行报错,如下

root@default-paddlenlp-inference-service-0:/opt/source/PaddleNLP/llm# python3  ./predict/flask_server.py     --model_name_or_path Qwen/Qwen2.5-0.5B-Instruct     --port 8010     --flask_port 8011     --dtype "float16"
Traceback (most recent call last):
  File "/opt/source/PaddleNLP/llm/./predict/flask_server.py", line 27, in <module>
    from predict.predictor import (
ModuleNotFoundError: No module named 'predict'

2、安装predict失败
使用pip install predict,安装报错,尝试多次调试,均未成功
错误很长,截取部分
Building wheels for collected packages: PyWavelets, scikit-image
  Building wheel for PyWavelets (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [462 lines of output]
      /tmp/pip-install-d9pw2pk4/pywavelets_9a235184ba354152aac4da1d1296e917/setup.py:69: DeprecationWarning: the imp module is deprecated in favour of importlib and slated for removal in Python 3.12; see the module's documentation for alternative uses
        import imp
      running bdist_wheel
      running build


      pywt/_extensions/_pywt.c: In function ‘__Pyx__ExceptionSwap’:
      pywt/_extensions/_pywt.c:36973:24: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_type’; did you mean ‘curexc_type’?
      36973 |     tmp_type = tstate->exc_type;
            |                        ^~~~~~~~
            |                        curexc_type
      pywt/_extensions/_pywt.c:36974:25: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_value’; did you mean ‘curexc_value’?
      36974 |     tmp_value = tstate->exc_value;
            |                         ^~~~~~~~~
            |                         curexc_value
      pywt/_extensions/_pywt.c:36975:22: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
      36975 |     tmp_tb = tstate->exc_traceback;
            |                      ^~~~~~~~~~~~~
            |                      curexc_traceback
      pywt/_extensions/_pywt.c:36976:13: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_type’; did you mean ‘curexc_type’?
      36976 |     tstate->exc_type = *type;
            |             ^~~~~~~~
            |             curexc_type
      pywt/_extensions/_pywt.c:36977:13: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_value’; did you mean ‘curexc_value’?
      36977 |     tstate->exc_value = *value;
            |             ^~~~~~~~~
            |             curexc_value
      pywt/_extensions/_pywt.c:36978:13: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
      36978 |     tstate->exc_traceback = *tb;
            |             ^~~~~~~~~~~~~
            |             curexc_traceback
      error: command '/usr/bin/x86_64-linux-gnu-gcc' failed with exit code 1
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for PyWavelets
  Running setup.py clean for PyWavelets
  Building wheel for scikit-image (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [1014 lines of output]
      /tmp/pip-install-d9pw2pk4/scikit-image_c1c49b74876e489b94e4d7418a3de355/setup.py:167: DeprecationWarning:

        `numpy.distutils` is deprecated since NumPy 1.23.0, as a result
        of the deprecation of `distutils` itself. It will be removed for
        Python >= 3.12. For older Python versions it will remain present.
        It is recommended to use `setuptools < 60.0` for those Python versions.
        For more details, see:
          https://numpy.org/devdocs/reference/distutils_status_migration.html



      In file included from /usr/include/python3.10/unicodeobject.h:1046,
                       from /usr/include/python3.10/Python.h:83,
                       from skimage/_shared/transform.c:17:
      /usr/include/python3.10/cpython/unicodeobject.h:551:42: note: declared here
        551 | Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject*) PyUnicode_FromUnicode(
            |                                          ^~~~~~~~~~~~~~~~~~~~~
      error: Command "x86_64-linux-gnu-gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/local/lib/python3.10/dist-packages/numpy/_core/include -I/usr/local/lib/python3.10/dist-packages/numpy/_core/include -Ibuild/src.linux-x86_64-3.10/numpy/distutils/include -I/usr/include/python3.10 -c skimage/_shared/transform.c -o build/temp.linux-x86_64-3.10/skimage/_shared/transform.o -MMD -MF build/temp.linux-x86_64-3.10/skimage/_shared/transform.o.d -fopenmp -msse -msse2 -msse3" failed with exit status 1
      INFO:
      ########### EXT COMPILER OPTIMIZATION ###########
      INFO: Platform      :
        Architecture: x64
        Compiler    : gcc

      CPU baseline  :
        Requested   : 'min'
        Enabled     : SSE SSE2 SSE3
        Flags       : -msse -msse2 -msse3
        Extra checks: none

      CPU dispatch  :
        Requested   : 'max -xop -fma4'
        Enabled     : SSSE3 SSE41 POPCNT SSE42 AVX F16C FMA3 AVX2 AVX512F AVX512CD AVX512_KNL AVX512_KNM AVX512_SKX AVX512_CLX AVX512_CNL AVX512_ICL
        Generated   : none
      INFO: CCompilerOpt.cache_flush[864] : write cache to path -> /tmp/pip-install-d9pw2pk4/scikit-image_c1c49b74876e489b94e4d7418a3de355/build/temp.linux-x86_64-3.10/ccompiler_opt_cache_ext.py
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for scikit-image
  Running setup.py clean for scikit-image
Failed to build PyWavelets scikit-image
ERROR: Failed to build installable wheels for some pyproject.toml based projects (PyWavelets, scikit-image)

稳定复现步骤 & 代码

python3 ./predict/flask_server.py --model_name_or_path Qwen/Qwen2.5-0.5B-Instruct --port 8010 --flask_port 8011 --dtype "float16"
Traceback (most recent call last):
File "/opt/source/PaddleNLP/llm/./predict/flask_server.py", line 27, in
from predict.predictor import (
ModuleNotFoundError: No module named 'predict'

@yiyayieryo yiyayieryo added the bug Something isn't working label Mar 26, 2025
@DrownFish19
Copy link
Collaborator

DrownFish19 commented Mar 26, 2025

当前可以在镜像的/opt/source/PaddleNLP/llm目录下执行export PYTHONPATH=.:$PYTHONPATH操作,将当前目录增加到PYTHONPATH中,再重新执行部署命令即可。

具体原因应该是镜像在构建中有额外设置PYTHONPATH路径,导致PYTHONPATH缺少当前目录,具体代码链接:

ENV PYTHONPATH="/opt/source/PaddleNLP/llm/server/server:/opt/source/PaddleNLP"

@yiyayieryo
Copy link
Author

感谢指导啊~
通过提供方法,不报错误:ModuleNotFoundError: No module named 'predict'
但启动会报另一个错误:ModuleNotFoundError: No module named 'gradio'
不过安装gradio就可以了,pip install gradio

顺便请教下,这个gradio是服务化部署所必须的吗?
如果只想通过url调用,不用界面化访问,怎么启动呢?
我试了几次调整启动参数--flask_port 和 --port都失败了

不带flask_port

root@default-paddlenlp-inference-service-0:/opt/source/PaddleNLP/llm# python3 ./predict/flask_server.py --model_name_or_path Qwen/Qwen2.5-0.5B-Instruct --port 8010 --dtype "float16"
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0327 06:06:48.670446   274 header_generator.cc:52] Unable to open file : /paddle/paddle/cinn/runtime/cuda/float16.h
/usr/local/lib/python3.10/dist-packages/paddle/utils/cpp_extension/extension_utils.py:711: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
  warnings.warn(warning_message)
/usr/local/lib/python3.10/dist-packages/_distutils_hack/__init__.py:31: UserWarning: Setuptools is replacing distutils. Support for replacing an already imported distutils is deprecated. In the future, this condition will fail. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
  warnings.warn(
[2025-03-27 06:06:52,221] [    INFO] - The `unk_token` parameter needs to be defined: we use `eos_token` by default.
[2025-03-27 06:06:52,499] [    INFO] - Loading configuration file /root/.paddlenlp/models/Qwen/Qwen2.5-0.5B-Instruct/config.json
[2025-03-27 06:06:52,500] [    INFO] - We are using <class 'paddlenlp.transformers.qwen2.modeling.Qwen2ForCausalLM'> to load 'Qwen/Qwen2.5-0.5B-Instruct'.
[2025-03-27 06:06:52,500] [    INFO] - Loading configuration file /root/.paddlenlp/models/Qwen/Qwen2.5-0.5B-Instruct/config.json
[2025-03-27 06:06:52,505] [    INFO] - Loading weights file from cache at /root/.paddlenlp/models/Qwen/Qwen2.5-0.5B-Instruct/model.safetensors
[2025-03-27 06:06:53,069] [    INFO] - Loaded weights file from disk, setting weights to model.
W0327 06:06:53.075387   274 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 12.2, Runtime API Version: 11.8
W0327 06:06:53.077374   274 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
[2025-03-27 06:06:57,675] [    INFO] - All model checkpoint weights were used when initializing Qwen2ForCausalLM.

[2025-03-27 06:06:57,675] [ WARNING] - Some weights of Qwen2ForCausalLM were not initialized from the model checkpoint at Qwen/Qwen2.5-0.5B-Instruct and are newly initialized: ['lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[2025-03-27 06:06:57,678] [    INFO] - Loading configuration file /root/.paddlenlp/models/Qwen/Qwen2.5-0.5B-Instruct/generation_config.json
[2025-03-27 06:06:57,703] [    INFO] - Loading configuration file /root/.paddlenlp/models/Qwen/Qwen2.5-0.5B-Instruct/generation_config.json
Traceback (most recent call last):
  File "/opt/source/PaddleNLP/llm/./predict/flask_server.py", line 312, in <module>
    server = PredictorServer(
  File "/opt/source/PaddleNLP/llm/./predict/flask_server.py", line 73, in __init__
    self.args.flask_port + port_interval * predictor.tensor_parallel_rank,
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'

不带port

root@default-paddlenlp-inference-service-0:/opt/source/PaddleNLP/llm# python3 ./predict/flask_server.py --model_name_or_path Qwen/Qwen2.5-0.5B-Instruct --flask_port 8011 --dtype "float16"
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0327 06:07:54.037953   331 header_generator.cc:52] Unable to open file : /paddle/paddle/cinn/runtime/cuda/float16.h
/usr/local/lib/python3.10/dist-packages/paddle/utils/cpp_extension/extension_utils.py:711: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
  warnings.warn(warning_message)
/usr/local/lib/python3.10/dist-packages/_distutils_hack/__init__.py:31: UserWarning: Setuptools is replacing distutils. Support for replacing an already imported distutils is deprecated. In the future, this condition will fail. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
  warnings.warn(
[2025-03-27 06:07:57,378] [    INFO] - The `unk_token` parameter needs to be defined: we use `eos_token` by default.
[2025-03-27 06:07:57,680] [    INFO] - Loading configuration file /root/.paddlenlp/models/Qwen/Qwen2.5-0.5B-Instruct/config.json
[2025-03-27 06:07:57,681] [    INFO] - We are using <class 'paddlenlp.transformers.qwen2.modeling.Qwen2ForCausalLM'> to load 'Qwen/Qwen2.5-0.5B-Instruct'.
[2025-03-27 06:07:57,682] [    INFO] - Loading configuration file /root/.paddlenlp/models/Qwen/Qwen2.5-0.5B-Instruct/config.json
[2025-03-27 06:07:57,687] [    INFO] - Loading weights file from cache at /root/.paddlenlp/models/Qwen/Qwen2.5-0.5B-Instruct/model.safetensors
[2025-03-27 06:07:58,266] [    INFO] - Loaded weights file from disk, setting weights to model.
W0327 06:07:58.270812   331 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 12.2, Runtime API Version: 11.8
W0327 06:07:58.271869   331 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
[2025-03-27 06:08:02,885] [    INFO] - All model checkpoint weights were used when initializing Qwen2ForCausalLM.

[2025-03-27 06:08:02,886] [ WARNING] - Some weights of Qwen2ForCausalLM were not initialized from the model checkpoint at Qwen/Qwen2.5-0.5B-Instruct and are newly initialized: ['lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[2025-03-27 06:08:02,890] [    INFO] - Loading configuration file /root/.paddlenlp/models/Qwen/Qwen2.5-0.5B-Instruct/generation_config.json
[2025-03-27 06:08:02,908] [    INFO] - Loading configuration file /root/.paddlenlp/models/Qwen/Qwen2.5-0.5B-Instruct/generation_config.json
 * Serving Flask app 'flask_server'
 * Debug mode: off
[2025-03-27 06:08:04,886] [    INFO] _internal.py:97 - WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:8011
 * Running on http://100.108.188.72:8011
[2025-03-27 06:08:04,886] [    INFO] _internal.py:97 - Press CTRL+C to quit
/opt/source/PaddleNLP/llm/predict/gradio_ui.py:271: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style dictionaries with 'role' and 'content' keys.
  context_chatbot = gr.Chatbot(
Process Process-1:
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/source/PaddleNLP/llm/predict/gradio_ui.py", line 342, in main
    launch(args, default_params)
  File "/opt/source/PaddleNLP/llm/predict/gradio_ui.py", line 338, in launch
    block.queue().launch(server_name="0.0.0.0", server_port=args.port, debug=True)
  File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 2638, in launch
    ) = http_server.start_server(
  File "/usr/local/lib/python3.10/dist-packages/gradio/http_server.py", line 156, in start_server
    raise OSError(
OSError: Cannot find empty port in range: 8011-8011. You can specify a different port by setting the GRADIO_SERVER_PORT environment variable or passing the `server_port` parameter to `launch()`.

当前可以在镜像的/opt/source/PaddleNLP/llm目录下执行export PYTHONPATH=.:$PYTHONPATH操作,将当前目录增加到PYTHONPATH中,再重新执行部署命令即可。

具体原因应该是镜像在构建中有额外设置PYTHONPATH路径,导致PYTHONPATH缺少当前目录,具体代码链接:

PaddleNLP/llm/server/dockerfiles/Dockerfile_serving_cuda124_cudnn9

Line 30 in 712495c

ENV PYTHONPATH="/opt/source/PaddleNLP/llm/server/server:/opt/source/PaddleNLP"

@DrownFish19
Copy link
Collaborator

顺便请教下,这个gradio是服务化部署所必须的吗?
如果只想通过url调用,不用界面化访问,怎么启动呢?

当前这个是和gradio写在同一个文件的,需要自己修改下文件,去掉gradio相关内容也可以使用

@Chinesejunzai
Copy link

export PYTHONPATH=.:$PYTHONPATH

你好, 使用这个我也解决了同样的问题, 但是出现了先的问题, 报错如下:

@Chinesejunzai
Copy link

You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[2025-04-16 14:33:25,638] [ INFO] - Loading configuration file /home/algo/.paddlenlp/models/Qwen/Qwen2.5-0.5B-Instruct/generation_config.json
[2025-04-16 14:33:25,654] [ INFO] - Loading configuration file /home/algo/.paddlenlp/models/Qwen/Qwen2.5-0.5B-Instruct/generation_config.json
Traceback (most recent call last):
File "/mnt/hdd/houxiaojun/workspace/project/PaddleNLP_dev_20250407/llm/./predict/flask_server.py", line 317, in
server.start_ui_service(server_args, asdict(predictor.config))
File "/mnt/hdd/houxiaojun/workspace/project/PaddleNLP_dev_20250407/llm/./predict/flask_server.py", line 287, in start_ui_service
from gradio_ui import main
File "/mnt/hdd/houxiaojun/workspace/project/PaddleNLP_dev_20250407/llm/predict/gradio_ui.py", line 23, in
import gradio as gr
File "/home/algo/miniconda3/envs/paddle_3/lib/python3.10/site-packages/gradio/init.py", line 3, in
import gradio.components as components
File "/home/algo/miniconda3/envs/paddle_3/lib/python3.10/site-packages/gradio/components.py", line 31, in
from gradio import media_data, processing_utils, utils
File "/home/algo/miniconda3/envs/paddle_3/lib/python3.10/site-packages/gradio/processing_utils.py", line 20, in
from gradio import encryptor, utils
File "/home/algo/miniconda3/envs/paddle_3/lib/python3.10/site-packages/gradio/utils.py", line 399, in
class AsyncRequest:
File "/home/algo/miniconda3/envs/paddle_3/lib/python3.10/site-packages/gradio/utils.py", line 419, in AsyncRequest
client = httpx.AsyncClient()
File "/home/algo/miniconda3/envs/paddle_3/lib/python3.10/site-packages/httpx/_client.py", line 1397, in init
self._transport = self._init_transport(
File "/home/algo/miniconda3/envs/paddle_3/lib/python3.10/site-packages/httpx/_client.py", line 1445, in _init_transport
return AsyncHTTPTransport(
File "/home/algo/miniconda3/envs/paddle_3/lib/python3.10/site-packages/httpx/_transports/default.py", line 275, in init
self._pool = httpcore.AsyncConnectionPool(
TypeError: AsyncConnectionPool.init() got an unexpected keyword argument 'socket_options'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants