Skip to content

taking long time to give response (around 2 min) #1896

@mbbutt

Description

@mbbutt

Hello

I am running in the following machine.

CPU: 12th Gen Intel(R) Core(TM) i7-12700
RAM: 32GB, speed: 4400MT/s
NVIDIA RTX A2000 12GB

model is:
llama-2-7b-chat.Q6_K.gguf

And it takes around 2 min to start giving a response.
is it reasonable or it should be faster?

bat command to start the bot

"C:\Users\Public\pyenv-win\pyenv-win\bin\.h2o\Scripts\python.exe"^
 "generate.py"^
 --share=False ^
 --auth=[('jon','password')] ^
 --auth_access=closed ^
 --gradio_offline_level=1 ^
 --base_model="llama" ^
 --prompt_type=llama2 ^
 --model_path_llama=C:\Users\Public\git\h2ogpt\llama-2-7b-chat.Q6_K.gguf^
 --score_model=None ^
 --langchain_mode="LLLM" ^
 --user_path=user_path ^
 --load_4bit=True ^
 --llamacpp_dict="{'n_gpu_layers':5}"

While running idle
it is taking 7GB GPU memory (remains same when running the query)
24.4GB RAM (remains same when running the query)
CPU utilization stays 2 to 3%

When running the query CPU utilization goes closer to 100%
GPU remains 1% to 2%

and it takes around 2 min to start giving a response.

It seems it is not utilizing GPU at all.
could you please see what i am doing wrong here?
I want to get faster response

cuda version is

C:\Windows\System32>nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:41:10_Pacific_Daylight_Time_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

below is my pip list

Package                                  Version
---------------------------------------- ---------------
absl-py                                  2.1.0
accelerate                               0.32.1
aiofiles                                 23.2.1
aiohappyeyeballs                         2.4.3
aiohttp                                  3.10.9
aiosignal                                1.3.1
altair                                   5.4.1
annotated-types                          0.7.0
anthropic                                0.8.1
antlr4-python3-runtime                   4.9.3
anyio                                    4.6.0
appdirs                                  1.4.4
APScheduler                              3.10.4
argcomplete                              3.5.1
arxiv                                    1.4.8
asgiref                                  3.8.1
async-timeout                            4.0.3
attributedict                            0.3.0
attrs                                    24.2.0
audioread                                3.0.1
Authlib                                  1.3.1
auto-gptq                                0.6.0
autoawq                                  0.1.8+cu118
autoawq_kernels                          0.0.3+cu118
babel                                    2.16.0
backoff                                  2.2.1
backports.tarfile                        1.2.0
bcrypt                                   4.2.0
beautifulsoup4                           4.12.3
bioc                                     2.1
bitsandbytes                             0.41.1
blessings                                1.7
boto3                                    1.35.35
botocore                                 1.35.35
Brotli                                   1.1.0
bs4                                      0.0.2
build                                    1.2.2.post1
cachetools                               5.5.0
certifi                                  2024.8.30
cffi                                     1.17.1
chardet                                  5.2.0
charset-normalizer                       3.3.2
chroma-bullet                            2.2.0
chroma-hnswlib                           0.7.3
chroma-migrate                           0.0.7
chromadb                                 0.4.23
chromamigdb                              0.3.26
click                                    8.1.7
clickhouse-connect                       0.6.6
codecov                                  2.1.13
colorama                                 0.4.6
coloredlogs                              15.0.1
colour-runner                            0.1.1
contourpy                                1.3.0
coverage                                 7.6.1
cryptography                             43.0.1
cssselect2                               0.7.0
cutlet                                   0.3.0
cycler                                   0.12.1
dacite                                   1.7.0
dataclasses-json                         0.6.7
DataProperty                             1.0.1
datasets                                 2.16.1
dateparser                               1.1.8
decorator                                5.1.1
deepdiff                                 8.0.1
defusedxml                               0.7.1
Deprecated                               1.2.14
diffusers                                0.24.0
dill                                     0.3.7
diskcache                                5.6.3
distlib                                  0.3.8
distro                                   1.9.0
dnspython                                2.7.0
docopt                                   0.6.2
docutils                                 0.20.1
duckdb                                   0.7.1
duckduckgo_search                        6.3.0
durationpy                               0.9
effdet                                   0.4.1
einops                                   0.8.0
emoji                                    2.14.0
et-xmlfile                               1.1.0
eval_type_backport                       0.2.0
evaluate                                 0.4.0
exceptiongroup                           1.2.2
execnet                                  2.1.1
exllama                                  0.0.18+cu118
fastapi                                  0.115.0
feedparser                               6.0.11
ffmpeg                                   1.4
ffmpy                                    0.4.0
fiftyone                                 1.0.0
fiftyone-brain                           0.17.0
fiftyone_db                              1.1.6
filelock                                 3.16.1
filetype                                 1.2.0
fire                                     0.5.0
flatbuffers                              24.3.25
fonttools                                4.54.1
frozenlist                               1.4.1
fsspec                                   2023.10.0
ftfy                                     6.2.3
fugashi                                  1.3.2
future                                   1.0.0
g2pkk                                    0.1.2
gekko                                    1.2.1
glob2                                    0.7
google-ai-generativelanguage             0.4.0
google-api-core                          2.20.0
google-auth                              2.35.0
google-generativeai                      0.3.2
google_search_results                    2.4.2
googleapis-common-protos                 1.65.0
gpt4all                                  1.0.5
gradio                                   3.50.2
gradio_client                            0.6.1
gradio_pdf                               0.0.15
gradio_tools                             0.0.9
graphql-core                             3.2.4
greenlet                                 3.0.3
grpcio                                   1.66.2
grpcio-health-checking                   1.62.3
grpcio-status                            1.62.3
grpcio-tools                             1.62.3
gruut                                    2.2.3
gruut-ipa                                0.13.0
gruut-lang-de                            2.0.1
gruut-lang-en                            2.0.1
gruut-lang-es                            2.0.1
gruut_lang_fr                            2.0.2
h11                                      0.14.0
h2                                       4.1.0
h5py                                     3.12.1
hf_transfer                              0.1.8
hnswlib                                  0.8.0
hnswmiglib                               0.7.0
hpack                                    4.0.0
html2text                                2024.2.26
html5lib                                 1.1
httpcore                                 1.0.6
httptools                                0.6.1
httpx                                    0.27.0
huggingface-hub                          0.25.1
humanfriendly                            10.0
humanize                                 4.11.0
Hypercorn                                0.17.3
hyperframe                               6.0.1
idna                                     3.10
imageio                                  2.35.1
importlib_metadata                       8.4.0
importlib_resources                      6.4.5
imutils                                  0.5.4
inflate64                                1.0.0
iniconfig                                2.0.0
inspecta                                 0.1.3
InstructorEmbedding                      1.0.1
intervaltree                             3.1.0
iopath                                   0.1.10
jaconv                                   0.4.0
jamo                                     0.4.1
jaraco.context                           6.0.1
jieba                                    0.42.1
Jinja2                                   3.1.4
jiter                                    0.6.1
jmespath                                 1.0.1
joblib                                   1.4.2
jsonlines                                1.2.0
jsonpatch                                1.33
jsonpath-python                          1.0.6
jsonpointer                              3.0.0
jsonschema                               4.23.0
jsonschema-specifications                2024.10.1
kaleido                                  0.2.1
kiwisolver                               1.4.7
kubernetes                               31.0.0
langchain                                0.0.354
langchain-community                      0.0.8
langchain-core                           0.1.6
langchain-experimental                   0.0.47
langchain-google-genai                   0.0.6
langchain-mistralai                      0.0.2
langdetect                               1.0.9
langid                                   1.1.6
langsmith                                0.0.77
layoutparser                             0.3.4
lazy_loader                              0.4
librosa                                  0.10.1
llama_cpp_python                         0.2.26+cpuavx2
llama_cpp_python_cuda                    0.2.26+cu121avx
llvmlite                                 0.43.0
lm-dataformat                            0.0.20
lm_eval                                  0.4.4
loralib                                  0.1.2
lxml                                     5.3.0
lz4                                      4.3.3
Markdown                                 3.7
markdown-it-py                           3.0.0
MarkupSafe                               2.1.5
marshmallow                              3.22.0
matplotlib                               3.9.2
mbstrdecoder                             1.1.3
mdurl                                    0.1.2
mistralai                                0.0.8
mmh3                                     5.0.1
mojimoji                                 0.0.13
mongoengine                              0.24.2
monotonic                                1.6
more-itertools                           10.5.0
motor                                    3.5.3
mplcursors                               0.5.3
mpmath                                   1.3.0
msg-parser                               1.2.0
msgpack                                  1.1.0
multidict                                6.1.0
multiprocess                             0.70.15
multivolumefile                          0.2.3
mutagen                                  1.47.0
mypy-extensions                          1.0.0
narwhals                                 1.9.1
nest-asyncio                             1.6.0
networkx                                 2.8.8
nltk                                     3.9.1
num2words                                0.5.13
numba                                    0.60.0
numexpr                                  2.10.1
numpy                                    1.23.4
oauthlib                                 3.2.2
olefile                                  0.47
omegaconf                                2.3.0
onnx                                     1.17.0
onnxruntime                              1.15.1
onnxruntime-gpu                          1.15.0
openai                                   1.51.2
opencv-python                            4.10.0.84
opencv-python-headless                   4.10.0.84
openpyxl                                 3.1.5
opentelemetry-api                        1.27.0
opentelemetry-exporter-otlp-proto-common 1.27.0
opentelemetry-exporter-otlp-proto-grpc   1.27.0
opentelemetry-instrumentation            0.48b0
opentelemetry-instrumentation-asgi       0.48b0
opentelemetry-instrumentation-fastapi    0.48b0
opentelemetry-proto                      1.27.0
opentelemetry-sdk                        1.27.0
opentelemetry-semantic-conventions       0.48b0
opentelemetry-util-http                  0.48b0
openvino                                 2022.3.0
optimum                                  1.16.1
orderly-set                              5.2.2
orjson                                   3.10.7
outcome                                  1.3.0.post0
overrides                                7.7.0
packaging                                24.1
pandas                                   2.0.2
pathvalidate                             3.2.1
pdf2image                                1.17.0
pdfminer.six                             20221105
pdfplumber                               0.10.4
peft                                     0.13.1
pikepdf                                  9.3.0
pillow                                   10.4.0
pillow_heif                              0.18.0
pip                                      23.0.1
pip-licenses                             5.0.0
platformdirs                             4.3.6
playwright                               1.47.0
plotly                                   5.24.1
pluggy                                   1.5.0
pooch                                    1.8.2
portalocker                              2.10.1
posthog                                  3.7.0
pprintpp                                 0.4.0
prettytable                              3.11.0
primp                                    0.6.3
priority                                 2.0.0
propcache                                0.2.0
proto-plus                               1.24.0
protobuf                                 4.25.5
psutil                                   6.0.0
pulsar-client                            3.5.0
py7zr                                    0.22.0
pyarrow                                  17.0.0
pyarrow-hotfix                           0.6
pyasn1                                   0.6.1
pyasn1_modules                           0.4.1
pybcj                                    1.0.2
pybind11                                 2.13.6
pyclipper                                1.3.0.post5
pycocotools                              2.0.8
pycparser                                2.22
pycryptodomex                            3.21.0
pydantic                                 2.9.2
pydantic_core                            2.23.4
pydantic-settings                        2.1.0
pydash                                   8.0.3
pydub                                    0.25.1
pydyf                                    0.11.0
pyee                                     12.0.0
Pygments                                 2.18.0
pymongo                                  4.8.0
PyMuPDF                                  1.24.11
pynvml                                   11.5.3
pypandoc                                 1.14
pypandoc_binary                          1.14
pyparsing                                3.1.4
pypdf                                    5.0.1
pypdfium2                                4.30.0
pyphen                                   0.16.0
PyPika                                   0.48.9
pyppmd                                   1.1.0
pyproject-api                            1.8.0
pyproject_hooks                          1.2.0
pyreadline3                              3.5.4
PySocks                                  1.7.1
pytablewriter                            1.2.0
pytesseract                              0.3.13
pytest                                   8.3.3
pytest-xdist                             3.6.1
python-crfsuite                          0.9.11
python-dateutil                          2.8.2
python-doctr                             0.5.4a0
python-docx                              1.1.2
python-dotenv                            1.0.1
python-iso639                            2024.4.27
python-magic                             0.4.27
python-magic-bin                         0.4.14
python-multipart                         0.0.12
python-pptx                              0.6.23
pytube                                   15.0.0
pytz                                     2024.2
pywin32                                  307
PyYAML                                   6.0.2
pyzstd                                   0.16.1
RapidFuzz                                3.10.0
rarfile                                  4.2
referencing                              0.35.1
regex                                    2024.9.11
replicate                                0.20.0
requests                                 2.32.3
requests-file                            2.1.0
requests-oauthlib                        2.0.0
requests-toolbelt                        1.0.0
responses                                0.18.0
retrying                                 1.3.4
rich                                     13.9.2
rootpath                                 0.1.1
rouge                                    1.0.1
rouge_score                              0.1.2
rpds-py                                  0.20.0
rsa                                      4.9
ruff                                     0.6.9
s3transfer                               0.10.2
sacrebleu                                2.3.1
safetensors                              0.4.5
scikit-image                             0.24.0
scikit-learn                             1.2.2
scipy                                    1.13.1
selenium                                 4.25.0
semantic-version                         2.10.0
semanticscholar                          0.8.4
sentence-transformers                    2.2.2
sentencepiece                            0.1.99
setuptools                               65.5.0
sgmllib3k                                1.0.0
Shapely                                  1.8.5.post1
shellingham                              1.5.4
six                                      1.16.0
sniffio                                  1.3.1
sortedcontainers                         2.4.0
soundfile                                0.12.1
soupsieve                                2.6
soxr                                     0.5.0.post1
SQLAlchemy                               2.0.35
sqlitedict                               2.1.0
sse-starlette                            0.10.3
sseclient-py                             1.8.0
starlette                                0.38.6
strawberry-graphql                       0.246.0
sympy                                    1.13.3
tabledata                                1.3.3
tabulate                                 0.9.0
taskgroup                                0.0.0a4
tcolorpy                                 0.1.6
tenacity                                 8.5.0
termcolor                                2.5.0
text-generation                          0.7.0
textstat                                 0.7.4
texttable                                1.7.0
threadpoolctl                            3.5.0
tifffile                                 2024.9.20
tiktoken                                 0.8.0
timm                                     1.0.9
tinycss2                                 1.3.0
tokenizers                               0.19.1
toml                                     0.10.2
tomli                                    2.0.2
tomlkit                                  0.12.0
torch                                    2.1.2+cu118
torchvision                              0.16.2+cu118
tox                                      4.21.2
tqdm                                     4.66.5
tqdm-multiprocess                        0.0.11
transformers                             4.40.2
trio                                     0.26.2
trio-websocket                           0.11.1
typepy                                   1.3.2
typer                                    0.12.5
typing_extensions                        4.12.2
typing-inspect                           0.9.0
tzdata                                   2024.2
tzlocal                                  5.2
ujson                                    5.10.0
Unidecode                                1.3.8
universal-analytics-python3              1.1.1
unstructured                             0.12.5
unstructured-client                      0.26.0
unstructured-inference                   0.7.23
unstructured.pytesseract                 0.3.13
urllib3                                  2.2.3
uvicorn                                  0.31.0
validators                               0.34.0
virtualenv                               20.26.6
voxel51-eta                              0.13.0
watchfiles                               0.24.0
wavio                                    0.0.8
wcwidth                                  0.2.13
weasyprint                               62.3
weaviate-client                          4.8.1
webencodings                             0.5.1
websocket-client                         1.8.0
websockets                               11.0.3
wikipedia                                1.4.0
wolframalpha                             5.1.3
word2number                              1.1
wrapt                                    1.16.0
wsproto                                  1.2.0
xlrd                                     2.0.1
XlsxWriter                               3.2.0
xmltodict                                0.13.0
xxhash                                   3.5.0
yarl                                     1.14.0
yt-dlp                                   2023.10.13
zipp                                     3.20.2
zopfli                                   0.2.3
zstandard                                0.23.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions