Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: Model downloaded from Hugging Face can only run on CPU because it doesn't retrieve the correct GGUF model's metadata #3708

Closed
1 of 3 tasks
milen-prg opened this issue Sep 20, 2024 · 12 comments · Fixed by #3725
Assignees
Labels
needs info Not enough info, more logs/data required type: bug Something isn't working

Comments

@milen-prg
Copy link

Jan version

Jan v0.5.4

Describe the Bug

The installed models via app hub, works on GPU as was, but in this version the user installed models use only CPU. In the older version the user installed models also used GPU.
Each new version ruins using of the user installed models 😭🤬

Steps to Reproduce

  1. Install manually models from hugging face and from Jan hub.
  2. Run one, then the other.
  3. In this version the user installed not uses GPU.

Screenshots / Logs

No response

What is your OS?

  • MacOS
  • Windows
  • Linux
@milen-prg milen-prg added the type: bug Something isn't working label Sep 20, 2024
@imtuyethan imtuyethan changed the title bug: [DESCRIPTION] User installed models run on CPU bug: User installed models run on CPU Sep 21, 2024
@imtuyethan imtuyethan self-assigned this Sep 21, 2024
@imtuyethan imtuyethan added the needs info Not enough info, more logs/data required label Sep 21, 2024
@josepgl
Copy link

josepgl commented Sep 21, 2024

In my case, on Linux, trying to use the GPU I see on the logs:

ERROR Could not load engine: Could not load library "/home/jose/.config/Jan/data/extensions/@janhq/inference-cortex-extension/dist/bin/linux-cuda-12-0/engines/cortex.llamacpp/libengine.so"
libcudart.so.12: cannot open shared object file: No such file or directory - server.cc:299

@milen-prg On what previous version it worked for you?

@milen-prg
Copy link
Author

@josepgl , on v0.5.3 worked (there was another problem, the models was disappeared, but here easy helped me then).
On Windows where to see the logs?

@josepgl
Copy link

josepgl commented Sep 21, 2024

@milen-prg in Settings > Advanced Settings > Jan Data Folder is the data path, I see a log folder there in Linux.

@milen-prg
Copy link
Author

There is logs folder, but it is empty.

@imtuyethan
Copy link
Contributor

There is logs folder, but it is empty.

We only store your logs for 24h

@imtuyethan
Copy link
Contributor

imtuyethan commented Sep 23, 2024

This is a known issue. The Hugging Face model download from the Search Box in model hub is pretty broken for now, and it doesn't retrieve the correct GGUF model's metadata. We've filed an issue on this and are working on the fix from the engine: #3558

In the meantime, please help us add an ngl setting to the settings section for now to enable GPU acceleration. It worked fine before because the previous versions hardcoded an ngl setting, which is hacky and not correct for all models.

@imtuyethan imtuyethan assigned louis-jan and unassigned imtuyethan Sep 23, 2024
@imtuyethan imtuyethan changed the title bug: User installed models run on CPU bug: Model downloaded from Hugging Face can only run on CPU because it doesn't retrieve the correct GGUF model's metadata Sep 23, 2024
@imtuyethan imtuyethan added this to the v0.5.5 milestone Sep 23, 2024
@milen-prg
Copy link
Author

The "bug: Model downloaded from Hugging Face can only run on CPU"
is not solved in:
v0.5.5

@louis-jan
Copy link
Contributor

louis-jan commented Oct 2, 2024

The "bug: Model downloaded from Hugging Face can only run on CPU"

is not solved in:

v0.5.5

Hi @milen-prg, the fix would work only with new downloaded models, in case you have HF models downloaded before but do not have ngl settings, please help redownload.

Also could you please share your scenario, screenshots. Thanks

@louis-jan
Copy link
Contributor

Also, it looks like the issue is not about model GGUF. Could you please share your specs, and log file?

@milen-prg
Copy link
Author

The logs folder is permanently empty.
If I try to import the gguf file, the model.json doesn't generates automatically (as was in the older versions), so the model is not recognized, it is not in list and can't be selected. When I put the old model.json, the model is in the list, but works only with CPU, not with GPU.
I see, your app with time targets to work only with self suggested models, which is enormous problem. I'll search for something more liberal. Thank, you.

@louis-jan
Copy link
Contributor

Hi @milen-prg, it's likely a bug, as it's not targeting self-suggested models but also HuggingFace models.

@milen-prg
Copy link
Author

In v.0.5.6 if reimport the models, they again works with GPU.

Several times after the model reply, the GPU loading keeps 100% and it not stops at trying the stop button for the reply, or delete the conversation entire thread, must close the Jan to not to overheat the GPU.
Will investigate this new problem, it seems appears randomly, but frequently with the imported models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs info Not enough info, more logs/data required type: bug Something isn't working
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants
@josepgl @milen-prg @imtuyethan @louis-jan and others