Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eval bug: Issue with mradermacher/Nous-Hermes-2-Mixtral-8x7B-DPO-GGUF - Only works with version b4927, newer versions fail with strange output #12504

Closed
boblekov opened this issue Mar 21, 2025 · 1 comment

Comments

@boblekov
Copy link

boblekov commented Mar 21, 2025

Name and Version

./llama-cli --version
version: 4927 (568013d)
built with cc (Debian 12.2.0-14) 12.2.0 for x86_64-linux-gnu

Operating systems

Linux

GGML backends

CPU

Hardware

Intel(R) Xeon(R) CPU E5-2696 v3 @ 2.30GHz

Models

mradermacher/Nous-Hermes-2-Mixtral-8x7B-DPO-GGUF

Problem description & steps to reproduce

Hi,

I encountered an issue with the model mradermacher/Nous-Hermes-2-Mixtral-8x7B-DPO-GGUF. The model works perfectly with version b4927, but when trying to run with any newer versions (e.g., b4928, b4930, b4935, b4936), it produces strange output and fails to load properly.

What happens:
When using versions higher than b4927, the model starts but outputs corrupted or non-readable text (gibberish).
The issue does not occur with version b4927, where the model runs without any problems.
Example error message (for version b4936):

Image

Image

Image

Parameters used:
Model: mradermacher/Nous-Hermes-2-Mixtral-8x7B-DPO-GGUF
Command: ./llama-cli -m ... -t 72 -c 4096 --color -i --no-mmap
System: Intel(R) Xeon(R) CPU E5-2696 v3 @ 2.30GHz
I’ve tested with different versions, and the issue persists with all versions after b4927. The performance improvement is noticeable in later versions, but unfortunately, the model fails to load correctly.

Could you please look into this issue and let me know if there are any changes or fixes in the newer commits that could address this?

Best regards, boblekov

First Bad Commit

No response

Relevant log output

system_info: n_threads = 72 (n_threads_batch = 72) / 72 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | BMI2 = 1 | LLAMAFILE = 1 | OPENMP = 1 | AARCH64_REPACK = 1 | 

main: interactive mode on.
sampler seed: 2566462078
sampler params: 
        repeat_last_n = 64, repeat_penalty = 1.000, frequency_penalty = 0.000, presence_penalty = 0.000
        dry_multiplier = 0.000, dry_base = 1.750, dry_allowed_length = 2, dry_penalty_last_n = 4096
        top_k = 40, top_p = 0.950, min_p = 0.050, xtc_probability = 0.000, xtc_threshold = 0.100, typical_p = 1.000, top_n_sigma = -1.000, temp = 0.800
        mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
sampler chain: logits -> logit-bias -> penalties -> dry -> top-k -> typical -> top-p -> min-p -> xtc -> temp-ext -> dist 
generate: n_ctx = 4096, n_batch = 2048, n_predict = -1, n_keep = 1

== Running in interactive mode. ==
 - Press Ctrl+C to interject at any time.
 - Press Return to return control to the AI.
 - To return control without starting a new line, end your input with '/'.
 - If you want to submit another line, end your input with '\'.
 - Not using system message. To change it, set a different value via -sys PROMPT


> Hello

$ $

$"
> 
llama_perf_sampler_print:    sampling time =       1.73 ms /   115 runs   (    0.02 ms per token, 66550.93 tokens per second)
llama_perf_context_print:        load time =   20554.84 ms
llama_perf_context_print: prompt eval time =    2700.95 ms /    79 tokens (   34.19 ms per token,    29.25 tokens per second)
llama_perf_context_print:        eval time =    6132.62 ms /    36 runs   (  170.35 ms per token,     5.87 tokens per second)
llama_perf_context_print:       total time =   23249.99 ms /   115 tokens
Interrupted by user
@pyroxenites
Copy link

RekaAI_reka-flash-3-IQ4_XS.gguf is also experiencing garbled output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants