Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,10 @@ What they cover:
- `--logprob-vectors`: compares local token bytes and top-logprob slices against
official DeepSeek V4 Flash continuation vectors. This catches tokenizer,
template, attention, and logits regressions.
- `--long-context`: runs a long-context continuation regression from
`tests/long_context_security_prompt.txt`.
- `--long-context`: runs a long-context story fact-recall regression from
`tests/long_context_story_prompt.txt`. The model must retrieve spelled-out
person-number assignments from a long prose prompt and return `Name=number`
lines that the test parses.
- `--tool-call-quality`: exercises actual model behavior for DSML tool-call
emission in both fast and exact paths.
- `--metal-kernels`: isolated Metal kernel numeric checks.
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ This project would not exist without **llama.cpp and GGML**, make sure to read
the acknowledgements section, a big thank you to Georgi Gerganov and all the
other contributors.

## Motivations

Now, back at this project. Why we believe DeepSeek v4 Flash to be a pretty special
model deserving a stand alone engine? Because after comparing it with powerful smaller
dense models, we can report that:
Expand Down
Loading