@FIR-757: Update SDK to 0.1.4 and update release to 0.0.3 for tsi-ggml #20
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The test results with ./run_llama_cli.sh with 5 tokens is as follows
+++
root@agilex7_dk_si_agf014ea:/usr/bin/tsi/v0.1.1.tsv31_06_06_2025/bin# ./run_llama_cli.sh
my cat's name is Max. He'
llama_perf_sampler_print: sampling time = 111.70 ms / 11 runs ( 10.15 ms per token, 98.47 tokens per second)llama_perf_context_print: load time = 132926.48 ms
llama_perf_context_print: prompt eval time = 109957.33 ms / 6 tokens (18326.22 ms per token, 0.05 tokens per second)
llama_perf_context_print: eval time = 195682.91 ms / 4 runs (48920.73 ms per token, 0.02 tokens per second)
llama_perf_context_print: total time = 328764.01 ms / 10 tokens
GGML Tsavorite Profiling Results:
Calls Total(ms) T/call Self(ms) Function
33160 100086.000 3.018 47907.157 [32%] RuntimeHostShim::awaitCommandListCompletion
18920 29912.952 1.581 29912.952 └─ [10%] [ txe_silu ]
14080 22010.102 1.563 22010.102 └─ [ 7%] [ txe_mult ]
160 253.071 1.582 253.071 └─ [ 0%] [ txe_add ]
33160 1.178 0.000 1.178 └─ [ 0%] TXE 0 Idle
1 114.000 114.000 18.000 [ 0%] GGML Tsavorite
1 96.000 96.000 96.000 └─ [ 0%] RuntimeHostShim::initialize
1 52.000 52.000 52.000 [ 0%] RuntimeHostShim::finalize
33160 26.000 0.001 26.000 [ 0%] RuntimeHostShim::loadBlob
33160 23.000 0.001 23.000 [ 0%] RuntimeHostShim::finalizeCommandList
33160 5.000 0.000 5.000 [ 0%] RuntimeHostShim::addCommandToList
33161 3.000 0.000 3.000 [ 0%] RuntimeHostShim::allocate
33160 3.000 0.000 3.000 [ 0%] RuntimeHostShim::createCommandList
113720 0.000 0.000 0.000 [ 0%] RuntimeHostShim::getShmemManager
33160 0.000 0.000 0.000 [ 0%] RuntimeHostShim::launchBlob
33160 0.000 0.000 0.000 [ 0%] RuntimeHostShim::unloadBlob
33160 0.000 0.000 0.000 [ 0%] RuntimeHostShim::deallocate
412163 308849.000 0.749308849.000 [100%] TOTAL
root@agilex7_dk_si_agf014ea:/usr/bin/tsi/v0.1.1.tsv31_06_06_2025/bin# +++
Make sure to read the contributing guidelines before submitting a PR