Skip to content

Conversation

@atrivedi-tsavoritesi
Copy link

The test results are as follows
Model Response
cd /usr/bin/tsi/v0.1.1.tsv31_06_06_2025/bin/; ./run_llama_cli.sh "My cat's name" " 50 tinyllama-vo-5m-para.gguf tSavorite 1.5 1024 50 0.9 5 12288 0.0 [2018-03-09 13:03:17.788243] 271:272 [[32m info[m] :: </proj/work/mmankali/bld-setuptest/tsirel-31/tsi_yocto_workspace/tsi-apc-manager/platform/rsm_mgr/rsm_process_req.c:129> TXE resource allocation request processed successfully.
My cat's name was Tim. He loved to play with his toy car. He would run and jump in the park, making loud noises. Tim was very happy with his new toy car.
One day, Tim's mom said, "Tim. You

llama_perf_sampler_print: sampling time = 999.96 ms / 56 runs ( 17.86 ms per token, 56.00 tokens per second)llama_perf_context_print: load time = 1713.55 ms
llama_perf_context_print: prompt eval time = 603.51 ms / 6 tokens ( 100.58 ms per token, 9.94 tokens per second)
llama_perf_context_print: eval time = 7069.36 ms / 49 runs ( 144.27 ms per token, 6.93 tokens per second)
llama_perf_context_print: total time = 10046.17 ms / 55 tokens
[2018-03-09 13:03:28.875126] 271:272 [[32m info[m] :: </proj/work/mmankali/bld-setuptest/tsirel-31/tsi_yocto_workspace/tsi-apc-manager/platform/rsm_mgr/rsm_process_req.c:145> TXE resource release request processed successfully.

GGML Tsavorite Profiling Results:

Calls Total(ms) T/call Self(ms) Function

2715 2720.000 1.002 0.000 [25%] RuntimeHostShim::awaitCommandListCompletion
1740 2635.984 1.515 2635.984 └─ [24%] [ txe_silu ]
925 1379.715 1.492 1379.715 └─ [12%] [ txe_mult ]
50 74.450 1.489 74.450 └─ [ 1%] [ txe_add ]
2715 0.448 0.000 0.448 └─ [ 0%] TXE 0 Idle
1 34.000 34.000 34.000 [ 0%] RuntimeHostShim::finalize
1 16.000 16.000 1.000 [ 0%] GGML Tsavorite
1 15.000 15.000 15.000 └─ [ 0%] RuntimeHostShim::initialize
2716 0.000 0.000 0.000 [ 0%] RuntimeHostShim::allocate
9120 0.000 0.000 0.000 [ 0%] RuntimeHostShim::getShmemManager
2715 0.000 0.000 0.000 [ 0%] RuntimeHostShim::createCommandList
2715 0.000 0.000 0.000 [ 0%] RuntimeHostShim::loadBlob
2715 0.000 0.000 0.000 [ 0%] RuntimeHostShim::launchBlob
2715 0.000 0.000 0.000 [ 0%] RuntimeHostShim::addCommandToList
2715 0.000 0.000 0.000 [ 0%] RuntimeHostShim::finalizeCommandList
2715 0.000 0.000 0.000 [ 0%] RuntimeHostShim::unloadBlob
2715 0.000 0.000 0.000 [ 0%] RuntimeHostShim::deallocate

33558 11098.000 0.331 11098.000 [100%] TOTAL

⟵ Back to Form

The URL used is as follows
http://10.50.0.124:5003/llama-cli?model=tiny-llama&backend=tSavorite&tokens=10&prompt=My+cat%27s+name&repeat-penalty=1.5&batch-size=1024&top-k=50&top-p=0.9&last-n=5&context-length=12288&temp=0.0

Make sure to read the contributing guidelines before submitting a PR

The test results are as follows
Model Response
cd /usr/bin/tsi/v0.1.1.tsv31_06_06_2025/bin/; ./run_llama_cli.sh "My cat's name"
" 50 tinyllama-vo-5m-para.gguf tSavorite 1.5 1024 50 0.9 5 12288 0.0
[2018-03-09 13:03:17.788243] 271:272 [[32m info[m]  :: </proj/work/mmankali/bld-setuptest/tsirel-31/tsi_yocto_workspace/tsi-apc-manager/platform/rsm_mgr/rsm_process_req.c:129> TXE resource allocation request processed successfully.
 My cat's name was Tim. He loved to play with his toy car. He would run and jump in the park, making loud noises. Tim was very happy with his new toy car.
One day, Tim's mom said, "Tim. You

llama_perf_sampler_print:    sampling time =     999.96 ms /    56 runs   (   17.86 ms per token,    56.00 tokens per second)llama_perf_context_print:        load time =    1713.55 ms
llama_perf_context_print: prompt eval time =     603.51 ms /     6 tokens (  100.58 ms per token,     9.94 tokens per second)
llama_perf_context_print:        eval time =    7069.36 ms /    49 runs   (  144.27 ms per token,     6.93 tokens per second)
llama_perf_context_print:       total time =   10046.17 ms /    55 tokens
[2018-03-09 13:03:28.875126] 271:272 [[32m info[m]  :: </proj/work/mmankali/bld-setuptest/tsirel-31/tsi_yocto_workspace/tsi-apc-manager/platform/rsm_mgr/rsm_process_req.c:145> TXE resource release request processed successfully.

GGML Tsavorite Profiling Results:
------------------------------------------------------------------------------------------------------------------------
Calls  Total(ms)    T/call  Self(ms)  Function
------------------------------------------------------------------------------------------------------------------------
 2715   2720.000     1.002     0.000  [25%] RuntimeHostShim::awaitCommandListCompletion
 1740   2635.984     1.515  2635.984  └─ [24%] [ txe_silu ]
  925   1379.715     1.492  1379.715  └─ [12%] [ txe_mult ]
   50     74.450     1.489    74.450  └─ [ 1%] [ txe_add ]
 2715      0.448     0.000     0.448  └─ [ 0%] TXE 0 Idle
    1     34.000    34.000    34.000  [ 0%] RuntimeHostShim::finalize
    1     16.000    16.000     1.000  [ 0%] GGML Tsavorite
    1     15.000    15.000    15.000  └─ [ 0%] RuntimeHostShim::initialize
 2716      0.000     0.000     0.000  [ 0%] RuntimeHostShim::allocate
 9120      0.000     0.000     0.000  [ 0%] RuntimeHostShim::getShmemManager
 2715      0.000     0.000     0.000  [ 0%] RuntimeHostShim::createCommandList
 2715      0.000     0.000     0.000  [ 0%] RuntimeHostShim::loadBlob
 2715      0.000     0.000     0.000  [ 0%] RuntimeHostShim::launchBlob
 2715      0.000     0.000     0.000  [ 0%] RuntimeHostShim::addCommandToList
 2715      0.000     0.000     0.000  [ 0%] RuntimeHostShim::finalizeCommandList
 2715      0.000     0.000     0.000  [ 0%] RuntimeHostShim::unloadBlob
 2715      0.000     0.000     0.000  [ 0%] RuntimeHostShim::deallocate
========================================================================================================================
33558  11098.000     0.331 11098.000  [100%] TOTAL
========================================================================================================================

⟵ Back to Form

The URL used is as follows
http://10.50.0.124:5003/llama-cli?model=tiny-llama&backend=tSavorite&tokens=10&prompt=My+cat%27s+name&repeat-penalty=1.5&batch-size=1024&top-k=50&top-p=0.9&last-n=5&context-length=12288&temp=0.0
@atrivedi-tsavoritesi
Copy link
Author

Updated test results

cd /usr/bin/tsi/v0.1.1.tsv31_06_06_2025/bin/; ./run_llama_cli.sh "My cat's name" " 10 tinyllama-vo-5m-para.gguf tSavorite 1.5 1024 50 0.9 5 12288 0.0 [2018-03-09 14:27:48.730593] 271:272 [�[32m info�[m] :: TXE resource allocation request processed successfully. My cat's name was Tim. He loved to play with his toy llama_perf_sampler_print: sampling time = 200.83 ms / 16 runs ( 12.55 ms per token, 79.67 tokens per second)llama_perf_context_print: load time = 1717.23 ms llama_perf_context_print: prompt eval time = 605.29 ms / 6 tokens ( 100.88 ms per token, 9.91 tokens per second) llama_perf_context_print: eval time = 1292.54 ms / 9 runs ( 143.62 ms per token, 6.96 tokens per second) llama_perf_context_print: total time = 3260.48 ms / 15 tokens [2018-03-09 14:27:53.028716] 271:272 [�[32m info�[m] :: TXE resource release request processed successfully. GGML Tsavorite Profiling Results: ------------------------------------------------------------------------------------------------------------------------ Calls Total(ms) T/call Self(ms) Function ------------------------------------------------------------------------------------------------------------------------ 715 717.000 1.003 0.000 [17%] RuntimeHostShim::awaitCommandListCompletion 460 696.899 1.515 696.899 └─ [16%] [ txe_silu ] 245 365.414 1.491 365.414 └─ [ 8%] [ txe_mult ] 10 14.900 1.490 14.900 └─ [ 0%] [ txe_add ] 715 0.458 0.001 0.458 └─ [ 0%] TXE 0 Idle 1 34.000 34.000 34.000 [ 1%] RuntimeHostShim::finalize 1 15.000 15.000 1.000 [ 0%] GGML Tsavorite 1 14.000 14.000 14.000 └─ [ 0%] RuntimeHostShim::initialize 716 0.000 0.000 0.000 [ 0%] RuntimeHostShim::allocate 2400 0.000 0.000 0.000 [ 0%] RuntimeHostShim::getShmemManager 715 0.000 0.000 0.000 [ 0%] RuntimeHostShim::createCommandList 715 0.000 0.000 0.000 [ 0%] RuntimeHostShim::loadBlob 715 0.000 0.000 0.000 [ 0%] RuntimeHostShim::launchBlob 715 0.000 0.000 0.000 [ 0%] RuntimeHostShim::addCommandToList 715 0.000 0.000 0.000 [ 0%] RuntimeHostShim::finalizeCommandList 715 0.000 0.000 0.000 [ 0%] RuntimeHostShim::unloadBlob 715 0.000 0.000 0.000 [ 0%] RuntimeHostShim::deallocate ======================================================================================================================== 8838 4310.000 0.488 4310.000 [100%] TOTAL ========================================================================================================================

Copy link

@akapoor3518 akapoor3518 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link

@dineshReddy6381 dineshReddy6381 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved

@atrivedi-tsavoritesi atrivedi-tsavoritesi merged commit d733056 into master Jun 18, 2025
@atrivedi-tsavoritesi atrivedi-tsavoritesi deleted the FIR-754 branch June 18, 2025 04:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants