ArmDeveloperEcosystem · jasonrandrews · Jul 31, 2025 · Jul 31, 2025
diff --git a/.wordlist.txt b/.wordlist.txt
@@ -4474,4 +4474,117 @@ AssetLib
 PerformanceStudio
 VkThread
 precompiled
-rollouts
+rollouts
+Bhusari
+DLLAMA
+FlameGraph
+FlameGraphs
+JSP
+KBC
+MMIO
+Paravirtualized
+PreserveFramePointer
+Servlet
+TDISP
+VirtIO
+WebSocket
+agentpath
+alarmtimer
+aoss
+apb
+ata
+bpf
+brendangregg
+chipidea
+clk
+cma
+counterintuitive
+cpuhp
+cros
+csd
+devfreq
+devlink
+dma
+dpaa
+dwc
+ecurity
+edma
+evice
+filelock
+filemap
+flamegraphs
+fsl
+glink
+gpu
+hcd
+hns
+hw
+hwmon
+icmp
+initcall
+iomap
+iommu
+ipi
+irq
+jbd
+jvmti
+kmem
+ksm
+kvm
+kyber
+libata
+libperf
+lockd
+mdio
+memcg
+mmc
+mtu
+musb
+napi
+ncryption
+netfs
+netlink
+nfs
+ntegrity
+nterface
+oom
+optee
+pagemap
+paravirtualized
+percpu
+printk
+pwm
+qcom
+qdisc
+ras
+rcu
+regmap
+rgerganov’s
+rotocol
+rpcgss
+rpmh
+rseq
+rtc
+sched
+scmi
+scsi
+skb
+smbus
+smp
+spi
+spmi
+sunrpc
+swiotlb
+tegra
+thp
+tlb
+udp
+ufs
+untrusted
+uring
+virtio
+vmalloc
+vmscan
+workqueue
+xdp
+xhci
diff --git a/content/learning-paths/servers-and-cloud-computing/_index.md b/content/learning-paths/servers-and-cloud-computing/_index.md
@@ -8,8 +8,8 @@ key_ip:
 maintopic: true
 operatingsystems_filter:
 - Android: 2
-- Linux: 154
-- macOS: 10
+- Linux: 157
+- macOS: 11
 - Windows: 14
 pinned_modules:
 - module:
@@ -22,8 +22,8 @@ subjects_filter:
 - Containers and Virtualization: 29
 - Databases: 15
 - Libraries: 9
-- ML: 28
-- Performance and Architecture: 60
+- ML: 29
+- Performance and Architecture: 62
 - Storage: 1
 - Web: 10
 subtitle: Optimize cloud native apps on Arm for performance and cost
@@ -47,6 +47,8 @@ tools_software_languages_filter:
 - ASP.NET Core: 2
 - Assembly: 4
 - assembly: 1
+- Async-profiler: 1
+- AWS: 1
 - AWS CDK: 2
 - AWS CodeBuild: 1
 - AWS EC2: 2
@@ -65,7 +67,7 @@ tools_software_languages_filter:
 - C++: 8
 - C/C++: 2
 - Capstone: 1
-- CCA: 6
+- CCA: 7
 - Clair: 1
 - Clang: 10
 - ClickBench: 1
@@ -77,18 +79,19 @@ tools_software_languages_filter:
 - Daytona: 1
 - Demo: 3
 - Django: 1
-- Docker: 17
+- Docker: 18
 - Envoy: 2
 - ExecuTorch: 1
 - FAISS: 1
+- FlameGraph: 1
 - Flink: 1
 - Fortran: 1
 - FunASR: 1
 - FVP: 4
 - GCC: 22
 - gdb: 1
 - Geekbench: 1
-- GenAI: 11
+- GenAI: 12
 - GitHub: 6
 - GitLab: 1
 - Glibc: 1
@@ -114,7 +117,7 @@ tools_software_languages_filter:
 - Linaro Forge: 1
 - Litmus7: 1
 - Llama.cpp: 1
-- LLM: 9
+- LLM: 10
 - llvm-mca: 1
 - LSE: 1
 - MariaDB: 1
@@ -132,6 +135,7 @@ tools_software_languages_filter:
 - Ollama: 1
 - ONNX Runtime: 1
 - OpenBLAS: 1
+- OpenJDK-21: 1
 - OpenShift: 1
 - OrchardCore: 1
 - PAPI: 1
@@ -144,7 +148,7 @@ tools_software_languages_filter:
 - RAG: 1
 - Redis: 3
 - Remote.It: 2
-- RME: 6
+- RME: 7
 - Runbook: 71
 - Rust: 2
 - snappy: 1
@@ -161,6 +165,7 @@ tools_software_languages_filter:
 - TensorFlow: 2
 - Terraform: 11
 - ThirdAI: 1
+- Tomcat: 1
 - Trusted Firmware: 1
 - TSan: 1
 - TypeScript: 1
@@ -173,6 +178,7 @@ tools_software_languages_filter:
 - Whisper: 1
 - WindowsPerf: 1
 - WordPress: 3
+- wrk2: 1
 - x265: 1
 - zlib: 1
 - Zookeeper: 1

diff --git a/...hs/servers-and-cloud-computing/distributed-inference-with-llama-cpp/how-to-1.md b/...hs/servers-and-cloud-computing/distributed-inference-with-llama-cpp/how-to-1.md
@@ -46,7 +46,7 @@ If everything was built correctly, you should see a list of all the available fl
 
 Communication between the master node and the worker nodes occurs through a socket created on each worker. This socket listens for incoming data from the master—such as model parameters, tokens, hidden states, and other inference-related information.
 {{% notice Note %}}The RPC feature in llama.cpp is not secure by default, so you should never expose it to the open internet. To mitigate this risk, ensure that the security groups for all your EC2 instances are properly configured—restricting access to only trusted IPs or internal VPC traffic. This helps prevent unauthorized access to the RPC endpoints.{{% /notice %}}
-Use the following command to start the listeneing on the worker nodes:
+Use the following command to start the listening on the worker nodes:
 ```bash
 bin/rpc-server -p 50052 -H 0.0.0.0 -t 64
 ```

diff --git a/...hs/servers-and-cloud-computing/distributed-inference-with-llama-cpp/how-to-2.md b/...hs/servers-and-cloud-computing/distributed-inference-with-llama-cpp/how-to-2.md
@@ -190,7 +190,7 @@ llama_perf_context_print:        eval time =   77429.95 ms /   127 runs   (  609
 llama_perf_context_print:       total time =   79394.06 ms /   132 tokens
 llama_perf_context_print:    graphs reused =          0
 ```
-That's it! You have sucessfully run the llama-3.1-8B model on CPUs with the power of llama.cpp RPC functionality. The following table provides brief description of the metrics from `llama_perf`: <br><br>
+That's it! You have successfully run the llama-3.1-8B model on CPUs with the power of llama.cpp RPC functionality. The following table provides brief description of the metrics from `llama_perf`: <br><br>
 
 | Log Line          | Description                                                                 |
 |-------------------|-----------------------------------------------------------------------------|
@@ -200,11 +200,11 @@ That's it! You have sucessfully run the llama-3.1-8B model on CPUs with the powe
 | eval time         | Time to generate output tokens by forward-passing through the model.        |
 | total time        | Total time for both prompt processing and token generation (excludes model load). |
 
-Lastly to set up OpenAI compatible API, you can use the `llama-server` functionality. The process of implementing this is described [here](/learning-paths/servers-and-cloud-computing/llama-cpu) under the "Access the chatbot using the OpenAI-compatible API" section. Here is a snippet, for how to set up llama-server for disributed inference:
+Lastly to set up OpenAI compatible API, you can use the `llama-server` functionality. The process of implementing this is described [here](/learning-paths/servers-and-cloud-computing/llama-cpu) under the "Access the chatbot using the OpenAI-compatible API" section. Here is a snippet, for how to set up llama-server for distributed inference:
 ```bash
 bin/llama-server -m /home/ubuntu/model.gguf --port 8080 --rpc "$worker_ips" -ngl 99
 ```
-At the very end of the output to the above command, you will see somethin like the following: 
+At the very end of the output to the above command, you will see something like the following: 
 ```output
 main: server is listening on http://127.0.0.1:8080 - starting the main loop
 srv  update_slots: all slots are idle

diff --git a/content/learning-paths/servers-and-cloud-computing/java-perf-flamegraph/1_setup.md b/content/learning-paths/servers-and-cloud-computing/java-perf-flamegraph/1_setup.md
@@ -87,7 +87,7 @@ Move the executable to somewhere in your PATH:
 sudo cp wrk /usr/local/bin
 ```
 
-3. Finally, you can run the benchamrk of Tomcat through wrk2.
+3. Finally, you can run the benchmark of Tomcat through wrk2.
 ```bash
 wrk -c32 -t16 -R50000 -d60 http://${tomcat_ip}:8080/examples/servlets/servlet/HelloWorldExample
 ```