Examples and Tutorials

AOCL-DLP ships with example programs in the examples/classic/ directory. Build them with:

cd aocl-dlp
mkdir build && cd build
cmake -DBUILD_EXAMPLES=ON ..
make -j$(nproc)

Compiled examples are in build/examples/classic/.

Example Catalog

Basic GEMM

Example	Description	Key concepts
`simple_gemm_f32.c`	Float32 matrix multiplication	Basic GEMM call, row-major layout
`simple_gemm_bf16.c`	BFloat16 GEMM	BF16 input type, f32 accumulation
`simple_gemm_s8.c`	Signed int8 GEMM	Integer quantized GEMM

Mixed Precision

Example	Description	Key concepts
`simple_gemm_bf16s8.c`	BF16 activations with int8 weights	Mixed-precision, on-the-fly quantization
`simple_gemm_f32s8.c`	F32 activations with int8 weights	Mixed-precision quantized inference

Post-Operations

Example	Description	Key concepts
`simple_gemm_with_bias.c`	GEMM with fused bias addition	`dlp_metadata_t`, BIAS post-op
`simple_gemm_with_relu.c`	GEMM with fused ReLU activation	ELTWISE post-op, RELU
`simple_gemm_with_mish.c`	F32 GEMM with fused Mish activation	`aocl_gemm_f32f32f32of32`, ELTWISE post-op, `MISH` algo_type
`post_ops_combinations.c`	Multiple chained post-operations	Chaining BIAS + ELTWISE, seq_vector

Quantization

Example	Description	Key concepts
`quantization.c`	Symmetric quantization workflow	`DLP_SYMM_STAT_QUANT`, sym_quant APIs
`simple_gemm_s8_sym_quant.c`	s8 x s8 -> f32 GEMM with symmetric static quantization	`aocl_gemm_s8s8s32of32_sym_quant`, `post_op_grp` scales, group_size
`simple_gemm_per_token_quant.c`	W8A8 s8 x s8 GEMM with per-token (PerM) A dequant, incl. n=1 decoder path	`aocl_gemm_s8s8s32of32`, SCALE post-op, `DLP_PARAM_DIM_PER_TOKEN`
`simple_gemm_bf16s4.c`	BF16 activations x s4 weights, symmetric weight-only quantization (WOQ)	`aocl_gemm_bf16s4f32of32`, `aocl_reorder_bf16s4f32of32`, `pre_ops->b_scl`
`simple_gemm_bf16u4.c`	BF16 activations x u4 weights, asymmetric WOQ with B zero-point	`aocl_gemm_bf16u4f32of32`, `pre_ops` b_scl + b_zp

Batch & Advanced

Example	Description	Key concepts
`batch_gemm.c`	Batch GEMM for multiple matrices	`aocl_batch_gemm_*`, group_count
`matrix_reorder.c`	Pre-reorder weights for repeated use	`aocl_reorder_*`, mem_format_b = 'R'
`eltwise_ops.c`	Standalone element-wise operations	`aocl_gemm_eltwise_ops_*`

Multi-Instance & Utilities

Example	Description	Key concepts
`multi_instance_gemm_f32.c`	Multiple GEMM instances in parallel	Thread-local settings, concurrent calls
`multi_instance_gemm_u8s8.c`	Multi-instance quantized GEMM	Parallel quantized inference
`version.c`	Query library version	`dlp_version_query()`

Suggested Learning Path

If you are new to AOCL-DLP, work through the examples in this order:

Quick Start -- Build and run your first program (inline example)
simple_gemm_f32.c -- Understand basic GEMM parameters
simple_gemm_with_bias.c -- Learn how post-ops work
matrix_reorder.c -- Optimize for repeated inference
batch_gemm.c -- Process multiple matrices efficiently
quantization.c -- Use integer quantization for inference

Then explore the guides for deeper understanding:

GEMM Guide -- All data types, parameters, and reordering
Post-Ops Guide -- Full post-operations reference
Performance Guide -- Threading and optimization

Building Examples Against an Installed Library

If AOCL-DLP is already installed on your system, you can build examples standalone:

# Using shared library
gcc -o simple_gemm_f32 simple_gemm_f32.c -I/usr/local/include -L/usr/local/lib -laocl-dlp -lm

# Using static library
gcc -o simple_gemm_f32 simple_gemm_f32.c -I/usr/local/include -L/usr/local/lib \
    -Wl,--whole-archive -laocl-dlp_static -Wl,--no-whole-archive -lstdc++ -lm -fopenmp

See the Integration Guide for CMake-based builds and troubleshooting.

Home | Quick Start | API Reference | Report Issue | Source Code

AOCL-DLP Wiki

Getting Started

User Guides

Performance & Config

Testing & Benchmarking

Developer Guides

JIT Code Generation

Reference

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Examples and Tutorials

Examples and Tutorials

Example Catalog

Basic GEMM

Mixed Precision

Post-Operations

Quantization

Batch & Advanced

Multi-Instance & Utilities

Suggested Learning Path

Building Examples Against an Installed Library

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally