When I try to run applications/llama_3.2_1b, it throws such error:
AIE Configuration (✔ = AIE NPU / ✘ = CPU):
[Model] KV Cache ✔
[Decode] GEMV ✔
[Attention] Rope ✔
[Attention] QKV GEMM ✔
[Attention] Fused MHA ✔
[FFN] GEMM ✔
[FFN] Elementwise Mul ✔
[FFN] SiLU ✔
[Transformer] Residual Addition ✔
[Transformer] Pre Norm ✔
[Transformer] Post Norm ✔
[Transformer] Final Norm ✔
[Transformer] Final GEMM ✘
Traceback (most recent call last):
File "/home/nexaai/Desktop/code/IRON/applications/llama_3.2_1b/inference.py", line 319, in <module>
inference(
File "/home/nexaai/Desktop/code/IRON/applications/llama_3.2_1b/inference.py", line 143, in inference
AIEOperatorBase.compile_all_operators()
File "/home/nexaai/Desktop/code/IRON/applications/llama_3.2_1b/src/operator/aie_base.py", line 35, in compile_all_operators
op.compile()
File "/home/nexaai/Desktop/code/IRON/applications/llama_3.2_1b/src/operator/aie_base.py", line 204, in compile
comp.compile(compilation_rules, work_list)
File "/home/nexaai/Desktop/code/IRON/applications/llama_3.2_1b/src/compilation.py", line 496, in compile
success, artifacts = apply_rules(rules, remaining)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nexaai/Desktop/code/IRON/applications/llama_3.2_1b/src/compilation.py", line 483, in apply_rules
artifacts = rule.compile(artifacts)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nexaai/Desktop/code/IRON/applications/llama_3.2_1b/src/compilation.py", line 210, in compile
mlir_code = callback_function(
^^^^^^^^^^^^^^^^^^
File "/home/nexaai/Desktop/code/IRON/example/mha/mha.py", line 155, in fused_mha
num_q_blocks % number_of_pipelines == 0
AssertionError: Number of Q blocks must be divisible by number of pipelines (for now)
How to solve it?
When I try to run
applications/llama_3.2_1b, it throws such error:How to solve it?