Skip to content

Conversation

steven-studio
Copy link

This patch introduces a small 8-entry ring-buffer "hot cache" inside
IRTranslator::ValueToVRegInfo to reduce the number of DenseMap lookups
on very hot paths (PHIs, arguments, repeated operands).

Implementation details:

  • Two ring buffers are maintained: (Value* -> VRegListT*) and (Type* -> OffsetListT*)
  • Entries are filled when getVRegs/getOffsets hit or insert a new mapping.
  • Reset at function boundaries, so pointers stay valid (bump allocator).
  • No semantic changes: purely compile-time optimization.

Benefits:

  • Reduces DenseMap probes on tight loops with repeated operands.
  • Improves IR translation time in GlobalISel on large inputs (early micro-benchmark shows X% fewer map lookups).

Testing:

  • ninja check-llvm passed, no MIR/CodeGen diffs observed.

Risk:

  • Low. Cache is local per-function and fully reset between functions.
  • If disabled, behavior is identical to pre-patch.

NFC

…bled by default)

Introduce RegAllocSegmentTree pass skeleton and plumbing:
- New files under CodeGen for segment-tree-based RA
- Add CMake gate (LLVM_ENABLE_SEG_TRE_RA, default OFF)
- (Optional) Macro gate EXPERIMENTAL_SEG_TRE_RA around pass registration

Current status:
- Does not fully compile when enabled; kept OUT of default build
- selectOrSplit/enqueue/dequeue and segment tree ops are stubs
- Spiller creation via RequiredAnalyses wired but still under development

Design notes:
- Store LIS/VRM/MF/MRI as pointers; pass by reference where required
- Intend per-physreg data structure for interval storage (no static shared vector)

TODO:
- Fix pointer/reference consistency across all call sites
- Replace static interval buffer with per-physreg containers
- Implement incremental segment-tree updates and queries
- Implement selectOrSplit and spill heuristics
- Add tests; ensure no functional change when gate is OFF

Build/usage:
- OFF by default; enable with -DLLVM_ENABLE_SEG_TRE_RA=ON
- When enabled, use: llc/clang -regalloc=segtre

No functional change when LLVM_ENABLE_SEG_TRE_RA=OFF.:
- New allocator: PRCoords + PRTree (range-add / range-max with lazy)
- ensureCoordsAndTree(): rebuild-on-new-endpoints + interval replay
- isPhysRegAvailable(): alias-aware; queries new tree after ensuring coords
- updateSegmentTreeForPhysReg(): writes range add to selected physreg
- init/reset: initialize/clear PRCoords/PRTree/PhysRegIntervals
- SlotIndexes dependency handled; pipeline runs to completion

Known gaps:
- Rebuild frequency could be optimized (batch endpoints)
- Legacy MaxEnd helpers still present (unused)
- More tests for alias-heavy and spill cases

Verified:
- llc -O2 -regalloc=segtre produces assembly
- -debug-only=regallocsegtre shows coherent updates/queries
- -verify-machineinstrs passes
- Implement segment tree-based register allocator for LLVM
- Features coordinate compression and lazy propagation
- Add interference checking using interval tree queries
- Include performance benchmarking against RAGreedy
- Currently 6.7x slower than baseline (experimental research)
This adds a prototype pass `-regalloc=intervals`, which uses IntervalSet
instead of SegmentTree. Currently only handles trivial cases.
Adds a minimal factory + regalloc registration for .
Currently delegates to Greedy allocator. No functional change.
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added backend:RISC-V llvm:codegen llvm:globalisel PGO Profile Guided Optimizations llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Sep 10, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 10, 2025

@llvm/pr-subscribers-pgo
@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-risc-v

Author: None (steven-studio)

Changes

This patch introduces a small 8-entry ring-buffer "hot cache" inside
IRTranslator::ValueToVRegInfo to reduce the number of DenseMap lookups
on very hot paths (PHIs, arguments, repeated operands).

Implementation details:

  • Two ring buffers are maintained: (Value* -> VRegListT*) and (Type* -> OffsetListT*)
  • Entries are filled when getVRegs/getOffsets hit or insert a new mapping.
  • Reset at function boundaries, so pointers stay valid (bump allocator).
  • No semantic changes: purely compile-time optimization.

Benefits:

  • Reduces DenseMap probes on tight loops with repeated operands.
  • Improves IR translation time in GlobalISel on large inputs (early micro-benchmark shows X% fewer map lookups).

Testing:

  • ninja check-llvm passed, no MIR/CodeGen diffs observed.

Risk:

  • Low. Cache is local per-function and fully reset between functions.
  • If disabled, behavior is identical to pre-patch.

NFC


Patch is 184.62 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/157964.diff

9566 Files Affected:

  • (added) DeepSeekCodeAnalyzer.py (+164)
  • (added) SingleSource/UnitTests/Output/2002-04-17-PrintfChar.test.out (+1)
  • (added) asearch.ll (+1138)
  • (added) code_analysis_report.txt (+254)
  • (added) commit.txt (+4)
  • (added) failed.list (+7)
  • (added) hello ()
  • (added) hello-rv64 ()
  • (added) hello.c (+6)
  • (added) linux (+1)
  • (added) llvm-test-suite (+1)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/IRTranslator.h (+69-8)
  • (modified) llvm/include/llvm/CodeGen/LinkAllCodegenComponents.h (+3-1)
  • (modified) llvm/include/llvm/CodeGen/Passes.h (+5)
  • (added) llvm/include/llvm/CodeGen/RegAllocIntervals.h (+118)
  • (added) llvm/include/llvm/CodeGen/RegAllocSegmentTree.h (+355)
  • (modified) llvm/include/llvm/InitializePasses.h (+1)
  • (modified) llvm/lib/CodeGen/CMakeLists.txt (+2)
  • (modified) llvm/lib/CodeGen/CodeGen.cpp (+1)
  • (modified) llvm/lib/CodeGen/RegAllocGreedy.cpp (+86)
  • (modified) llvm/lib/CodeGen/RegAllocGreedy.h (+22)
  • (added) llvm/lib/CodeGen/RegAllocIntervals.cpp (+470)
  • (added) llvm/lib/CodeGen/RegAllocSegmentTree.cpp (+1510)
  • (added) llvm/lib/Target/MyTarget/CMakeLists.txt (+22)
  • (added) llvm/lib/Target/MyTarget/HelloMinimal.td (+8)
  • (added) llvm/lib/Target/MyTarget/HelloWorld.td (+27)
  • (added) llvm/lib/Target/MyTarget/MyTarget.td (+31)
  • (added) llvm/lib/Target/MyTarget/MyTargetInstrInfo.cpp (+6)
  • (added) llvm/lib/Target/MyTarget/MyTargetInstrInfo.h (+10)
  • (added) llvm/lib/Target/RISCV/RISCVArithmeticCostCalculator.h (+321)
  • (added) llvm/lib/Target/RISCV/RISCVCostCalculatorRegistry.h (+29)
  • (added) llvm/lib/Target/RISCV/RISCVCostConstants.h (+84)
  • (added) llvm/lib/Target/RISCV/RISCVCostValidation.h (+55)
  • (added) llvm/lib/Target/RISCV/RISCVShuffleCostCalculator.h (+439)
  • (added) llvm/lib/Target/RISCV/RISCVTTITester.h (+102)
  • (modified) llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp (+5-1)
  • (added) llvm/test/Analysis/CostModel/RISCV/rvv-shift-cost.ll (+26)
  • (added) llvm/test/Analysis/CostModel/RISCV/saturated-arithmetic.ll (+37)
  • (added) llvm/test/Analysis/CostModel/RISCV/shift-instructions.ll (+26)
  • (added) llvm/test/CodeGen/RISCV/GlobalISel/icmp-ult-imm.ll (+13)
  • (added) llvm/test/CodeGen/RISCV/riscv-tticost-mul-imm.ll (+9)
  • (added) llvm/test/CodeGen/RISCV/riscv-tticost-shifts.ll (+26)
  • (added) llvm/test/CodeGen/RISCV/riscv-tticost-ssub-sat.ll (+12)
  • (added) llvm/test/CodeGen/RISCV/rvv-shift-vv.ll (+27)
  • (added) llvm/test/Transforms/InstCombine/sub-xor-and-equivalence.ll (+118)
  • (modified) llvm/tools/llvm-profgen/ProfiledBinary.cpp (+1-1)
  • (added) my-test/a.out ()
  • (added) my-test/acc.ll (+348)
  • (added) my-test/accumulators.c (+32)
  • (added) my-test/call_intensive_function.c (+18)
  • (added) my-test/compare_each.sh (+57)
  • (added) my-test/compare_ra.sh (+99)
  • (added) my-test/complex_branching.c (+50)
  • (added) my-test/data-dependency/analysis/Makefile (+24)
  • (added) my-test/data-dependency/analysis/analyze_deps.sh (+22)
  • (added) my-test/data-dependency/examples/advanced/complex_dep.c (+24)
  • (added) my-test/data-dependency/examples/advanced/func_dep.c (+12)
  • (added) my-test/data-dependency/examples/advanced/vectorizable.c (+20)
  • (added) my-test/data-dependency/examples/basic/anti_dep.c (+7)
  • (added) my-test/data-dependency/examples/basic/control_dep.c (+8)
  • (added) my-test/data-dependency/examples/basic/dep.c (+6)
  • (added) my-test/data-dependency/examples/basic/output_dep.c (+9)
  • (added) my-test/data-dependency/generated/dep.ll (+60)
  • (added) my-test/dbg.txt (+1)
  • (added) my-test/debug.log (+2)
  • (added) my-test/equivalence-optimization/dag.s (+43)
  • (added) my-test/equivalence-optimization/foo.c (+5)
  • (added) my-test/equivalence-optimization/foo.ll (+47)
  • (added) my-test/equivalence-optimization/gisel.s (+18)
  • (added) my-test/equivalence-optimization/icmp_ult_imm.ll (+9)
  • (added) my-test/equivalence-optimization/test-equivalence.c (+4)
  • (added) my-test/equivalence-optimization/test-equivalence.ll (+44)
  • (added) my-test/extreme_vreg_test.c (+144)
  • (added) my-test/gen_high_pressure_ir.py (+22)
  • (added) my-test/gen_massive_ir.py (+11)
  • (added) my-test/gen_pressure_ir.py (+37)
  • (added) my-test/gen_vreg_stress_ir.py (+25)
  • (added) my-test/greedy.log (+315)
  • (added) my-test/greedy.s (+15)
  • (added) my-test/greedy_heavy.s (+69)
  • (added) my-test/greedy_stats.txt ()
  • (added) my-test/greedy_time.txt (+10)
  • (added) my-test/greedy_timing.txt (+274)
  • (added) my-test/harvest_and_compare.sh (+104)
  • (added) my-test/heavy_computation.c (+24)
  • (added) my-test/hp.ll (+16005)
  • (added) my-test/hp.s (+16)
  • (added) my-test/kernel.c (+29)
  • (added) my-test/llc-trace.json (+1)
  • (added) my-test/massive.ll (+4015)
  • (added) my-test/massive_N1e6.ll (+2000003)
  • (added) my-test/massive_O0.ll (+11118)
  • (added) my-test/massive_O2.ll (+140)
  • (added) my-test/massive_test.c (+2022)
  • (added) my-test/massive_test.o ()
  • (added) my-test/matrix_multiply_complex.c (+34)
  • (added) my-test/neon.ll (+212611)
  • (added) my-test/neon_window.c (+52)
  • (added) my-test/nested_loops_with_arrays.c (+22)
  • (added) my-test/out/summary.csv (+1)
  • (added) my-test/out_greedy.s (+10029)
  • (added) my-test/output.s (+2627)
  • (added) my-test/pbqp_timing.txt (+257)
  • (added) my-test/pressure_block.c (+40)
  • (added) my-test/results/asm/greedy/neon.s (+423827)
  • (added) my-test/results/asm/greedy/t.s (+16)
  • (added) my-test/results/asm/greedy/t_1024.s (+26)
  • (added) my-test/results/asm/greedy/test.s (+29)
  • (added) my-test/results/asm/segtre/neon.s (+389685)
  • (added) my-test/results/asm/segtre/t.s (+16)
  • (added) my-test/results/asm/segtre/t_1024.s (+27)
  • (added) my-test/results/asm/segtre/test.s (+29)
  • (added) my-test/results/logs/greedy/neon.time (+3)
  • (added) my-test/results/logs/greedy/t.time (+3)
  • (added) my-test/results/logs/greedy/t_1024.time (+3)
  • (added) my-test/results/logs/greedy/test.time (+479)
  • (added) my-test/results/logs/segtre/neon.time (+3)
  • (added) my-test/results/logs/segtre/t.time (+3)
  • (added) my-test/results/logs/segtre/t_1024.time (+3)
  • (added) my-test/results/logs/segtre/test.time (+479)
  • (added) my-test/seg.s (+2651)
  • (added) my-test/segtre.s (+188)
  • (added) my-test/segtre_timing.txt (+257)
  • (added) my-test/segtre_timing.txtcat (+2)
  • (added) my-test/segtree.s (+15)
  • (added) my-test/segtree_heavy.s (+69)
  • (added) my-test/segtree_stats.txt ()
  • (added) my-test/segtree_time.txt (+11)
  • (added) my-test/st.log (+3)
  • (added) my-test/summary.csv (+5)
  • (added) my-test/t.ll (+9)
  • (added) my-test/t.s (+18)
  • (added) my-test/t_1024.c (+7)
  • (added) my-test/t_1024.ll (+54)
  • (added) my-test/teaching_examples.c (+37)
  • (added) my-test/test.c (+1)
  • (added) my-test/test.ll (+35)
  • (added) my-test/test.s (+36)
  • (added) my-test/test_basic.ll (+4)
  • (added) my-test/test_heavy_pressure.ll (+58)
  • (added) my-test/test_regalloc.sh (+129)
  • (added) my-test/test_spill.ll (+36)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/functionobjects.ll (+4328)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/functionobjects.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/loop_unroll.ll (+101441)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/loop_unroll.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding.ll (+40281)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding.ll.err (+32)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/simple_types_loop_invariant.ll (+32171)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/simple_types_loop_invariant.ll.err (+32)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/stepanov_abstraction.ll (+6199)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/stepanov_abstraction.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/stepanov_vector.ll (+5308)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/stepanov_vector.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/Large/fasta.ll (+269)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/Large/fasta.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/fannkuch.ll (+345)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/fannkuch.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/n-body.ll (+769)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/n-body.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/nsieve-bits.ll (+223)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/nsieve-bits.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/partialsums.ll (+160)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/partialsums.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/puzzle.ll (+452)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/puzzle.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/recursive.ll (+158)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/recursive.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/spectral-norm.ll (+388)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/spectral-norm.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/CoyoteBench/almabench.ll (+836)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/CoyoteBench/almabench.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/CoyoteBench/fftbench.ll (+2261)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/CoyoteBench/fftbench.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/CoyoteBench/huffbench.ll (+857)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/CoyoteBench/huffbench.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/CoyoteBench/lpbench.ll (+1423)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/CoyoteBench/lpbench.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Dhrystone/dry.ll.err (+129)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Dhrystone/fldry.ll.err (+134)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Linpack/linpack-pc.ll (+6189)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Linpack/linpack-pc.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/McGill/chomp.ll (+1982)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/McGill/chomp.ll.err (+7)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/McGill/exptree.ll (+1070)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/McGill/exptree.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/McGill/misr.ll (+527)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/McGill/misr.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/McGill/queens.ll (+512)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/McGill/queens.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++-EH/spirit.ll (+25360)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++-EH/spirit.ll.err (+234)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/Large/ray.ll (+1512)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/Large/ray.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/Large/sphereflake.ll (+1198)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/Large/sphereflake.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/bigfib.ll (+3325)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/bigfib.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/mandel-text.ll (+113)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/mandel-text.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/oopack_v1p8.ll (+1625)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/oopack_v1p8.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/stepanov_container.ll (+6124)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/stepanov_container.ll.err (+25)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/stepanov_v1p2.ll (+1413)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/stepanov_v1p2.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/ReedSolomon.ll.err (+5)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/aarch64-init-cpu-features.ll (+63)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/aarch64-init-cpu-features.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/dt.ll (+143)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/dt.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/evalloop.ll (+314)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/evalloop.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/fbench.ll.err (+9)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/ffbench.ll (+613)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/ffbench.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-1.ll (+150)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-1.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-2.ll (+159)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-2.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-3.ll (+157)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-3.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-4.ll (+172)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-4.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-5.ll (+284)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-5.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-6.ll (+281)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-6.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-7.ll (+145)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-7.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-8.ll (+287)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-8.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops.ll.err (+75)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/fp-convert.ll (+199)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/fp-convert.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/himenobmtxpa.ll (+2410)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/himenobmtxpa.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/lowercase.ll (+59)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/lowercase.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/mandel-2.ll (+202)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/mandel-2.ll.err (+22)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/mandel.ll (+139)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/mandel.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/matmul_f64_4x4.ll (+246)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/matmul_f64_4x4.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/oourafft.ll (+3525)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/oourafft.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/perlin.ll.err (+5)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/pi.ll (+96)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/pi.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/revertBits.ll (+121)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/revertBits.ll.err (+5)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/richards_benchmark.ll (+1037)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/richards_benchmark.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/salsa20.ll (+337)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/salsa20.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/whetstone.ll (+639)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/whetstone.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/datamining/correlation/correlation.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/datamining/covariance/covariance.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/blas/gemm/gemm.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/blas/gemver/gemver.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/blas/gesummv/gesummv.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/blas/symm/symm.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/blas/syr2k/syr2k.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/blas/syrk/syrk.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/blas/trmm/trmm.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/kernels/2mm/2mm.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/kernels/3mm/3mm.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/kernels/atax/atax.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/kernels/bicg/bicg.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/kernels/doitgen/doitgen.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/kernels/mvt/mvt.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/solvers/cholesky/cholesky.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/solvers/durbin/durbin.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/solvers/gramschmidt/gramschmidt.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/solvers/lu/lu.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/solvers/ludcmp/ludcmp.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/solvers/trisolv/trisolv.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/medley/deriche/deriche.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/medley/floyd-warshall/floyd-warshall.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/medley/nussinov/nussinov.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/stencils/adi/adi.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/stencils/fdtd-2d/fdtd-2d.ll.err (+4)
diff --git a/DeepSeekCodeAnalyzer.py b/DeepSeekCodeAnalyzer.py
new file mode 100644
index 0000000000000..a9dc6545049cf
--- /dev/null
+++ b/DeepSeekCodeAnalyzer.py
@@ -0,0 +1,164 @@
+import requests
+import json
+import os
+from pathlib import Path
+import readline
+import glob
+
+class DeepSeekCodeAnalyzer:
+    def __init__(self, api_key):
+        self.api_key = api_key
+        self.api_url = "https://api.deepseek.com/v1/chat/completions"  # 请替换为实际API端点
+        self.headers = {
+            "Authorization": f"Bearer {api_key}",
+            "Content-Type": "application/json"
+        }
+    
+    def analyze_code(self, code_file_path, prompt_template=None):
+        # 读取代码文件
+        with open(code_file_path, 'r', encoding='utf-8') as f:
+            code_content = f.read()
+        
+        # 准备提示词
+        if prompt_template is None:
+            prompt = f"""
+请分析以下C++代码的时间复杂度以及可以改进的点:
+
+{code_content}
+
+请提供:
+1. 整体时间复杂度分析
+2. 关键函数/方法的时间复杂度
+3. 内存使用分析
+4. 具体的改进建议
+5. 代码优化示例
+"""
+        else:
+            prompt = prompt_template.format(code=code_content)
+        
+        # 准备API请求数据
+        payload = {
+            "model": "deepseek-coder",  # 使用代码专用模型
+            "messages": [
+                {"role": "user", "content": prompt}
+            ],
+            "temperature": 0.1,  # 低随机性以获得更确定的回答
+            "max_tokens": 4000   # 根据需要进行调整
+        }
+        
+        # 发送请求
+        try:
+            response = requests.post(
+                self.api_url,
+                headers=self.headers,
+                json=payload,
+                timeout=None  # None = 無限等待
+            )
+            response.raise_for_status()
+            
+            # 解析响应
+            result = response.json()
+            analysis = result['choices'][0]['message']['content']
+            
+            return analysis
+            
+        except Exception as e:
+            print(f"API请求失败: {e}")
+            return None
+    
+    def save_analysis(self, analysis_result, output_path):
+        with open(output_path, 'w', encoding='utf-8') as f:
+            f.write(analysis_result)
+        print(f"分析结果已保存到: {output_path}")
+
+def _complete_path(text, state):
+    # 展開 ~ 與 $ENV
+    text = os.path.expanduser(os.path.expandvars(text))
+
+    # 空字串時,預設從當前目錄開始
+    if not text:
+        text = './'
+
+    # 拆出目錄與前綴
+    dirname, prefix = os.path.split(text)
+    if not dirname:
+        dirname = '.'
+
+    try:
+        entries = os.listdir(dirname)
+    except Exception:
+        entries = []
+
+    # 做匹配
+    matches = []
+    for name in entries:
+        if name.startswith(prefix):
+            full = os.path.join(dirname, name)
+            # 目錄補全時自動加 '/'
+            if os.path.isdir(full):
+                full += '/'
+            matches.append(full)
+
+    # 也支援使用者原本寫的 glob 型式(例如 src/*.cpp)
+    if '*' in text or '?' in text or '[' in text:
+        matches.extend(glob.glob(text))
+
+    # 去重、排序
+    matches = sorted(set(matches))
+
+    return matches[state] if state < len(matches) else None
+
+# 讓 '/' 不被當成分隔符,保留路徑連續性
+readline.set_completer_delims(' \t\n;')
+
+# macOS 內建 Python 多半是 libedit;Linux/自裝 Python 多半是 GNU readline
+if 'libedit' in readline.__doc__:
+    # libedit 的綁定語法
+    readline.parse_and_bind("bind ^I rl_complete")
+else:
+    # GNU readline 的綁定語法
+    readline.parse_and_bind("tab: complete")
+
+readline.set_completer(_complete_path)
+
+# 使用示例
+if __name__ == "__main__":
+    # 替换为您的API密钥
+    API_KEY = "sk-9cafb2e074bf4b348af4a075c19ccf6b"
+    
+    # 创建分析器实例
+    analyzer = DeepSeekCodeAnalyzer(API_KEY)
+    
+    # 讓使用者輸入檔案名稱
+    code_file = input("請輸入要分析的程式碼檔案名稱(含路徑):").strip()
+    
+    # 自定义提示词(可选)
+    custom_prompt = """
+作为资深C++性能优化专家,请详细分析以下LLVM IRTranslator代码:
+
+{code}
+
+请重点关注:
+1. 算法复杂度分析(最好、最坏、平均情况)
+2. 内存访问模式和缓存友好性
+3. 潜在的性能瓶颈
+4. 并行化机会
+5. LLVM特定最佳实践
+6. 具体重构建议和代码示例
+"""
+    
+    # 执行分析
+    print("正在分析代码,请稍候...")
+    result = analyzer.analyze_code(code_file, custom_prompt)
+    
+    if result:
+        # 保存分析结果
+        output_file = "code_analysis_report.txt"
+        analyzer.save_analysis(result, output_file)
+        
+        # 打印部分结果预览
+        print("\n分析结果预览:")
+        print("=" * 50)
+        print(result)
+    else:
+        print("分析失败")
diff --git a/SingleSource/UnitTests/Output/2002-04-17-PrintfChar.test.out b/SingleSource/UnitTests/Output/2002-04-17-PrintfChar.test.out
new file mode 100644
index 0000000000000..f12351e3edc3d
--- /dev/null
+++ b/SingleSource/UnitTests/Output/2002-04-17-PrintfChar.test.out
@@ -0,0 +1 @@
+'c' 'e'
diff --git a/asearch.ll b/asearch.ll
new file mode 100644
index 0000000000000..e921f58c9fc86
--- /dev/null
+++ b/asearch.ll
@@ -0,0 +1,1138 @@
+; ModuleID = 'MultiSource/Benchmarks/Prolangs-C/agrep/asearch.c'
+source_filename = "MultiSource/Benchmarks/Prolangs-C/agrep/asearch.c"
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128-Fn32"
+target triple = "aarch64-unknown-linux-gnu"
+
+@D_endpos = external local_unnamed_addr global i32, align 4
+@Init1 = external local_unnamed_addr global i32, align 4
+@NO_ERR_MASK = external local_unnamed_addr global i32, align 4
+@Init = external local_unnamed_addr global [0 x i32], align 4
+@Mask = external local_unnamed_addr global [0 x i32], align 4
+@AND = external local_unnamed_addr global i32, align 4
+@endposition = external local_unnamed_addr global i32, align 4
+@INVERSE = external local_unnamed_addr global i32, align 4
+@FILENAMEONLY = external local_unnamed_addr global i32, align 4
+@num_of_matched = external local_unnamed_addr global i32, align 4
+@CurrentFileName = external global [0 x i8], align 1
+@TRUNCATE = external local_unnamed_addr global i32, align 4
+@I = external local_unnamed_addr global i32, align 4
+@DELIMITER = external local_unnamed_addr global i32, align 4
+
+; Function Attrs: nounwind uwtable
+define dso_local void @asearch0(ptr noundef readonly captures(none) %old_D_pat, i32 noundef %text, i32 noundef %D) local_unnamed_addr #0 {
+entry:
+  %A = alloca [10 x i32], align 4
+  %B = alloca [10 x i32], align 4
+  %buffer = alloca [98305 x i8], align 1
+  call void @llvm.lifetime.start.p0(ptr nonnull %A) #6
+  call void @llvm.lifetime.start.p0(ptr nonnull %B) #6
+  call void @llvm.lifetime.start.p0(ptr nonnull %buffer) #6
+  %call = tail call i64 @strlen(ptr noundef nonnull dereferenceable(1) %old_D_pat) #7
+  %conv = trunc i64 %call to i32
+  %arrayidx = getelementptr inbounds nuw i8, ptr %buffer, i64 49151
+  store i8 10, ptr %arrayidx, align 1, !tbaa !6
+  %0 = load i32, ptr @D_endpos, align 4, !tbaa !9
+  %cmp461 = icmp ugt i32 %conv, 1
+  br i1 %cmp461, label %for.body, label %for.end
+
+for.body:                                         ; preds = %entry, %for.body
+  %i.0463 = phi i32 [ %inc, %for.body ], [ 1, %entry ]
+  %D_Mask.0462 = phi i32 [ %or, %for.body ], [ %0, %entry ]
+  %shl = shl i32 %D_Mask.0462, 1
+  %or = or i32 %shl, %D_Mask.0462
+  %inc = add nuw i32 %i.0463, 1
+  %exitcond.not = icmp eq i32 %inc, %conv
+  br i1 %exitcond.not, label %for.end, label %for.body, !llvm.loop !11
+
+for.end:                                          ; preds = %for.body, %entry
+  %D_Mask.0.lcssa = phi i32 [ %0, %entry ], [ %or, %for.body ]
+  %1 = load i32, ptr @Init1, align 4, !tbaa !9
+  %2 = load i32, ptr @NO_ERR_MASK, align 4, !tbaa !9
+  %3 = load i32, ptr @Init, align 4, !tbaa !9
+  %4 = add i32 %D, 1
+  %wide.trip.count = zext i32 %4 to i64
+  %min.iters.check = icmp ult i32 %4, 8
+  br i1 %min.iters.check, label %for.body5.preheader, label %vector.ph
+
+vector.ph:                                        ; preds = %for.end
+  %n.vec = and i64 %wide.trip.count, 4294967288
+  %broadcast.splatinsert = insertelement <4 x i32> poison, i32 %3, i64 0
+  %broadcast.splat = shufflevector <4 x i32> %broadcast.splatinsert, <4 x i32> poison, <4 x i32> zeroinitializer
+  br label %vector.body
+
+vector.body:                                      ; preds = %vector.body, %vector.ph
+  %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]
+  %5 = getelementptr inbounds nuw [10 x i32], ptr %B, i64 0, i64 %index
+  %6 = getelementptr inbounds nuw i8, ptr %5, i64 16
+  store <4 x i32> %broadcast.splat, ptr %5, align 4, !tbaa !9
+  store <4 x i32> %broadcast.splat, ptr %6, align 4, !tbaa !9
+  %7 = getelementptr inbounds nuw [10 x i32], ptr %A, i64 0, i64 %index
+  %8 = getelementptr inbounds nuw i8, ptr %7, i64 16
+  store <4 x i32> %broadcast.splat, ptr %7, align 4, !tbaa !9
+  store <4 x i32> %broadcast.splat, ptr %8, align 4, !tbaa !9
+  %index.next = add nuw i64 %index, 8
+  %9 = icmp eq i64 %index.next, %n.vec
+  br i1 %9, label %middle.block, label %vector.body, !llvm.loop !13
+
+middle.block:                                     ; preds = %vector.body
+  %cmp.n = icmp eq i64 %n.vec, %wide.trip.count
+  br i1 %cmp.n, label %while.cond.preheader, label %for.body5.preheader
+
+for.body5.preheader:                              ; preds = %for.end, %middle.block
+  %indvars.iv.ph = phi i64 [ 0, %for.end ], [ %n.vec, %middle.block ]
+  br label %for.body5
+
+while.cond.preheader:                             ; preds = %for.body5, %middle.block
+  %not = xor i32 %D_Mask.0.lcssa, -1
+  %add.ptr = getelementptr inbounds nuw i8, ptr %buffer, i64 49152
+  %call12481 = call i32 @fill_buf(i32 noundef %text, ptr noundef nonnull %add.ptr, i32 noundef 49152) #6
+  %cmp13482 = icmp sgt i32 %call12481, 0
+  br i1 %cmp13482, label %while.body.lr.ph, label %cleanup
+
+while.body.lr.ph:                                 ; preds = %while.cond.preheader
+  %sext = shl i64 %call, 32
+  %conv20 = ashr exact i64 %sext, 32
+  %cmp43.not465 = icmp eq i32 %D, 0
+  %idxprom77 = zext i32 %D to i64
+  %arrayidx78 = getelementptr inbounds nuw [10 x i32], ptr %A, i64 0, i64 %idxprom77
+  %arrayidx205 = getelementptr inbounds nuw [10 x i32], ptr %B, i64 0, i64 %idxprom77
+  %10 = xor i32 %conv, -1
+  %min.iters.check542 = icmp ult i32 %4, 8
+  %n.vec545 = and i64 %wide.trip.count, 4294967288
+  %cmp.n552 = icmp eq i64 %n.vec545, %wide.trip.count
+  %min.iters.check529 = icmp ult i32 %4, 8
+  %n.vec532 = and i64 %wide.trip.count, 4294967288
+  %cmp.n539 = icmp eq i64 %n.vec532, %wide.trip.count
+  br label %while.body
+
+for.body5:                                        ; preds = %for.body5.preheader, %for.body5
+  %indvars.iv = phi i64 [ %indvars.iv.next, %for.body5 ], [ %indvars.iv.ph, %for.body5.preheader ]
+  %arrayidx6 = getelementptr inbounds nuw [10 x i32], ptr %B, i64 0, i64 %indvars.iv
+  store i32 %3, ptr %arrayidx6, align 4, !tbaa !9
+  %arrayidx8 = getelementptr inbounds nuw [10 x i32], ptr %A, i64 0, i64 %indvars.iv
+  store i32 %3, ptr %arrayidx8, align 4, !tbaa !9
+  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
+  %exitcond488.not = icmp eq i64 %indvars.iv.next, %wide.trip.count
+  br i1 %exitcond488.not, label %while.cond.preheader, label %for.body5, !llvm.loop !16
+
+while.body:                                       ; preds = %while.body.lr.ph, %if.end311
+  %call12486 = phi i32 [ %call12481, %while.body.lr.ph ], [ %call12, %if.end311 ]
+  %j.0485 = phi i32 [ 0, %while.body.lr.ph ], [ %j.1.lcssa, %if.end311 ]
+  %lasti.0484 = phi i32 [ 49152, %while.body.lr.ph ], [ %lasti.4, %if.end311 ]
+  %tobool.not483 = phi i32 [ 49151, %while.body.lr.ph ], [ 49152, %if.end311 ]
+  %add = add nuw nsw i32 %call12486, 49152
+  %cmp15 = icmp samesign ult i32 %call12486, 49152
+  br i1 %cmp15, label %if.then17, label %if.end26
+
+if.then17:                                        ; preds = %while.body
+  %idx.ext = zext nneg i32 %add to i64
+  %add.ptr19 = getelementptr inbounds nuw i8, ptr %buffer, i64 %idx.ext
+  %call21 = call ptr @strncpy(ptr noundef nonnull %add.ptr19, ptr noundef nonnull %old_D_pat, i64 noundef %conv20) #6
+  %add22 = add i32 %add, %conv
+  %idxprom23 = zext i32 %add22 to i64
+  %arrayidx24 = getelementptr inbounds nuw [98305 x i8], ptr %buffer, i64 0, i64 %idxprom23
+  store i8 0, ptr %arrayidx24, align 1, !tbaa !6
+  br label %if.end26
+
+if.end26:                                         ; preds = %if.then17, %while.body
+  %end.0 = phi i32 [ %add22, %if.then17 ], [ %add, %while.body ]
+  %cmp28475 = icmp ult i32 %tobool.not483, %end.0
+  br i1 %cmp28475, label %while.body30.lr.ph, label %while.end
+
+while.body30.lr.ph:                               ; preds = %if.end26
+  %sub98 = add nuw nsw i32 %call12486, 49151
+  %.pre = load i32, ptr %B, align 4, !tbaa !9
+  br label %while.body30
+
+while.body30:                                     ; preds = %while.body30.lr.ph, %if.end287
+  %11 = phi i32 [ %.pre, %while.body30.lr.ph ], [ %58, %if.end287 ]
+  %j.1478 = phi i32 [ %j.0485, %while.body30.lr.ph ], [ %j.3, %if.end287 ]
+  %i.2477 = phi i32 [ %tobool.not483, %while.body30.lr.ph ], [ %inc155, %if.end287 ]
+  %lasti.1476 = phi i32 [ %lasti.0484, %while.body30.lr.ph ], [ %lasti.3, %if.end287 ]
+  %inc31 = add nuw i32 %i.2477, 1
+  %idxprom32 = zext i32 %i.2477 to i64
+  %arrayidx33 = getelementptr inbounds nuw [98305 x i8], ptr %buffer, i64 0, i64 %idxprom32
+  %12 = load i8, ptr %arrayidx33, align 1, !tbaa !6
+  %idxprom35 = zext i8 %12 to i64
+  %arrayidx36 = getelementptr inbounds nuw [0 x i32], ptr @Mask, i64 0, i64 %idxprom35
+  %13 = load i32, ptr %arrayidx36, align 4, !tbaa !9
+  %and = and i32 %11, %1
+  %shr = lshr i32 %11, 1
+  %and39 = and i32 %shr, %13
+  %or40 = or i32 %and39, %and
+  store i32 %or40, ptr %A, align 4, !tbaa !9
+  br i1 %cmp43.not465, label %for.end71, label %for.body45
+
+for.body45:                                       ; preds = %while.body30, %for.body45
+  %14 = phi i32 [ %or66, %for.body45 ], [ %or40, %while.body30 ]
+  %15 = phi i32 [ %16, %for.body45 ], [ %11, %while.body30 ]
+  %indvars.iv489 = phi i64 [ %indvars.iv.next490, %for.body45 ], [ 1, %while.body30 ]
+  %arrayidx47 = getelementptr inbounds nuw [10 x i32], ptr %B, i64 0, i64 %indvars.iv489
+  %16 = load i32, ptr %arrayidx47, align 4, !tbaa !9
+  %and48 = and i32 %16, %1
+  %or57 = or i32 %14, %15
+  %shr58 = lshr i32 %or57, 1
+  %and59 = and i32 %shr58, %2
+  %shr63 = lshr i32 %16, 1
+  %and64 = and i32 %shr63, %13
+  %17 = or i32 %and48, %and64
+  %18 = or i32 %17, %and59
+  %or66 = or i32 %18, %15
+  %arrayidx68 = getelementptr inbounds nuw [10 x i32], ptr %A, i64 0, i64 %indvars.iv489
+  store i32 %or66, ptr %arrayidx68, align 4, !tbaa !9
+  %indvars.iv.next490 = add nuw nsw i64 %indvars.iv489, 1
+  %exitcond494.not = icmp eq i64 %indvars.iv.next490, %wide.trip.count
+  br i1 %exitcond494.not, label %for.end71, label %for.body45, !llvm.loop !17
+
+for.end71:                                        ; preds = %for.body45, %while.body30
+  %and73 = and i32 %or40, %0
+  %tobool74.not = icmp eq i32 %and73, 0
+  br i1 %tobool74.not, label %if.end154, label %if.then75
+
+if.then75:                                        ; preds = %for.end71
+  %inc76 = add nsw i32 %j.1478, 1
+  %19 = load i32, ptr %arrayidx78, align 4, !tbaa !9
+  %20 = load i32, ptr @AND, align 4, !tbaa !9
+  %cmp79 = icmp eq i32 %20, 1
+  %.pre523 = load i32, ptr @endposition, align 4
+  %and81 = and i32 %.pre523, %19
+  %cmp82 = icmp eq i32 %and81, %.pre523
+  %or.cond = select i1 %cmp79, i1 %cmp82, i1 false
+  br i1 %or.cond, label %if.then89, label %lor.lhs.false
+
+lor.lhs.false:                                    ; preds = %if.then75
+  %cmp84 = icmp eq i32 %20, 0
+  %tobool87 = icmp ne i32 %and81, 0
+  %21 = select i1 %cmp84, i1 %tobool87, i1 false
+  %land.ext = zext i1 %21 to i32
+  %22 = load i32, ptr @INVERSE, align 4, !tbaa !9
+  %tobool88.not = icmp eq i32 %22, %land.ext
+  br i1 %tobool88.not, label %if.end104, label %if.then89
+
+if.then89:                                        ; preds = %if.then75, %lor.lhs.false
+  %23 = load i32, ptr @FILENAMEONLY, align 4, !tbaa !9
+  %tobool90.not = icmp eq i32 %23, 0
+  br i1 %tobool90.not, label %if.end94, label %cleanup.sink.split
+
+if.end94:                                         ; preds = %if.then89
+  %cmp99.not = icmp slt i32 %lasti.1476, %sub98
+  br i1 %cmp99.not, label %if.then101, label %if.end104
+
+if.then101:                                       ; preds = %if.end94
+  %sub96 = sub i32 %i.2477, %conv
+  call void @output(ptr noundef nonnull %buffer, i32 noundef %lasti.1476, i32 noundef %sub96, i32 noundef %inc76) #6
+  br label %if.end104
+
+if.end104:                                        ; preds = %if.end94, %if.then101, %lor.lhs.false
+  %24 = load i32, ptr @Init, align 4, !tbaa !9
+  br i1 %min.iters.check542, label %for.body109.preheader, label %vector.ph543
+
+vector.ph543:                                     ; preds = %if.end104
+  %broadcast.splatinsert546 = insertelement <4 x i32> poison, i32 %24, i64 0
+  %broadcast.splat547 = shufflevector <4 x i32> %broadcast.splatinsert546, <4 x i32> poison, <4 x i32> zeroinitializer
+  br label %vector.body548
+
+vector.body548:                                   ; preds = %vector.body548, %vector.ph543
+  %index549 = phi i64 [ 0, %vector.ph543 ], [ %index.next550, %vector.body548 ]
+  %25 = getelementptr inbounds nuw [10 x i32], ptr %B, i64 0, i64 %index549
+  %26 = getelementptr inbounds nuw i8, ptr %25, i64 16
+  store <4 x i32> %broadcast.splat547, ptr %25, align 4, !tbaa !9
+  store <4 x i32> %broadcast.splat547, ptr %26, align 4, !tbaa !9
+  %index.next550 = add nuw i64 %index549, 8
+  %27 = icmp eq i64 %index.next550, %n.vec545
+  br i1 %27, label %middle.block551, label %vector.body548, !llvm.loop !18
+
+middle.block551:                                  ; preds = %vector.body548
+  br i1 %cmp.n552, label %for.end114, label %for.body109.preheader
+
+for.body109.preheader:                            ; preds = %if.end104, %middle.block551
+  %indvars.iv495.ph = phi i64 [ 0, %if.end104 ], [ %n.vec545, %middle.block551 ]
+  br label %for.body109
+
+for.body109:                                      ; preds = %for.body109.preheader, %for.body109
+  %indvars.iv495 = phi i64 [ %indvars.iv.next496, %for.body109 ], [ %indvars.iv495.ph, %for.body109.preheader ]
+  %arrayidx111 = getelementptr inbounds nuw [10 x i32], ptr %B, i64 0, i64 %indvars.iv495
+  store i32 %24, ptr %arrayidx111, align 4, !tbaa !9
+  %indvars.iv.next496 = add nuw nsw i64 %indvars.iv495, 1
+  %exitcond499.not = icmp eq i64 %indvars.iv.next496, %wide.trip.count
+  br i1 %exitcond499.not, label %for.end114, label %for.body109, !llvm.loop !19
+
+for.end114:                                       ; preds = %for.body109, %middle.block551
+  %sub105 = sub i32 %inc31, %conv
+  %28 = load i32, ptr %B, align 4, !tbaa !9
+  %and116 = and i32 %28, %1
+  %shr118 = lshr i32 %28, 1
+  %and119 = and i32 %shr118, %13
+  %or120 = or i32 %and119, %and116
+  %and121 = and i32 %or120, %not
+  store i32 %and121, ptr %A, align 4, !tbaa !9
+  br i1 %cmp43.not465, label %if.end154, label %for.body126.lr.ph
+
+for.body126.lr.ph:                                ; preds = %for.end114
+  %29 = load i32, ptr @Init1, align 4, !tbaa !9
+  br label %for.body126
+
+for.body126:                                      ; preds = %for.body126.lr.ph, %for.body126
+  %30 = phi i32 [ %and121, %for.body126.lr.ph ], [ %or148, %for.body126 ]
+  %31 = phi i32 [ %28, %for.body126.lr.ph ], [ %32, %for.body126 ]
+  %indvars.iv500 = phi i64 [ 1, %for.body126.lr.ph ], [ %indvars.iv.next501, %for.body126 ]
+  %arrayidx128 = getelementptr inbounds nuw [10 x i32], ptr %B, i64 0, i64 %indvars.iv500
+  %32 = load i32, ptr %arrayidx128, align 4, !tbaa !9
+  %and129 = and i32 %32, %29
+  %or139 = or i32 %30, %31
+  %shr140 = lshr i32 %or139, 1
+  %and141 = and i32 %shr140, %2
+  %shr145 = lshr i32 %32, 1
+  %and146 = and i32 %shr145, %13
+  %33 = or i32 %and129, %and146
+  %34 = or i32 %33, %and141
+  %or148 = or i32 %34, %31
+  %arrayidx150 = getelementptr inbounds nuw [10 x i32], ptr %A, i64 0, i64 %indvars.iv500
+  store i32 %or148, ptr %arrayidx150, align 4, !tbaa !9
+  %indvars.iv.next501 = add nuw nsw i64 %indvars.iv500, 1
+  %exitcond505.not = icmp eq i64 %indvars.iv.next501, %wide.trip.count
+  br i1 %exitcond505.not, label %if.end154, label %for.body126, !llvm.loop !20
+
+if.end154:                                        ; preds = %for.body126, %for.end114, %for.end71
+  %35 = phi i32 [ %or40, %for.end71 ], [ %and121, %for.end114 ], [ %and121, %for.body126 ]
+  %lasti.2 = phi i32 [ %lasti.1476, %for.end71 ], [ %sub105, %for.end114 ], [ %sub105, %for.body126 ]
+  %j.2 = phi i32 [ %j.1478, %for.end71 ],...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Sep 10, 2025

@llvm/pr-subscribers-llvm-analysis

Author: None (steven-studio)

Changes

This patch introduces a small 8-entry ring-buffer "hot cache" inside
IRTranslator::ValueToVRegInfo to reduce the number of DenseMap lookups
on very hot paths (PHIs, arguments, repeated operands).

Implementation details:

  • Two ring buffers are maintained: (Value* -&gt; VRegListT*) and (Type* -&gt; OffsetListT*)
  • Entries are filled when getVRegs/getOffsets hit or insert a new mapping.
  • Reset at function boundaries, so pointers stay valid (bump allocator).
  • No semantic changes: purely compile-time optimization.

Benefits:

  • Reduces DenseMap probes on tight loops with repeated operands.
  • Improves IR translation time in GlobalISel on large inputs (early micro-benchmark shows X% fewer map lookups).

Testing:

  • ninja check-llvm passed, no MIR/CodeGen diffs observed.

Risk:

  • Low. Cache is local per-function and fully reset between functions.
  • If disabled, behavior is identical to pre-patch.

NFC


Patch is 184.62 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/157964.diff

9566 Files Affected:

  • (added) DeepSeekCodeAnalyzer.py (+164)
  • (added) SingleSource/UnitTests/Output/2002-04-17-PrintfChar.test.out (+1)
  • (added) asearch.ll (+1138)
  • (added) code_analysis_report.txt (+254)
  • (added) commit.txt (+4)
  • (added) failed.list (+7)
  • (added) hello ()
  • (added) hello-rv64 ()
  • (added) hello.c (+6)
  • (added) linux (+1)
  • (added) llvm-test-suite (+1)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/IRTranslator.h (+69-8)
  • (modified) llvm/include/llvm/CodeGen/LinkAllCodegenComponents.h (+3-1)
  • (modified) llvm/include/llvm/CodeGen/Passes.h (+5)
  • (added) llvm/include/llvm/CodeGen/RegAllocIntervals.h (+118)
  • (added) llvm/include/llvm/CodeGen/RegAllocSegmentTree.h (+355)
  • (modified) llvm/include/llvm/InitializePasses.h (+1)
  • (modified) llvm/lib/CodeGen/CMakeLists.txt (+2)
  • (modified) llvm/lib/CodeGen/CodeGen.cpp (+1)
  • (modified) llvm/lib/CodeGen/RegAllocGreedy.cpp (+86)
  • (modified) llvm/lib/CodeGen/RegAllocGreedy.h (+22)
  • (added) llvm/lib/CodeGen/RegAllocIntervals.cpp (+470)
  • (added) llvm/lib/CodeGen/RegAllocSegmentTree.cpp (+1510)
  • (added) llvm/lib/Target/MyTarget/CMakeLists.txt (+22)
  • (added) llvm/lib/Target/MyTarget/HelloMinimal.td (+8)
  • (added) llvm/lib/Target/MyTarget/HelloWorld.td (+27)
  • (added) llvm/lib/Target/MyTarget/MyTarget.td (+31)
  • (added) llvm/lib/Target/MyTarget/MyTargetInstrInfo.cpp (+6)
  • (added) llvm/lib/Target/MyTarget/MyTargetInstrInfo.h (+10)
  • (added) llvm/lib/Target/RISCV/RISCVArithmeticCostCalculator.h (+321)
  • (added) llvm/lib/Target/RISCV/RISCVCostCalculatorRegistry.h (+29)
  • (added) llvm/lib/Target/RISCV/RISCVCostConstants.h (+84)
  • (added) llvm/lib/Target/RISCV/RISCVCostValidation.h (+55)
  • (added) llvm/lib/Target/RISCV/RISCVShuffleCostCalculator.h (+439)
  • (added) llvm/lib/Target/RISCV/RISCVTTITester.h (+102)
  • (modified) llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp (+5-1)
  • (added) llvm/test/Analysis/CostModel/RISCV/rvv-shift-cost.ll (+26)
  • (added) llvm/test/Analysis/CostModel/RISCV/saturated-arithmetic.ll (+37)
  • (added) llvm/test/Analysis/CostModel/RISCV/shift-instructions.ll (+26)
  • (added) llvm/test/CodeGen/RISCV/GlobalISel/icmp-ult-imm.ll (+13)
  • (added) llvm/test/CodeGen/RISCV/riscv-tticost-mul-imm.ll (+9)
  • (added) llvm/test/CodeGen/RISCV/riscv-tticost-shifts.ll (+26)
  • (added) llvm/test/CodeGen/RISCV/riscv-tticost-ssub-sat.ll (+12)
  • (added) llvm/test/CodeGen/RISCV/rvv-shift-vv.ll (+27)
  • (added) llvm/test/Transforms/InstCombine/sub-xor-and-equivalence.ll (+118)
  • (modified) llvm/tools/llvm-profgen/ProfiledBinary.cpp (+1-1)
  • (added) my-test/a.out ()
  • (added) my-test/acc.ll (+348)
  • (added) my-test/accumulators.c (+32)
  • (added) my-test/call_intensive_function.c (+18)
  • (added) my-test/compare_each.sh (+57)
  • (added) my-test/compare_ra.sh (+99)
  • (added) my-test/complex_branching.c (+50)
  • (added) my-test/data-dependency/analysis/Makefile (+24)
  • (added) my-test/data-dependency/analysis/analyze_deps.sh (+22)
  • (added) my-test/data-dependency/examples/advanced/complex_dep.c (+24)
  • (added) my-test/data-dependency/examples/advanced/func_dep.c (+12)
  • (added) my-test/data-dependency/examples/advanced/vectorizable.c (+20)
  • (added) my-test/data-dependency/examples/basic/anti_dep.c (+7)
  • (added) my-test/data-dependency/examples/basic/control_dep.c (+8)
  • (added) my-test/data-dependency/examples/basic/dep.c (+6)
  • (added) my-test/data-dependency/examples/basic/output_dep.c (+9)
  • (added) my-test/data-dependency/generated/dep.ll (+60)
  • (added) my-test/dbg.txt (+1)
  • (added) my-test/debug.log (+2)
  • (added) my-test/equivalence-optimization/dag.s (+43)
  • (added) my-test/equivalence-optimization/foo.c (+5)
  • (added) my-test/equivalence-optimization/foo.ll (+47)
  • (added) my-test/equivalence-optimization/gisel.s (+18)
  • (added) my-test/equivalence-optimization/icmp_ult_imm.ll (+9)
  • (added) my-test/equivalence-optimization/test-equivalence.c (+4)
  • (added) my-test/equivalence-optimization/test-equivalence.ll (+44)
  • (added) my-test/extreme_vreg_test.c (+144)
  • (added) my-test/gen_high_pressure_ir.py (+22)
  • (added) my-test/gen_massive_ir.py (+11)
  • (added) my-test/gen_pressure_ir.py (+37)
  • (added) my-test/gen_vreg_stress_ir.py (+25)
  • (added) my-test/greedy.log (+315)
  • (added) my-test/greedy.s (+15)
  • (added) my-test/greedy_heavy.s (+69)
  • (added) my-test/greedy_stats.txt ()
  • (added) my-test/greedy_time.txt (+10)
  • (added) my-test/greedy_timing.txt (+274)
  • (added) my-test/harvest_and_compare.sh (+104)
  • (added) my-test/heavy_computation.c (+24)
  • (added) my-test/hp.ll (+16005)
  • (added) my-test/hp.s (+16)
  • (added) my-test/kernel.c (+29)
  • (added) my-test/llc-trace.json (+1)
  • (added) my-test/massive.ll (+4015)
  • (added) my-test/massive_N1e6.ll (+2000003)
  • (added) my-test/massive_O0.ll (+11118)
  • (added) my-test/massive_O2.ll (+140)
  • (added) my-test/massive_test.c (+2022)
  • (added) my-test/massive_test.o ()
  • (added) my-test/matrix_multiply_complex.c (+34)
  • (added) my-test/neon.ll (+212611)
  • (added) my-test/neon_window.c (+52)
  • (added) my-test/nested_loops_with_arrays.c (+22)
  • (added) my-test/out/summary.csv (+1)
  • (added) my-test/out_greedy.s (+10029)
  • (added) my-test/output.s (+2627)
  • (added) my-test/pbqp_timing.txt (+257)
  • (added) my-test/pressure_block.c (+40)
  • (added) my-test/results/asm/greedy/neon.s (+423827)
  • (added) my-test/results/asm/greedy/t.s (+16)
  • (added) my-test/results/asm/greedy/t_1024.s (+26)
  • (added) my-test/results/asm/greedy/test.s (+29)
  • (added) my-test/results/asm/segtre/neon.s (+389685)
  • (added) my-test/results/asm/segtre/t.s (+16)
  • (added) my-test/results/asm/segtre/t_1024.s (+27)
  • (added) my-test/results/asm/segtre/test.s (+29)
  • (added) my-test/results/logs/greedy/neon.time (+3)
  • (added) my-test/results/logs/greedy/t.time (+3)
  • (added) my-test/results/logs/greedy/t_1024.time (+3)
  • (added) my-test/results/logs/greedy/test.time (+479)
  • (added) my-test/results/logs/segtre/neon.time (+3)
  • (added) my-test/results/logs/segtre/t.time (+3)
  • (added) my-test/results/logs/segtre/t_1024.time (+3)
  • (added) my-test/results/logs/segtre/test.time (+479)
  • (added) my-test/seg.s (+2651)
  • (added) my-test/segtre.s (+188)
  • (added) my-test/segtre_timing.txt (+257)
  • (added) my-test/segtre_timing.txtcat (+2)
  • (added) my-test/segtree.s (+15)
  • (added) my-test/segtree_heavy.s (+69)
  • (added) my-test/segtree_stats.txt ()
  • (added) my-test/segtree_time.txt (+11)
  • (added) my-test/st.log (+3)
  • (added) my-test/summary.csv (+5)
  • (added) my-test/t.ll (+9)
  • (added) my-test/t.s (+18)
  • (added) my-test/t_1024.c (+7)
  • (added) my-test/t_1024.ll (+54)
  • (added) my-test/teaching_examples.c (+37)
  • (added) my-test/test.c (+1)
  • (added) my-test/test.ll (+35)
  • (added) my-test/test.s (+36)
  • (added) my-test/test_basic.ll (+4)
  • (added) my-test/test_heavy_pressure.ll (+58)
  • (added) my-test/test_regalloc.sh (+129)
  • (added) my-test/test_spill.ll (+36)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/functionobjects.ll (+4328)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/functionobjects.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/loop_unroll.ll (+101441)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/loop_unroll.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding.ll (+40281)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding.ll.err (+32)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/simple_types_loop_invariant.ll (+32171)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/simple_types_loop_invariant.ll.err (+32)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/stepanov_abstraction.ll (+6199)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/stepanov_abstraction.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/stepanov_vector.ll (+5308)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Adobe-C++/stepanov_vector.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/Large/fasta.ll (+269)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/Large/fasta.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/fannkuch.ll (+345)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/fannkuch.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/n-body.ll (+769)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/n-body.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/nsieve-bits.ll (+223)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/nsieve-bits.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/partialsums.ll (+160)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/partialsums.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/puzzle.ll (+452)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/puzzle.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/recursive.ll (+158)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/recursive.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/spectral-norm.ll (+388)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/BenchmarkGame/spectral-norm.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/CoyoteBench/almabench.ll (+836)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/CoyoteBench/almabench.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/CoyoteBench/fftbench.ll (+2261)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/CoyoteBench/fftbench.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/CoyoteBench/huffbench.ll (+857)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/CoyoteBench/huffbench.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/CoyoteBench/lpbench.ll (+1423)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/CoyoteBench/lpbench.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Dhrystone/dry.ll.err (+129)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Dhrystone/fldry.ll.err (+134)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Linpack/linpack-pc.ll (+6189)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Linpack/linpack-pc.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/McGill/chomp.ll (+1982)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/McGill/chomp.ll.err (+7)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/McGill/exptree.ll (+1070)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/McGill/exptree.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/McGill/misr.ll (+527)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/McGill/misr.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/McGill/queens.ll (+512)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/McGill/queens.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++-EH/spirit.ll (+25360)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++-EH/spirit.ll.err (+234)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/Large/ray.ll (+1512)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/Large/ray.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/Large/sphereflake.ll (+1198)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/Large/sphereflake.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/bigfib.ll (+3325)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/bigfib.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/mandel-text.ll (+113)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/mandel-text.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/oopack_v1p8.ll (+1625)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/oopack_v1p8.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/stepanov_container.ll (+6124)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/stepanov_container.ll.err (+25)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/stepanov_v1p2.ll (+1413)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc-C++/stepanov_v1p2.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/ReedSolomon.ll.err (+5)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/aarch64-init-cpu-features.ll (+63)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/aarch64-init-cpu-features.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/dt.ll (+143)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/dt.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/evalloop.ll (+314)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/evalloop.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/fbench.ll.err (+9)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/ffbench.ll (+613)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/ffbench.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-1.ll (+150)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-1.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-2.ll (+159)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-2.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-3.ll (+157)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-3.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-4.ll (+172)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-4.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-5.ll (+284)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-5.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-6.ll (+281)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-6.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-7.ll (+145)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-7.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-8.ll (+287)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops-8.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/flops.ll.err (+75)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/fp-convert.ll (+199)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/fp-convert.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/himenobmtxpa.ll (+2410)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/himenobmtxpa.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/lowercase.ll (+59)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/lowercase.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/mandel-2.ll (+202)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/mandel-2.ll.err (+22)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/mandel.ll (+139)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/mandel.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/matmul_f64_4x4.ll (+246)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/matmul_f64_4x4.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/oourafft.ll (+3525)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/oourafft.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/perlin.ll.err (+5)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/pi.ll (+96)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/pi.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/revertBits.ll (+121)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/revertBits.ll.err (+5)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/richards_benchmark.ll (+1037)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/richards_benchmark.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/salsa20.ll (+337)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/salsa20.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/whetstone.ll (+639)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Misc/whetstone.ll.err ()
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/datamining/correlation/correlation.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/datamining/covariance/covariance.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/blas/gemm/gemm.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/blas/gemver/gemver.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/blas/gesummv/gesummv.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/blas/symm/symm.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/blas/syr2k/syr2k.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/blas/syrk/syrk.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/blas/trmm/trmm.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/kernels/2mm/2mm.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/kernels/3mm/3mm.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/kernels/atax/atax.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/kernels/bicg/bicg.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/kernels/doitgen/doitgen.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/kernels/mvt/mvt.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/solvers/cholesky/cholesky.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/solvers/durbin/durbin.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/solvers/gramschmidt/gramschmidt.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/solvers/lu/lu.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/solvers/ludcmp/ludcmp.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/linear-algebra/solvers/trisolv/trisolv.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/medley/deriche/deriche.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/medley/floyd-warshall/floyd-warshall.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/medley/nussinov/nussinov.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/stencils/adi/adi.ll.err (+4)
  • (added) my-test/testsuite_scan/ir/SingleSource/Benchmarks/Polybench/stencils/fdtd-2d/fdtd-2d.ll.err (+4)
diff --git a/DeepSeekCodeAnalyzer.py b/DeepSeekCodeAnalyzer.py
new file mode 100644
index 0000000000000..a9dc6545049cf
--- /dev/null
+++ b/DeepSeekCodeAnalyzer.py
@@ -0,0 +1,164 @@
+import requests
+import json
+import os
+from pathlib import Path
+import readline
+import glob
+
+class DeepSeekCodeAnalyzer:
+    def __init__(self, api_key):
+        self.api_key = api_key
+        self.api_url = "https://api.deepseek.com/v1/chat/completions"  # 请替换为实际API端点
+        self.headers = {
+            "Authorization": f"Bearer {api_key}",
+            "Content-Type": "application/json"
+        }
+    
+    def analyze_code(self, code_file_path, prompt_template=None):
+        # 读取代码文件
+        with open(code_file_path, 'r', encoding='utf-8') as f:
+            code_content = f.read()
+        
+        # 准备提示词
+        if prompt_template is None:
+            prompt = f"""
+请分析以下C++代码的时间复杂度以及可以改进的点:
+
+{code_content}
+
+请提供:
+1. 整体时间复杂度分析
+2. 关键函数/方法的时间复杂度
+3. 内存使用分析
+4. 具体的改进建议
+5. 代码优化示例
+"""
+        else:
+            prompt = prompt_template.format(code=code_content)
+        
+        # 准备API请求数据
+        payload = {
+            "model": "deepseek-coder",  # 使用代码专用模型
+            "messages": [
+                {"role": "user", "content": prompt}
+            ],
+            "temperature": 0.1,  # 低随机性以获得更确定的回答
+            "max_tokens": 4000   # 根据需要进行调整
+        }
+        
+        # 发送请求
+        try:
+            response = requests.post(
+                self.api_url,
+                headers=self.headers,
+                json=payload,
+                timeout=None  # None = 無限等待
+            )
+            response.raise_for_status()
+            
+            # 解析响应
+            result = response.json()
+            analysis = result['choices'][0]['message']['content']
+            
+            return analysis
+            
+        except Exception as e:
+            print(f"API请求失败: {e}")
+            return None
+    
+    def save_analysis(self, analysis_result, output_path):
+        with open(output_path, 'w', encoding='utf-8') as f:
+            f.write(analysis_result)
+        print(f"分析结果已保存到: {output_path}")
+
+def _complete_path(text, state):
+    # 展開 ~ 與 $ENV
+    text = os.path.expanduser(os.path.expandvars(text))
+
+    # 空字串時,預設從當前目錄開始
+    if not text:
+        text = './'
+
+    # 拆出目錄與前綴
+    dirname, prefix = os.path.split(text)
+    if not dirname:
+        dirname = '.'
+
+    try:
+        entries = os.listdir(dirname)
+    except Exception:
+        entries = []
+
+    # 做匹配
+    matches = []
+    for name in entries:
+        if name.startswith(prefix):
+            full = os.path.join(dirname, name)
+            # 目錄補全時自動加 '/'
+            if os.path.isdir(full):
+                full += '/'
+            matches.append(full)
+
+    # 也支援使用者原本寫的 glob 型式(例如 src/*.cpp)
+    if '*' in text or '?' in text or '[' in text:
+        matches.extend(glob.glob(text))
+
+    # 去重、排序
+    matches = sorted(set(matches))
+
+    return matches[state] if state < len(matches) else None
+
+# 讓 '/' 不被當成分隔符,保留路徑連續性
+readline.set_completer_delims(' \t\n;')
+
+# macOS 內建 Python 多半是 libedit;Linux/自裝 Python 多半是 GNU readline
+if 'libedit' in readline.__doc__:
+    # libedit 的綁定語法
+    readline.parse_and_bind("bind ^I rl_complete")
+else:
+    # GNU readline 的綁定語法
+    readline.parse_and_bind("tab: complete")
+
+readline.set_completer(_complete_path)
+
+# 使用示例
+if __name__ == "__main__":
+    # 替换为您的API密钥
+    API_KEY = "sk-9cafb2e074bf4b348af4a075c19ccf6b"
+    
+    # 创建分析器实例
+    analyzer = DeepSeekCodeAnalyzer(API_KEY)
+    
+    # 讓使用者輸入檔案名稱
+    code_file = input("請輸入要分析的程式碼檔案名稱(含路徑):").strip()
+    
+    # 自定义提示词(可选)
+    custom_prompt = """
+作为资深C++性能优化专家,请详细分析以下LLVM IRTranslator代码:
+
+{code}
+
+请重点关注:
+1. 算法复杂度分析(最好、最坏、平均情况)
+2. 内存访问模式和缓存友好性
+3. 潜在的性能瓶颈
+4. 并行化机会
+5. LLVM特定最佳实践
+6. 具体重构建议和代码示例
+"""
+    
+    # 执行分析
+    print("正在分析代码,请稍候...")
+    result = analyzer.analyze_code(code_file, custom_prompt)
+    
+    if result:
+        # 保存分析结果
+        output_file = "code_analysis_report.txt"
+        analyzer.save_analysis(result, output_file)
+        
+        # 打印部分结果预览
+        print("\n分析结果预览:")
+        print("=" * 50)
+        print(result)
+    else:
+        print("分析失败")
diff --git a/SingleSource/UnitTests/Output/2002-04-17-PrintfChar.test.out b/SingleSource/UnitTests/Output/2002-04-17-PrintfChar.test.out
new file mode 100644
index 0000000000000..f12351e3edc3d
--- /dev/null
+++ b/SingleSource/UnitTests/Output/2002-04-17-PrintfChar.test.out
@@ -0,0 +1 @@
+'c' 'e'
diff --git a/asearch.ll b/asearch.ll
new file mode 100644
index 0000000000000..e921f58c9fc86
--- /dev/null
+++ b/asearch.ll
@@ -0,0 +1,1138 @@
+; ModuleID = 'MultiSource/Benchmarks/Prolangs-C/agrep/asearch.c'
+source_filename = "MultiSource/Benchmarks/Prolangs-C/agrep/asearch.c"
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128-Fn32"
+target triple = "aarch64-unknown-linux-gnu"
+
+@D_endpos = external local_unnamed_addr global i32, align 4
+@Init1 = external local_unnamed_addr global i32, align 4
+@NO_ERR_MASK = external local_unnamed_addr global i32, align 4
+@Init = external local_unnamed_addr global [0 x i32], align 4
+@Mask = external local_unnamed_addr global [0 x i32], align 4
+@AND = external local_unnamed_addr global i32, align 4
+@endposition = external local_unnamed_addr global i32, align 4
+@INVERSE = external local_unnamed_addr global i32, align 4
+@FILENAMEONLY = external local_unnamed_addr global i32, align 4
+@num_of_matched = external local_unnamed_addr global i32, align 4
+@CurrentFileName = external global [0 x i8], align 1
+@TRUNCATE = external local_unnamed_addr global i32, align 4
+@I = external local_unnamed_addr global i32, align 4
+@DELIMITER = external local_unnamed_addr global i32, align 4
+
+; Function Attrs: nounwind uwtable
+define dso_local void @asearch0(ptr noundef readonly captures(none) %old_D_pat, i32 noundef %text, i32 noundef %D) local_unnamed_addr #0 {
+entry:
+  %A = alloca [10 x i32], align 4
+  %B = alloca [10 x i32], align 4
+  %buffer = alloca [98305 x i8], align 1
+  call void @llvm.lifetime.start.p0(ptr nonnull %A) #6
+  call void @llvm.lifetime.start.p0(ptr nonnull %B) #6
+  call void @llvm.lifetime.start.p0(ptr nonnull %buffer) #6
+  %call = tail call i64 @strlen(ptr noundef nonnull dereferenceable(1) %old_D_pat) #7
+  %conv = trunc i64 %call to i32
+  %arrayidx = getelementptr inbounds nuw i8, ptr %buffer, i64 49151
+  store i8 10, ptr %arrayidx, align 1, !tbaa !6
+  %0 = load i32, ptr @D_endpos, align 4, !tbaa !9
+  %cmp461 = icmp ugt i32 %conv, 1
+  br i1 %cmp461, label %for.body, label %for.end
+
+for.body:                                         ; preds = %entry, %for.body
+  %i.0463 = phi i32 [ %inc, %for.body ], [ 1, %entry ]
+  %D_Mask.0462 = phi i32 [ %or, %for.body ], [ %0, %entry ]
+  %shl = shl i32 %D_Mask.0462, 1
+  %or = or i32 %shl, %D_Mask.0462
+  %inc = add nuw i32 %i.0463, 1
+  %exitcond.not = icmp eq i32 %inc, %conv
+  br i1 %exitcond.not, label %for.end, label %for.body, !llvm.loop !11
+
+for.end:                                          ; preds = %for.body, %entry
+  %D_Mask.0.lcssa = phi i32 [ %0, %entry ], [ %or, %for.body ]
+  %1 = load i32, ptr @Init1, align 4, !tbaa !9
+  %2 = load i32, ptr @NO_ERR_MASK, align 4, !tbaa !9
+  %3 = load i32, ptr @Init, align 4, !tbaa !9
+  %4 = add i32 %D, 1
+  %wide.trip.count = zext i32 %4 to i64
+  %min.iters.check = icmp ult i32 %4, 8
+  br i1 %min.iters.check, label %for.body5.preheader, label %vector.ph
+
+vector.ph:                                        ; preds = %for.end
+  %n.vec = and i64 %wide.trip.count, 4294967288
+  %broadcast.splatinsert = insertelement <4 x i32> poison, i32 %3, i64 0
+  %broadcast.splat = shufflevector <4 x i32> %broadcast.splatinsert, <4 x i32> poison, <4 x i32> zeroinitializer
+  br label %vector.body
+
+vector.body:                                      ; preds = %vector.body, %vector.ph
+  %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]
+  %5 = getelementptr inbounds nuw [10 x i32], ptr %B, i64 0, i64 %index
+  %6 = getelementptr inbounds nuw i8, ptr %5, i64 16
+  store <4 x i32> %broadcast.splat, ptr %5, align 4, !tbaa !9
+  store <4 x i32> %broadcast.splat, ptr %6, align 4, !tbaa !9
+  %7 = getelementptr inbounds nuw [10 x i32], ptr %A, i64 0, i64 %index
+  %8 = getelementptr inbounds nuw i8, ptr %7, i64 16
+  store <4 x i32> %broadcast.splat, ptr %7, align 4, !tbaa !9
+  store <4 x i32> %broadcast.splat, ptr %8, align 4, !tbaa !9
+  %index.next = add nuw i64 %index, 8
+  %9 = icmp eq i64 %index.next, %n.vec
+  br i1 %9, label %middle.block, label %vector.body, !llvm.loop !13
+
+middle.block:                                     ; preds = %vector.body
+  %cmp.n = icmp eq i64 %n.vec, %wide.trip.count
+  br i1 %cmp.n, label %while.cond.preheader, label %for.body5.preheader
+
+for.body5.preheader:                              ; preds = %for.end, %middle.block
+  %indvars.iv.ph = phi i64 [ 0, %for.end ], [ %n.vec, %middle.block ]
+  br label %for.body5
+
+while.cond.preheader:                             ; preds = %for.body5, %middle.block
+  %not = xor i32 %D_Mask.0.lcssa, -1
+  %add.ptr = getelementptr inbounds nuw i8, ptr %buffer, i64 49152
+  %call12481 = call i32 @fill_buf(i32 noundef %text, ptr noundef nonnull %add.ptr, i32 noundef 49152) #6
+  %cmp13482 = icmp sgt i32 %call12481, 0
+  br i1 %cmp13482, label %while.body.lr.ph, label %cleanup
+
+while.body.lr.ph:                                 ; preds = %while.cond.preheader
+  %sext = shl i64 %call, 32
+  %conv20 = ashr exact i64 %sext, 32
+  %cmp43.not465 = icmp eq i32 %D, 0
+  %idxprom77 = zext i32 %D to i64
+  %arrayidx78 = getelementptr inbounds nuw [10 x i32], ptr %A, i64 0, i64 %idxprom77
+  %arrayidx205 = getelementptr inbounds nuw [10 x i32], ptr %B, i64 0, i64 %idxprom77
+  %10 = xor i32 %conv, -1
+  %min.iters.check542 = icmp ult i32 %4, 8
+  %n.vec545 = and i64 %wide.trip.count, 4294967288
+  %cmp.n552 = icmp eq i64 %n.vec545, %wide.trip.count
+  %min.iters.check529 = icmp ult i32 %4, 8
+  %n.vec532 = and i64 %wide.trip.count, 4294967288
+  %cmp.n539 = icmp eq i64 %n.vec532, %wide.trip.count
+  br label %while.body
+
+for.body5:                                        ; preds = %for.body5.preheader, %for.body5
+  %indvars.iv = phi i64 [ %indvars.iv.next, %for.body5 ], [ %indvars.iv.ph, %for.body5.preheader ]
+  %arrayidx6 = getelementptr inbounds nuw [10 x i32], ptr %B, i64 0, i64 %indvars.iv
+  store i32 %3, ptr %arrayidx6, align 4, !tbaa !9
+  %arrayidx8 = getelementptr inbounds nuw [10 x i32], ptr %A, i64 0, i64 %indvars.iv
+  store i32 %3, ptr %arrayidx8, align 4, !tbaa !9
+  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
+  %exitcond488.not = icmp eq i64 %indvars.iv.next, %wide.trip.count
+  br i1 %exitcond488.not, label %while.cond.preheader, label %for.body5, !llvm.loop !16
+
+while.body:                                       ; preds = %while.body.lr.ph, %if.end311
+  %call12486 = phi i32 [ %call12481, %while.body.lr.ph ], [ %call12, %if.end311 ]
+  %j.0485 = phi i32 [ 0, %while.body.lr.ph ], [ %j.1.lcssa, %if.end311 ]
+  %lasti.0484 = phi i32 [ 49152, %while.body.lr.ph ], [ %lasti.4, %if.end311 ]
+  %tobool.not483 = phi i32 [ 49151, %while.body.lr.ph ], [ 49152, %if.end311 ]
+  %add = add nuw nsw i32 %call12486, 49152
+  %cmp15 = icmp samesign ult i32 %call12486, 49152
+  br i1 %cmp15, label %if.then17, label %if.end26
+
+if.then17:                                        ; preds = %while.body
+  %idx.ext = zext nneg i32 %add to i64
+  %add.ptr19 = getelementptr inbounds nuw i8, ptr %buffer, i64 %idx.ext
+  %call21 = call ptr @strncpy(ptr noundef nonnull %add.ptr19, ptr noundef nonnull %old_D_pat, i64 noundef %conv20) #6
+  %add22 = add i32 %add, %conv
+  %idxprom23 = zext i32 %add22 to i64
+  %arrayidx24 = getelementptr inbounds nuw [98305 x i8], ptr %buffer, i64 0, i64 %idxprom23
+  store i8 0, ptr %arrayidx24, align 1, !tbaa !6
+  br label %if.end26
+
+if.end26:                                         ; preds = %if.then17, %while.body
+  %end.0 = phi i32 [ %add22, %if.then17 ], [ %add, %while.body ]
+  %cmp28475 = icmp ult i32 %tobool.not483, %end.0
+  br i1 %cmp28475, label %while.body30.lr.ph, label %while.end
+
+while.body30.lr.ph:                               ; preds = %if.end26
+  %sub98 = add nuw nsw i32 %call12486, 49151
+  %.pre = load i32, ptr %B, align 4, !tbaa !9
+  br label %while.body30
+
+while.body30:                                     ; preds = %while.body30.lr.ph, %if.end287
+  %11 = phi i32 [ %.pre, %while.body30.lr.ph ], [ %58, %if.end287 ]
+  %j.1478 = phi i32 [ %j.0485, %while.body30.lr.ph ], [ %j.3, %if.end287 ]
+  %i.2477 = phi i32 [ %tobool.not483, %while.body30.lr.ph ], [ %inc155, %if.end287 ]
+  %lasti.1476 = phi i32 [ %lasti.0484, %while.body30.lr.ph ], [ %lasti.3, %if.end287 ]
+  %inc31 = add nuw i32 %i.2477, 1
+  %idxprom32 = zext i32 %i.2477 to i64
+  %arrayidx33 = getelementptr inbounds nuw [98305 x i8], ptr %buffer, i64 0, i64 %idxprom32
+  %12 = load i8, ptr %arrayidx33, align 1, !tbaa !6
+  %idxprom35 = zext i8 %12 to i64
+  %arrayidx36 = getelementptr inbounds nuw [0 x i32], ptr @Mask, i64 0, i64 %idxprom35
+  %13 = load i32, ptr %arrayidx36, align 4, !tbaa !9
+  %and = and i32 %11, %1
+  %shr = lshr i32 %11, 1
+  %and39 = and i32 %shr, %13
+  %or40 = or i32 %and39, %and
+  store i32 %or40, ptr %A, align 4, !tbaa !9
+  br i1 %cmp43.not465, label %for.end71, label %for.body45
+
+for.body45:                                       ; preds = %while.body30, %for.body45
+  %14 = phi i32 [ %or66, %for.body45 ], [ %or40, %while.body30 ]
+  %15 = phi i32 [ %16, %for.body45 ], [ %11, %while.body30 ]
+  %indvars.iv489 = phi i64 [ %indvars.iv.next490, %for.body45 ], [ 1, %while.body30 ]
+  %arrayidx47 = getelementptr inbounds nuw [10 x i32], ptr %B, i64 0, i64 %indvars.iv489
+  %16 = load i32, ptr %arrayidx47, align 4, !tbaa !9
+  %and48 = and i32 %16, %1
+  %or57 = or i32 %14, %15
+  %shr58 = lshr i32 %or57, 1
+  %and59 = and i32 %shr58, %2
+  %shr63 = lshr i32 %16, 1
+  %and64 = and i32 %shr63, %13
+  %17 = or i32 %and48, %and64
+  %18 = or i32 %17, %and59
+  %or66 = or i32 %18, %15
+  %arrayidx68 = getelementptr inbounds nuw [10 x i32], ptr %A, i64 0, i64 %indvars.iv489
+  store i32 %or66, ptr %arrayidx68, align 4, !tbaa !9
+  %indvars.iv.next490 = add nuw nsw i64 %indvars.iv489, 1
+  %exitcond494.not = icmp eq i64 %indvars.iv.next490, %wide.trip.count
+  br i1 %exitcond494.not, label %for.end71, label %for.body45, !llvm.loop !17
+
+for.end71:                                        ; preds = %for.body45, %while.body30
+  %and73 = and i32 %or40, %0
+  %tobool74.not = icmp eq i32 %and73, 0
+  br i1 %tobool74.not, label %if.end154, label %if.then75
+
+if.then75:                                        ; preds = %for.end71
+  %inc76 = add nsw i32 %j.1478, 1
+  %19 = load i32, ptr %arrayidx78, align 4, !tbaa !9
+  %20 = load i32, ptr @AND, align 4, !tbaa !9
+  %cmp79 = icmp eq i32 %20, 1
+  %.pre523 = load i32, ptr @endposition, align 4
+  %and81 = and i32 %.pre523, %19
+  %cmp82 = icmp eq i32 %and81, %.pre523
+  %or.cond = select i1 %cmp79, i1 %cmp82, i1 false
+  br i1 %or.cond, label %if.then89, label %lor.lhs.false
+
+lor.lhs.false:                                    ; preds = %if.then75
+  %cmp84 = icmp eq i32 %20, 0
+  %tobool87 = icmp ne i32 %and81, 0
+  %21 = select i1 %cmp84, i1 %tobool87, i1 false
+  %land.ext = zext i1 %21 to i32
+  %22 = load i32, ptr @INVERSE, align 4, !tbaa !9
+  %tobool88.not = icmp eq i32 %22, %land.ext
+  br i1 %tobool88.not, label %if.end104, label %if.then89
+
+if.then89:                                        ; preds = %if.then75, %lor.lhs.false
+  %23 = load i32, ptr @FILENAMEONLY, align 4, !tbaa !9
+  %tobool90.not = icmp eq i32 %23, 0
+  br i1 %tobool90.not, label %if.end94, label %cleanup.sink.split
+
+if.end94:                                         ; preds = %if.then89
+  %cmp99.not = icmp slt i32 %lasti.1476, %sub98
+  br i1 %cmp99.not, label %if.then101, label %if.end104
+
+if.then101:                                       ; preds = %if.end94
+  %sub96 = sub i32 %i.2477, %conv
+  call void @output(ptr noundef nonnull %buffer, i32 noundef %lasti.1476, i32 noundef %sub96, i32 noundef %inc76) #6
+  br label %if.end104
+
+if.end104:                                        ; preds = %if.end94, %if.then101, %lor.lhs.false
+  %24 = load i32, ptr @Init, align 4, !tbaa !9
+  br i1 %min.iters.check542, label %for.body109.preheader, label %vector.ph543
+
+vector.ph543:                                     ; preds = %if.end104
+  %broadcast.splatinsert546 = insertelement <4 x i32> poison, i32 %24, i64 0
+  %broadcast.splat547 = shufflevector <4 x i32> %broadcast.splatinsert546, <4 x i32> poison, <4 x i32> zeroinitializer
+  br label %vector.body548
+
+vector.body548:                                   ; preds = %vector.body548, %vector.ph543
+  %index549 = phi i64 [ 0, %vector.ph543 ], [ %index.next550, %vector.body548 ]
+  %25 = getelementptr inbounds nuw [10 x i32], ptr %B, i64 0, i64 %index549
+  %26 = getelementptr inbounds nuw i8, ptr %25, i64 16
+  store <4 x i32> %broadcast.splat547, ptr %25, align 4, !tbaa !9
+  store <4 x i32> %broadcast.splat547, ptr %26, align 4, !tbaa !9
+  %index.next550 = add nuw i64 %index549, 8
+  %27 = icmp eq i64 %index.next550, %n.vec545
+  br i1 %27, label %middle.block551, label %vector.body548, !llvm.loop !18
+
+middle.block551:                                  ; preds = %vector.body548
+  br i1 %cmp.n552, label %for.end114, label %for.body109.preheader
+
+for.body109.preheader:                            ; preds = %if.end104, %middle.block551
+  %indvars.iv495.ph = phi i64 [ 0, %if.end104 ], [ %n.vec545, %middle.block551 ]
+  br label %for.body109
+
+for.body109:                                      ; preds = %for.body109.preheader, %for.body109
+  %indvars.iv495 = phi i64 [ %indvars.iv.next496, %for.body109 ], [ %indvars.iv495.ph, %for.body109.preheader ]
+  %arrayidx111 = getelementptr inbounds nuw [10 x i32], ptr %B, i64 0, i64 %indvars.iv495
+  store i32 %24, ptr %arrayidx111, align 4, !tbaa !9
+  %indvars.iv.next496 = add nuw nsw i64 %indvars.iv495, 1
+  %exitcond499.not = icmp eq i64 %indvars.iv.next496, %wide.trip.count
+  br i1 %exitcond499.not, label %for.end114, label %for.body109, !llvm.loop !19
+
+for.end114:                                       ; preds = %for.body109, %middle.block551
+  %sub105 = sub i32 %inc31, %conv
+  %28 = load i32, ptr %B, align 4, !tbaa !9
+  %and116 = and i32 %28, %1
+  %shr118 = lshr i32 %28, 1
+  %and119 = and i32 %shr118, %13
+  %or120 = or i32 %and119, %and116
+  %and121 = and i32 %or120, %not
+  store i32 %and121, ptr %A, align 4, !tbaa !9
+  br i1 %cmp43.not465, label %if.end154, label %for.body126.lr.ph
+
+for.body126.lr.ph:                                ; preds = %for.end114
+  %29 = load i32, ptr @Init1, align 4, !tbaa !9
+  br label %for.body126
+
+for.body126:                                      ; preds = %for.body126.lr.ph, %for.body126
+  %30 = phi i32 [ %and121, %for.body126.lr.ph ], [ %or148, %for.body126 ]
+  %31 = phi i32 [ %28, %for.body126.lr.ph ], [ %32, %for.body126 ]
+  %indvars.iv500 = phi i64 [ 1, %for.body126.lr.ph ], [ %indvars.iv.next501, %for.body126 ]
+  %arrayidx128 = getelementptr inbounds nuw [10 x i32], ptr %B, i64 0, i64 %indvars.iv500
+  %32 = load i32, ptr %arrayidx128, align 4, !tbaa !9
+  %and129 = and i32 %32, %29
+  %or139 = or i32 %30, %31
+  %shr140 = lshr i32 %or139, 1
+  %and141 = and i32 %shr140, %2
+  %shr145 = lshr i32 %32, 1
+  %and146 = and i32 %shr145, %13
+  %33 = or i32 %and129, %and146
+  %34 = or i32 %33, %and141
+  %or148 = or i32 %34, %31
+  %arrayidx150 = getelementptr inbounds nuw [10 x i32], ptr %A, i64 0, i64 %indvars.iv500
+  store i32 %or148, ptr %arrayidx150, align 4, !tbaa !9
+  %indvars.iv.next501 = add nuw nsw i64 %indvars.iv500, 1
+  %exitcond505.not = icmp eq i64 %indvars.iv.next501, %wide.trip.count
+  br i1 %exitcond505.not, label %if.end154, label %for.body126, !llvm.loop !20
+
+if.end154:                                        ; preds = %for.body126, %for.end114, %for.end71
+  %35 = phi i32 [ %or40, %for.end71 ], [ %and121, %for.end114 ], [ %and121, %for.body126 ]
+  %lasti.2 = phi i32 [ %lasti.1476, %for.end71 ], [ %sub105, %for.end114 ], [ %sub105, %for.body126 ]
+  %j.2 = phi i32 [ %j.1478, %for.end71 ],...
[truncated]

@s-barannikov
Copy link
Contributor

@steven-studio Just letting you know that you created a PR with a 4.5 million line change.

@arsenm
Copy link
Contributor

arsenm commented Sep 10, 2025

PR is broken, includes 5000+ files of accidental changes

@jrtc27
Copy link
Collaborator

jrtc27 commented Sep 10, 2025

It also pulls in llvm-test-suite and linux(!) as submodules

@arsenm
Copy link
Contributor

arsenm commented Sep 11, 2025

PR is still broken with submodules removed

@aengelke
Copy link
Contributor

aengelke commented Sep 11, 2025

Broken PR aside, if this really is a performance problem (I don't remember seeing this as substantial in my profiles when I looked at it last year), I think we should pursue adding an auxiliary integer field to Value or Instruction and use this instead.

When I did that change to SelectionDAG (ec8bcda, unmerged), however, I didn't feel the improvement was big enough (-0.18%, c-t-t) to push for that aux field. It might be worth re-considering, but it needs a more work on other parts to show that it's worth the hidden complexity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:RISC-V llvm:analysis Includes value tracking, cost tables and constant folding llvm:codegen llvm:globalisel llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants