[Sema] Implement fix as suggested by FIXME #143142

lux-QAQ · 2025-06-06T13:49:17Z

Fix #143129
Implement fix as suggested by FIXME

github-actions · 2025-06-06T13:49:39Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2025-06-06T13:50:13Z

@llvm/pr-subscribers-clang

Author: LUX (lux-QAQ)

Changes

Move allocation of deduction failure info from the ASTContext to the heap. Implement corresponding type-safe delete calls in DeductionFailureInfo::Destroy() to prevent memory leaks.

Full diff: https://github.com/llvm/llvm-project/pull/143142.diff

1 Files Affected:

(modified) clang/lib/Sema/SemaOverload.cpp (+23-13)

diff --git a/clang/lib/Sema/SemaOverload.cpp b/clang/lib/Sema/SemaOverload.cpp
index 66f84fc67b52f..0e059239e65cb 100644
--- a/clang/lib/Sema/SemaOverload.cpp
+++ b/clang/lib/Sema/SemaOverload.cpp
@@ -756,8 +756,8 @@ clang::MakeDeductionFailureInfo(ASTContext &Context,
 
   case TemplateDeductionResult::DeducedMismatch:
   case TemplateDeductionResult::DeducedMismatchNested: {
-    // FIXME: Should allocate from normal heap so that we can free this later.
-    auto *Saved = new (Context) DFIDeducedMismatchArgs;
+    // Allocate from normal heap so that we can free this later.
+    auto *Saved = new  DFIDeducedMismatchArgs;
     Saved->FirstArg = Info.FirstArg;
     Saved->SecondArg = Info.SecondArg;
     Saved->TemplateArgs = Info.takeSugared();
@@ -767,8 +767,8 @@ clang::MakeDeductionFailureInfo(ASTContext &Context,
   }
 
   case TemplateDeductionResult::NonDeducedMismatch: {
-    // FIXME: Should allocate from normal heap so that we can free this later.
-    DFIArguments *Saved = new (Context) DFIArguments;
+    // Allocate from normal heap so that we can free this later.
+    DFIArguments *Saved = new  DFIArguments;
     Saved->FirstArg = Info.FirstArg;
     Saved->SecondArg = Info.SecondArg;
     Result.Data = Saved;
@@ -779,8 +779,8 @@ clang::MakeDeductionFailureInfo(ASTContext &Context,
     // FIXME: It's slightly wasteful to allocate two TemplateArguments for this.
   case TemplateDeductionResult::Inconsistent:
   case TemplateDeductionResult::Underqualified: {
-    // FIXME: Should allocate from normal heap so that we can free this later.
-    DFIParamWithArguments *Saved = new (Context) DFIParamWithArguments;
+    // Allocate from normal heap so that we can free this later.
+    DFIParamWithArguments *Saved = new  DFIParamWithArguments;
     Saved->Param = Info.Param;
     Saved->FirstArg = Info.FirstArg;
     Saved->SecondArg = Info.SecondArg;
@@ -799,7 +799,8 @@ clang::MakeDeductionFailureInfo(ASTContext &Context,
     break;
 
   case TemplateDeductionResult::ConstraintsNotSatisfied: {
-    CNSInfo *Saved = new (Context) CNSInfo;
+    // Allocate from normal heap so that we can free this later.
+    CNSInfo *Saved = new CNSInfo;
     Saved->TemplateArgs = Info.takeSugared();
     Saved->Satisfaction = Info.AssociatedConstraintsSatisfaction;
     Result.Data = Saved;
@@ -831,14 +832,21 @@ void DeductionFailureInfo::Destroy() {
   case TemplateDeductionResult::IncompletePack:
   case TemplateDeductionResult::Inconsistent:
   case TemplateDeductionResult::Underqualified:
+    delete static_cast<DFIParamWithArguments*>(Data);
+    Data = nullptr;
+    break;
   case TemplateDeductionResult::DeducedMismatch:
   case TemplateDeductionResult::DeducedMismatchNested:
+    delete static_cast<DFIDeducedMismatchArgs*>(Data);
+    Data = nullptr;
+    break;
   case TemplateDeductionResult::NonDeducedMismatch:
-    // FIXME: Destroy the data?
+    // Destroy the data
+    delete static_cast<DFIArguments*>(Data);
     Data = nullptr;
     break;
 
-  case TemplateDeductionResult::SubstitutionFailure:
+  case TemplateDeductionResult::SubstitutionFailure:{
     // FIXME: Destroy the template argument list?
     Data = nullptr;
     if (PartialDiagnosticAt *Diag = getSFINAEDiagnostic()) {
@@ -846,16 +854,18 @@ void DeductionFailureInfo::Destroy() {
       HasDiagnostic = false;
     }
     break;
-
-  case TemplateDeductionResult::ConstraintsNotSatisfied:
-    // FIXME: Destroy the template argument list?
+  }
+  case TemplateDeductionResult::ConstraintsNotSatisfied:{
+    // Destroy the data
+    CNSInfo *Ptr = static_cast<CNSInfo*>(Data);
+    delete Ptr;
     Data = nullptr;
     if (PartialDiagnosticAt *Diag = getSFINAEDiagnostic()) {
       Diag->~PartialDiagnosticAt();
       HasDiagnostic = false;
     }
     break;
-
+  }
   // Unhandled
   case TemplateDeductionResult::MiscellaneousDeductionFailure:
   case TemplateDeductionResult::AlreadyDiagnosed:

github-actions · 2025-06-09T05:55:47Z

✅ With the latest revision this PR passed the C/C++ code formatter.

Move allocation of deduction failure info from the ASTContext to the heap. Implement corresponding type-safe delete calls in `DeductionFailureInfo::Destroy()` to prevent memory leaks.

Sirraide

I’m not sure this is better? Having to manually delete these is... not great. Have you benchmarked whether this improves memory usage or makes Clang faster?

See also this comment by Aaron on the issue you linked: #143129 (comment)

lux-QAQ · 2025-06-09T13:37:08Z

I ran performance tests by following the LLVM Test Suite Guide and obtained the following results.

The results_old data represents the baseline performance using the original, unmodified Clang. The results_new data shows the performance using the Clang with my modifications applied.
results_new.json
results_old.json

(base) ***@DESKTOP-80T55HI:~/code/llvmleak$ llvm-test-suite/utils/compare.py -m compile_time results_old.json results_new.json
Tests: 3327
Metric: compile_time

Program                                       compile_time
                                              results_old  results_new diff
SingleSour...tTests/2020-01-06-coverage-008     0.00         0.02      484.6%
SingleSour...tTests/2010-05-24-BitfieldTest     0.00         0.01      265.9%
SingleSour.../2006-01-29-SimpleIndirectCall     0.00         0.01      238.9%
SingleSour...006-12-07-Compare64BitConstant     0.00         0.01      186.8%
SingleSour...itTests/2003-07-06-IntOverflow     0.01         0.02      159.7%
SingleSour...UnitTests/2002-05-02-CastTest1     0.00         0.01      117.9%
tools/test/check_env                            0.02         0.05      107.9%
tools/test/ret0                                 0.02         0.05      107.9%
tools/test/abrt                                 0.02         0.05      107.9%
tools/test/ret1                                 0.02         0.05      107.9%
tools/fpcmp-target                              0.02         0.05      107.9%
SingleSour...UnitTests/2006-01-23-UnionInit     0.01         0.02       88.5%
SingleSour.../UnitTests/block-call-r7674133     0.01         0.01       76.2%
SingleSour...UnitTests/2005-05-12-Int64ToFP     0.01         0.02       64.2%
SingleSour.../UnitTests/SignlessTypes/cast2     0.01         0.02       62.4%
                           Geomean difference                           -5.4%
      compile_time
run    results_old  results_new        diff
count  3327.000000  3327.000000  454.000000
mean   0.197357     0.184994    -0.005927
std    2.103227     1.847044     0.393625
min    0.000000     0.000000    -0.797814
25%    0.000000     0.000000    -0.139927
50%    0.000000     0.000000    -0.031750
75%    0.000000     0.000000     0.055100
max    92.449400    75.736300    4.846154
(base) ***@DESKTOP-80T55HI:~/code/llvmleak$ llvm-test-suite/utils/compare.py --filter-short results_old.json results_new.json
Tests: 3327
Short Running: 2390 (filtered out)
Remaining: 937
Metric: exec_time

Program                                       exec_time
                                              results_old results_new
SingleSour...ut-C++/Shootout-C++-sieve.test       0.00        1.05
SingleSour...marks/CoyoteBench/lpbench.test       0.00        1.29
MultiSourc...lt/CrossingThresholds-flt.test       0.00        1.21
SingleSour...ce/Benchmarks/Misc/perlin.test       0.00        1.20
MultiSourc...CI_Purple/SMG2000/smg2000.test       0.00        1.23
MicroBench...HMARK_ANISTROPIC_DIFFUSION/256   20620.89    48887.01
MicroBench..._MemCmp<31, LessThanZero, Mid>     294.71      677.21
MicroBench...HMARK_ANISTROPIC_DIFFUSION/128    6420.34    13718.01
MicroBench...MemCmp<5, LessThanZero, First>     501.73     1004.02
MicroBench...test:benchAutoVecForBigLoopTC8       1.44        2.88
MicroBench...test:benchAutoVecForBigLoopTC2       1.51        3.01
MicroBench...t:BENCHMARK_asinf_novec_float_     109.42      217.25
MicroBench...rks.test:benchForIC4VW4LoopTC4       2.04        4.01
MicroBench...BENCHMARK_ORDERED_DITHER/128/2      37.37       72.65
MicroBench...BENCHMARK_acos_autovec_double_     158.15      302.43
           exec_time
run      results_old    results_new
count  937.000000     937.000000
mean   2723.168099    2608.870807
std    30514.292833   28486.676194
min    0.000000       0.000000
25%    2.978695       3.087209
50%    17.409600      17.785316
75%    281.905940     283.165699
max    666889.746452  622151.000000

Performance Test Results

Following the LLVM Test Suite Guide, I conducted performance tests to evaluate the impact of my changes.

The results are encouraging. All tests passed successfully for both the baseline and modified versions. Furthermore, the preliminary data indicates a compile-time reduction of 5.4% with the modified Clang compared to the original.

Caveats and Future Work

However, I want to be cautious with these initial findings. It is important to note that this is based on a single test run, and I suspect this performance difference could be attributed to random chance or environmental factors.

I am currently busy with my final exams, but as soon as I have time, I plan to conduct more rigorous testing. My plan is to repeat the experiments on my WSL setup and on a separate Mac to obtain more robust and reliable results.

Test Environment & Configuration

To ensure the test was reproducible, the following configuration was used.

Host Compiler:
The same host compiler (installed via apt) was used to build both Clang versions:

$ /usr/bin/clang --version
Ubuntu clang version 18.1.3 (1ubuntu1)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

Build Configuration:
The following environment and CMake command were used to build both the baseline (original) and the modified Clang, ensuring a consistent build configuration between the two.

# Environment variables were set to ensure optimization

unset CFLAGS
unset CXXFLAGS
unset LDFLAGS
unset LD_LIBRARY_PATH
unset CC
unset CXX

export CFLAGS="-O3 -march=native"
export CXXFLAGS="-O3 -march=native" 

# CMake command
cmake -G "Ninja" \
  -S /home/***/llvmtestfix/llvm \
  -B /home/***/llvmtestfix/build \
  -DCMAKE_INSTALL_PREFIX=/home/***/fixtestinstall \
  -DCMAKE_INSTALL_RPATH=/home/***/fixtestinstall/lib \
  -DLLVM_ENABLE_PROJECTS="clang;lld" \
  -DLLVM_ENABLE_RUNTIMES="compiler-rt;libcxx;libcxxabi;libunwind;openmp" \
  -DCMAKE_C_COMPILER=/usr/bin/clang \
  -DCMAKE_CXX_COMPILER=/usr/bin/clang++ \
  -DCMAKE_CXX_COMPILER_TARGET=x86_64-pc-linux-gnu \
  -DCMAKE_C_COMPILER_TARGET=x86_64-pc-linux-gnu \
  -DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-pc-linux-gnu \
  -DCMAKE_LINKER=/usr/bin/ld.lld \
  -DLLVM_ENABLE_LLD=ON \
  -DLLVM_PROFILE_GENERATE=OFF \
  -DLLVM_INCLUDE_DOCS=OFF \
  -DLLVM_BUILD_DOCS=OFF \
  -DLIBCXX_INSTALL_MODULES=ON \
  -DCMAKE_AR=/usr/bin/llvm-ar \
  -DCMAKE_RANLIB=/usr/bin/llvm-ranlib \
  -DCMAKE_BUILD_TYPE=Release \
  -DLLVM_BUILD_LLVM_DYLIB=ON \
  -DLLVM_LINK_LLVM_DYLIB=ON \
  -DLLVM_ENABLE_RTTI=ON \
  -DLLVM_ENABLE_EH=ON \
  -DLLVM_ENABLE_EXCEPTIONS=ON \
  -DLLVM_TARGETS_TO_BUILD="X86"

Sirraide · 2025-06-09T13:39:37Z

You might want to get your branch on https://llvm-compile-time-tracker.com/index.php

CC @nikic

nikic · 2025-06-09T14:37:28Z

Looks neutral: https://llvm-compile-time-tracker.com/compare.php?from=e4060d3beab3b525b49baaa15484e3c595740a2f&to=aace43c7cc60347e1853e55ee7c033a19a9a65ea&stat=instructions:u

max-rss seems to have an improvement on tramp3d-v4 in multiple configurations, so that's probably not noise: https://llvm-compile-time-tracker.com/compare.php?from=e4060d3beab3b525b49baaa15484e3c595740a2f&to=aace43c7cc60347e1853e55ee7c033a19a9a65ea&stat=max-rss

lux-QAQ · 2025-06-10T03:15:58Z

It seems that using manual delete may not be the most reasonable approach.
Is there any other feedback or improvements I should make? Or, should we simply close this PR now?

llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Jun 6, 2025

lux-QAQ changed the title ~~[Sema] Fix memory leak in DeductionFailureInfo (#143129)~~ [Sema] Fix memory leak in DeductionFailureInfo Jun 6, 2025

lux-QAQ changed the title ~~[Sema] Fix memory leak in DeductionFailureInfo~~ [Sema] Implement fix as suggested by FIXME Jun 6, 2025

[Sema] Fix memory leak in DeductionFailureInfo (llvm#143129)

f69abed

Move allocation of deduction failure info from the ASTContext to the heap. Implement corresponding type-safe delete calls in `DeductionFailureInfo::Destroy()` to prevent memory leaks.

lux-QAQ force-pushed the fix-sema-leak branch from 697f75c to f69abed Compare June 9, 2025 09:59

Sirraide reviewed Jun 9, 2025

View reviewed changes

lux-QAQ closed this Jun 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Sema] Implement fix as suggested by FIXME #143142

[Sema] Implement fix as suggested by FIXME #143142

Uh oh!

lux-QAQ commented Jun 6, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jun 6, 2025

Uh oh!

llvmbot commented Jun 6, 2025

Uh oh!

github-actions bot commented Jun 9, 2025 •

edited

Loading

Uh oh!

Sirraide left a comment

Uh oh!

lux-QAQ commented Jun 9, 2025

Uh oh!

Sirraide commented Jun 9, 2025

Uh oh!

nikic commented Jun 9, 2025

Uh oh!

lux-QAQ commented Jun 10, 2025

Uh oh!

Uh oh!

[Sema] Implement fix as suggested by FIXME #143142

[Sema] Implement fix as suggested by FIXME #143142

Uh oh!

Conversation

lux-QAQ commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jun 6, 2025

Uh oh!

llvmbot commented Jun 6, 2025

Uh oh!

github-actions bot commented Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Sirraide left a comment

Choose a reason for hiding this comment

Uh oh!

lux-QAQ commented Jun 9, 2025

Performance Test Results

Caveats and Future Work

Test Environment & Configuration

Uh oh!

Sirraide commented Jun 9, 2025

Uh oh!

nikic commented Jun 9, 2025

Uh oh!

lux-QAQ commented Jun 10, 2025

Uh oh!

Uh oh!

lux-QAQ commented Jun 6, 2025 •

edited

Loading

github-actions bot commented Jun 9, 2025 •

edited

Loading