Skip to content

Conversation

term-est
Copy link
Contributor

Fixes #163778 (fix might be indirect?)

Prevents emitting byte-code for UB casts

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:bytecode Issues for the clang bytecode constexpr interpreter labels Oct 16, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 16, 2025

@llvm/pr-subscribers-clang

Author: None (term-est)

Changes

Fixes #163778 (fix might be indirect?)

Prevents emitting byte-code for UB casts


Full diff: https://github.com/llvm/llvm-project/pull/163809.diff

2 Files Affected:

  • (modified) clang/lib/AST/ByteCode/Compiler.cpp (+44)
  • (modified) clang/lib/AST/ByteCode/Compiler.h (+2)
diff --git a/clang/lib/AST/ByteCode/Compiler.cpp b/clang/lib/AST/ByteCode/Compiler.cpp
index 74cae030bb9bb..c30262dd8a323 100644
--- a/clang/lib/AST/ByteCode/Compiler.cpp
+++ b/clang/lib/AST/ByteCode/Compiler.cpp
@@ -212,6 +212,10 @@ template <class Emitter>
 bool Compiler<Emitter>::VisitCastExpr(const CastExpr *CE) {
   const Expr *SubExpr = CE->getSubExpr();
 
+  if (isPunningDereference(SubExpr))
+    return this->emitInvalidCast(CastKind::Reinterpret, /*Fatal=*/true,
+                                 CE);
+
   if (DiscardResult)
     return this->delegate(SubExpr);
 
@@ -5511,6 +5515,46 @@ bool Compiler<Emitter>::maybeEmitDeferredVarInit(const VarDecl *VD) {
   return true;
 }
 
+template <class Emitter>
+bool Compiler<Emitter>::isPunningDereference(const Expr *E)
+{
+  E = E->IgnoreParenImpCasts();
+
+  const auto *UO = dyn_cast<UnaryOperator>(E);
+  if (!UO || UO->getOpcode() != UO_Deref)
+    return false;
+
+  const Expr *Base = UO->getSubExpr()->IgnoreParenImpCasts();
+  const auto *Cast = dyn_cast<CastExpr>(Base);
+  if (!Cast)
+    return false;
+
+  // Only consider reinterpret-ish casts
+  switch (Cast->getCastKind()) {
+    case CK_BitCast:
+    case CK_PointerToIntegral:
+    case CK_IntegralToPointer:
+    case CK_AddressSpaceConversion:
+      break;
+    default:
+      return false; // CK_NoOp etc. are fine
+  }
+
+  QualType DestPtrTy = Cast->getType();
+  QualType SrcPtrTy  = Cast->getSubExpr()->getType();
+  if (!DestPtrTy->isPointerType() || !SrcPtrTy->isPointerType())
+    return true; // super fishy, treat it as a pun
+
+  QualType DestPointee = DestPtrTy->getPointeeType();
+  QualType SrcPointee  = SrcPtrTy->getPointeeType();
+
+  // If pointee types differ (ignoring qualifiers), its a pun
+  if (!Ctx.getASTContext().hasSameUnqualifiedType(DestPointee, SrcPointee))
+    return true;
+
+  return false;
+}
+
 static bool hasTrivialDefaultCtorParent(const FieldDecl *FD) {
   assert(FD);
   assert(FD->getParent()->isUnion());
diff --git a/clang/lib/AST/ByteCode/Compiler.h b/clang/lib/AST/ByteCode/Compiler.h
index 5c46f75af4da3..2e814fe3e623b 100644
--- a/clang/lib/AST/ByteCode/Compiler.h
+++ b/clang/lib/AST/ByteCode/Compiler.h
@@ -423,6 +423,8 @@ class Compiler : public ConstStmtVisitor<Compiler<Emitter>, bool>,
   bool checkLiteralType(const Expr *E);
   bool maybeEmitDeferredVarInit(const VarDecl *VD);
 
+  bool isPunningDereference(const Expr *E);
+
   bool refersToUnion(const Expr *E);
 
 protected:

Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff origin/main HEAD --extensions cpp,h -- clang/lib/AST/ByteCode/Compiler.cpp clang/lib/AST/ByteCode/Compiler.h --diff_from_common_commit

⚠️
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing origin/main to the base branch/commit you want to compare against.
⚠️

View the diff from clang-format here.
diff --git a/clang/lib/AST/ByteCode/Compiler.cpp b/clang/lib/AST/ByteCode/Compiler.cpp
index c30262dd8..62d744a16 100644
--- a/clang/lib/AST/ByteCode/Compiler.cpp
+++ b/clang/lib/AST/ByteCode/Compiler.cpp
@@ -213,8 +213,7 @@ bool Compiler<Emitter>::VisitCastExpr(const CastExpr *CE) {
   const Expr *SubExpr = CE->getSubExpr();
 
   if (isPunningDereference(SubExpr))
-    return this->emitInvalidCast(CastKind::Reinterpret, /*Fatal=*/true,
-                                 CE);
+    return this->emitInvalidCast(CastKind::Reinterpret, /*Fatal=*/true, CE);
 
   if (DiscardResult)
     return this->delegate(SubExpr);
@@ -5516,8 +5515,7 @@ bool Compiler<Emitter>::maybeEmitDeferredVarInit(const VarDecl *VD) {
 }
 
 template <class Emitter>
-bool Compiler<Emitter>::isPunningDereference(const Expr *E)
-{
+bool Compiler<Emitter>::isPunningDereference(const Expr *E) {
   E = E->IgnoreParenImpCasts();
 
   const auto *UO = dyn_cast<UnaryOperator>(E);
@@ -5531,22 +5529,22 @@ bool Compiler<Emitter>::isPunningDereference(const Expr *E)
 
   // Only consider reinterpret-ish casts
   switch (Cast->getCastKind()) {
-    case CK_BitCast:
-    case CK_PointerToIntegral:
-    case CK_IntegralToPointer:
-    case CK_AddressSpaceConversion:
-      break;
-    default:
-      return false; // CK_NoOp etc. are fine
+  case CK_BitCast:
+  case CK_PointerToIntegral:
+  case CK_IntegralToPointer:
+  case CK_AddressSpaceConversion:
+    break;
+  default:
+    return false; // CK_NoOp etc. are fine
   }
 
   QualType DestPtrTy = Cast->getType();
-  QualType SrcPtrTy  = Cast->getSubExpr()->getType();
+  QualType SrcPtrTy = Cast->getSubExpr()->getType();
   if (!DestPtrTy->isPointerType() || !SrcPtrTy->isPointerType())
     return true; // super fishy, treat it as a pun
 
   QualType DestPointee = DestPtrTy->getPointeeType();
-  QualType SrcPointee  = SrcPtrTy->getPointeeType();
+  QualType SrcPointee = SrcPtrTy->getPointeeType();
 
   // If pointee types differ (ignoring qualifiers), its a pun
   if (!Ctx.getASTContext().hasSameUnqualifiedType(DestPointee, SrcPointee))

@Sirraide Sirraide requested a review from tbaederr October 16, 2025 16:29
@tbaederr
Copy link
Contributor

Use [clang][bytecode] in the title please.

break;
default:
return false; // CK_NoOp etc. are fine
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you come up with this list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's tentative. I think we need to check for actual UBness of the cast, as there are cases where some casts are legal and should be allowed in CE

For a POC to see if this fixes the issue I linked, I just put everything that remotely resembles a fishy cast

const Expr *SubExpr = CE->getSubExpr();

if (isPunningDereference(SubExpr))
return this->emitInvalidCast(CastKind::Reinterpret, /*Fatal=*/true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to call this when we already switch'ed for the cast kind later.

@term-est
Copy link
Contributor Author

Use [clang][bytecode] in the title please.

Yep. I can fix things up, I think there are formatting issues and cases where legal downcast are rejected as well, which needs more work

I mainly want to get an opinion regarding whether we would want this? I am just getting into the bytecode interpreter and I don't really know about the direction it is heading.

LMK if you want me to pursue this, otherwise I can close the PR

Thank you 🍓

@term-est term-est changed the title [clang][Interp] Bail out from type-punning casts [clang][bytecode] Bail out from type-punning casts Oct 17, 2025
@tbaederr
Copy link
Contributor

Looking at https://godbolt.org/z/xoM5Tb7cc, I think we should basically reject CK_BitCast casts where the pointee types don't qualify to the same PrimType. That's just from a quick glance though, I haven't checked if that breaks anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:bytecode Issues for the clang bytecode constexpr interpreter clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[clang][bytecode] Assertion `Offset + sizeof(T) <= BS.Pointee->getDescriptor()->getAllocSize()' failed.

3 participants