Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[-Wunsafe-buffer-usage] Introduce std::array fixits #80084

Merged

Conversation

jkorous-apple
Copy link
Contributor

No description provided.

It needs to be used from handleUnsafeVariableGroup function.
…::array

Array subscript on a const size array is not bounds-checked. The idiomatic
replacement is std::array which is bounds-safe in hardened mode of libc++.

This commit extends the fixit-producing machine to consider std::array as a
transformation target type and teaches it to handle the array subscript on const
size arrays with a trivial (empty) fixit.

The transformation is straightforward for most cases:
T array[N]; => std::array<T, N> array;
but for array of const pointers it needs to be:
T* const array[N]; => const std::array<T*, N> array;

This case and other special cases are to be implemented in follow-up
patches.
… fixits

Fixits related to arrays generally change type from T [N] to
std::array<T, N>. N has to be constant but doesn't have to be an integer
literal. It can be for example a constant.

The fixits need to preserve the spelling of the array size.
The assumption is that not specifying the numerical value directly was a
conscious decision of the original author and it'll very likely still apply to
the transformed code.

E. g. for
int buffer[SIZE];

if `buffer` is unsafe it should become:
std::array<int, SIZE> buffer;

rather than:
std::array<int, 42> buffer;
…array fixits

In other words - don't use the type that the source code gets expanded
to in the AST as there's very likely a reason why it is represented via
a typedef or macro in the sourcode.
Specifically applies to text with array element type and array size for
constant size arrays. Trimming there strings on both sides and using
llvm::StringRef::trim().
…deduced

C++ allows the array size to be omitted in declaration if it can be deduced from the
initializer.
E. g.:
int array [] = { 1, 2, 3 };

std::array<T, N> on the other hand requires the size N to be provided.

Currently the fixit produced would use empty space for N which would lead to a
compilation failure. This patch makes the fixit-producing machine just
bail in such cases and adds a FIXME to be addressed later.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:analysis labels Jan 31, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jan 31, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-analysis

Author: None (jkorous-apple)

Changes

Patch is 39.66 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/80084.diff

7 Files Affected:

  • (modified) clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h (+44-3)
  • (modified) clang/lib/Analysis/UnsafeBufferUsage.cpp (+218-138)
  • (modified) clang/lib/Sema/AnalysisBasedWarnings.cpp (+14-2)
  • (added) clang/test/SemaCXX/warn-unsafe-buffer-usage-array.cpp (+24)
  • (modified) clang/test/SemaCXX/warn-unsafe-buffer-usage-debug.cpp (+5-7)
  • (added) clang/test/SemaCXX/warn-unsafe-buffer-usage-fixits-local-var-array.cpp (+167)
  • (modified) clang/test/SemaCXX/warn-unsafe-buffer-usage.cpp (+5)
diff --git a/clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h b/clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h
index aca1ad998822c..2c9ebf3fb42d0 100644
--- a/clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h
+++ b/clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h
@@ -42,6 +42,43 @@ class VariableGroupsManager {
   virtual VarGrpRef getGroupOfParms() const =0;
 };
 
+// FixitStrategy is a map from variables to the way we plan to emit fixes for
+// these variables. It is figured out gradually by trying different fixes
+// for different variables depending on gadgets in which these variables
+// participate.
+class FixitStrategy {
+public:
+  enum class Kind {
+    Wontfix,  // We don't plan to emit a fixit for this variable.
+    Span,     // We recommend replacing the variable with std::span.
+    Iterator, // We recommend replacing the variable with std::span::iterator.
+    Array,    // We recommend replacing the variable with std::array.
+    Vector    // We recommend replacing the variable with std::vector.
+  };
+
+private:
+  using MapTy = llvm::DenseMap<const VarDecl *, Kind>;
+
+  MapTy Map;
+
+public:
+  FixitStrategy() = default;
+  FixitStrategy(const FixitStrategy &) = delete; // Let's avoid copies.
+  FixitStrategy &operator=(const FixitStrategy &) = delete;
+  FixitStrategy(FixitStrategy &&) = default;
+  FixitStrategy &operator=(FixitStrategy &&) = default;
+
+  void set(const VarDecl *VD, Kind K) { Map[VD] = K; }
+
+  Kind lookup(const VarDecl *VD) const {
+    auto I = Map.find(VD);
+    if (I == Map.end())
+      return Kind::Wontfix;
+
+    return I->second;
+  }
+};
+
 /// The interface that lets the caller handle unsafe buffer usage analysis
 /// results by overriding this class's handle... methods.
 class UnsafeBufferUsageHandler {
@@ -58,6 +95,8 @@ class UnsafeBufferUsageHandler {
 #endif
 
 public:
+  enum class TargetType { Span, Array };
+
   UnsafeBufferUsageHandler() = default;
   virtual ~UnsafeBufferUsageHandler() = default;
 
@@ -75,9 +114,11 @@ class UnsafeBufferUsageHandler {
   ///
   /// `D` is the declaration of the callable under analysis that owns `Variable`
   /// and all of its group mates.
-  virtual void handleUnsafeVariableGroup(const VarDecl *Variable,
-                                         const VariableGroupsManager &VarGrpMgr,
-                                         FixItList &&Fixes, const Decl *D) = 0;
+  virtual void
+  handleUnsafeVariableGroup(const VarDecl *Variable,
+                            const VariableGroupsManager &VarGrpMgr,
+                            FixItList &&Fixes, const Decl *D,
+                            const FixitStrategy &VarTargetTypes) = 0;
 
 #ifndef NDEBUG
 public:
diff --git a/clang/lib/Analysis/UnsafeBufferUsage.cpp b/clang/lib/Analysis/UnsafeBufferUsage.cpp
index 823cd2a7b9969..18d0ca7a17d3c 100644
--- a/clang/lib/Analysis/UnsafeBufferUsage.cpp
+++ b/clang/lib/Analysis/UnsafeBufferUsage.cpp
@@ -12,10 +12,13 @@
 #include "clang/AST/RecursiveASTVisitor.h"
 #include "clang/AST/StmtVisitor.h"
 #include "clang/ASTMatchers/ASTMatchFinder.h"
+#include "clang/Basic/CharInfo.h"
+#include "clang/Basic/SourceLocation.h"
 #include "clang/Lex/Lexer.h"
 #include "clang/Lex/Preprocessor.h"
 #include "llvm/ADT/APSInt.h"
 #include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/StringRef.h"
 #include <memory>
 #include <optional>
 #include <queue>
@@ -407,9 +410,6 @@ using DeclUseList = SmallVector<const DeclRefExpr *, 1>;
 
 // Convenience typedef.
 using FixItList = SmallVector<FixItHint, 4>;
-
-// Defined below.
-class Strategy;
 } // namespace
 
 namespace {
@@ -486,7 +486,7 @@ class FixableGadget : public Gadget {
   /// Returns a fixit that would fix the current gadget according to
   /// the current strategy. Returns std::nullopt if the fix cannot be produced;
   /// returns an empty list if no fixes are necessary.
-  virtual std::optional<FixItList> getFixits(const Strategy &) const {
+  virtual std::optional<FixItList> getFixits(const FixitStrategy &) const {
     return std::nullopt;
   }
 
@@ -737,7 +737,8 @@ class PointerInitGadget : public FixableGadget {
     return stmt(PtrInitStmt);
   }
 
-  virtual std::optional<FixItList> getFixits(const Strategy &S) const override;
+  virtual std::optional<FixItList>
+  getFixits(const FixitStrategy &S) const override;
 
   virtual const Stmt *getBaseStmt() const override {
     // FIXME: This needs to be the entire DeclStmt, assuming that this method
@@ -789,7 +790,8 @@ class PointerAssignmentGadget : public FixableGadget {
     return stmt(isInUnspecifiedUntypedContext(PtrAssignExpr));
   }
 
-  virtual std::optional<FixItList> getFixits(const Strategy &S) const override;
+  virtual std::optional<FixItList>
+  getFixits(const FixitStrategy &S) const override;
 
   virtual const Stmt *getBaseStmt() const override {
     // FIXME: This should be the binary operator, assuming that this method
@@ -892,7 +894,8 @@ class ULCArraySubscriptGadget : public FixableGadget {
     return expr(isInUnspecifiedLvalueContext(Target));
   }
 
-  virtual std::optional<FixItList> getFixits(const Strategy &S) const override;
+  virtual std::optional<FixItList>
+  getFixits(const FixitStrategy &S) const override;
 
   virtual const Stmt *getBaseStmt() const override { return Node; }
 
@@ -932,7 +935,8 @@ class UPCStandalonePointerGadget : public FixableGadget {
     return stmt(isInUnspecifiedPointerContext(target));
   }
 
-  virtual std::optional<FixItList> getFixits(const Strategy &S) const override;
+  virtual std::optional<FixItList>
+  getFixits(const FixitStrategy &S) const override;
 
   virtual const Stmt *getBaseStmt() const override { return Node; }
 
@@ -976,7 +980,8 @@ class PointerDereferenceGadget : public FixableGadget {
 
   virtual const Stmt *getBaseStmt() const final { return Op; }
 
-  virtual std::optional<FixItList> getFixits(const Strategy &S) const override;
+  virtual std::optional<FixItList>
+  getFixits(const FixitStrategy &S) const override;
 };
 
 // Represents expressions of the form `&DRE[any]` in the Unspecified Pointer
@@ -1009,7 +1014,8 @@ class UPCAddressofArraySubscriptGadget : public FixableGadget {
             .bind(UPCAddressofArraySubscriptTag)))));
   }
 
-  virtual std::optional<FixItList> getFixits(const Strategy &) const override;
+  virtual std::optional<FixItList>
+  getFixits(const FixitStrategy &) const override;
 
   virtual const Stmt *getBaseStmt() const override { return Node; }
 
@@ -1088,46 +1094,6 @@ class DeclUseTracker {
 };
 } // namespace
 
-namespace {
-// Strategy is a map from variables to the way we plan to emit fixes for
-// these variables. It is figured out gradually by trying different fixes
-// for different variables depending on gadgets in which these variables
-// participate.
-class Strategy {
-public:
-  enum class Kind {
-    Wontfix,  // We don't plan to emit a fixit for this variable.
-    Span,     // We recommend replacing the variable with std::span.
-    Iterator, // We recommend replacing the variable with std::span::iterator.
-    Array,    // We recommend replacing the variable with std::array.
-    Vector    // We recommend replacing the variable with std::vector.
-  };
-
-private:
-  using MapTy = llvm::DenseMap<const VarDecl *, Kind>;
-
-  MapTy Map;
-
-public:
-  Strategy() = default;
-  Strategy(const Strategy &) = delete; // Let's avoid copies.
-  Strategy &operator=(const Strategy &) = delete;
-  Strategy(Strategy &&) = default;
-  Strategy &operator=(Strategy &&) = default;
-
-  void set(const VarDecl *VD, Kind K) { Map[VD] = K; }
-
-  Kind lookup(const VarDecl *VD) const {
-    auto I = Map.find(VD);
-    if (I == Map.end())
-      return Kind::Wontfix;
-
-    return I->second;
-  }
-};
-} // namespace
-
-
 // Representing a pointer type expression of the form `++Ptr` in an Unspecified
 // Pointer Context (UPC):
 class UPCPreIncrementGadget : public FixableGadget {
@@ -1159,7 +1125,8 @@ class UPCPreIncrementGadget : public FixableGadget {
 										  ).bind(UPCPreIncrementTag)))));
   }
 
-  virtual std::optional<FixItList> getFixits(const Strategy &S) const override;
+  virtual std::optional<FixItList>
+  getFixits(const FixitStrategy &S) const override;
 
   virtual const Stmt *getBaseStmt() const override { return Node; }
 
@@ -1204,7 +1171,8 @@ class UUCAddAssignGadget : public FixableGadget {
     // clang-format on
   }
 
-  virtual std::optional<FixItList> getFixits(const Strategy &S) const override;
+  virtual std::optional<FixItList>
+  getFixits(const FixitStrategy &S) const override;
 
   virtual const Stmt *getBaseStmt() const override { return Node; }
 
@@ -1254,7 +1222,8 @@ class DerefSimplePtrArithFixableGadget : public FixableGadget {
     // clang-format on
   }
 
-  virtual std::optional<FixItList> getFixits(const Strategy &s) const final;
+  virtual std::optional<FixItList>
+  getFixits(const FixitStrategy &s) const final;
 
   // TODO remove this method from FixableGadget interface
   virtual const Stmt *getBaseStmt() const final { return nullptr; }
@@ -1464,38 +1433,40 @@ bool clang::internal::anyConflict(const SmallVectorImpl<FixItHint> &FixIts,
 }
 
 std::optional<FixItList>
-PointerAssignmentGadget::getFixits(const Strategy &S) const {
+PointerAssignmentGadget::getFixits(const FixitStrategy &S) const {
   const auto *LeftVD = cast<VarDecl>(PtrLHS->getDecl());
   const auto *RightVD = cast<VarDecl>(PtrRHS->getDecl());
   switch (S.lookup(LeftVD)) {
-    case Strategy::Kind::Span:
-      if (S.lookup(RightVD) == Strategy::Kind::Span)
-        return FixItList{};
-      return std::nullopt;
-    case Strategy::Kind::Wontfix:
-      return std::nullopt;
-    case Strategy::Kind::Iterator:
-    case Strategy::Kind::Array:
-    case Strategy::Kind::Vector:
-      llvm_unreachable("unsupported strategies for FixableGadgets");
+  case FixitStrategy::Kind::Span:
+    if (S.lookup(RightVD) == FixitStrategy::Kind::Span)
+      return FixItList{};
+    return std::nullopt;
+  case FixitStrategy::Kind::Wontfix:
+    return std::nullopt;
+  case FixitStrategy::Kind::Iterator:
+  case FixitStrategy::Kind::Array:
+    return std::nullopt;
+  case FixitStrategy::Kind::Vector:
+    llvm_unreachable("unsupported strategies for FixableGadgets");
   }
   return std::nullopt;
 }
 
 std::optional<FixItList>
-PointerInitGadget::getFixits(const Strategy &S) const {
+PointerInitGadget::getFixits(const FixitStrategy &S) const {
   const auto *LeftVD = PtrInitLHS;
   const auto *RightVD = cast<VarDecl>(PtrInitRHS->getDecl());
   switch (S.lookup(LeftVD)) {
-    case Strategy::Kind::Span:
-      if (S.lookup(RightVD) == Strategy::Kind::Span)
-        return FixItList{};
-      return std::nullopt;
-    case Strategy::Kind::Wontfix:
-      return std::nullopt;
-    case Strategy::Kind::Iterator:
-    case Strategy::Kind::Array:
-    case Strategy::Kind::Vector:
+  case FixitStrategy::Kind::Span:
+    if (S.lookup(RightVD) == FixitStrategy::Kind::Span)
+      return FixItList{};
+    return std::nullopt;
+  case FixitStrategy::Kind::Wontfix:
+    return std::nullopt;
+  case FixitStrategy::Kind::Iterator:
+  case FixitStrategy::Kind::Array:
+    return std::nullopt;
+  case FixitStrategy::Kind::Vector:
     llvm_unreachable("unsupported strategies for FixableGadgets");
   }
   return std::nullopt;
@@ -1512,12 +1483,12 @@ static bool isNonNegativeIntegerExpr(const Expr *Expr, const VarDecl *VD,
 }
 
 std::optional<FixItList>
-ULCArraySubscriptGadget::getFixits(const Strategy &S) const {
+ULCArraySubscriptGadget::getFixits(const FixitStrategy &S) const {
   if (const auto *DRE =
           dyn_cast<DeclRefExpr>(Node->getBase()->IgnoreImpCasts()))
     if (const auto *VD = dyn_cast<VarDecl>(DRE->getDecl())) {
       switch (S.lookup(VD)) {
-      case Strategy::Kind::Span: {
+      case FixitStrategy::Kind::Span: {
 
         // If the index has a negative constant value, we give up as no valid
         // fix-it can be generated:
@@ -1528,10 +1499,11 @@ ULCArraySubscriptGadget::getFixits(const Strategy &S) const {
         // no-op is a good fix-it, otherwise
         return FixItList{};
       }
-      case Strategy::Kind::Wontfix:
-      case Strategy::Kind::Iterator:
-      case Strategy::Kind::Array:
-      case Strategy::Kind::Vector:
+      case FixitStrategy::Kind::Array:
+        return FixItList{};
+      case FixitStrategy::Kind::Wontfix:
+      case FixitStrategy::Kind::Iterator:
+      case FixitStrategy::Kind::Vector:
         llvm_unreachable("unsupported strategies for FixableGadgets");
       }
     }
@@ -1542,17 +1514,18 @@ static std::optional<FixItList> // forward declaration
 fixUPCAddressofArraySubscriptWithSpan(const UnaryOperator *Node);
 
 std::optional<FixItList>
-UPCAddressofArraySubscriptGadget::getFixits(const Strategy &S) const {
+UPCAddressofArraySubscriptGadget::getFixits(const FixitStrategy &S) const {
   auto DREs = getClaimedVarUseSites();
   const auto *VD = cast<VarDecl>(DREs.front()->getDecl());
 
   switch (S.lookup(VD)) {
-  case Strategy::Kind::Span:
+  case FixitStrategy::Kind::Span:
     return fixUPCAddressofArraySubscriptWithSpan(Node);
-  case Strategy::Kind::Wontfix:
-  case Strategy::Kind::Iterator:
-  case Strategy::Kind::Array:
-  case Strategy::Kind::Vector:
+  case FixitStrategy::Kind::Wontfix:
+  case FixitStrategy::Kind::Iterator:
+  case FixitStrategy::Kind::Array:
+    return std::nullopt;
+  case FixitStrategy::Kind::Vector:
     llvm_unreachable("unsupported strategies for FixableGadgets");
   }
   return std::nullopt; // something went wrong, no fix-it
@@ -1803,10 +1776,10 @@ getSpanTypeText(StringRef EltTyText,
 }
 
 std::optional<FixItList>
-DerefSimplePtrArithFixableGadget::getFixits(const Strategy &s) const {
+DerefSimplePtrArithFixableGadget::getFixits(const FixitStrategy &s) const {
   const VarDecl *VD = dyn_cast<VarDecl>(BaseDeclRefExpr->getDecl());
 
-  if (VD && s.lookup(VD) == Strategy::Kind::Span) {
+  if (VD && s.lookup(VD) == FixitStrategy::Kind::Span) {
     ASTContext &Ctx = VD->getASTContext();
     // std::span can't represent elements before its begin()
     if (auto ConstVal = Offset->getIntegerConstantExpr(Ctx))
@@ -1866,10 +1839,10 @@ DerefSimplePtrArithFixableGadget::getFixits(const Strategy &s) const {
 }
 
 std::optional<FixItList>
-PointerDereferenceGadget::getFixits(const Strategy &S) const {
+PointerDereferenceGadget::getFixits(const FixitStrategy &S) const {
   const VarDecl *VD = cast<VarDecl>(BaseDeclRefExpr->getDecl());
   switch (S.lookup(VD)) {
-  case Strategy::Kind::Span: {
+  case FixitStrategy::Kind::Span: {
     ASTContext &Ctx = VD->getASTContext();
     SourceManager &SM = Ctx.getSourceManager();
     // Required changes: *(ptr); => (ptr[0]); and *ptr; => ptr[0]
@@ -1884,11 +1857,12 @@ PointerDereferenceGadget::getFixits(const Strategy &S) const {
     }
     break;
   }
-  case Strategy::Kind::Iterator:
-  case Strategy::Kind::Array:
-  case Strategy::Kind::Vector:
-    llvm_unreachable("Strategy not implemented yet!");
-  case Strategy::Kind::Wontfix:
+  case FixitStrategy::Kind::Iterator:
+  case FixitStrategy::Kind::Array:
+    return std::nullopt;
+  case FixitStrategy::Kind::Vector:
+    llvm_unreachable("FixitStrategy not implemented yet!");
+  case FixitStrategy::Kind::Wontfix:
     llvm_unreachable("Invalid strategy!");
   }
 
@@ -1897,27 +1871,27 @@ PointerDereferenceGadget::getFixits(const Strategy &S) const {
 
 // Generates fix-its replacing an expression of the form UPC(DRE) with
 // `DRE.data()`
-std::optional<FixItList> UPCStandalonePointerGadget::getFixits(const Strategy &S)
-      const {
+std::optional<FixItList>
+UPCStandalonePointerGadget::getFixits(const FixitStrategy &S) const {
   const auto VD = cast<VarDecl>(Node->getDecl());
   switch (S.lookup(VD)) {
-    case Strategy::Kind::Span: {
-      ASTContext &Ctx = VD->getASTContext();
-      SourceManager &SM = Ctx.getSourceManager();
-      // Inserts the .data() after the DRE
-      std::optional<SourceLocation> EndOfOperand =
-          getPastLoc(Node, SM, Ctx.getLangOpts());
-
-      if (EndOfOperand)
-        return FixItList{{FixItHint::CreateInsertion(
-            *EndOfOperand, ".data()")}};
-      // FIXME: Points inside a macro expansion.
-      break;
+  case FixitStrategy::Kind::Span: {
+    ASTContext &Ctx = VD->getASTContext();
+    SourceManager &SM = Ctx.getSourceManager();
+    // Inserts the .data() after the DRE
+    std::optional<SourceLocation> EndOfOperand =
+        getPastLoc(Node, SM, Ctx.getLangOpts());
+
+    if (EndOfOperand)
+      return FixItList{{FixItHint::CreateInsertion(*EndOfOperand, ".data()")}};
+    // FIXME: Points inside a macro expansion.
+    break;
     }
-    case Strategy::Kind::Wontfix:
-    case Strategy::Kind::Iterator:
-    case Strategy::Kind::Array:
-    case Strategy::Kind::Vector:
+    case FixitStrategy::Kind::Wontfix:
+    case FixitStrategy::Kind::Iterator:
+    case FixitStrategy::Kind::Array:
+      return std::nullopt;
+    case FixitStrategy::Kind::Vector:
       llvm_unreachable("unsupported strategies for FixableGadgets");
   }
 
@@ -1962,14 +1936,14 @@ fixUPCAddressofArraySubscriptWithSpan(const UnaryOperator *Node) {
 }
 
 std::optional<FixItList>
-UUCAddAssignGadget::getFixits(const Strategy &S) const {
+UUCAddAssignGadget::getFixits(const FixitStrategy &S) const {
   DeclUseList DREs = getClaimedVarUseSites();
 
   if (DREs.size() != 1)
     return std::nullopt; // In cases of `Ptr += n` where `Ptr` is not a DRE, we
                          // give up
   if (const VarDecl *VD = dyn_cast<VarDecl>(DREs.front()->getDecl())) {
-    if (S.lookup(VD) == Strategy::Kind::Span) {
+    if (S.lookup(VD) == FixitStrategy::Kind::Span) {
       FixItList Fixes;
 
       const Stmt *AddAssignNode = getBaseStmt();
@@ -2003,14 +1977,15 @@ UUCAddAssignGadget::getFixits(const Strategy &S) const {
   return std::nullopt; // Not in the cases that we can handle for now, give up.
 }
 
-std::optional<FixItList> UPCPreIncrementGadget::getFixits(const Strategy &S) const {
+std::optional<FixItList>
+UPCPreIncrementGadget::getFixits(const FixitStrategy &S) const {
   DeclUseList DREs = getClaimedVarUseSites();
 
   if (DREs.size() != 1)
     return std::nullopt; // In cases of `++Ptr` where `Ptr` is not a DRE, we
                          // give up
   if (const VarDecl *VD = dyn_cast<VarDecl>(DREs.front()->getDecl())) {
-    if (S.lookup(VD) == Strategy::Kind::Span) {
+    if (S.lookup(VD) == FixitStrategy::Kind::Span) {
       FixItList Fixes;
       std::stringstream SS;
       const Stmt *PreIncNode = getBaseStmt();
@@ -2261,7 +2236,7 @@ static bool hasConflictingOverload(const FunctionDecl *FD) {
 // }
 //
 static std::optional<FixItList>
-createOverloadsForFixedParams(const Strategy &S, const FunctionDecl *FD,
+createOverloadsForFixedParams(const FixitStrategy &S, const FunctionDecl *FD,
                               const ASTContext &Ctx,
                               UnsafeBufferUsageHandler &Handler) {
   // FIXME: need to make this conflict checking better:
@@ -2278,9 +2253,9 @@ createOverloadsForFixedParams(const Strategy &S, const FunctionDecl *FD,
   for (unsigned i = 0; i < NumParms; i++) {
     const ParmVarDecl *PVD = FD->getParamDecl(i);
 
-    if (S.lookup(PVD) == Strategy::Kind::Wontfix)
+    if (S.lookup(PVD) == FixitStrategy::Kind::Wontfix)
       continue;
-    if (S.lookup(PVD) != Strategy::Kind::Span)
+    if (S.lookup(PVD) != FixitStrategy::Kind::Span)
       // Not supported, not suppose to happen:
       return std::nullopt;
 
@@ -2291,7 +2266,8 @@ createOverloadsForFixedParams(const Strategy &S, const FunctionDecl *FD,
     if (!PteTyText)
       // something wrong in obtaining the text of the pointee type, give up
       return std::nullopt;
-    // FIXME: whether we should create std::span type depends on the Strategy.
+    // FIXME: whether we should create std::span type depends on the
+    // FixitStrategy.
     NewTysTexts[i] = getSpanTypeText(*PteTyText, PteTyQuals);
     ParmsMask[i] = true;
     AtLeastOneParmToFix = true;
@@ -2495,10 +2471,104 @@ static FixItList fixVariableWithSpan(const VarDecl *VD,
   return fixLocalVarDeclWithSpan(VD, Ctx, getUserFillPlaceHolder(), Handler);
 }
 
+static FixItList fixVarDeclWithArray(const VarDecl *D, const ASTContext &Ctx,
+                                     UnsafeBufferUsageHandler &Handler) {
+  FixItList FixIts{};
+
+  if (auto CAT = dyn_cast<clang::ConstantArrayType>(D-...
[truncated]

Copy link

github-actions bot commented Jan 31, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@jkorous-apple jkorous-apple force-pushed the cxx-safe-buffers/std-array-fixits branch from a627add to 04abb73 Compare February 1, 2024 01:51
@@ -58,6 +95,8 @@ class UnsafeBufferUsageHandler {
#endif

public:
enum class TargetType { Span, Array };
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a bit of dead code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally. Removed. Thanks!

// T * foo[10]; => std::array<T *, 10> foo;
//
// However, for const pointers the transformation is different:
// T * const foo[10]; => const std::array<T *, 10> foo;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, wouldn't std::array<T *const, 10> have the exact same behavior in practice? (Though I completely agree that const std::array<T *, 10> is more readable.) (Testbed: https://godbolt.org/z/aeohe6jW9)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, let me simplify this.

Ctx.getLangOpts());
if (!MaybeElemTypeTxt)
return {};
const llvm::StringRef ElemTypeTxt = MaybeElemTypeTxt->trim();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should getRangeText() trim automatically?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I would leave it as is but if we later find that callers need to trim so often it gets annoying then let's reconsider.

@@ -61,6 +61,7 @@ void testArraySubscripts(int *p, int **pp) {
);

int a[10]; // expected-warning{{'a' is an unsafe buffer that does not perform bounds checks}}
// expected-note@-1{{change type of 'a' to 'std::array' to harden it}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll probably have a broader discussion about wording but before I forget: "to harden it" may be misleading because we don't know/check whether library hardening is actually turned on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with broader discussion later but don't want to block the PR on it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's important to make sure it's not actively misleading, even if we plan to figure out the perfect phrasing later. Maybe to incapsulate it for hardening purposes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess what we really want to say is "change the type of 'foo' to std::array so when you build the code with hardened libc++ all subscript operations on 'foo' will be automatically bounds-safe" but that is too long and too complex for this particular communication channel.

How about:
"to enable hardening"
"to get it hardened"
"to designate it for hardening"
"to register it for hardening"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"so it can benefit from libc++ hardening"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went with "to label it for hardening".

@jkorous-apple jkorous-apple force-pushed the cxx-safe-buffers/std-array-fixits branch from f3ba5be to 7694a6d Compare February 1, 2024 22:03
// std::array::operator[] has unsigned parameter so the value will be casted to unsigned.
// This will very likely be buffer overflow but hardened std::array catches these at runtime.
buffer[signed_idx] = 0;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can have a test showing that multi-dimentional arrays will not be fixed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent point!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added couple tests and an explicit check when we're creating fixit for unsafe array declaration.

const SourceLocation LSqBracketLoc = NextTok->getLocation();

// Get the spelling of the array size as written in the source file
// (including macros, etc.).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we could have a more elegant way to get the spelling of the array size. I feel like TypeLoc is suppose to offer it but it does not behave as I expect in this case. (TypeLoc also disappointed me in dealing with cv-qualifiers. ¯\_(ツ)_/¯)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I expect we'll get a better way of doing this and more when we adopt transformer library in the future. Getting the array size spelling might be already implemented or at least I expect to find better means of doing that than dealing with tokens.
For now I'm happy with a temporary solution while we bootstrap the rest of the machine.

@jkorous-apple jkorous-apple force-pushed the cxx-safe-buffers/std-array-fixits branch from c465195 to 7694a6d Compare February 6, 2024 17:40
…nal C arrays

Proper support for multidimensional arrays is outside of the scope of
initial std::array fixits.
This change makes sure that we don't accidentally emit fixits for multidimensional
arrays until that work is all done. It adds a specific check when transforming
VarDecl type to std::array. Also adds corresponding several testscases.
const QualType &ArrayEltT = CAT->getElementType();
assert(!ArrayEltT.isNull() && "Trying to fix a non-array type variable!");
// FIXME: support multi-dimensional arrays
if (isa<clang::ConstantArrayType>(ArrayEltT))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible that the element type is an array type but not a constant array type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, for example it's possible to mix CSA with C variable length arrays.
https://clang.llvm.org/doxygen/classclang_1_1VariableArrayType.html

void foo(unsigned u) {
    int foo[5][u];
}

Let me fix this and add a testcase.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and on top of that the if should check for canonical type to not be fooled by typedefs & co.

return {};
const llvm::StringRef ArraySizeTxt = MaybeArraySizeTxt->trim();
if (ArraySizeTxt.empty()) {
// FIXME: Support array size getting determined from the initializer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, maybe we can get the concrete size value from the array type directly since there is no spelling text for us to preserve?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is technically doable and the code would compile but if we want to preserve the "spirit of the code" then we can do a better job:

int arr1[] = {0, 1, 2};
// = fixit ===>
auto arr1 = std::to_array<int>({0, 1, 2});

The author of the original code likely had a reason why they didn't hardcode the size in the first place.

<< IdentText->str();

FixIts.push_back(FixItHint::CreateReplacement(
SourceRange{D->getBeginLoc(), D->getTypeSpecEndLoc()}, OS.str()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if it is possible to declare a constant array in a way that the type specifier is completely at the left-hand side of the identifier? Like the form T a; while T represents a constant array type.

How about this?

int a[10];
decltype(a) b;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this maybe?

typedef int ArrayT[10];
ArrayT a;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent question!
It is indeed possible so I added a couple more testcases and verified that the implementation doesn't produce nonsense when it encounters type sugar.

UnsafeBufferUsageHandler &Handler) {
FixItList FixIts{};

if (auto CAT = dyn_cast<clang::ConstantArrayType>(D->getType())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: can we replace this check with an assertion? Because it is already checked at the caller:

 case FixitStrategy::Kind::Array: {
    if (VD->isLocalVarDecl() && isa<clang::ConstantArrayType>(VD->getType()))
      return fixVariableWithArray(VD, Tracker, Ctx, Handler);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion but I actually changed the conditions in the callers in the commit that addresses the type sugar.
d49d5fd

My reasoning is that the layers above just check what kind of variable they are dealing with (e. g. pointer vs array) for which they need canonical type.
These rely on more specific functions to produce the fixits in a given context. Ultimately I see it as responsibility of fixVarDeclWithArray to handle any local variable of an array type and it shouldn't expect only canonical arrays. Maybe there will be some cases of type sugar for which it'll provide fixits in the future.

Copy link
Contributor

@ziqingluo-90 ziqingluo-90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left all my comments here. I didn't think quite through for some of them so feel free to ignore if they don't make sense to you.

Other parts LGTM!

The code that creates fixits from C array to std::array assumes the array type is
not a sugared type like typedef as it would not produce correct fixits otherwise.
The patch adds a note about the condition under which the code works and testcases.
As a general precaution it also makes the format assumption explicit in iteration
over tokens of the declaration.

Contrary to that the higher layers need to use canonical (desugared) type in order
to decide which function is responsible for fixing an array VarDecl.
Array element can be for example VLA which given its non-constant length can't be transformed to std::array.
The hardening has two necessary conditions:
- using a bounds-safe standard library type in the source code
- compiling with hardened mode of libc++

The message shouldn't imply that just fixing the source code hardens it.
@jkorous-apple jkorous-apple merged commit 644ac2a into llvm:main Feb 12, 2024
3 of 4 checks passed
jkorous-apple added a commit to jkorous-apple/llvm-project that referenced this pull request Feb 13, 2024
Array subscript on a const size array is not bounds-checked. The idiomatic
replacement is std::array which is bounds-safe in hardened mode of libc++.

This commit extends the fixit-producing machine to consider std::array as a
transformation target type and teaches it to handle the array subscript on const
size arrays with a trivial (empty) fixit.

(cherry picked from commit 644ac2a)
jkorous-apple added a commit to apple/llvm-project that referenced this pull request Feb 13, 2024
…e/cxx-safe-buffers/std-array-fixits

[-Wunsafe-buffer-usage] Introduce std::array fixits (llvm#80084)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:analysis clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants