[mlir] Fix use-after-free bugs in {RankedTensorType|VectorType}::Builder #68969

MacDue · 2023-10-13T09:45:00Z

Previously, these would set their ArrayRef members to reference their storage SmallVectors after a copy-on-write (COW) operation. This leads to a use-after-free if the builder is copied and the original destroyed (as the new builder would still reference the old SmallVector).

This could easily accidentally occur in code like (annotated):

// 1. `VectorType::Builder(type)` constructs a new temporary builder
// 2. `.dropDim(0)` updates the temporary builder by reference, and returns a `VectorType::Builder&`
//    - Modifying the shape is a COW operation, so `storage` is used, and `shape` updated the reference it
// 3. Assigning the reference to `auto` copies the builder (via the default C++ copy ctor)
//    -  There's no special handling for `shape` and `storage`, so the new shape points to the old builder's `storage`
auto newType = VectorType::Builder(type).dropDim(0);
// 4. When this line is reached the original temporary builder is destroyed
//    - Actually constructing the vector type is now a use-after-free
VectorType newVectorType = VectorType(newType);

This is fixed with these changes by using CopyOnWriteArrayRef<T>, which implements the same functionality, but ensures no
dangling references are possible if it's copied.

The VectorType::Builder also set the ArrayRef scalableDims member to a temporary SmallVector when the provided scalableDims are empty. This again leads to a use-after-free, and is unnecessary as VectorType::get already handles being passed an empty scalableDims array.

These bugs were in-part caught by UBSAN, see:
https://lab.llvm.org/buildbot/#/builders/5/builds/37355

Previously, these would set their ArrayRef members to reference their storage SmallVectors after a copy-on-write operation. This leads to a use-after-free if the builder is copied and the original destroyed (as the new builder would still reference the old SmallVector). The VectorType::Builder also set the ArrayRef<bool> scalableDims member to a temporary SmallVector when the provided scalableDims are empty. This again lead to a use-after-free, and is unnecessary as VectorType::get already handles being passed an empty scalableDims array. These bugs were in-part caught by UBSAN, see: https://lab.llvm.org/buildbot/#/builders/5/builds/37355

llvmbot · 2023-10-13T09:46:07Z

@llvm/pr-subscribers-mlir-core

@llvm/pr-subscribers-mlir

Author: Benjamin Maxwell (MacDue)

Changes

Previously, these would set their ArrayRef members to reference their storage SmallVectors after a copy-on-write operation. This leads to a use-after-free if the builder is copied and the original destroyed (as the new builder would still reference the old SmallVector).

The VectorType::Builder also set the ArrayRef<bool> scalableDims member to a temporary SmallVector when the provided scalableDims are empty. This again lead to a use-after-free, and is unnecessary as VectorType::get already handles being passed an empty scalableDims array.

These bugs were in-part caught by UBSAN, see:
https://lab.llvm.org/buildbot/#/builders/5/builds/37355

Full diff: https://github.com/llvm/llvm-project/pull/68969.diff

1 Files Affected:

(modified) mlir/include/mlir/IR/BuiltinTypes.h (+24-19)

diff --git a/mlir/include/mlir/IR/BuiltinTypes.h b/mlir/include/mlir/IR/BuiltinTypes.h
index 9df5548cd5d939c..13fbae90b68c6cb 100644
--- a/mlir/include/mlir/IR/BuiltinTypes.h
+++ b/mlir/include/mlir/IR/BuiltinTypes.h
@@ -277,7 +277,7 @@ class RankedTensorType::Builder {
     if (storage.empty())
       storage.append(shape.begin(), shape.end());
     storage.erase(storage.begin() + pos);
-    shape = {storage.data(), storage.size()};
+    shape = {};
     return *this;
   }
 
@@ -287,12 +287,17 @@ class RankedTensorType::Builder {
     if (storage.empty())
       storage.append(shape.begin(), shape.end());
     storage.insert(storage.begin() + pos, val);
-    shape = {storage.data(), storage.size()};
+    shape = {};
     return *this;
   }
 
+  /// Returns the current shape.
+  ArrayRef<int64_t> getShape() {
+    return shape.empty() ? ArrayRef(storage) : shape;
+  }
+
   operator RankedTensorType() {
-    return RankedTensorType::get(shape, elementType, encoding);
+    return RankedTensorType::get(getShape(), elementType, encoding);
   }
 
 private:
@@ -319,20 +324,11 @@ class VectorType::Builder {
   /// Build from scratch.
   Builder(ArrayRef<int64_t> shape, Type elementType,
           unsigned numScalableDims = 0, ArrayRef<bool> scalableDims = {})
-      : shape(shape), elementType(elementType) {
-    if (scalableDims.empty())
-      scalableDims = SmallVector<bool>(shape.size(), false);
-    else
-      this->scalableDims = scalableDims;
-  }
+      : shape(shape), elementType(elementType), scalableDims(scalableDims) {}
 
   Builder &setShape(ArrayRef<int64_t> newShape,
                     ArrayRef<bool> newIsScalableDim = {}) {
-    if (newIsScalableDim.empty())
-      scalableDims = SmallVector<bool>(shape.size(), false);
-    else
-      scalableDims = newIsScalableDim;
-
+    scalableDims = newIsScalableDim;
     shape = newShape;
     return *this;
   }
@@ -351,9 +347,8 @@ class VectorType::Builder {
       storageScalableDims.append(scalableDims.begin(), scalableDims.end());
     storage.erase(storage.begin() + pos);
     storageScalableDims.erase(storageScalableDims.begin() + pos);
-    shape = {storage.data(), storage.size()};
-    scalableDims =
-        ArrayRef<bool>(storageScalableDims.data(), storageScalableDims.size());
+    shape = {};
+    scalableDims = {};
     return *this;
   }
 
@@ -363,12 +358,22 @@ class VectorType::Builder {
       storage.append(shape.begin(), shape.end());
     assert(pos < storage.size() && "overflow");
     storage[pos] = val;
-    shape = {storage.data(), storage.size()};
+    shape = {};
     return *this;
   }
 
+  /// Returns the current shape.
+  ArrayRef<int64_t> getShape() {
+    return shape.empty() ? ArrayRef(storage) : shape;
+  }
+
+  /// Returns the current scalable dims.
+  ArrayRef<bool> getScalableDims() {
+    return scalableDims.empty() ? ArrayRef(storageScalableDims) : scalableDims;
+  }
+
   operator VectorType() {
-    return VectorType::get(shape, elementType, scalableDims);
+    return VectorType::get(getShape(), elementType, getScalableDims());
   }
 
 private:

mlir/include/mlir/IR/BuiltinTypes.h

Easy to mess up 🤦

This is a safer helper class that avoids the issues with the previous manual approaches. It also cleans up the builder implementations.

banach-space

Overall makes sense - thanks for the fix! `

CopyOnWriteArrayRef` feels like something that should be moved elsewhere in LLVM/MLIR (i.e. I see it being used/needed more widely). So not sure.

I think that this deserves some space in the summary:

Bug not triggered here:

// One builder is constructed. Updated by reference via the dropDim calls.
// Then converted to a vector type.
VectorType newType = VectorType::Builder(type).dropDim(0).dropDim(1);

Bug triggered here:

// One builder is constructed, updated by reference, then assigned 
// to auto, which ends up copying the builder. The original temporary
// builder is destroyed, and `shape` now points to junk.
auto newType = VectorType::Builder(type).dropDim(0).dropDim(1);
VectorType newVectorType = VectorType(newType);

mlir/include/mlir/IR/BuiltinTypes.h

MacDue · 2023-10-17T11:26:13Z

CopyOnWriteArrayRef` feels like something that should be moved elsewhere in LLVM/MLIR (i.e. I see it being used/needed more widely). So not sure.

I don't think CopyOnWriteArrayRef should be moved somewhere more general just yet though, since it's rather bare bones, implementing only what's needed for the builders.

MacDue · 2023-10-17T11:26:51Z

(I've now put an annotated example of the issue in the summary)

joker-eph

The new CopyOnWriteArrayRef is just the nice abstraction here that makes this safe, thanks! :)

mlir/include/mlir/IR/BuiltinTypes.h

+ Some minor tweaks

banach-space

Lovely clean-up, thank you Ben!

Local branch amd-gfx b9f9dea Merged main:675231eb09ca into amd-gfx:a6e32de5b204 Remote branch main b0b8e83 [mlir] Fix use-after-free bugs in {RankedTensorType|VectorType}::Builder (llvm#68969)

MacDue requested review from c-rhodes, banach-space and nicolasvasilache October 13, 2023 09:45

llvmbot added mlir:core MLIR Core Infrastructure mlir labels Oct 13, 2023

c-rhodes reviewed Oct 16, 2023

View reviewed changes

mlir/include/mlir/IR/BuiltinTypes.h Outdated Show resolved Hide resolved

MacDue added 4 commits October 16, 2023 10:41

Fixup asserts and add some simple tests

1ef6ebb

Add a few more tests

6a462fd

Fix some unsafe std::initializer_list list usages in tests

1fb8e9a

Easy to mess up 🤦

Move builder COW pattern to safe CopyOnWriteArrayRef<T> class.

e7ca4a9

This is a safer helper class that avoids the issues with the previous manual approaches. It also cleans up the builder implementations.

banach-space reviewed Oct 17, 2023

View reviewed changes

mlir/include/mlir/IR/BuiltinTypes.h Outdated Show resolved Hide resolved

joker-eph approved these changes Oct 17, 2023

View reviewed changes

mlir/include/mlir/IR/BuiltinTypes.h Outdated Show resolved Hide resolved

Move CopyOnWriteArrayRef to mlir/Support/ADTExtras.h

1e6248d

+ Some minor tweaks

banach-space approved these changes Oct 18, 2023

View reviewed changes

MacDue merged commit b0b8e83 into llvm:main Oct 18, 2023
3 checks passed

MacDue deleted the ub_fix branch October 18, 2023 15:02

madhur13490 mentioned this pull request Oct 20, 2023

Revert commit ba8565fbcb975e2d067ce3ae5a7dbaae4953edd3 madhur13490/llvm-project#3

Closed

banach-space mentioned this pull request Oct 24, 2023

[mlir][vector] Add scalable vectors to tests for vector.contract #70039

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[mlir] Fix use-after-free bugs in {RankedTensorType|VectorType}::Builder #68969

[mlir] Fix use-after-free bugs in {RankedTensorType|VectorType}::Builder #68969

MacDue commented Oct 13, 2023 •

edited

llvmbot commented Oct 13, 2023 •

edited

banach-space left a comment

MacDue commented Oct 17, 2023

MacDue commented Oct 17, 2023

joker-eph left a comment

banach-space left a comment

[mlir] Fix use-after-free bugs in {RankedTensorType|VectorType}::Builder #68969

[mlir] Fix use-after-free bugs in {RankedTensorType|VectorType}::Builder #68969

Conversation

MacDue commented Oct 13, 2023 • edited

llvmbot commented Oct 13, 2023 • edited

banach-space left a comment

Choose a reason for hiding this comment

MacDue commented Oct 17, 2023

MacDue commented Oct 17, 2023

joker-eph left a comment

Choose a reason for hiding this comment

banach-space left a comment

Choose a reason for hiding this comment

MacDue commented Oct 13, 2023 •

edited

llvmbot commented Oct 13, 2023 •

edited