Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mlir] Fix use-after-free bugs in {RankedTensorType|VectorType}::Builder #68969

Merged
merged 6 commits into from Oct 18, 2023

Conversation

MacDue
Copy link
Member

@MacDue MacDue commented Oct 13, 2023

Previously, these would set their ArrayRef members to reference their storage SmallVectors after a copy-on-write (COW) operation. This leads to a use-after-free if the builder is copied and the original destroyed (as the new builder would still reference the old SmallVector).

This could easily accidentally occur in code like (annotated):

// 1. `VectorType::Builder(type)` constructs a new temporary builder
// 2. `.dropDim(0)` updates the temporary builder by reference, and returns a `VectorType::Builder&`
//    - Modifying the shape is a COW operation, so `storage` is used, and `shape` updated the reference it
// 3. Assigning the reference to `auto` copies the builder (via the default C++ copy ctor)
//    -  There's no special handling for `shape` and `storage`, so the new shape points to the old builder's `storage`
auto newType = VectorType::Builder(type).dropDim(0);
// 4. When this line is reached the original temporary builder is destroyed
//    - Actually constructing the vector type is now a use-after-free
VectorType newVectorType = VectorType(newType);

This is fixed with these changes by using CopyOnWriteArrayRef<T>, which implements the same functionality, but ensures no
dangling references are possible if it's copied.


The VectorType::Builder also set the ArrayRef scalableDims member to a temporary SmallVector when the provided scalableDims are empty. This again leads to a use-after-free, and is unnecessary as VectorType::get already handles being passed an empty scalableDims array.

These bugs were in-part caught by UBSAN, see:
https://lab.llvm.org/buildbot/#/builders/5/builds/37355

Previously, these would set their ArrayRef members to reference their
storage SmallVectors after a copy-on-write operation. This leads to a
use-after-free if the builder is copied and the original destroyed (as
the new builder would still reference the old SmallVector).

The VectorType::Builder also set the ArrayRef<bool> scalableDims member
to a temporary SmallVector when the provided scalableDims are empty.
This again lead to a use-after-free, and is unnecessary as
VectorType::get already handles being passed an empty scalableDims
array.

These bugs were in-part caught by UBSAN, see:
https://lab.llvm.org/buildbot/#/builders/5/builds/37355
@llvmbot llvmbot added mlir:core MLIR Core Infrastructure mlir labels Oct 13, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Oct 13, 2023

@llvm/pr-subscribers-mlir-core

@llvm/pr-subscribers-mlir

Author: Benjamin Maxwell (MacDue)

Changes

Previously, these would set their ArrayRef members to reference their storage SmallVectors after a copy-on-write operation. This leads to a use-after-free if the builder is copied and the original destroyed (as the new builder would still reference the old SmallVector).

The VectorType::Builder also set the ArrayRef<bool> scalableDims member to a temporary SmallVector when the provided scalableDims are empty. This again lead to a use-after-free, and is unnecessary as VectorType::get already handles being passed an empty scalableDims array.

These bugs were in-part caught by UBSAN, see:
https://lab.llvm.org/buildbot/#/builders/5/builds/37355


Full diff: https://github.com/llvm/llvm-project/pull/68969.diff

1 Files Affected:

  • (modified) mlir/include/mlir/IR/BuiltinTypes.h (+24-19)
diff --git a/mlir/include/mlir/IR/BuiltinTypes.h b/mlir/include/mlir/IR/BuiltinTypes.h
index 9df5548cd5d939c..13fbae90b68c6cb 100644
--- a/mlir/include/mlir/IR/BuiltinTypes.h
+++ b/mlir/include/mlir/IR/BuiltinTypes.h
@@ -277,7 +277,7 @@ class RankedTensorType::Builder {
     if (storage.empty())
       storage.append(shape.begin(), shape.end());
     storage.erase(storage.begin() + pos);
-    shape = {storage.data(), storage.size()};
+    shape = {};
     return *this;
   }
 
@@ -287,12 +287,17 @@ class RankedTensorType::Builder {
     if (storage.empty())
       storage.append(shape.begin(), shape.end());
     storage.insert(storage.begin() + pos, val);
-    shape = {storage.data(), storage.size()};
+    shape = {};
     return *this;
   }
 
+  /// Returns the current shape.
+  ArrayRef<int64_t> getShape() {
+    return shape.empty() ? ArrayRef(storage) : shape;
+  }
+
   operator RankedTensorType() {
-    return RankedTensorType::get(shape, elementType, encoding);
+    return RankedTensorType::get(getShape(), elementType, encoding);
   }
 
 private:
@@ -319,20 +324,11 @@ class VectorType::Builder {
   /// Build from scratch.
   Builder(ArrayRef<int64_t> shape, Type elementType,
           unsigned numScalableDims = 0, ArrayRef<bool> scalableDims = {})
-      : shape(shape), elementType(elementType) {
-    if (scalableDims.empty())
-      scalableDims = SmallVector<bool>(shape.size(), false);
-    else
-      this->scalableDims = scalableDims;
-  }
+      : shape(shape), elementType(elementType), scalableDims(scalableDims) {}
 
   Builder &setShape(ArrayRef<int64_t> newShape,
                     ArrayRef<bool> newIsScalableDim = {}) {
-    if (newIsScalableDim.empty())
-      scalableDims = SmallVector<bool>(shape.size(), false);
-    else
-      scalableDims = newIsScalableDim;
-
+    scalableDims = newIsScalableDim;
     shape = newShape;
     return *this;
   }
@@ -351,9 +347,8 @@ class VectorType::Builder {
       storageScalableDims.append(scalableDims.begin(), scalableDims.end());
     storage.erase(storage.begin() + pos);
     storageScalableDims.erase(storageScalableDims.begin() + pos);
-    shape = {storage.data(), storage.size()};
-    scalableDims =
-        ArrayRef<bool>(storageScalableDims.data(), storageScalableDims.size());
+    shape = {};
+    scalableDims = {};
     return *this;
   }
 
@@ -363,12 +358,22 @@ class VectorType::Builder {
       storage.append(shape.begin(), shape.end());
     assert(pos < storage.size() && "overflow");
     storage[pos] = val;
-    shape = {storage.data(), storage.size()};
+    shape = {};
     return *this;
   }
 
+  /// Returns the current shape.
+  ArrayRef<int64_t> getShape() {
+    return shape.empty() ? ArrayRef(storage) : shape;
+  }
+
+  /// Returns the current scalable dims.
+  ArrayRef<bool> getScalableDims() {
+    return scalableDims.empty() ? ArrayRef(storageScalableDims) : scalableDims;
+  }
+
   operator VectorType() {
-    return VectorType::get(shape, elementType, scalableDims);
+    return VectorType::get(getShape(), elementType, getScalableDims());
   }
 
 private:

This is a safer helper class that avoids the issues with the previous
manual approaches. It also cleans up the builder implementations.
Copy link
Contributor

@banach-space banach-space left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall makes sense - thanks for the fix! `

CopyOnWriteArrayRef` feels like something that should be moved elsewhere in LLVM/MLIR (i.e. I see it being used/needed more widely). So not sure.

I think that this deserves some space in the summary:

Bug not triggered here:

// One builder is constructed. Updated by reference via the dropDim calls.
// Then converted to a vector type.
VectorType newType = VectorType::Builder(type).dropDim(0).dropDim(1);

Bug triggered here:

// One builder is constructed, updated by reference, then assigned 
// to auto, which ends up copying the builder. The original temporary
// builder is destroyed, and `shape` now points to junk.
auto newType = VectorType::Builder(type).dropDim(0).dropDim(1);
VectorType newVectorType = VectorType(newType);

mlir/include/mlir/IR/BuiltinTypes.h Outdated Show resolved Hide resolved
@MacDue
Copy link
Member Author

MacDue commented Oct 17, 2023

CopyOnWriteArrayRef` feels like something that should be moved elsewhere in LLVM/MLIR (i.e. I see it being used/needed more widely). So not sure.

I don't think CopyOnWriteArrayRef should be moved somewhere more general just yet though, since it's rather bare bones, implementing only what's needed for the builders.

@MacDue
Copy link
Member Author

MacDue commented Oct 17, 2023

(I've now put an annotated example of the issue in the summary)

Copy link
Collaborator

@joker-eph joker-eph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new CopyOnWriteArrayRef is just the nice abstraction here that makes this safe, thanks! :)

mlir/include/mlir/IR/BuiltinTypes.h Outdated Show resolved Hide resolved
Copy link
Contributor

@banach-space banach-space left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lovely clean-up, thank you Ben!

@MacDue MacDue merged commit b0b8e83 into llvm:main Oct 18, 2023
3 checks passed
@MacDue MacDue deleted the ub_fix branch October 18, 2023 15:02
Guzhu-AMD pushed a commit to GPUOpen-Drivers/llvm-project that referenced this pull request Oct 20, 2023
Local branch amd-gfx b9f9dea Merged main:675231eb09ca into amd-gfx:a6e32de5b204
Remote branch main b0b8e83 [mlir] Fix use-after-free bugs in {RankedTensorType|VectorType}::Builder (llvm#68969)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mlir:core MLIR Core Infrastructure mlir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants