Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute memory usage per location #538

Merged
merged 3 commits into from
Aug 29, 2023
Merged

Compute memory usage per location #538

merged 3 commits into from
Aug 29, 2023

Conversation

youben11
Copy link
Member

@youben11 youben11 commented Aug 7, 2023

No description provided.

Copy link
Member

@BourgerieQuentin BourgerieQuentin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's live review

#include <mlir/IR/Operation.h>
#include <mlir/Pass/Pass.h>

#include <concretelang/Support/CompilationFeedback.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here and same reflexion for statistics, why not follow the common pass pattern?


namespace mlir {
namespace concretelang {
namespace Concrete {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enter/exit pass could be factorized

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or even more the nested loop logic too

namespace concretelang {
namespace Concrete {

struct MemoryUsagePass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following the common pass pattern, this should be hidden in cpp and expose uniquely a createAnalysisConcreteMemoryUsage function.


namespace {

template <typename Op> std::string locationOf(Op op) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so all this things could be factorized

pass.iterations /= (uint64_t)numberOfIterations;
return std::nullopt;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

until here

return StringError(
"allocation of buffer with a non-supported element-type");
for (auto size : shape) {
if (size == mlir::ShapedType::kDynamic) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that actually happens?

continue;
// allocation inside the current location are computed separately
auto definingOp = operand.getDefiningOp();
if (definingOp && mlir::isa<memref::AllocaOp>(definingOp) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess typo here, AllocOp not AllocaOp

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!

return maybeBufferSize.error();
}
auto bufferSize = maybeBufferSize.value();
auto numberOfUsers = numberOfUsersInLocation(operand, op->getLoc());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its a bit hacky here and I would say it's not working (more than the rounding that could over estimate a bit the memory usage).

Let's consider this example :

%alloc_a = memref.alloc() : memref<...>
...

%alloc_b = memref.alloc() : memref<...>
scf.for ... {
  ...
  scf.for {
    %sub_a = memref.subview %alloc_a
    %sub_b = memref.subview %alloc_b

    op(%sub_a, %sub_b)
  }
}

Here the b location has as memory usage = %alloc_a + %alloc_b + %sub_a + %sub_b while its actually memory footprint should be just %alloc_a + %alloc_b. I think for each location you need to have a set of the actual buffers used, and for each operand of operations you should find back the actual origin of the buffer (i.e. a block argument or the allocation) and update the set.

Typically for the location b here you will have b => %alloc_a, %alloc_b

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I don't think we should consider subviews if we already considered the original buffer, as this shouldn't access a memory region that wasn't accessed before
  • The division is trying to avoid adding a memory buffer multiple times, just because it has been used by different operations, and in this case, subview is an operation that should be ignored, and thus, they aren't added by the ops themselves, but the op op. However, keeping the relationship between subviews and views isn't straightforward.

return error;
}
}
// ignore ops that have no memory effect
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure that is necessary to skip memory effect free ops... ??

}
}
// we already count allocations separately
if (definingOp && mlir::isa<memref::AllocOp>(definingOp) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this code is not necessary and we should not have a specific handle of the AllocOp, that will make the code simplest, wdyt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alloc is a simpler case than the generic function which needs to search for the defining op. And as we can add other simple operations in the future, it would be easier to add their respective functions (just like alloc), and ignore them here, instead of extending the logic here (adding two or three more operations here might start making it much more complex)

@youben11 youben11 merged commit 9e8c44e into main Aug 29, 2023
24 of 27 checks passed
@BourgerieQuentin BourgerieQuentin deleted the feat/stats-mem branch September 14, 2023 07:54
andidr added a commit that referenced this pull request Feb 7, 2024
…circuit parametrization

Since conversions at the edge of partitions determined by the
optimizer are specified at the producer of a value,
`MaterializePartitionBoundaryPattern` may agerly replace references to
the original value before conversion, resulting in conflicts.

This commit adds a check to the conflict handling routine that checks
if the producer of an operation with a conflicting operand is a
keyswitch operation that converts the original value of an extra
conversion with the required, non-conflicting type. If this is the
case, the conflict is handled by simply forwarding the original value
from the operand of the producing keyswitch operation.

Resolves Issue #538
(zama-ai/concrete-internal#538).
andidr added a commit that referenced this pull request Feb 7, 2024
…circuit parametrization

Since conversions at the edge of partitions determined by the
optimizer are specified at the producer of a value,
`MaterializePartitionBoundaryPattern` may agerly replace references to
the original value before conversion, resulting in conflicts.

This commit adds a check to the conflict handling routine that checks
if the producer of an operation with a conflicting operand is a
keyswitch operation that converts the original value of an extra
conversion with the required, non-conflicting type. If this is the
case, the conflict is handled by simply forwarding the original value
from the operand of the producing keyswitch operation.

Resolves Issue #538
(zama-ai/concrete-internal#538).
andidr added a commit that referenced this pull request Feb 7, 2024
…circuit parametrization

Since conversions at the edge of partitions determined by the
optimizer are specified at the producer of a value,
`MaterializePartitionBoundaryPattern` may agerly replace references to
the original value before conversion, resulting in conflicts.

This commit adds a check to the conflict handling routine that checks
if the producer of an operation with a conflicting operand is a
keyswitch operation that converts the original value of an extra
conversion with the required, non-conflicting type. If this is the
case, the conflict is handled by simply forwarding the original value
from the operand of the producing keyswitch operation.

Resolves Issue #538
(zama-ai/concrete-internal#538).
andidr added a commit that referenced this pull request Feb 20, 2024
…circuit parametrization

Since conversions at the edge of partitions determined by the
optimizer are specified at the producer of a value,
`MaterializePartitionBoundaryPattern` may eagerly replace references
to the original value before conversion, resulting in conflicts.

This commit adds a check to the conflict handling routine that checks
if the producer of an operation with a conflicting operand is a
keyswitch operation that converts the original value of an extra
conversion with the required, non-conflicting type. If this is the
case, the conflict is handled by simply forwarding the original value
from the operand of the producing keyswitch operation.

Resolves Issue #538
(zama-ai/concrete-internal#538).
andidr added a commit that referenced this pull request Feb 20, 2024
…circuit parametrization

Since conversions at the edge of partitions determined by the
optimizer are specified at the producer of a value,
`MaterializePartitionBoundaryPattern` may eagerly replace references
to the original value before conversion, resulting in conflicts.

This commit adds a check to the conflict handling routine that checks
if the producer of an operation with a conflicting operand is a
keyswitch operation that converts the original value of an extra
conversion with the required, non-conflicting type. If this is the
case, the conflict is handled by simply forwarding the original value
from the operand of the producing keyswitch operation.

Resolves Issue #538
(zama-ai/concrete-internal#538).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants