Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store reused expressions in intermediate variables instead of reevaluating them in forward mode. Remove unnecessary cloning in forward mode. Refactor DerivativeBuilder.cpp #80

Merged

Conversation

efremale
Copy link
Contributor

Forward mode now supports storing expression results in intermediate
variables. This allows to avoid reevaluation if derived expressions are
used on several places. For example, for:

double t = std::sin(x) * std::cos(x);

We now produce:

double _t0 = std::sin(x);
double _t1 = std::cos(x);
double _d_t = sin_darg0(x) * (_d_x) * _t1 + _t0 * cos_darg0(x) * (_d_x);
double t = _t0 * _t1;

Instead of:

double _d_t = sin_darg0(x) * (_d_x) * cos(x) + sin(x) * cos_darg0(x) * (_d_x);
double t = sin(x) * cos(x);

which allows us to avoid reevaluation of sin and cos.

This commit solves the issue #47 in case of forward mode.
For example, when deriving BinaryOperator, instead of cloning its whole
tree and visiting it (potentially cloning the subtrees again), e.g.

auto ClonedBO = Clone(BO); // both LHS and RHS are cloned
auto LHS = ClonedBO->getLHS();
auto LHSDiff = Visit(LHS) // LHS will be cloned inside again
auto RHS = ClonedBO->getRHS();
auto RHSDiff = Visit(RHS);
auto BODiff = ...// Build BODiff from LHSDiff and RHSDiff

we first visit the whole tree and rebuild the original operation from
the results of subtrees, e.g.

auto [LHS, LHSDiff] = Visit(BO->getLHS());
auto [RHS, RHSDiff] = Visit(BO->getRHS());
auto ClonedBO = BuildOp(OpKind, LHS, RHS);
auto BODiff = ...// Build BODiff from LHSDiff and RHSDiff.

Some refactoring was applied to DerivativeBuilder.cpp to support new way
of visiting. Some functions of DerivativeBuilder and Visitors were moved
to VisitorBase.

@efremale efremale force-pushed the forward-mode-store-intermediate branch from 812d32e to e2dc98e Compare August 12, 2018 17:06
return llvm::cast<T>(getStmt());
}
};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced by StmtDiff class

}
};

static clang::SourceLocation noLoc{};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved from DerivativeBuilder.cpp to DerivativeBuilder.h

clang::BinaryOperatorKind OpCode,
clang::Expr* L,
clang::Expr* R);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to VisitorBase class

}
/// Output a statement to the current block.
void emitToCurrentBlock(clang::Stmt* S) {
currentBlock().push_back(S);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved from ReverseModeVisitor to VisitorBase

// Shorthands that delegate their functionality to DerviativeBuilder.
// Used to simplify the code.
clang::Stmt* Clone(const clang::Stmt* S);
clang::Expr* Clone(const clang::Expr* E);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to VisitorBase

@efremale efremale force-pushed the forward-mode-store-intermediate branch 5 times, most recently from 9f32086 to 71ace05 Compare August 12, 2018 21:27
Copy link
Owner

@vgvassilev vgvassilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some comments from the partial review.

Overall I like the direction. This PR adds a few new features!

It seems that some of the tests are broken by the new functionality. We compute wrongly the derivatives of some functions. I suspect we are missing braces in few places to capture the right operations priorities.

I have added a few style comments inline. The gist is:

  • We should not overuse auto -- if the casual reader cannot deduce (easily) the type of the RHS of the expression we should not use auto.
  • We should not overuse brace-init -- this saves time but makes the code less readable because prevents us to 'inject' more meta information in the form of local variables.
  • We should indent only when it is necessary -- we should try to use the space on the same line as much as we can and break only if we need to.

data[0] = diff;
}

clang::Stmt* Stmt() { return data[1]; }
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of the methods should contain a verb. This should be something like getOriginalStmt()...

Please update accordingly the rest of the names.

@@ -7,9 +7,11 @@
#ifndef CLAD_DERIVATIVE_BUILDER_H
#define CLAD_DERIVATIVE_BUILDER_H

#include "clang/Sema/Sema.h"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please order the includes alphabetically.

}

/// Get the latest block of code (i.e. place for statements output).
Stmts & currentBlock() {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you rename this to getCurrentBlock, getCurBlock (to be consistent in naming) or getActiveBlock (see comments below).

return m_Blocks.top();
}
/// Create new block.
Stmts & startBlock() {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally find beginBlock/endBlock conveying better the intent.

return CS;
}
/// Output a statement to the current block.
void emitToCurrentBlock(clang::Stmt* S) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In clang lingo emit is often used in the context of codegen. Could we rename the method to addToCurrentBlock or addToActiveBlock?

elseBody);
currentBlock().push_back(ifStmt);
auto ifStmt =
new (m_Context) IfStmt(m_Context,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to prev line.

// CHECK-NEXT: int _t0 = x * y;
// CHECK-NEXT: int _t1 = _t0 * x;
// CHECK-NEXT: int _t2 = _t1 * 3;
// CHECK-NEXT: return 1 * y + x * _d_y * x + _t0 * 1 * 3 + _t1 * 0 * x + _t2 * 1;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem to produce 9x^2y.

// CHECK-NEXT: x *= 0.;
// CHECK-NEXT: return 1.;
// CHECK-NEXT: }
// FIXME: +=, etc. operators are not currently supported, ignore for now.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you open a ticket for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: #85

// CHECK-NEXT: int _t2 = _t1 * y;
// CHECK-NEXT: int _t3 = _t2 / y;
// CHECK-NEXT: int _t4 = _t3 * 3;
// CHECK-NEXT: return (((1 * x + x * 1 * x - _t0 * 1) / (x * x) * y + _t1 * _d_y * y - _t2 * _d_y) / (y * y) * 3 + _t3 * 0 * 3 - _t4 * 0) / (3 * 3);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look like this does not evaluate to 1 but to 1/3xy see here.

@@ -18,39 +18,40 @@ float g(float x) {
}

// CHECK: float g_darg0(float x) {
// CHECK-NEXT: return f_darg0(x * x * x) * (((1.F * x + x * 1.F) * x + x * x * 1.F));
// CHECK: float _t0 = x * x;
// CHECK-NEXT: f_darg0(_t0 * x) * (1.F * x + x * 1.F * x + _t0 * 1.F);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second term of the expression seems to evaluate to 2x^2*x instead of 3x^2.

@efremale efremale force-pushed the forward-mode-store-intermediate branch 3 times, most recently from bb23b07 to ddc36fd Compare August 18, 2018 22:15
Copy link
Owner

@vgvassilev vgvassilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for improving the code!

I have added a few comments. The gradient computations seem to deviate from the reference files. I have not investigated if the new values are correct and the old wrong or vice versa.

Please add your new changes in a separate commit. This will save me time because I have validated all changes in the test suite. This way I will only look at the changes you made and re-verify the relevant tests. Before merging we will squash all in one.

@@ -8,8 +8,10 @@
#define CLAD_DERIVATIVE_BUILDER_H

#include "clang/AST/RecursiveASTVisitor.h"
#include "clang/Sema/Sema.h"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move it after the StmtVisitor.h please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

clang::Expr* BuildOp(clang::UnaryOperatorKind OpCode, clang::Expr* E) {
return m_Builder.BuildOp(OpCode, E);
/// Get the latest block of code (i.e. place for statements output).
Stmts & getCurrentBlock() {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you put the & next to Stmts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
/// Remove the block from the stack, wrap it in CompoundStmt and return it.
clang::CompoundStmt* endBlock() {
auto CS = MakeCompoundStmt(getCurrentBlock());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we are suboptimal here in the case where we have only 1 statement in the block. Perhaps all current uses do not leave just a single statement in a block but let's protect it from our-future-selves.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In many cases, it is actually desirable to get a CompoundStmt, even if it has only 1 statement inside. E.g. when visiting another CompoundStmt (especially a function body).

clang::Stmt* BuildDeclStmt(llvm::MutableArrayRef<clang::Decl*> DS);

/// Builds a DeclRefExpr to a given Decl.
clang::Expr* BuildDeclRef(clang::VarDecl* D);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like Sema::BuildDeclRefExpr returns a DeclRefExpr, could you narrow the return type of the routine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sema::BuildDeclRefExpr returns ExprResult (alias for ActionResult<Expr*>), and .get() on it returns Expr*. We can cast it to DeclRefExpr*, though it seems to be unnecessary since in most of the cases getting Expr* is sufficient.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a generic concept in Sema. Here we know what we expect. The return type narrowing has a lot of advantages from making the code clearer to detecting silent change of behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure that BuildDeclRefExpr always returns DeclRefExpr though? For example, BuildBinOp may return either BinaryOperator or CXXOperatorCallExpr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest commit narrows return types.

bool DirectInit = false);
/// Wraps a declaration in DeclStmt.
clang::Stmt* BuildDeclStmt(clang::Decl* D);
clang::Stmt* BuildDeclStmt(llvm::MutableArrayRef<clang::Decl*> DS);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you narrow the return type to clang::DeclStmt in both overloads?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly to the above, with StmtResult/Stmt*.

// CHECK-NEXT: return 0;
// CHECK-NEXT: else
// CHECK-NEXT: } else {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -15,10 +15,11 @@ int f_pow(int arg, int p) {
}

// CHECK: int f_pow_darg0(int arg, int p) {
// CHECK-NEXT: if (p == 0)
// CHECK-NEXT: if (p == 0) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

// CHECK_GRADIENT: float _t3 = (y - yc);
// CHECK_GRADIENT: float _t4 = (z - zc);
// CHECK_GRADIENT: float _t5 = (z - zc);
// CHECK_GRADIENT: return (1.F - 0.F) * _t1 + _t0 * (1.F - 0.F) + (0.F - 0.F) * _t3 + _t2 * (0.F - 0.F) + (0.F - 0.F) * _t5 + _t4 * (0.F - 0.F) - (0.F * r + r * 0.F);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this yields different result.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be the same.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I screwed up checking.

// CHECK_GRADIENT: float _t2 = (y - yc);
// CHECK_GRADIENT: float _t3 = (y - yc);
// CHECK_GRADIENT: float _t4 = (z - zc);
// CHECK_GRADIENT: float _t5 = (z - zc);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like _t5, _t3 and _t1 are redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the original function uses (x - xc) on several places, we get several temporaries for it. We do not check the original function for repeated expressions, it would add quite a huge computational overhead. Moreover, since it was acceptable for the original function to use the same expression repeatedly, it is presumably acceptable to do so in the derivative.

Copy link
Contributor Author

@efremale efremale Aug 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I.e. the original function is

return (x-xc)*(x-xc) + (y-yc)*(y-yc) + (z-zc)*(z-zc) - r*r;

instead of

auto t0 = (x - xc);
auto t1 = (y - yc);
auto t2 = (z - zc);
return t0 * t0 + t1 * t1 + t2 * t2 - r * r;

Avoiding redundant temporaries would require us to transform former to later, which is computationally expensive in general, we should hope that the one who writes the original function takes care of this.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's resolve this by opening an issue to document this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: #88

// CHECK_GRADIENT: float _t3 = (y - yc);
// CHECK_GRADIENT: float _t4 = (z - zc);
// CHECK_GRADIENT: float _t5 = (z - zc);
// CHECK_GRADIENT: return (0.F - 0.F) * _t1 + _t0 * (0.F - 0.F) + (0.F - 0.F) * _t3 + _t2 * (0.F - 0.F) + (1.F - 0.F) * _t5 + _t4 * (1.F - 0.F) - (0.F * r + r * 0.F);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be the same.

@efremale efremale force-pushed the forward-mode-store-intermediate branch 4 times, most recently from d7f60c1 to 818d743 Compare August 19, 2018 18:22
@@ -169,7 +169,12 @@ namespace clad {
template <std::size_t N>
void diag(clang::DiagnosticsEngine::Level level, // Warning or Error
const char (&format)[N],
llvm::ArrayRef<llvm::StringRef> args = {},
llvm::ArrayRef<llvm::StringRef> args,
clang::SourceLocation loc = {});
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We must not provide easy-to-use API for emission of diagnostics with no source location. This makes them quite useless.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@efremale efremale force-pushed the forward-mode-store-intermediate branch 3 times, most recently from d4be164 to f7dd27a Compare August 20, 2018 18:58
Forward mode now supports storing expression results in intermediate
variables. This allows to avoid reevaluation if derived expressions are
used on several places. For example, for:
```
double t = std::sin(x) * std::cos(x);
```
We now produce:
```
double _t0 = std::sin(x);
double _t1 = std::cos(x);
double _d_t = sin_darg0(x) * (_d_x) * _t1 + _t0 * cos_darg0(x) * (_d_x);
double t = _t0 * _t1;
```
Instead of:
```
double _d_t = sin_darg0(x) * (_d_x) * cos(x) + sin(x) * cos_darg0(x) * (_d_x);
double t = sin(x) * cos(x);
```
which allows us to avoid reevaluation of sin and cos.

This commit also removes unnecessary cloning in forward mode, solving the issue vgvassilev#47
in case of forward mode.
For example, when deriving BinaryOperator, instead of cloning its whole
tree and visiting it (potentially cloning the subtrees again), e.g.
```
auto ClonedBO = Clone(BO); // both LHS and RHS are cloned
auto LHS = ClonedBO->getLHS();
auto LHSDiff = Visit(LHS) // LHS will be cloned inside again
auto RHS = ClonedBO->getRHS();
auto RHSDiff = Visit(RHS);
auto BODiff = ...// Build BODiff from LHSDiff and RHSDiff
```
we first visit the whole tree and rebuild the original operation from
the results of subtrees, e.g.
```
auto [LHS, LHSDiff] = Visit(BO->getLHS());
auto [RHS, RHSDiff] = Visit(BO->getRHS());
auto ClonedBO = BuildOp(OpKind, LHS, RHS);
auto BODiff = ...// Build BODiff from LHSDiff and RHSDiff.
```

Some refactoring was applied to DerivativeBuilder.cpp to support new way
of visiting. Some functions of DerivativeBuilder and Visitors were moved
to VisitorBase.
@vgvassilev vgvassilev merged commit 3a058aa into vgvassilev:master Aug 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants