-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Force deabstraction of functions that take or return Tensors if referred in code. #19052
Conversation
…re referred to in the code after a first round of deabstraction.
@swift-ci please test tensorflow linux |
@swift-ci please test tensorflow macos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that this is not optimal, but I think it should work in the near term. Please file a bug to track improving this. A sketch:
-
Make our existing pass over functions, deabstracting as we go.
-
Do a second pass where we deabstract all functions, regardless of whether they have tensor args or results. Many would be dead, but the indirectly used ones would now work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix makes sense, unblocking lots of cross-module cases.
There is a more general problem though: whenever there’s a call to an opaque function which calls a protocol method (e.g. Equatable.==) passing tensor arguments whose implementation of this protocol is inlinable, execution will fail. This is because Tensor’s protocol implementation is never getting partitioned.
To resolve the bigger problem later down the road, partitioning generic functions should be supported and will require a major architectural change.
// Return true if it is referenced somewhere. | ||
// TODO: There might be a better check other than fn->getRefCount() > 0. | ||
// See the clean up code in TFDeabstraction::inlineCalls() function. | ||
return (fn->getRefCount() > 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary parentheses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having this as a short-term patch sounds good, but please add some unit test.
To avoid breaking unit tests, one option is to flag-control the value of forceTFFunctions
, and only turn it on in REPL mode. This will buy us time to rewrite the relevant unit tests -- if we don't want a function to be partitioned (e.g. if the func takes a string/int/float param, and that param is used in a tfop attr), we can mark it " inline always". Once we finish this on all tests, we can then retire that flag.
@@ -277,7 +277,11 @@ class TensorFunctionClassifier { | |||
/// example) for inlined functions that take and return tensors, since we | |||
/// know that they are either unreachable or will be inlined into any | |||
/// clients that use them. | |||
bool shouldBePartitioned(SILFunction *fn); | |||
/// | |||
/// If the flag forceTFFunctions is true, it forces partitioning of functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the comment does not quite reflect the impl, because there are conditions that cause the function to return false.
e.g. if (isAvailableExternally(fn->getLinkage()))
. Also when a function is marked inline always.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
/// If the flag forceTFFunctions is true, it forces partitioning of functions | ||
/// that operate on Tensor values irrespective of whether they are inlinable, | ||
/// private, etc. | ||
bool shouldBePartitioned(SILFunction *fn, bool forceTFFunctions = false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this func does not have many call-sites, it could be less error prone if we don't use a default value for forceTFFunctions
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// are functions that are *still* referred to in the code and operate on | ||
// tensor values, but have not been partitioned. It can happen in the | ||
// following case, for instance, where `foo` is an external function that | ||
// takes a for which we do not have the body: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is "takes a for"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo. Fixed the comment.
// following case, for instance, where `foo` is an external function that | ||
// takes a for which we do not have the body: | ||
// main() { | ||
// foo() { $0 -= 0.1 * $1 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we turn something like this into a unit test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a unit test in deabstraction_finished.swift
// | ||
// (Note the body of a function may be missing when we are linking against a | ||
// library or in the REPL context where the function was defined on a | ||
// different line.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "different line" might not be the right description.
@marcrasi showed me earlier that you can copy/paste multi-line text into REPL, and that'll be executed as one "cell".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling it REPL line now. Not sure if that is the right terminology.
for (auto &fn : *module) { | ||
// Skip if it is already partitioned, or if it was ignored only because it | ||
// operated on tensor values. | ||
if (partitionedFunctions.count(&fn) > 0 || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment block motivating this two-round / two-pass design?
IIUC, in your example above, if we just use tfc.shouldBePartitioned(&fn, /*forceTFFunctions=*/true)
in the first round, and make it the only round, we'll be able to partition the private closure, so it achieves the same correctness.
the two-round design is an optimization to minimize the set of functions to partition (would be nice to have a unit test that shows its effects -- have a function with no caller left after the first round finishes, and confirm it's not partitioned in the second round).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few lines motivating the two round design.
@@ -631,7 +631,8 @@ tf::createTensorToInt1Inst(SILValue value, SILBuilder &builder, | |||
/// example) for inlined functions that take and return tensors, since we | |||
/// know that they are either unreachable or will be inlined into any | |||
/// clients that use them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider removing this comment block as it's now out of date (the new param is not documented).
(i'd also be fine with keeping it in sync with the corresponding comment block in the header file.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the block comment altogether.
@rxwei To help me understand this larger problem, can you provide a concrete example? (It might be useful to create a bug to track that.) |
A common, concrete example is to try to use This is not a small bug, but a large hole in the architecture in that there's no way to even reject it from the compiler. These issues are not important in the initial GPE because the library ecosystem using I'll definitely file a bug soon. I was also planning to write a detailed document describing the issue, but it's not the best time until we hit the tutorial goal. |
@lattner , this PR already implements it in two passes as you outlined. Did I misunderstand your comment?
|
@bgogul, I believe the difference is: If we change the example from main() {
foo() { $0 -= 0.1 * $1 }
} to main() {
foo() { #tfop("SomeOp", ...) }
} This patch probably won't work. That is, if the private closure does not take or return tensors. Another issue to take care (both in Chris' sketch and this PR) is that if the private closure takes a string/int/float param to be used in a tfop attr, as in: main() {
foo() { (s: String, i: Int) -> () in #tfop("SomeOp", intAttr: i, sttrAttr: s) }
} The private closure cannot be partitioned due to const tfop attr requirement (as you observed). In that case it'd be useful to issue a good diagnostic message, asking user to convert the closure to a function, so that it can be annotated with something like "inline always", and as such compiler won't partition it. |
In cases like this, it's also possible to make the closure function serialized (inlinable) by simply adding a |
Added unit tests and addressed comments. PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
This is a somewhat hacky patch to deal with SR-8589 related to the Jupyter tutoral. In the REPL mode (both in the in-build swift and lldb -r), the SIL module for a REPL expr is thrown away after lowering to LLVM IR. Consequently, when processing subsequent lines, TFDeabstraction does not see the body of some functions being invoked, which causes some functions not to be partitioned.
This PR performs a second round of deabstraction (and subsequent partitioning) of functions taking and returning TensorValues if they are still referred in the code after the first round.
BTW, this is not only relevant to Jupyter tutorial. We will encounter this if we allow linking to other swift-tensorflow binaries. We will need to think about how to deal with such cases in general.
@mhong, when I simply removed the condition that checks if a function operates on tensors, some tests started failing as attributes were no longer const-evaluable. That is why I am sending this instead of what we discussed.