-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Treat action code as attachments #3945
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3945 +/- ##
==========================================
- Coverage 85.44% 80.71% -4.74%
==========================================
Files 146 146
Lines 7057 7056 -1
Branches 413 422 +9
==========================================
- Hits 6030 5695 -335
- Misses 1027 1361 +334
Continue to review full report at Codecov.
|
} | ||
} | ||
val attachment: Attachment[String] = { | ||
code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I read this correctly, an undefined code
results in a DeserializationException? If that's the case, why even bother checking for the "binary" field above if code is None
? The binary field also feels redundant, is it ever used?
As I read it, this could be simplified to:
val (binary, attachment) = code match {
case Some(JsString(c)) =>
(isBinaryCode(c), attFmt[String].read(c))
case _ =>
throw new DeserializationException(s"'code' must be a string defined in 'exec' for '$kind' actions")
}
Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chatted with @chetanmeh, I was indeed missing something:
code
can be a JsObject, that breaks my simplification of course.
This PR is ready for review. One aspect which is still to be addressed is should we raise the default |
val (bytes, attachmentType) = if (binary) { | ||
(Base64.getDecoder().decode(code), ContentTypes.`application/octet-stream`) | ||
} else { | ||
(code.getBytes("UTF-8"), ContentTypes.`text/plain(UTF-8)`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please replace "UTF-8" with StandardCharsets.UTF_8
@@ -300,33 +300,28 @@ protected[core] object Exec extends ArgNormalizer[Exec] with DefaultJsonProtocol | |||
case None => throw new DeserializationException(s"kind '$kind' not in $runtimes") | |||
} | |||
|
|||
manifest.attached.map { _ => | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad push, this is dead code (sorry I forgot to take that out of my patch)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aah ... missed checking prior to push. Fixed it now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chetanmeh, does this use the new database format for newly created actions? If so does this break any existing artifact stores?
@dubee new actions would be stored with {
"code": {
"attachmentName": "couch:f41be90d-5656-47f5-9be9-0d5656f7f555",
"attachmentType": "application/java-archive",
"length": 32768,
"digest": "sha256-5a373715b8cdcd608059debe9dae2ad95087eb5322321d1d0b862f8330a0e54d"
}
} Is that you were looking for or you meant something else? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chetanmeh, you may want to alert the dev-list about this change as it is rather significant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PG2 3498 ⏳
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chetanmeh, have you run performance tests against this to detect any cold start regressions that may arise from the double database fetch?
Ignore contentType info from manifest
c70724d
to
b142bc1
Compare
"attachmentName": "jarfile", | ||
"attachmentType": "application/java-archive" | ||
"attachmentName": "codefile", | ||
"attachmentType": "text/plain" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won’t break pre-existing java actions I presume.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes attachmentName
and attachmentType
are not used in actual flow. ContentType is just stored but not interpreted so far for any purpose
} getOrElse { | ||
.map { _ => | ||
// java actions once stored the attachment in "jar" instead of "code" | ||
val code = obj.fields.get("code").orElse(obj.fields.get("jar")).getOrElse { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can remove this code or jar (separately). It’s been a long time since I made this schema change.
s"finding attachment '[\\w-/:]+' of document 'id: ${action.namespace}/${action.name}").mkString("(?s).*") | ||
val nodeAction = WhiskAction(namespace, aname(), jsDefault(nonInlinedCode(entityStore)), Parameters("x", "b")) | ||
val swiftAction = WhiskAction(namespace, aname(), swift3(nonInlinedCode(entityStore)), Parameters("x", "b")) | ||
val actions = Seq((javaAction, JAVA_DEFAULT), (nodeAction, NODEJS6), (swiftAction, SWIFT3)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there defaults for the other runtimes to use as well? Node 6 and Swift 3 might have short shelf life. Not that imprtant can fix later if there isn’t a default.
@@ -1063,6 +1073,73 @@ class ActionsApiTests extends ControllerTestCommon with WhiskActionsApi { | |||
stream.reset() | |||
} | |||
|
|||
it should "ensure old and new action schemas are supported" in { | |||
implicit val tid = transid() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Thanks.
| "code": { | ||
| "attachmentName": "foo:bar", | ||
| "attachmentType": "application/java-archive", | ||
| "length": 32768, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this test.
lgtm. As for the performance impact, sure two fetches will be slower than one. The size of the inline chunk could be controlled to tune the performance hit (ie based on expected size of code for example, or set it to 50MB to recover existing behavior). Also whatever comparison in performance will likely not match a production environment and hence skewed. Lastly this is the direction we need to go toward for streaming in any case. |
If env CODE_SIZE is set then code would be padded with space * CODE_SIZE
I have sent a mail on dev sometime back about this change ... would that be ok?. I can also highlight this on wednesday call
@dubee I modified the
I tried to run the tests but the numbers are not stable
Numbers for
Below is aggregated stats As these are run on a local setup with all OpenWhisk component running on same host (client on another) they would not convey any real value. To get true sense of impact we would need to run same test on a more prod like setup where CouchDB is remote and also bit populated otherwise it would be streaming stuff from memory! Unfortunately I do not have access to such a setup so would not be able to get such numbers. |
@chetanmeh, I'll run some performances tests on this. |
What are you hoping to show? The real factor is the size of the Inlined code. If you have to do a fetch for the attachment it’s self evident the cost is additive. Having the size of the average code size and perhaps one std dev would be more informative imo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PG5 588 ⏳
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chetanmeh, should we update runtimes.json for php to support attachments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PG2 3498 🔵
@dubee Good point. Updated php runtime entry also to use attachment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran some performances tests on these changes. For testing, I lowered the attachment size threshold from 16K to 0KB, so that all action code would be cached. The existing performance tests that we have all passed.
Going to merge this one.
Thanks a lot for taking this over for me @chetanmeh, much appreciated!!
* Treat action code as attachments for all newly created or update action runtimes * Store base64 encoded streams in raw form * Add test for exec deserialization compatibility * Support code stored in jar property * Use octet stream type for binary and text/plain otherwise Ignore contentType info from manifest * Reword exception message to state that code can be string on object * Drop rewrite of jar files as base64 encoded file * Remove unnecessary special handling of java code * Fix expected attachmentType * Add some more compat tests * Simplify json deserialization based on Markus patch * Use std charset * Add support for tweaking code size via CODE_SIZE If env CODE_SIZE is set then code would be padded with space * CODE_SIZE * Use attachment for php runtime also
Treats action code as attachment for all kinds in default runtime.
Supersedes #3286.
Description
This PR builds on #3286 and enables use of attachment mode for all kinds. Some key aspects are
attachmentType
property of runtime.json is now not used. Instead one of the below is usedtext/plain(UTF-8)
- For plain string which are not base64 encodedapplication/octet-stream
- For codes which are Base64 string. Such codes are stored in raw form in persistent storeNote that with support for attachment inlining (#3709) it may happen that small code is still stored as part of same document (but in attachment object format)
Forms of code
Code related to an action can be stored in following forms
String property
For Blackbox action also the code is stored as string property
Attachment
Potential Performance Impact
Currently for non java kinds the code is stored as part of same document. With this change the code may be (if size >
max-inline-size
) stored as attachment. Due to this in case of action invocation if there is a cache miss then it would require 2 remote calls to load the whole action compared to 1 remote call currently.Note that post #2855 the attachments are now also cached. So if more new containers need to be spinned then they would not incur any further performance impact till action is cached.
Benefits
With this change it would be possible to store larger codes for setups using store which place restriction on maximum document size.
Further it also improves the heap usage for normal cases where the code is not required to be loaded. See #2847
TODO
moveCodeToAttachment.py
jar
property nameapplication/octet-stream
for binary code. Even for jars andtext/plain
for non binary codeRelated issue and scope
My changes affect the following components
Types of changes
Checklist: