Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Feral more amenable to SnapStart optimization? #299

Open
cb372 opened this issue Dec 20, 2022 · 1 comment
Open

Make Feral more amenable to SnapStart optimization? #299

cb372 opened this issue Dec 20, 2022 · 1 comment

Comments

@cb372
Copy link

cb372 commented Dec 20, 2022

A few weeks ago AWS announced SnapStart, a feature to improve cold-start performance for JVM Lambdas. A few lines of config to make my Lambdas magically faster? Yes please!

With SnapStart, Lambda initializes your function when you publish a function version. Lambda takes a Firecracker microVM snapshot of the memory and disk state of the initialized execution environment, encrypts the snapshot, and caches it for low-latency access. When you invoke the function version for the first time, and as the invocations scale up, Lambda resumes new execution environments from the cached snapshot instead of initializing them from scratch, improving startup latency.

My understanding of that paragraph is that they will "initialize" by calling the constructor on the handler class, but they won't invoke the Lambda, i.e. call the handler method.

Unfortunately (if I'm understanding Feral's code correctly) it appears that Feral Lambdas won't benefit much from this optimization. As I understand it, Feral does pretty much nothing at class initialization time. The resource defined in def init is acquired the first time the Lambda is invoked, and then memoized for reuse by subsequent invocations. So the SnapStart snapshot will capture the JVM startup and a bit of classloading, but none of the work performed while acquiring the init resource.

It would be nice if we could make everything that happens in IOSetup happen eagerly at class init time to take full advantage of SnapStart.

For now we can emulate this in user-land by eschewing def init and just doing a good old unsafeRunSync:

class MyLambda extends IOLambda.Simple[KinesisStreamEvent, INothing]:
  type Init = Unit

  private def buildAlgebra: IO[MyAlgebra[IO]] = ??? // the stuff that would usually go in `def init`

  private val algebra: MyAlgebra[IO] = buildAlgebra[IO].unsafeRunSync()

  override def apply(
      event: KinesisStreamEvent,
      context: Context[IO],
      init: Init
  ): IO[Option[INothing]] = algebra.process(event, context)

Disclaimer: I haven't done any benchmarking with SnapStart and Feral yet.

@armanbilge
Copy link
Member

Thanks, @kubukoz recently brought this to my attention as well.

If someone wants to experiment with this I am certainly open to making changes that will improve the experience, although at the moment it's not really obvious to me why they would be necessary.

As you point out, if you have some stuff that you would like captured in the snapshot you can just define an ordinary val.

However, this is icky if the things you are initializing are in IO. But the reason things are in IO is because they are side-effects and I am skeptical it you can snapstart these things at all.

For example, see the "compatibility considerations" in the snapstart documentation, which warn about:

Uniqueness If your initialization code generates unique content that is included in the snapshot, then the content might not be unique when it is reused across execution environments.

Network connections The state of connections that your function establishes during the initialization phase isn't guaranteed when Lambda resumes your function from a snapshot.

Temporary data Some functions download or initialize ephemeral data, such as temporary credentials or cached timestamps, during the initialization phase.

https://docs.aws.amazon.com/lambda/latest/dg/snapstart.html#snapstart-compatibility

All of these things are side-effects you acquire in IO and are unsurprisingly exactly the sort of thing you cannot or should not snapstart.

Meanwhile, I suspect the sorts of things you can safely snapstart will not be in IO (since they are not side-effects). In which case, you can just define a val :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants