Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary size with stratosphere #95

Closed
utdemir opened this issue Jun 3, 2018 · 5 comments
Closed

Binary size with stratosphere #95

utdemir opened this issue Jun 3, 2018 · 5 comments

Comments

@utdemir
Copy link

utdemir commented Jun 3, 2018

Recently I noticed that Stratosphere was responsible for a large percentage of my binary size.

For example, below executable is 3.4 MB (with statically linked haskell libraries as in default GHC); but if you comment out the line which puts Stratosphere to the executable, it immediately becomes 36 MB.

module Main where

import Control.Lens
import Data.Aeson
import Stratosphere

main :: IO ()
main = do
  print $ encode ()
  print $ Just () ^? _Just
  -- print $ template (Resources [])

I am aware that for most of the use cases this is not a big issue, but I am working on a distributed application that I need to share the binary and this is my main problem with this otherwise excellent library.

I think one of the reasons might be the way ResourceProperties is structured. Since it is a sum type of every possible resource, just depending on it pulls the whole thing to the executable. I can not really think of a solution, but maybe using a typeclass based approach one can just import the resources they need (like amazonka, I think) or we can have an intermediate data type that we first convert the parameters to.

@jdreaver
Copy link
Contributor

jdreaver commented Jun 3, 2018

The reason we have a big sum type and a single package is so we can read template files with types as well as write them. I'm considering abandoning that requirement though, since I'm not sure anyone does that very often. I'd love to retain some way to read templates in a type-safe way though, since splitting up this package could have a lot of benefits (compile time of dependencies, for one).

Currently the constructors for the big sum type have a type MyResourceProperties :: a -> ResourceProperties. I'm sure we could replace all of those with myResourceProperties :: a -> ResourceProperties and have ResourceProperties by some simple record type. Of course, then we need to actually decide how to split this package up ergonomically, and how to do that automatically.

Another idea is to hard-code GHC options in the cabal file to not run optimizations and focus on saving on binary size. By default it will use -O1, but I wonder if -O0 and some other options will trim things down. This certainly doesn't need to be super fast to generate relatively small JSON files 😄

I'm sure this is all technically possible (and probably not even that hard), but I'll let other chime in with thoughts.


I'm curious about your use case. Do you have a distributed application that generates CloudFormation templates via stratosphere on the fly, so it needs stratosphere embedded? That sounds neat!

@utdemir
Copy link
Author

utdemir commented Jun 3, 2018

Hi @jdreaver , thank you very much for you detailed answer.

Now I understand the reasoning behind the big sum type, thank you. I understand that being able to read the template files can be useful; so it is totally up to you to decide :).

You are right, the approach you mentioned would work perfectly for me. But then I have no idea how you'd implement the read logic there.

I tried compiling stratosphere with -O0; that reduced the binary size but not much, from 36M to 32M. I haven't fiddled with any other flags tho. But I think it might still worthwhile unless we find a better solution.

As I said, I know that this issue is not a priority, and I am perfectly fine if you decide not to action it; thank you for spending your time on it :).


Here is my usage of stratosphere, specifically this function. What that library does is that it uploads itself to S3 and creates a small stack using stratosphere that contains a Lambda function which invokes the same binary on AWS Lambda. Theoretically I don't use stratosphere ever again on distributed machines so I don't actually need to distribute it; but because of the way the rest of the codebase work I need to use exactly the same binary, so I can not compile my clients separately without stratosphere.

My problem there is that binary size is a bit important for me because:

  1. I upload the binary to S3 first time you run the executable.
  2. Most of the times, every invocation of the lambda function downloads the binary from S3 again.

So having a small binary directly results on getting your result faster. Technically I don't use stratosphere extensively, my template only takes about 30 lines; I can just remove stratosphere and hard-code a JSON string. But I really like the ease-of-use and safety Stratosphere provides, so it'd be cool to keep using it :).

@utdemir
Copy link
Author

utdemir commented Mar 11, 2019

After #118, I think we are one step closer solving this PR. @jdreaver do you have plans for removing ResourceProperties type in future? I think it might solve this issue too.

@jdreaver
Copy link
Contributor

Yeah I'm planning on removing that. I'll definitely need to do it in another PR and solicit feedback from folks, plus use it on all of our templates at Freckle to see how painful migrating is.

@jdreaver
Copy link
Contributor

Hey @utdemir you mentioned this would be fixed after #121 was merged. Let me know if it isn't!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants