New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move packaging process into own plugin #1486
Comments
Add zipping of handler directories. Add a walkDirSync method which collects all the file paths in a directory (needed for the .zip file creation for "magic handlers") in a synchronous way.
@pmuens imho we should have a package step. This makes sure other plugins can hook in there as well. And the way to let following plugins know which artefact path they should use could simply be a configuration value in the serverless.yml, e.g.
By default people don't set it, the packaging plugin zips and puts this somewhere to be picked up, but if users want to build their own artefact they can and if the artefact_path variable is present the packaging plugin will simply do nothing. |
@flomotlik Sound reasonable. Should the Or should we define a |
Imho has to be in deploy, otherwise it won't run as people would have to run package:package directly. but we can implement a separate package command in the future imho. |
Yep. Missed that Here's how this might look like:
Not sure if we should name it |
|
@pmuens @flomotlik the only downside I see to this is the size of zip file... For example, one of the current functions we have written in Java with all the jars is 38MB, and the limit is 50MB... Would we have to separate out into different services and then keep track which lambda function has which handler? There doesn't seem a logical way to separate out the functions and services... Also, it would seem like function type would have to be per services not per function, is that correct? |
@license2e Is that only one function or are those 38MB combining code for multiple functions? We do want to provide a way in the future for you to build function specific packages, would be great to understand a little more how you're reusing code in services. |
Unfortunately one function, although, its currently pulling in ALL AWS SDK jars, so I will have to tweak the gradle zip settings to only pull in the ones that are actually used... |
While I like the simplicity this introduces (for us and for our users), this will make deployment exponentially slow, and will eat up lots of space both on S3 and Lambda, which is the main reason why we went with magic handlers in the first place. If a service has 10 functions, we'll end up deploying the entire service 10 times. Magic handlers allows users to organize their functions and their dependencies in such a way to minimize including code that is not used by the lambda/function. If all functions will have the same code footprint (most of which is not used by that function), then a service is eventually a big fat function. |
@eahefnawy I agree, with the setup we have and the number of functions we are writing ~1000, we will need those magic handlers. Otherwise, we will have to arbitrarily group certain functions together into a service, and manually keep track of which service has which function. |
At first, I was nervous about this. This potentially introduces problems which a lot of people complained about in V.0. However, I think they can be resolved, and this could ultimately be a step in a better direction. Concerns:
Solutions:
On a side note, I personally favor |
@ac360 I am still nervous about this, for exact concerns you listed above, and for the folder structure that we will have to do. |
Hey guys, The main reason we're doing this is to provide a more general way of packaging which can be used for every provider and every language. With this change we will zip up the service (which is basically a composition of functions) once, upload it to S3 and let all functions point to the one, service based .zip file. The .zip file might be a little bit bigger but at the end of the day it's just one .zip file which still contains the functions compared to multiple .zip files which will contain one function per file (so size should be "nearly" the same). Another important thing this introduces is the fact that we'll introduce a new lifecycle plugin authors can hook into. So you can modify the package before it gets deployed. We're also heavily investing in good We'll test the new package plugin in a real world scenario to see how it performs. I like the |
Thanks for the clarity @pmuens. Now that I understand this better, and that all functions in a service point to a single zip, this change results in a massive problem. It removes the ability of a function to be the unit of deployment. By having multiple functions point to the same zip file, you will be forced to deploy functions in bulk if they are in a service, and not deploy/update them individually. The function as the unit of deployment is one of the much touted benefits of the serverless architecture. It leaves devs in a the most agile position. This change would make that impossible for services containing multiple functions. To fix one function, you would have to deploy other functions as well, risking those breaking. For example, Further, this wouldn't allow Overall, the costs of this approach appear to greatly outweigh the benefits, and it clearly diverges from serverless architecture principles. The current implementation, though slower, is much better, as it retains the "function as the unit of deployment" philosophy, especially when we add an option to target a specific function within a service to deploy. Also, a new Recommended Steps Forward:
|
@ac360 thanks for the detailed explanation / the upsides and downsides of this approach. I understand the upsides and downsides of this approach. Could @flomotlik and @eahefnawy chime in on this? |
@ac360 adds good points here. It seems that we're returning to the traditional monolithic servers approaches. For example, a bug in function A would cause function B to break since they all share the same buggy code/zip file. That's not what lambda/serverless was made for imo. Magic handlers and dependency management gives users the choice however they want to architect their apps. Maybe they could use GraphQL if they wanna be monolithic, but at least the function would still be isolated from any potential issues in other functions. Also, after reading @pmuens clarification above, I think I initially misunderstood the proposition. If I understand this correctly now, we'll zip and deploy the service only once and just simply set a different handler for each function, but ultimately all functions will point to the same zip. Is that correct? In that case you can discard what I mentioned about slowness and having deploy x times. I thought the same zip would be deployed for each function. Ultimately, I love the simplicity of this approach, but I'm still a bit worried that users won't have flexibility they had in v0 with deploying their functions and choosing their dependencies. |
Further drawbacks of the new approach:
This new approach still seems like a huge divergence, that disables core functionality and goes against best practices. Strongly in favor of bringing back support for single function deployment. Also, @eahefnawy I'm not as concerned about magic handlers as before, since zipping things up at the service level is smaller still than the project level, especially if you can specify files to exclude at the function level. I would be OK w/ delaying magic handlers and waiting to see if that is needed within the service-level. Maintaining functions as the unit of deployment is the immediate priority, imo. |
Imo functions are not the unit of deployment. We're helping teams build services. Those services are the main entity we're deploying for users. When functions get deployed to production from a CI/CD environment deploying functions independently (and even not deploying functions sometimes) can be very dangerous:
While Hashing can potentially resolve some of those issues it introduces a lot of complexity in our deployment, can be the source of huge problems when there are bugs and also isn't necessarily the most efficient way for us to deploy. When we only build one zip file, push it to S3 and then deploy it through Cloudformation we only have to do one upload and Cloudformation can use that artefact for all deployed lambda functions. Deploying functions independently is important for the development workflow, as users will want to deploy independently during development. But not from a CI/CD perspective and this is what we need to implement first, then we can improve. But most importantly why we're not deploying independently is that while it does work for Node, it simply won't work for other languages. For Java for example building separate zip files for each function, with only the code necessary for that function is practically impossible. Individual Rollback would still be possible with versions available in lambda, but imho individual rollback is something we should absolutely discourage, as this can very very easily lead to issues with different expectations implemented in different functions and the whole system then failing (e.g. different code expects different fields in the database to be available). If users need to create specific deployment files for specific functions (e.g. 4 of 5 functions can live with the same zip, but function 5 needs to include a few additional things) we can (and I assume will) absolutely implement this. Those reasons are why we decided we're going to move the packaging out of AWS, generalise it and implement it in a way that makes more sense in CI/CD and makes our code much simpler. |
@flomotlik can you elaborate on this:
Are you saying that it can still live within the service, but would have a different deploy command or setting (eg. in the functions:function5:standalone = true)? Or are you saying that it would just have to be a separate service altogether? |
@ryansb yeah I think that is the best way, though I would also call it package. The way I see this going forward (and we've discussed it before the implementation) is that by default you have one package for each function, but you can use the same include/exclude parameters on each function as well. Every function that doesn't have include/exclude/artifact parameters will use the default package, every other function will get its own artefact. One thing I would like to get more feedback on is if all of you consider this as a common use case, or rare optimisation. Personally I think single function artefacts are an optimisation on the general package that gets created for the whole service. So for the few functions or services where you need tighter control you can get it, but my assumption is that for most functions you don't need it. So from an implementation perspective here is how I see this going forward:
One of the changes we also need to do in the packaging plugin is how many artefacts we keep around. Currently we only keep the last 5 versions inside of the S3 bucket. We should keep that, but we need to scope it per function, so we keep the last 5 around for each artefact (service wide and function specific) so we don't just remove random old artefacts. /cc @erikerikson @eahefnawy @pmuens Also going forward we're going to change how we deal with issues that are still being discussed. In the past we closed them when we implemented something but kept the discussion going. From now on we're going to label them discussion and keep them open. Didn't want to make it seem like we're closing the discussion with closing the issue, so I'm reopening this now. |
I'd consider it a roughly 80/20 issue: 80% of serious users will never need this split function, but for 20% of serious users it would be a total dealbreaker to not have it, and they'd need to hack something together or skip using the framework. The per-function include/exclude in addition to service-level ones would cover the use case I was discussing, and it seems to fit well in the overall design. IMO, for the case where an artifact is set up, but so are include/exclude, I think it should be a hard error because that's definitely not going to do what they intend. |
@flomotlik it almost sounds like you're trying to optimize for disk space and single payment upload time. The more constrained resource seems to be cold-start loading time. This seems like it risks forcing users to think about and get into the depths of how packages and deployments and container persistence work. This reminds me a bit of the everything is lambda centered choice earlier in the SLS lifecycle. It is an external concept/artifact brought in because something that seemed like a bright idea. It had the effect of blocking use cases as well as generating confusion and issues. One package per function seems to reflect the thing that the SLS is managing more directly. Can you help me grok what goal(s)/value you're trying to serve or problem(s) you're trying to avoid? |
@erikerikson the main goals are:
/cc @HyperBrain as well |
Thanks Flo. Bullets 1-5:
|
Last weekend, I tried out the new packaging process by deploying services with 1-20 functions. Overall, I really enjoyed it. Out of the box, CF deployments are slower, but this change makes CF deployments faster. It took 30 seconds to update 20 functions when they are in a single service, and each had their own API G endpoint. That's much faster than V.0 can do. With fewer functions in a service, provisioning via CF takes the same amount of time, and is a bit slower than V.0, however uploading them (a large time cost depending on the size of your functions) will be faster. On average, V.1 deployments will likely be a bit slower than V.0. But I think that's a good trade-off because V.1 deployments are all CF and are much safer. We used to run into weird collision issues with teams deploying to the same Lambda function simultaneously and trying to publish unique aliases. I'm still concerned about updating multiple functions at once, but it seems like we can offer single function deployments, when necessary. So that's resolved. Versioning is another concern. We'll be creating a lot of unnecessary Lambda versions if all functions of a service are always updated, even though their code may be unmodified. @flomotlik Has anyone put thought into versioning the actual CloudFormation stacks instead? We can still update the Lambda versions, but perhaps relying on CF Stack versions might offer a better experience. Given that a serverless service is code + resources, having versioning and rollback ability at the CF Stack level seems simple and powerful. We could easily download the previous versions from S3 and redeploy them (since their packaged w/ their serverless.yml files!) to offer a safe rollback experience. @erikerikson Great comments, as always. I actually believe there is a lot of alignment between @flomotlik's stated goals and what you are describing. Overall, we're offering a better default experience, while still maintaining a high degree of flexibility here. So, I think there is room for solutions to the points you brought up. @ryansb Great suggestions/solutions. Thank you! |
Not yet, but I like it. We could simply create a folder per deployment and then push artefacts and CF files into it. |
What we used to do with project buckets in JAWS is append the stage and version to the S3 key, instead of using subfolders. We also put all Lambda zips in a subfolder within the bucket root, just in case people wanted to put other stuff in the bucket.
|
@ac360 we now only create one bucket per service/stage/region combination, but in there we could simply create folders for the different version and put all the files in |
@flomotlik Right! But, if we add the ability to use a single bucket across multiple services (this feature might have already been added, not sure), keeping the stage in the name will protect us in that case. |
@ac360 the most likely option for a single bucket across services is that people manage that bucket themselves imho. I see basically 2 options going forward for the buckets:
This might change if we get different feedback from users |
Funny how things come full circle :) I'm with @flomotlik 's last comment. Thats exactly how I had it architected in JAWS (could pass S3 bucket to use as CLI opt when creating project). Only thing I'd add is when your naming the bucket, please include the region name in it. Ex: |
Yar! 2 is a must have for us. We appreciated that we would give bucket
By convention, not mandate, yes/please?
|
One question: Is the use of aliases and named versions for publishing lambda functions to environments (stages) still an option, instead of naming the lambda functions? For us the number of functions will explode if every environment gets its own function. Supporting aliases and versions should be possible with CloudFormation using the AWS:Lambda:Version and AWS:Lambda:Alias CF types. With version even versions with custom names (maybe implied through the user's build system) may be used and the environment ( @erikerikson +1 for the naming ) can be easily attached as alias to that. |
@HyperBrain its dropped for now. But I see your concern. Does your usecase require handling stages with aliases? or we could just add support for aliases/versions in general and that could work for you? The issue with handling stages with aliases/versions that we've found over the months is that not all event sources and other AWS services are aware of this Plus, when you deploy to a stage, you gotta navigate to that alias/version, and it confused some people who didnt see their deployment right away. @flomotlik any thoughts on this? |
@eahefnawy Since the latest improvements in the AWS lambda web console, it is far more centric on aliases than on versions. e.g. you now have the prominently placed "Qualifier" selection button in there. For us, we heavily depend on the stages/aliases as we use the Qualifier property for our lambda calls. |
@HyperBrain Perfectly said 😊 ... let's see what @flomotlik thinks of this. |
Yup Alias support is definitely an important one, we do have an issue open, I just can't seem to find it right now :D |
@flomotlik we have an issue for Lambda versioning here: #1457 |
Closing this issue as #1777 is targeting the per function packaging and @johncmckim is going to work on it. |
The packaging process should be moved into an own plugin.
Status:
Done --> See #1494
This includes:
serverless.yaml
file (e.g.artefact_path
). This way the deployment plugin will use the user specified artefact and won't zip up everything on it's ownQuestions:
deploy
plugin? Or should we usebefore:deploy:deploy
?Current implementation:
The current implementation can be found in the
awsDeploy
plugin.Open issues:
CloudFormation
) and the .zip file to the file system without deploying it) (moved into own issue: Add DRYrun support #1496)src
,pom
andtraget
when using Java --> We can define presets which will be automatically appliedThe text was updated successfully, but these errors were encountered: