Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethink Region/Stage handling and Environment variables for each #1707

Closed
flomotlik opened this issue Jul 29, 2016 · 30 comments
Closed

Rethink Region/Stage handling and Environment variables for each #1707

flomotlik opened this issue Jul 29, 2016 · 30 comments
Milestone

Comments

@flomotlik
Copy link
Contributor

Currently Region and stage handling is clunky and they depend on each other in a way that often doesn't reflect the way users treat or think about different environments.

This leads to issues where stage/region have to be defined in serverless.yml like #1555

In general we have to rethink what regions and stages mean for Serverless, if they are concepts that are specific for each provider or if its something we would like to be provider independent concepts.

@flomotlik flomotlik added this to the v1.0 milestone Jul 29, 2016
@flomotlik
Copy link
Contributor Author

@serverless/vip would be interesting to get your thoughts on this. Do you think stages and regions relate to each other in your projects, should we treat them more separately going forward or which improvements would you like to see on Region/Stage management?

@kennu
Copy link
Contributor

kennu commented Jul 29, 2016

In our case (SC5), basically all projects are deployed to a single region (eu-west-1). For us worrying about multiple regions within a project has added complexity (e.g. variable management). I know that support for multiple regions is necessary, but I would like it to be hidden until needed.

We have needed to deploy projects to multiple AWS accounts, though. (Development, Staging and Production.) Conceptually it might make sense to be able to define deployment targets, which point to account+region combinations. By default there would be just the one you created the project with.

@HyperBrain
Copy link
Member

I like @kennu 's idea of having only one region on project creation. The regions could then follow a similar semantics as it was with the stages. You then could create or remove a region. Duplicating CF stacks would not be a good idea in that case, as a stage of a service should be deployable throughout multiple regions.

@pmuens
Copy link
Contributor

pmuens commented Jul 31, 2016

I would also propose to encapsulate them more and treat them independently.
The need to have everything defined in both places (serverless.yml and serverless.env.yml) is not that obvious and even with docs hard to understand (e.g. If I don't use variables from serverless.env.yml).

@jthomas did a great writeup here where he pointed out which other issues he faced while implemented OpenWhisk support:
#1700 (comment)

IMHO we need an encapsulated abstraction layer here as other cloud providers may also have different implementations of stages, regions, etc.

@eahefnawy
Copy link
Member

eahefnawy commented Aug 2, 2016

Ok let's look at the fundamentals here and challenge our 1 year old assumptions. Here's what stage, region and variables concepts are:

Stages

The concept of stages is simply a completely separate deployment of the service. We can easily do that by prefixing the service name with the stage name, hence creating an entirely new CF stack, without the need for serverless.env.yml at all.

Regions

The concept of regions turned out to be provider specific (although it is common among providers) like @jthomas pointed out in his IBM write up linked above. We'll simply take it as an option in the CLI or use a default that is set in serverless.yml

Variables & Secret Handling

This is where it gets complex and this is why we have serverless.env.yml in the first place. Variable values are usually different based on stage and region, which is why we have the structure we have in serverless.env.yml. This is where I think our focus should be.

As an example: Suppose you need to provide an authorizer ARN, the ARN will usually be different for dev and production stages. Maybe even different between different regions.

Alternative Approach

We can solve this issue by relying on local env vars without the need for serverless.env.yaml. Here's an example that outlines different ways available for user to set the ARN:

NOTE: The following example assumes that the user passes dev stage and us-east-1 region as options.

# serverless.yml
service: serviceName
functions:
    hello:
        handler: handler.hello
        events:
            - http:
                   path: user/create
                   method: get
                   authorizer: ${AUTHORIZER_ARN} # this will look for AUTHORIZER_ARN env var & populate
                   authorizer: ${stage:AUTHORIZER_ARN} # this will look for DEV_AUTHORIZER_ARN env var & populate
                   authorizer: ${region:AUTHORIZER_ARN} # this will look for USEAST1_AUTHORIZER_ARN env var & populate
                   authorizer: ${stage:region:AUTHORIZER_ARN} # this will look for DEV_USEAST1_AUTHORIZER_ARN env var & populate
                   authorizer: ${region:stage:AUTHORIZER_ARN} # this will look for USEAST1_DEV_AUTHORIZER_ARN env var & populate

In this scenario, stage and region are special keywords that refer to the stage and region that user provided as options with the command

In addition, depending on which stage & region the user provided as options, we'll prefix those to the service name and stack name (and whenever else is applicable).

Advantages of this approach:

  • no need for serverless.env.yml anymore and we don't have to worry about .gitignoring it
  • Providing a stage is completely optional. We don't even need to set a default dev stage. If user didn't provide a stage, we'll deploy the service name as is. So in that scenario the stage is merely a prefix. We still have to define a default region though because it's required by AWS.
  • The user has the option whether to use the same variable for all stages and regions or for specific ones. Whatever fits his use case. Thanks to the name spacing that we have stage:region:ENV_VAR
  • There's no structure between stages and regions. Regions are not children for stages, and stages are not children for regions either. It's just about the name spacing as provided by the user (stage:region:ENV_VAR vs region:stage:ENV_VAR)
  • we'll drop a lot of stage/region validation that we have to do because of the structure that we have in serverless.env.yml, saving us a lot of pain and tedious UX. It all just depends on what the user provides.
  • Secrets are now handled with env vars as users are used to. This will allow better workflow between teams, and users can optionally use .env files to manage that even better.

NOTE: The above is just for local variables and doesn't consider making environment variables available inside lambda runtime on AWS infra. This is a different issue, but I think we can handle it in a similar fashion.

@serverless/vip What you think?

@svdgraaf
Copy link
Contributor

svdgraaf commented Aug 2, 2016

I really like idea of just using ENV vars, this makes things a lot more flexible. I actually think we don't need to have the different ${stage.ENVVAR}, as the user can "simply" switch the environment for the current deploy if needed (eg: load a different .env file, or perhaps we provide a mechanism to provide an .env file). Letting the OS take care of the environment, and Serverless can just takes that environment and run with it. I think it's also more the 12factor-way?

But perhaps I'm missing a use-case where it would be handy to have the {$stage.xyz} variables.

I like the way Flask does this, you can feed it a config file (which is just an environment file), to use as configuration: http://flask.pocoo.org/docs/0.11/config/. It actually uses an environment variable to switch config files.

Something like:
$ sls --config test.env
Or:
$ sls --config production.env
And by default, check if a .env file exists, and use that (or use a SLS_CONFIG env variable?)
$ sls

As a bonus: we can then also just use the AWS_DEFAULT_REGION as the default region for the aws provider, which should already be set for the aws-cli.

@HyperBrain
Copy link
Member

Currently we use stage configuration variables in our s-resources-cf.json to model conditional resource setups, e.g. different number of Kinesis shards for dev and prod, etc. Internally in SLS this is handled by substituting the var definitions in the CF file.
Another argument against using system environment variables is, when running builds on CI servers, it is hard to trace errors then on one hand and on the other hand, it is quite easy to mixup parallel builds if someone set something in the global server environment to a wrong value.
For me static configuration checked in as YML makes much more sense in regards to stability and reproducability.

@svdgraaf
Copy link
Contributor

svdgraaf commented Aug 2, 2016

@HyperBrain Hmmm, how are you solving the issue of secrets? Or do you just not have those in the .yml files?

@HyperBrain
Copy link
Member

The secrets are not kept in the config files. They are injected by the build server (Bamboo) as AWS temporary session credentials. In my opinion secrets should not be part of any configuration but brought in by some independent second way.

@svdgraaf
Copy link
Contributor

svdgraaf commented Aug 2, 2016

@HyperBrain I completely agree with your remark, using the env vars would still work in your setup, you would still have your credentials provided by Bamboo. You could still have your config file checked in, in a repo if you want to, it would be up to you/Bamboo to provide the correct one.

@eahefnawy
Copy link
Member

@svdgraaf in the case you proposed, users would have to keep switching vars before each deployment. Do you think that's inconvenient? or would setting a predefined env var for each stage/region would be more inconvenient?

@rajington
Copy link
Member

rajington commented Aug 2, 2016

I've never done anything too complex with stage/regions, so I'm not sure if this is helpful, but have you considered custom YAML types?

Could do something like this:

# serverless.yml
service: serviceName
functions:
    hello:
        handler: handler.hello
        events:
            - http:
                   path: user/create
                   method: get
                   authorizer: !!env AUTHORIZER_ARN
                   authorizer: !!env-stage AUTHORIZER_ARN
                   authorizer: !!env-region AUTHORIZER_ARN
                   authorizer: !!env-stage-region AUTHORIZER_ARN
                   authorizer: !!env-region-stage AUTHORIZER_ARN

Also, there's some cool things you might be able to do with anchors for the more complex multiple architecture problem.

@svdgraaf
Copy link
Contributor

svdgraaf commented Aug 2, 2016

@eahefnawy why would they keep switching vars? I don't follow :)

I meant the exact same solution as you provided, but I think the ${region/stage} stuff is perhaps a bit overengineered, I'd say let's keep things simple. If we do decide to implement it like that, we would give power the user, as they could decide to use that, or just use the AUTHORIZER_ARN var in their current environment, and have it resolve correctly according to their needs. So you have my 👍's :)

@rajington
Copy link
Member

rajington commented Aug 2, 2016

Just to give you an example of what that could look like, this is an arbitrary environment variable replacement tag:

assuming process.env['foo']==='bar', the env type

const EnvType = new yaml.Type('!env', {
  kind: 'scalar',
  construct: data =>
    data.replace(/\$\{(.+?)\}/g, (_, key) => process.env[key])
});

converts

test: !env ${foo}Table

into

{ "test": "barTable" }

@nicka
Copy link
Member

nicka commented Aug 4, 2016

Some thoughts
In 0.5.X we commit all json files within _meta except for dev*.json(we don't store secrets). Each developer has it's own AWS account and stack to work on new project features. Why do we commit _meta/**/*.json files? The reason behind this is that users can easily loose stage/stack variables between branching, deployments etc. Our CI commits any json updates made in _meta after a deploy. I'm not saying this is a good/best solution. But what our team does like about this approach, is not loosing stage/stack related variables between developers.

Our concerns about serverless.env.yml:

  • Not being able to ignore developer specific variables
  • Missing stack-outputs support(why was this removed recently?)
  • No support to get variables into functions(I know whole different topic)

We do like what @eahefnawy mentioned about .env, as this gives back control over what to commit and ignore. Bonus is managing secrets as environment variables within CI. Other solution could be stage specific env files like serverless.dev.yml.


One more thing regarding stack Outputs. How would you feel about the following within serverless.yml?

Option 1

resources:
  Resources:
    ExampleBucket:
      Type: AWS::S3::Bucket
      Outputs:
        ExampleBucketName:
          Description: ExampleBucket Name
          Value:
            Ref: ExampleBucket
        ExampleBucketEndpoint:
          Description: ExampleBucket Endpoint
          Value:
            Fn::GetAtt:
              - ExampleBucket
              - DomainName

I know this is not the original location with a CloudFormation template, but it might be good UX to specify it per Resource?

Option 2

resources:
  Resources:
    ExampleBucket:
      Type: AWS::S3::Bucket

outputs:
  Outputs:
    ExampleBucketName:
      Description: ExampleBucket Name
      Value:
        Ref: BucketB
    ExampleBucketEndpoint:
      Description: ExampleBucket Endpoint
      Value:
        Fn::GetAtt:
          - ExampleBucket
          - DomainName

@kennu
Copy link
Contributor

kennu commented Aug 4, 2016

We have also always committed _meta to Git, so that multiple developers can work on the same project. And we have excluded _meta/variables/s-variables-dev.json and synced it with sls meta sync (since it has secrets).

The problem with that approach has been that Serverless 0.5 fails with an error when s-variables-dev.json is missing. Developers have had to manually create an empty file first. I really hope that the 1.0 solution will let developers run Serverless without configuring anything manually, and just use a simple command to sync the required secret variables to their computer.

Generally speaking I can see that there are two completely different use cases:

  • Publish a public Serverless open source project to GitHub
  • Share a private Serverless project between developers (or just between a single developer's multiple computers)

The Serverless 0.5 approach has been geared towards the first case, and made the second case fairly inconvenient.

@HyperBrain
Copy link
Member

Do also complex type variables work with the environment approach? Currently (0.5.6) we have the VPC definitions defined as arrays in the var files and just use the variable replacement in the configuration. That's very convenient as the number of subnets are different across the stages.

@mthenw
Copy link
Contributor

mthenw commented Aug 5, 2016

I’m personally not sure if there should be direct support for region and stage in the framework. Instead of that I would propose something similar to what @svdgraaf proposed which is “just” config file support. Default config file is serverless.yml.

Framework shouldn't care about region and staging.

Framework is just a tool for managing function. Minimal framework project should look like this:
serverless.yml

service: hello

provider:
  name: aws
  runtime: nodejs4.3

functions:
  hello:
    handler: index.handle

index.js

module.exports.handle = (event, context, cb) => cb(null,
  { message: 'YO!' }
);

I should be able to deploy it without any other files via serverless deploy (region & creds from ~/.aws/*)

serverless.yml - default config file

By default framework should look for serverless.yml file. This is default config file.

Config file structure

Most of the stuff we have already:

  • service - service name
  • provider - provider specific config
  • functionNameFormat - by default it's "${service}-${functionName}" but can be anything. name format can use all values from config file.
  • vars - custom vars that we want to use in config files
  • ...

Using different config file

serverless deploy should accept config file as a optional param:

serverless deploy --config dev.yml

Of course this config has to have specified structure.

Possibility to overwrite config value

There should be a way to overwrite config value

serverless deploy --var 'provider.timeout=100'

or even

serverless deploy --config dev.yml --var 'provider.timeout=100'

Use cases

Simple setup (prototyping some app)

Needed files:

  • serverless.yml
  • hander file

Deploy with serverless deploy

App with multiple (dev, test, prod) stages under the same AWS account

In that case we need to have 3 config files. One per each stage.

  • serverless.dev.yml
  • serverless.test.yml
  • serverless.prod.yml
  • hander file

Deploy with serverless deploy -c serverless.dev.yml

To reduce copy/pasting (probably most of the values in those config files are the same) we can use JSONREF, some kind of inheritance or accept multiple config files so deploy would look like this:

serverless deploy --config general.yml --config dev.yml

What's important here that user need to change default functionNameFormat. general.yml could look like this:

service: hello

provider:
  name: aws
  runtime: nodejs4.3

functionNameFormat: "${service}-${vars.stage}"

functions:
  hello:
    handler: index.handle

and dev.yml like this:

vars:
  stage: dev

App with multiple (dev, test, prod) stages in different AWS accounts

Similar to previous use case but dev.yml should overwrite provider config:

provider:
  access_key: ...
  secret_key: ...

or

provider:
  profile: devaccount

What do you think guys?

@mthenw
Copy link
Contributor

mthenw commented Aug 5, 2016

One more comment. To make it easy to configure and setup there should be some interactive command for generating that files by asking questions like:

  1. What is the name of you service?
  2. What stages you want to setup?
    etc.

@eahefnawy
Copy link
Member

@svdgraaf they would have to keep switching env vars because AUTHORIZER_ARN would be different for each stage/region. So when you run serverless deploy -s production -r us-west-2, you have to make sure you updated the value of the AUTHORIZER_ARN env var to reflect the production version.

Things would be so much simpler if values doesn't depend on stage/region. So as a way to reduce this step of keeping the values updated before deployment, what I'm proposing is to keep all the values of all the stages/regions you'll be using in different env vars based on the framework conventions I've mentioned above and just deploy to whatever stage/region and the framework would pick it up based on the options you provided.

Although users might not be switching stages/regions that often anyway. So you might be right, this might be overengineered.

@eahefnawy
Copy link
Member

@HyperBrain I don't think the env var approach would support complex variable types. This seems to be something that is more fit to JSON-Ref. I'm not sure about the nature of your variables, but complex types gives the impression that they're not really "secretive" but rather "reusable". So maybe JSON-Ref would work there?

@HyperBrain if you still would prefer managing files rather than env vars, what you think about using .env files?

@eahefnawy
Copy link
Member

@rajington Interesting! that's a pretty creative solution! However it's too dependent on YAML, and because of some indentation/UX issues we've faced with YAML, we're thinking maybe to also support JSON files (while keeping yaml default/preferred)

@eahefnawy
Copy link
Member

Framework shouldn't care about region and staging.

+100000

serverless deploy should accept config file as a optional param

Does that mean that a Service doesn't need to have serverless.yml? but rather optionally depends on what the user pass as config file?

In that case we need to have 3 config files. One per each stage.

Our main goal for v1 is to reduce boilerplate, so I'm on the fence on that one.

@mthenw Generally it seems that you'd prefer config files over env vars like @HyperBrain ... Any particular reason for that? Do you think env vars are harder to manage, specially within a team?

@mthenw
Copy link
Contributor

mthenw commented Aug 5, 2016

Does that mean that a Service doesn't need to have serverless.yml? but rather optionally depends on what the user pass as config file?

It's loaded by default. If there is no serverless.yml and --config is not passed error should be returned.

@mthenw Generally it seems that you'd prefer config files over env vars like @HyperBrain ... Any particular reason for that? Do you think env vars are harder to manage, specially within a team?

Env var are fine too. I personally don't like them as they are not explicit.

Terraform solved that (https://www.terraform.io/intro/getting-started/variables.html). There are three options for passing vars:

  • terraform plan -var 'access_key=foo'
  • terraform plan -var-file="secret.tfvars"
  • or export TF_VAR_access_key

We could do something similar with --config, --var and export SERVERLESS_VAR_access_key.

@svdgraaf
Copy link
Contributor

svdgraaf commented Aug 5, 2016

Maybe a silly suggestion, but couldn't we just expose the environment hash to the variables, eg: ${env.FOOBAR}, together with the other variables? That would be an easy fix I think.

That would not solve the issue of the configuration loading, but at least the users that want to use environment variables, can do so easily.

@nicka
Copy link
Member

nicka commented Aug 5, 2016

@svdgraaf I like both, but adding ${env.FOO} support would be a super niffyy feature for CI and managing secrets. Offcourse this is possible with .yml config files but adds the need to configure files within CI.

My vote would go to .yml config files and ${env.FOO} support 😉

@jordanmack
Copy link
Member

jordanmack commented Aug 5, 2016

I second a number of the points made by @kennu, @rajington, @mthenw, and the others.

Remain flexible and minimal. Offer a clear set of recommended conventions, but do not restrict.

Let there be one and only predefined file that the framework requires, serverless.yml. (And technically you could allow --config to override this.) When you create from a default template it should still include the handler, event, and possibly other files for variables and configuration, but don't hardcode the framework to use specific files. All other resources can be defined in serverless.yml. This maintains ease of use, but also keeps the door open for alternative layouts to emerge.

Never assume characteristics of a provider, and avoid marrying general syntax to characteristics.

Both regions and stages are characteristics of the provider. They are not guaranteed to exist on all providers, and new characteristics may emerge that were never anticipated. The syntax to define configuration and variables should not be tied to regions, stages or the like. This adds complexity to the framework that really doesn't need to be there. The ability to define things differently for each environment can be addressed by defining targets (directly or through multiple configuration files).

Adopt a future proof deployment model that embraces common use cases, but remains flexible.

Replying on deployment targets would decouple the deployment process from any single provider characteristic. Not to mention that everyone already uses targets. They might call them something different, but they are still targets. Stages are probably the most commonly used target naming convention, and the provider is typically implied since most only use a single provider. If we take a step back and look at what Serverless is trying to do, there are some ways to really push the envelope. Different targets can point to different providers. A single target could even encompass multiple accounts with multiple providers.

Simplify.

Keep the barriers to entry as low as possible. Reduce required boilerplate by relying on well documented default values when possible. Avoid complex syntax and overly deep nesting. Allow configuration reuse to avoid repetition and keep the project organized and maintainable.

Relying on YAML anchors would significantly reduce the complexity of any single configuration branch, and naturally provides the configuration inheritance model that Serverless has been gearing towards. Combined with YAML includes (non-standard) configuration files can be broken up and restructured in an intuitive and more easily maintainable way. JSON References/Pointers (also non-standard) would offer the equivalent functionality in JSON configurations.

@eahefnawy
Copy link
Member

@ac360 proposal in #1801 is pretty much a combination of the best ideas here. Highly recommended to check it out! It's simply about Serverless Variables, which is the core of this entire discussion.

@svdgraaf
Copy link
Contributor

svdgraaf commented Aug 9, 2016

I think #1801 is indeed the best proposal so far, I like it, let's go for it! 😄 👍

@flomotlik
Copy link
Contributor Author

This is now merged with #1834

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants