Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: sam local start-api --warm-containers doesn't work on docker #6688

Closed
cford1080 opened this issue Feb 14, 2024 · 8 comments
Closed

Bug: sam local start-api --warm-containers doesn't work on docker #6688

cford1080 opened this issue Feb 14, 2024 · 8 comments
Labels
stage/waiting-for-release Fix has been merged to develop and is waiting for a release type/bug

Comments

@cford1080
Copy link

cford1080 commented Feb 14, 2024

Description:

@lucashuy asked that I open a new issue after my comment on this existing (but closed) issue.

When adding the warm-containers option to start-api running inside a docker container, the following exception is thrown:
samcli.lib.utils.file_observer.FileObserverException: Can not observe non exist path

I may be misunderstanding the docs on docker-volume-basedir, but it appears to me that since SAM CLI is running in a Docker container that this directory should be the directory inside the container. However, do notice that it is set to the current working directory with $PWD, which gives it the directory from the host, and it works this way.

Another issue is that the containers spun off to run the lambda when warm-containers is working (see workarounds in steps to reproduce) are not cleaned up when the SAM container goes down.

Finally, I have observed this behavior in both this SAM provided-runtime image as well as this SAM NodeJs image.

Steps to reproduce:

  1. Clone this repo
  2. Run docker compose -f docker-compose.exception.yaml up
  3. Execute: curl localhost:3000

I have also included 3 other Compose files to demonstrate the following behaviors:

  • docker-compose.yaml omits warm-containers and runs as expected (still, however, have a question about docker-volume-basedir)
    • this is what I'm doing currently across many projects, and I want to use warm-containers to speed up test execution
  • docker-compose.workaround1.yaml includes warm-containers and is based on this workaround from a different issue. Everything seems to work except that after bringing down the SAM container, the containers spun off to run the lambda hang around and do not get destroyed.
  • docker-compose.workaround2.yaml includes warm-containers and duplicates the host directory to avoid adding the working_dir docker directive, for whatever reason someone may have for doing so. Everything seems to work except that after bringing down the SAM container, the containers spun off to run the lambda hang around and do not get destroyed.

Observed result:

Result when exception occurs:

sam-local-1  |
sam-local-1  |  SAM CLI now collects telemetry to better understand customer needs.
sam-local-1  |
sam-local-1  |  You can OPT OUT and disable telemetry collection by setting the
sam-local-1  |  environment variable SAM_CLI_TELEMETRY=0 in your shell.
sam-local-1  |  Thanks for your help!
sam-local-1  |
sam-local-1  |  Learn More: https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-telemetry.html
sam-local-1  |
sam-local-1  | 2024-02-13 22:48:07,096 | No config file found in this directory.
sam-local-1  | 2024-02-13 22:48:07,097 | OSError occurred while reading TOML file: [Errno 2] No such file or directory: '/var/task/samconfig.toml'
sam-local-1  | 2024-02-13 22:48:07,097 | Config file location: /var/task/samconfig.toml
sam-local-1  | 2024-02-13 22:48:07,097 | Config file '/var/task/samconfig.toml' does not exist
sam-local-1  | 2024-02-13 22:48:07,100 | Using SAM Template at /var/task/template.yaml
sam-local-1  | 2024-02-13 22:48:07,122 | OSError occurred while reading TOML file: [Errno 2] No such file or directory: '/var/task/samconfig.toml'
sam-local-1  | 2024-02-13 22:48:07,122 | Using config file: samconfig.toml, config environment: default
sam-local-1  | 2024-02-13 22:48:07,122 | Expand command line arguments to:
sam-local-1  | 2024-02-13 22:48:07,122 | --template_file=/var/task/template.yaml --host=0.0.0.0 --port=3000 --skip_pull_image --warm_containers=LAZY --docker_volume_basedir=/home/cford1080/github/sample --container_host=host.docker.internal --container_host_interface=0.0.0.0 --static_dir=public --layer_cache_basedir=/root/.aws-sam/layers-pkg
sam-local-1  | 2024-02-13 22:48:07,183 | local start-api command is called
sam-local-1  | 2024-02-13 22:48:07,184 | No Parameters detected in the template
sam-local-1  | 2024-02-13 22:48:07,198 | There is no customer defined id or cdk path defined for resource LambdaFunction, so we will use the resource logical id as the resource id
sam-local-1  | 2024-02-13 22:48:07,198 | There is no customer defined id or cdk path defined for resource ServerlessRestApi, so we will use the resource logical id as the resource id
sam-local-1  | 2024-02-13 22:48:07,198 | 0 stacks found in the template
sam-local-1  | 2024-02-13 22:48:07,198 | No Parameters detected in the template
sam-local-1  | 2024-02-13 22:48:07,209 | There is no customer defined id or cdk path defined for resource LambdaFunction, so we will use the resource logical id as the resource id
sam-local-1  | 2024-02-13 22:48:07,210 | There is no customer defined id or cdk path defined for resource ServerlessRestApi, so we will use the resource logical id as the resource id
sam-local-1  | 2024-02-13 22:48:07,210 | 2 resources found in the stack
sam-local-1  | 2024-02-13 22:48:07,210 | Found Serverless function with name='LambdaFunction' and CodeUri='./src'
sam-local-1  | 2024-02-13 22:48:07,210 | watch resource /var/task/template.yaml
sam-local-1  | 2024-02-13 22:48:07,210 | Create Observer for resource /var/task/template.yaml with recursive True
sam-local-1  | 2024-02-13 22:48:07,212 | watch resource /var/task/template.yaml's parent /var/task
sam-local-1  | 2024-02-13 22:48:07,212 | Create Observer for resource /var/task with recursive False
sam-local-1  | 2024-02-13 22:48:07,214 | Found '1' API Events in Serverless function with name 'LambdaFunction'
sam-local-1  | 2024-02-13 22:48:07,214 | Detected Inline Swagger definition
sam-local-1  | 2024-02-13 22:48:07,214 | Parsing Swagger document using 2.0 specification
sam-local-1  | 2024-02-13 22:48:07,214 | Found '0' authorizers in resource 'ServerlessRestApi'
sam-local-1  | 2024-02-13 22:48:07,214 | Lambda function integration not found in Swagger document at path='/' method='get'
sam-local-1  | 2024-02-13 22:48:07,214 | Found '0' APIs in resource 'ServerlessRestApi'
sam-local-1  | 2024-02-13 22:48:07,214 | Authorizer not found or disabled, returning early
sam-local-1  | 2024-02-13 22:48:07,214 | Removed duplicates from '0' Explicit APIs and '1' Implicit APIs to produce '1' APIs
sam-local-1  | 2024-02-13 22:48:07,214 | 1 APIs found in the template
sam-local-1  | 2024-02-13 22:48:07,216 | Mounting LambdaFunction at http://0.0.0.0:3000/ [GET]
sam-local-1  | 2024-02-13 22:48:07,216 | You can now browse to the above endpoints to invoke your functions. You do not need to restart/reload SAM CLI while working on your functions, changes will be reflected instantly/automatically. If you used sam build before running local commands, you will need to re-run sam build for the changes to be picked up. You only need to restart SAM CLI if you update your AWS SAM template
sam-local-1  | 2024-02-13 22:48:07,216 | Localhost server is starting up. Multi-threading = True
sam-local-1  | 2024-02-13 22:48:07 WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
sam-local-1  |  * Running on all addresses (0.0.0.0)
sam-local-1  |  * Running on http://127.0.0.1:3000
sam-local-1  |  * Running on http://192.168.16.2:3000
sam-local-1  | 2024-02-13 22:48:07 Press CTRL+C to quit
sam-local-1  | 2024-02-13 22:48:11,221 | Constructed Event 1.0 to invoke Lambda. Event: {'httpMethod': 'GET', 'body': None, 'resource': '/', 'requestContext': {'resourceId': '123456', 'apiId': '1234567890', 'resourcePath': '/', 'httpMethod': 'GET', 'requestId': 'c57aa2ef-fdec-46e6-bde6-eb3b87b8a244', 'accountId': '123456789012', 'stage': 'Prod', 'identity': {'apiKey': None, 'userArn': None, 'cognitoAuthenticationType': None, 'caller': None, 'userAgent': 'Custom User Agent String', 'user': None, 'cognitoIdentityPoolId': None, 'cognitoAuthenticationProvider': None, 'sourceIp': '192.168.16.1', 'accountId': None}, 'extendedRequestId': None, 'path': '/', 'protocol': 'HTTP/1.1', 'domainName': 'localhost:3000', 'requestTimeEpoch': 1707864487, 'requestTime': '13/Feb/2024:22:48:07 +0000'}, 'queryStringParameters': None, 'multiValueQueryStringParameters': None, 'headers': {'Host': 'localhost:3000', 'User-Agent': 'curl/7.81.0', 'Accept': '*/*', 'X-Forwarded-Proto': 'http', 'X-Forwarded-Port': '3000'}, 'multiValueHeaders': {'Host': ['localhost:3000'], 'User-Agent': ['curl/7.81.0'], 'Accept': ['*/*'], 'X-Forwarded-Proto': ['http'], 'X-Forwarded-Port': ['3000']}, 'pathParameters': None, 'stageVariables': None, 'path': '/', 'isBase64Encoded': False}
sam-local-1  | 2024-02-13 22:48:11,222 | Found one Lambda function with name 'LambdaFunction'
sam-local-1  | 2024-02-13 22:48:11,222 | Invoking index.handler (nodejs18.x)
sam-local-1  | 2024-02-13 22:48:11,222 | No environment variables found for function 'LambdaFunction'
sam-local-1  | 2024-02-13 22:48:11,222 | Loading AWS credentials from session with profile 'None'
sam-local-1  | 2024-02-13 22:48:13,235 | Resolving code path. Cwd=/home/cford1080/github/sample, CodeUri=./src
sam-local-1  | 2024-02-13 22:48:13,236 | Resolved absolute path to code is /home/cford1080/github/sample/src
sam-local-1  | 2024-02-13 22:48:13,237 | Exception on / [GET]
sam-local-1  | Traceback (most recent call last):
sam-local-1  |   File "flask/app.py", line 1463, in wsgi_app
sam-local-1  |   File "flask/app.py", line 872, in full_dispatch_request
sam-local-1  |   File "flask/app.py", line 870, in full_dispatch_request
sam-local-1  |   File "flask/app.py", line 855, in dispatch_request
sam-local-1  |   File "samcli/local/apigw/local_apigw_service.py", line 724, in _request_handler
sam-local-1  |   File "samcli/local/apigw/local_apigw_service.py", line 617, in _invoke_lambda_function
sam-local-1  |   File "samcli/commands/local/lib/local_lambda.py", line 147, in invoke
sam-local-1  |   File "samcli/lib/telemetry/metric.py", line 325, in wrapped_func
sam-local-1  |   File "samcli/local/lambdafn/runtime.py", line 199, in invoke
sam-local-1  |   File "samcli/local/lambdafn/runtime.py", line 440, in create
sam-local-1  |   File "samcli/lib/utils/file_observer.py", line 214, in watch
sam-local-1  |   File "samcli/lib/utils/file_observer.py", line 401, in watch
sam-local-1  |   File "samcli/lib/utils/file_observer.py", line 531, in watch
sam-local-1  | samcli.lib.utils.file_observer.FileObserverException: Can not observe non exist path
sam-local-1  | 2024-02-13 22:48:13,241 | Lambda execution failed (<InternalServerError '500: Internal Server Error'>,)
sam-local-1  | 2024-02-13 22:48:13 192.168.16.1 - - [13/Feb/2024 22:48:13] "GET / HTTP/1.1" 502

Result of docker ps when using the workarounds (leftover containers):

CONTAINER ID   IMAGE                                          COMMAND                  CREATED              STATUS              PORTS                    NAMES
a33d86bcff9c   public.ecr.aws/lambda/nodejs:18-rapid-x86_64   "/var/rapid/aws-lamb…"   16 seconds ago       Up 16 seconds       0.0.0.0:5505->8080/tcp   focused_kalam
f67c7993075e   public.ecr.aws/lambda/nodejs:18-rapid-x86_64   "/var/rapid/aws-lamb…"   43 seconds ago       Up 43 seconds       0.0.0.0:5171->8080/tcp   adoring_bassi
a2b3e3d4e40b   public.ecr.aws/lambda/nodejs:18-rapid-x86_64   "/var/rapid/aws-lamb…"   About a minute ago   Up About a minute   0.0.0.0:5063->8080/tcp   happy_hugle
a62ffaf090fd   public.ecr.aws/lambda/nodejs:18-rapid-x86_64   "/var/rapid/aws-lamb…"   About a minute ago   Up About a minute   0.0.0.0:5278->8080/tcp   jovial_ellis

Expected result:

  1. Adding warm-containers to start-api behaves as defined in the docs.
  2. When the SAM container goes down, it takes all its spawned containers with it.

Additional environment details (Ex: Windows, Mac, Amazon Linux etc)

  1. OS: Ubuntu inside of WSL2 on Windows 10 (using Docker Engine only, not Docker Desktop)
  2. sam --version: 1.109.0
  3. AWS region: omitted

Paste the output of sam --info here

{
  "version": "1.109.0",
  "system": {
    "python": "3.11.3",
    "os": "Linux-5.15.133.1-microsoft-standard-WSL2-x86_64-with-glibc2.26"
  },
  "additional_dependencies": {
    "docker_engine": "25.0.2",
    "aws_cdk": "Not available",
    "terraform": "Not available"
  },
  "available_beta_feature_env_vars": [
    "SAM_CLI_BETA_FEATURES",
    "SAM_CLI_BETA_BUILD_PERFORMANCE",
    "SAM_CLI_BETA_TERRAFORM_SUPPORT",
    "SAM_CLI_BETA_RUST_CARGO_LAMBDA"
  ]
}
@cford1080 cford1080 added the stage/needs-triage Automatically applied to new issues and PRs, indicating they haven't been looked at. label Feb 14, 2024
@mildaniel
Copy link
Contributor

Hi @cford1080, thanks for raising this issue. I was able to reproduce the problem and I've raised a PR #6697 that should hopefully resolve it.

There is another factor here, however, which is how Docker runs SAM CLI inside the container. You need to ensure that it is being run as the main process and that it's not being wrapped by sh. In those cases, the SIGTERM interrupt will never reach the SAM CLI process and there's nothing that SAM CLI can do to handle it. This article goes into more detail.

@mildaniel mildaniel added type/bug and removed stage/needs-triage Automatically applied to new issues and PRs, indicating they haven't been looked at. labels Feb 15, 2024
@cford1080
Copy link
Author

@mildaniel
Thanks. I see that addresses the issue of spawned containers not being cleaned up, but what of the issue of warm-containers not working inside of a docker container and producing a FileObserverException?

@ryanb77
Copy link

ryanb77 commented Feb 15, 2024

When using the warm-containers option to start-api running inside a docker container, it will also fail to resolve http://host.docker.internal and a timeout will result.

Omitting the warm-containers for either EAGER or LAZY resolves the issue.

@cford1080
Copy link
Author

cford1080 commented Feb 15, 2024

@ryanb77

When using the warm-containers option to start-api running inside a docker container, it will also fail to resolve http://host.docker.internal and a timeout will result.

This is mitigated by adding the extra_hosts directive:

extra_hosts:
    - host.docker.internal:host-gateway

Omitting the warm-containers for either EAGER or LAZY resolves the issue.

I am currently omitting this option, but I wish to use the feature. If you review the Steps to reproduce, you may notice that I have provided 2 workarounds where warm-containers is working (with exception for cleaning up spawned containers), but requires some suboptimal volume mapping. In the provided logs (not using any workaround, where warm-containers is failing), you may also notice the following:

sam-local-1  | 2024-02-13 22:48:13,235 | Resolving code path. Cwd=/home/cford1080/github/sample, CodeUri=./src
sam-local-1  | 2024-02-13 22:48:13,236 | Resolved absolute path to code is /home/cford1080/github/sample/src

I am at a loss as to why this is using the host directories rather than the container's.

@mildaniel
Copy link
Contributor

We were able to find the root cause. It has to do with the --docker-volume-basedir overriding the file path that the observer watches which happens to be on the host machine instead of inside the container.

I've raised a draft PR which should solve this issue too but the integration tests for this might take me some more time.

@cford1080
Copy link
Author

Thank you, @mildaniel. As I mentioned, this isn't causing any showstopping issues because the option can be left omitted for now, but some testing could be sped up significantly if the container can remain warm throughout.

Does this impact other sam local commands which use these same options? For example, if I open another project which uses start-lambda instead of start-api, would I run into the same issues with the same options, and if so, will the solutions in motion now resolve the issues for these options on all commands, assuming they resolve the issues reported for start-api? Or do the options have different implementations per command that would all need to be addressed individually if issues are present? Hope that all makes sense; I got lost myself along the way and rewrote that a few times.

@mildaniel
Copy link
Contributor

mildaniel commented Feb 16, 2024

These two commands use the same underlying logic for starting and running containers, including file watching and volume mounting. The PRs I've raised above should address the issue for both start-api and start-lambda commands.

@mildaniel mildaniel added the stage/waiting-for-release Fix has been merged to develop and is waiting for a release label Mar 18, 2024
Copy link
Contributor

Patch is released in v1.113.0. Closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stage/waiting-for-release Fix has been merged to develop and is waiting for a release type/bug
Projects
None yet
Development

No branches or pull requests

3 participants