-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(AWS Lambda): Add support for building Docker images #8725
Conversation
Codecov Report
@@ Coverage Diff @@
## master #8725 +/- ##
==========================================
+ Coverage 87.59% 87.72% +0.13%
==========================================
Files 259 262 +3
Lines 9640 9763 +123
==========================================
+ Hits 8444 8565 +121
- Misses 1196 1198 +2
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pgrzesik it looks really good, and I think not much is left here. See my comments
Should we try to remove ECR repository only if there are provider.images defined?
Great question. I would attempt to remove repository unconditionally as a whole only when we're removing the service via sls remove
, still I would attempt it in a way in which it doesn't prolong time of sls remove
when repository doesn't exist (I proposed how we can do it in one of the comments
Do you think it's still worth attempting to re-tag the image in case of failure?
I think we may either:
- Skip it, and do not attempt to search and remove unused images
- Always be clean, and always attempt to leave clean slate (attempt to remove any not used images)
Second feels a bit aggressive, I think we may skip it for now, and eventually revisit if concern would be raised. Working with specific use case may give us better idea on what approach would be most optimal, so we may wait for that.
I don't think we can "fix" it other than enforcing serial build process. Do you think it's fine to keep the outputs like that?
I'd rather fix it. Outputs like that are very confusing and may lead to wrong assumptions. I'd fix it by flushing output of processes after they finalize and not on the go
I was thinking that maybe, we should create an "umbrella" property on provider like
ecr
ordocker
Very good call. I propose to use ecr
(AWS pointed to us once that ECR is not only for docker
, in light of that docker
doesn't seem as valid choice)
What is the "recommended" way of handling errors from spawnExt? Should we try to catch them and provide a custom error message for user or just fail whenever process fails?
I think we should catch, and crash with meaningful ServerlessError
. If we pass through original error, it'll be assumed as programmer error (so like 500
), and will be presented to user with full stack trace (and such errors unconditionally are treated as bugs in our logic).
Right now it will fail on build attepmt, but I was thinking that maybe we should fail faster in such case.
If from build fail we can easily diagnose error signals that docker
is not installed. I would keep that, and just throw meaningful ServerlessError
in response.
It'll be great to also test this on Windows. Do you have some access to Windows machine?
lib/plugins/aws/provider.js
Outdated
// TODO: VALIDATE THAT LENGTH | ||
'^[a-z]{1,64}$': { | ||
oneOf: [ | ||
// TODO: Combine into one? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's best to stick to one option, and validate exclusivity of path
and uri
manually.
It's because in next PR will introdoce common options that apply to all scenarios and we should not duplicate their definitions. Other reason is that it'll make validation errors more human
friendly, anyOf
fail in such case could be reported just with provider.images.foo format is not supported
which doesn't inform what properties didn't validate
Otherwise (as a side note), let's always rely on anyOf
(not oneOf
)
With oneOf
AJV always validates all paths (if more that one passes validation fails), and applies coercion if needed in each case, which may malform final value.
With anyOf
AJV stops at first passing option
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that's a great advice, I'll change it
lib/plugins/aws/provider.js
Outdated
type: 'string', | ||
pattern: | ||
'^\\d+\\.dkr\\.ecr\\.[a-z0-9-]+..amazonaws.com\\/([^@]+)|([^@:]+@sha256:[a-f0-9]{64})$', | ||
oneOf: [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use anyOf
lib/plugins/aws/provider.js
Outdated
@@ -1698,6 +1744,170 @@ class AwsProvider { | |||
return this.getStackResources(res.NextToken, allResources); | |||
}); | |||
} | |||
|
|||
getEcrRepositoryName() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a dedicated naming.js module for such methods
lib/plugins/aws/provider.js
Outdated
if (err.providerError && err.providerError.code === 'RepositoryNotFoundException') { | ||
// TODO: CONSIDER MORE PARAMS FOR REPO CREATION? | ||
const result = await this.request('ECR', 'createRepository', { | ||
repositoryName, | ||
}); | ||
repositoryUri = result.repository.repositoryUri; | ||
} else { | ||
throw err; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style hint.
It usually feels more concise to construct flow as:
if (!x) throw error;
... process ...
Instead of
if (x) {
... process ...
} else {
throw error;
}
lib/plugins/aws/provider.js
Outdated
const { repositoryUri, repositoryName } = await this.getOrCreateEcrRepository(); | ||
|
||
await this.dockerLoginToEcr(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those two can probably be issued in parallel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you talking about getOrCreateEcrRepository
and dockerLoginToEcr
? I think it's not safe to issue them in parallel as if the repo doesn't exist, we have to create it before trying to use dockerLoginToEcr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm.. but aren't we logging in to ECR in general, and not to specific repo?
Or do you mean that if there's no single repo at given region AWS would reject login?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're totally right, please disregard my previous comment
lib/plugins/aws/remove/lib/ecr.js
Outdated
if (err.providerError && err.providerError.code === 'RepositoryNotFoundException') { | ||
// Pass silently | ||
} else { | ||
throw err; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether it's fine to unconditionally throw
here.
e.g. what if user works in region in which AWS ECR is not supported at all (I'm not sure if it can be the case, but if it would we will throw at every removal in such region.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not consider that situation, I'll investigate if such a thing can happen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to investigate it but I didn't find a single mention of ECR not being available in a specific region - do you envision any other potential issues with unconditionally throwing here @medikoo ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw you already updated to issue it only if it's confirmed that repository exists. In such case I think it's totally safe to run it unconditionally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct, however, checking for repository existence has similar check at the moment, which can trigger (potentially) the same issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I just saw that. I think we can leave it for now, and eventually we'll react if anything is reported.
I'm not sure how it'll behave in e.g. Chinese regions, which are not accessible if you do not have registered business in China
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about China specifically, but if you don't have access there, how all other API calls via SDK work (or not)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was referring more to case where someone has access but e.g. given AWS service is not provided in the region (I can imagine that china regions could be limited to US ones).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood 👍 I was actually researching availability of ECR service but didn't find a single region where it's not available, including China regions: https://www.amazonaws.cn/en/about-aws/regional-product-services/
lib/plugins/aws/remove/index.js
Outdated
await this.validate(); | ||
await this.emptyS3Bucket(); | ||
const cfData = await this.removeStack(); | ||
await this.monitorStack('delete', cfData); | ||
await this.removeEcrRepository(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to construct this in a way, that we have two methods. First checkIfEcRepositoryExist
, which we should issue at beginning but do not await
,
and await
after monitorStack
succeds, and if it returns true proceed with removeEcrRepository()
This will ensure we do not add time to remove
command for services where ECR is not involved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a clever way to skip adding extra time for people that don't use docker
at all, I'll update it
}, | ||
fnImageWithTag: { | ||
image: '000000000000.dkr.ecr.sa-east-1.amazonaws.com/test-lambda-docker:stable', | ||
}, | ||
fnImageWithExplicitUri: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's maybe prefix name with fnProviderImage
to differentiate those which have image config inline and those which refer to provider.images
Thanks @medikoo for all great points raised
I really like the proposed idea 👍
The only thing I'm worried about here is the potential leftover images generating extra costs for users. Storage in ECR is $0.1/GB/month, which can quickly add up to a significant sum after some time. But I like the idea of potentially revisiting the approach after we get feedback from the community. With that in mind, do you suggest to remove the logic that cleans up unused images after each deploy for now?
Yeah, that's the only way to ensure we don't mix up outputs. The only downside is that users won't see progress of build/push in real-time, but I guess we have to make a tradeoff here.
👍
I really like the proposed idea 👍
The only thing I'm worried about here is the potential leftover images generating extra costs for users. Storage in ECR is $0.1/GB/month, which can quickly add up to a significant sum after some time. But I like the idea of potentially revisiting the approach after we get feedback from the community. With that in mind, do you suggest to remove the logic that cleans up unused images after each deploy for now?
Yeah, that's the only way to ensure we don't mix up outputs. The only downside is that users won't see progress of build/push in real-time, but I guess we have to make a tradeoff here.
👍
I really like the proposed idea 👍
The only thing I'm worried about here is the potential leftover images generating extra costs for users. Storage in ECR is $0.1/GB/month, which can quickly add up to a significant sum after some time. But I like the idea of potentially revisiting the approach after we get feedback from the community. With that in mind, do you suggest to remove the logic that cleans up unused images after each deploy for now?
Yeah, that's the only way to ensure we don't mix up outputs. The only downside is that users won't see progress of build/push in real-time, but I guess we have to make a tradeoff here.
👍
Makes perfect sense, I'll adjust it
Will try to do that, thanks for a suggestion 👍
Yes, I've planned to try to test it out on Windows machine after wrapping up the development. Thanks once again for all the answers and suggestions 🙇 e should catch, and crash with meaningful Makes perfect sense, I'll adjust it
Will try to do that, thanks for a suggestion 👍
Yes, I've planned to try to test it out on Windows machine after wrapping up the development. Thanks once again for all the answers and suggestions 🙇 as programmer error (so like Makes perfect sense, I'll adjust it
Will try to do that, thanks for a suggestion 👍
Yes, I've planned to try to test it out on Windows machine after wrapping up the development. Thanks once again for all the answers and suggestions 🙇 |
I wasn't aware it's the case. In light of that as we already have cleanup logic, maybe let's keep it, and it could be fine to keep the image on failed deployment as you proposed, as I take it's still destined to be removed on cleanup operation with next (successful) deployment if it for some reason it will end as not used (?) |
Yes, in current form, if the deploy will fail for some reason, the image will be kept, but during next deployment we will once again push the newest version and re-tag even if nothing changed for the image and the function will be configured with that latest version. The old ones won't have tags so they will be removed |
async cleanupEcrRepository() { | ||
// TODO: Add check if we should even try removing anything | ||
const repositoryName = this.provider.naming.getEcrRepositoryName(); | ||
const registryId = await this.provider.getAccountId(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you see any issues with memoize'ing getAccountId
call on provider? I'm using it a few times and I dont' want to incur extra API calls - I think it's safe but wanted to double check here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be totally safe. I do not envision scenario where creds change over the course of processing command.
Still there might be an issue if this command is invoked prematurely (e.g. before variables are resolved), but hopefully there won't be such case.
Also memoizing may influence tests. If there are issues, it might be needed to reset cache here:
Lines 22 to 28 in b4fef7d
// Ensure to reset cache for local serverless installation resolution | |
// Leaking it across test files may introduce wrong assumptions when result is used for testing | |
resolveLocalServerless.clear(); | |
// Ensure to reset memoization on artifacts generation after each test file run. | |
// Reason is that homedir is automatically cleaned for each test, | |
// therefore eventually built custom resource file is also removed | |
_ensureArtifact.clear(); |
In addition, I've testing it extensively on Windows machine, I managed to catch one extra bug there but otherwise it works without issues 👍 |
2417185
to
2f7ae4c
Compare
@medikoo I believe it's ready to be reviewed again - I've added all missing pieces, tested it both on Windows and Linux (including regression with just functions), all feedback will be much appreciated 🙇 |
2f7ae4c
to
20f8420
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @pgrzesik that's outstanding!
Concerning implentation I have just some nitpicks, otherwise I think it'll be nice to improve tests organization and we should be good to go
} | ||
|
||
// We should only keep the images that are tagged with imageNames that are defined in service | ||
const imageTagsDefinedInService = Object.keys( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: It'll be nicer to have it as Set
instance
lib/plugins/aws/provider.js
Outdated
images: { | ||
type: 'object', | ||
patternProperties: { | ||
'^[a-z]{1,32}$': { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume it's case insensitve. For the sake of clarity, I would probably add A-Z
also I think we should support numbers and -
(just not as fist chars), what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's in fact case sensitive and you cannot use uppercase. I totally agree that we can use numbers and additional chars though 👍
lib/plugins/aws/provider.js
Outdated
{ $ref: '#/definitions/ecrImageUri' }, | ||
{ | ||
type: 'string', | ||
pattern: '^[a-z]{1,32}$', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll be nice to keep this pattern in variable, as e.g. we do here:
serverless/lib/configSchema.js
Line 3 in 30015ea
const functionNamePattern = '^[a-zA-Z0-9-_]+$'; |
lib/plugins/aws/provider.js
Outdated
}); | ||
const { authorizationToken, proxyEndpoint } = result.authorizationData[0]; | ||
const decodedAuthToken = Buffer.from(authorizationToken, 'base64').toString().split(':')[1]; | ||
const dockerArgs = ['login', '--username', 'AWS', '--password-stdin', proxyEndpoint]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think docker has an option to simply pass password as an argument (--password
?).
It'll probably be cleaner to use that, instead of playing with stdin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first I went with --password
, however, Docker complains that it's insecure to use it --password
directly and suggests to use --password-stdin
instead. See message below:
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
In general, even if password-stdin
is used, there's still another warning:
WARNING! Your password will be stored unencrypted in /home/pgrzesik/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store
But we cannot really do anything meaningful about that one if we want to automatically login people to docker
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think here? Should we go back to --password
or keep the current approach (though I agree that it's so-so with stdin
interaction)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's just a printed warning when --password
is used, and not a deprecation of any kind, I would switch to regular param input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed 👍
lib/plugins/aws/provider.js
Outdated
const pathToDockerfile = path.join(imagePath, 'Dockerfile'); | ||
|
||
try { | ||
const stat = await fs.promises.stat(pathToDockerfile); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Convention is to name stats
the result of fs.stat
lib/plugins/aws/provider.js
Outdated
); | ||
} | ||
|
||
const [{ repositoryUri, repositoryName }, ,] = await Promise.all([ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I guess trailing commas are not needed
}, | ||
}) | ||
).to.be.rejectedWith( | ||
'Either "uri" or "path" property needs to be set on image "invalidimage"' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's rely on codes, as e.g. here:
serverless/test/unit/lib/plugins/aws/package/compile/events/httpApi.test.js
Lines 456 to 458 in 30015ea
).to.eventually.be.rejected.and.have.property( | |
'code', | |
'EXTERNAL_HTTP_API_AUTHORIZER_WITHOUT_EXTERNAL_HTTP_API' |
(Human readable message are subject to typo's, grammar improvements)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call 👍
@@ -1756,6 +1756,363 @@ describe('lib/plugins/aws/package/compile/functions/index.test.js', () => { | |||
expect(fooFunctionRole).to.equal(fooFunctionConfig.role); | |||
}); | |||
}); | |||
|
|||
it('should fail if `functions[].image` references image without path and uri', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't those tests duplicate of ones in deployFunction.test.js
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct, just tested in different use case scenario - do you feel like these should be just moved to provider.test.js
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved these to provider.test.js
from both deployFunction.test.js
and functions.test.js
}, | ||
}; | ||
|
||
it('should fail if `functions[].image` references image without path and uri', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think those tests should be in provider.test.js
where we confirm custom validation of properties.
(in general I would test functionalities in test files which correspond to files in which they were implemented)
Also I guess we can use package
command
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would you go about ensuring that the resolution happens correctly also for deploy function -f functionName
? While the functionality is in provider.js
, both deploy
and deploy function
use it a bit differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good point (and I think in tons of other places we forgot about packaging that's implied by sls deploy function
.
I think we should probably ensure one sanity test (happy path) for both sls package
(implied by sls deploy
) and sls deploy function
, but then test specific scenarios on just one of them.
Otherwise ideally it would be to have this functionality implemented independently of Serverless
instance (it could take a imageName and imagePath), and then we can do unit tests against that uitlity, and with runServerless
just confirm happy path for sls package
and sls deploy function
.
Still I guess it's not possible until we have #8445 done (and it's likely that some other things will stand in a way)
(e.g with refactor I'm working on, I'm secluding all those bricks into independent utils, that makes right unit testing easier)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like that suggestion and that's what I went with - happy path cases in both functions.test.js
and deployFunction.test.js
and full coverage in provider.test.js
. I agree that ideally, I would love to have image resolution implemented as independent util, but at the moment it's hard as there are direct dependencies on provider
instance (the biggest one on request
that you mentioned). But I know we're on a good path to better modularization 🙌
4fde59f
to
fd1cdf7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @pgrzesik ! Looks great. I have just two last questions to clarify, and we're good to go
lib/plugins/aws/provider.js
Outdated
]; | ||
try { | ||
const { stdBuffer } = await spawnExt('docker', dockerArgs); | ||
process.stdout.write(stdBuffer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the warning that we use --password
now visible in a console?
If it's the case I would probably not show output of this command by default (what's more important is image operations actions, where it's simple login)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's visible in the console, we could hide it but I'm not sure if we should hide information that we're storing the credentials in an unencrypted way locally. That's what is showing right now from docker login
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
WARNING! Your password will be stored unencrypted in /home/pgrzesik/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store
Login Succeeded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm.. I think it's confusing for user. It's not user that made a choice of how password is transferred, and visible warning may imply to him that something is not right.
I would hide this output and instead show a visible notice:
Logged Docker successfully into AWS ECR.
Note: that authentication token is stored as unencrypted in /home/pgrzesik/.docker/config.json.
Second line I'd put in orange as we do with other warnings. I believe we should also add a note about that into doc.
Also this triggers another question, if user will configure a credential helper that's mentioned in docs, can we somehow omit the login step? (we may ofc leave it for other PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About displaying that note: it might be tricky as the outputs vary between operating systems (as well as the location of .docker/config.json
) - for example, on Windows machine I didn't get that warning. As for the credential helpers, I believe we still have to perform login for two reasons:
- Tokens obtained from AWS are short-lived and there's no information about their validity stored (we could try to fallback to login only if needed (when push operation is denied e.g.), but that still requires logging in)
- Even if credential helper is used (afaik), it's only about where the tokens are stored, e.g. in some kind of encrypted file instead of plaintext as in default, we still have to
login
, but it's just up todocker
to properly store that token duringlogin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I'm thinking about it more - maybe we shouldn't start with docker login
, but rather only issue it if docker push
command fails due to incorrect credentials?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also https://github.com/awslabs/amazon-ecr-credential-helper utility that would make docker login
step redundant in case someone uses it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but rather only issue it if docker push command fails due to incorrect credentials?
Good idea! Going that road will definitely be nicer. At least user will have a choice and be able to setup authentication via own means if that's preferred.
but if for some reason recognition of authentication error is tricky I would skip it for now.
About displaying that note: it might be tricky as the outputs vary between operating systems
I would probably then search for a phrase stored unencrypted in <somePath>
, and show our note only if we find it in output, otherwise not show it at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've went for this type of output (warning with orange colour):
Serverless: Login to Docker succeeded!
Serverless: WARNING: Docker authentication token will be stored unencrypted. Configure Docker credential helper to remove this warning.
The push refers to repository [600238737408.dkr.ecr.us-east-1.amazonaws.com/serverless-someproject-dev]
bdfa6eb032c6: Preparing
61b3904fa195: Preparing
6a7f9b3693e2: Preparing
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, I would probably add ..will be stored unencrypted in docker config
, so user is not confused where it's stored in general
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call 👍
24ccb59
to
da640c3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pgrzesik I have just one style improvement proposal. Let me know
lib/plugins/aws/provider.js
Outdated
try { | ||
({ stdBuffer: pushDockerStdBuffer } = await spawnExt('docker', pushDockerArgs)); | ||
process.stdout.write(pushDockerStdBuffer); | ||
} catch (outerPushErr) { | ||
let errToThrow = outerPushErr; | ||
if ( | ||
outerPushErr.stdBuffer && | ||
outerPushErr.stdBuffer.includes('no basic auth credentials') | ||
) { | ||
errToThrow = null; | ||
await this.dockerLoginToEcr(); | ||
try { | ||
({ stdBuffer: pushDockerStdBuffer } = await spawnExt('docker', pushDockerArgs)); | ||
process.stdout.write(pushDockerStdBuffer); | ||
} catch (innerPushError) { | ||
errToThrow = innerPushError; | ||
} | ||
} | ||
if (errToThrow) { | ||
throw new this.serverless.classes.Error( | ||
`Encountered error during executing: docker ${pushDockerArgs}\nOutput of the command:\n${errToThrow.stdBuffer}`, | ||
'DOCKER_PUSH_ERROR' | ||
); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'll be nicer to configure with more functional than imperative approach. Something as:
const pushTag = (args, options = {}) => {
try {
spawn('push tag', args);
} catch (error) {
if (!options.isLoggedIn && notLoggedInError) {
login();
return pushTag(args, { isLoggedIn: true })
}
throw error;
}
}
pushTag(args);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great idea - changed
3b22d79
to
e2a5fab
Compare
e2a5fab
to
a4b317a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pgrzesik Great job!
Last thing that was holding us, was random crashes we observed in coverage build run against Node.js v10.
I've attempted to further investigate it, and so far outcome is as follows:
It looks that the orphaned crash happens either on fs.promises.open
, fs.readFile
or fs.open
call. As it's only those functions (with fs.promises.readFile
) which result with such error with no stack trace (Interestingly callback taking ones, sometimes expose stack trace and sometimes not).
I've excluded the fs.promises.readFile
from that list, as having it patched I was able to reproduce the error without info in error message: https://github.com/serverless/serverless/runs/1707179867
At some points I've attempted to patch fs.readFileSync
(although that was a false call, as those always produce stack trace), and after repeated runs got the error, but this time it was OOM error: https://github.com/serverless/serverless/runs/1708122362
This signals to me, that it's likely we're nearing end of memory and it causes the condition which produces the error we're occasionally seing
I had no further success in reproducing the error, when patching fs.promises.open
, fs.promises.readFile
, fs.readFile
and fs.open
. I've restarted build countless times and it always was successful.
It was during the weekend, so I wonder whether traffic on GitHub Actions servers doesn't influence some more restrictive memory usage which then makes error more likely.
I propose to merge it now, and if we observe the issue again, I'd commit-in the patch for fs
methods to have better stack trace if error happens again. Then having it repeated again we'll decide on further steps.
At this point it looks as condition related to tight memory usage, which may no longer be likely (as e.g. we reverted AJV to v6 which is less consuming)
Thanks @medikoo for extremely detailed report of your investigation and all the time you spend researching it 🙇 Your suggestion about running out of memory makes a lot of sense as it was only visible on constrained CI environment, where locally I've never encountered it. |
this doesn't works.... Your Environment Information --------------------------- Serverless: Configuration warning: What i'm doing wrong? |
Hello @brunocasado 👋 I believe you need to update your |
omg i got. sorry, now i realized that i did an update only in the global serverless and not inside the application. |
No worries @brunocasado - hope it works correctly after upgrade 💯 |
thank you! It's working! My opinion: This works much better than zip method. My problem with layer + python requirements ( size problem) is solved. :) |
Hi, are there any docs/ tutorials for this? I'm struggling to find anything that explains using this new feature. Thanks! |
Hello @JacobWithACapitalJ 👋 We have a blog post that includes a walkthrough: https://www.serverless.com/blog/container-support-for-lambda/ |
@pgrzesik is it possible to create an ECR and upload a local image ? just like it's done with lambdas, but with no lambda? I would like to create an ECR and use it's image in a ECS's task definition, creating the image in the same serverless.yml file. |
Hello @paulacampigotto - at the moment such functionality is not supported, building and uploading image is triggered only if the image definition is used for at least one function. |
This is still WIP, but I wanted to consult a few smaller things before cleaning everything up, providing proper test coverage and docs.
A few questions:
serverless.yml
. What do you think here? Should we try to remove ECR repository only if there areprovider.images
defined?imageSha
- ifdeploy
fails, we won't triggercleanupEcrRepository
, which means the image won't be deleted and Lambdas won't be affected. There will be an untagged + tagged image that is not actually used in the repository, but it's going to be fixed automatically after next successful deployment, so unless someone purposefully removes the image from ECR, there shouldn't be any issues. What is your opinion on that? Do you think it's still worth attempting to re-tag the image in case of failure?docker build
anddocker push
processes displayed simultaneously. I don't think we can "fix" it other than enforcing serial build process. Do you think it's fine to keep the outputs like that?provider
, in addition to images. I was thinking that maybe, we should create an "umbrella" property onprovider
likeecr
ordocker
or something along those lines that will hold all related configurations (provider.docker.images
,provider.docker.repository
, etc)? We might get additional ones in the future (e.g. setting to keep old images instead of removing them). What do you think?spawnExt
? Should we try to catch them and provide a custom error message for user or just fail whenever process fails?docker
is installed on the machine before continuing further? Right now it will fail onbuild
attepmt, but I was thinking that maybe we should fail faster in such case.Closes: #8622