Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clients read config from file resulting in EMFILE (too many open files) errors #3019

Closed
moltar opened this issue Nov 12, 2021 · 48 comments · Fixed by #3285
Closed

Clients read config from file resulting in EMFILE (too many open files) errors #3019

moltar opened this issue Nov 12, 2021 · 48 comments · Fixed by #3285
Assignees
Labels
bug This issue is a bug.

Comments

@moltar
Copy link

moltar commented Nov 12, 2021

Describe the bug

This applies to every auto-generated client, but I will use client-sqs as an example.

This is related to: #2271, #2027, #2993


Note that in the above snippet, loadNodeConfig is called multiple times to get the defaults for the client configuration:

return {
...clientSharedValues,
...config,
runtime: "node",
base64Decoder: config?.base64Decoder ?? fromBase64,
base64Encoder: config?.base64Encoder ?? toBase64,
bodyLengthChecker: config?.bodyLengthChecker ?? calculateBodyLength,
credentialDefaultProvider:
config?.credentialDefaultProvider ?? decorateDefaultCredentialProvider(credentialDefaultProvider),
defaultUserAgentProvider:
config?.defaultUserAgentProvider ??
defaultUserAgent({ serviceId: clientSharedValues.serviceId, clientVersion: packageInfo.version }),
maxAttempts: config?.maxAttempts ?? loadNodeConfig(NODE_MAX_ATTEMPT_CONFIG_OPTIONS),
md5: config?.md5 ?? Hash.bind(null, "md5"),
region: config?.region ?? loadNodeConfig(NODE_REGION_CONFIG_OPTIONS, NODE_REGION_CONFIG_FILE_OPTIONS),
requestHandler: config?.requestHandler ?? new NodeHttpHandler(),
retryMode: config?.retryMode ?? loadNodeConfig(NODE_RETRY_MODE_CONFIG_OPTIONS),
sha256: config?.sha256 ?? Hash.bind(null, "sha256"),
streamCollector: config?.streamCollector ?? streamCollector,
useDualstackEndpoint: config?.useDualstackEndpoint ?? loadNodeConfig(NODE_USE_DUALSTACK_ENDPOINT_CONFIG_OPTIONS),
useFipsEndpoint: config?.useFipsEndpoint ?? loadNodeConfig(NODE_USE_FIPS_ENDPOINT_CONFIG_OPTIONS),
utf8Decoder: config?.utf8Decoder ?? fromUtf8,
utf8Encoder: config?.utf8Encoder ?? toUtf8,

Which is the exported fn loadConfig from @aws-sdk/node-config-provider package:

export const loadConfig = <T = string>(
{ environmentVariableSelector, configFileSelector, default: defaultValue }: LoadedConfigSelectors<T>,
configuration: LocalConfigOptions = {}
): Provider<T> =>
memoize(
chain(
fromEnv(environmentVariableSelector),
fromSharedConfigFiles(configFileSelector, configuration),
fromStatic(defaultValue)
)
);

Which then uses fromSharedConfigFiles to load config values from the disk.

This is a very undesirable "feature", especially in the serverless environments.

Because under heavy load this results in the following errors:

    error: {
      "type": "NodeError",
      "message": "A system error occurred: uv_os_homedir returned EMFILE (too many open files)",
      "stack":
          SystemError [ERR_SYSTEM_ERROR]: A system error occurred: uv_os_homedir returned EMFILE (too many open files)
              at Object.getHomeDir (/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/index.js:82:17)
              at Object.loadSharedConfigFiles (/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/index.js:11:89)
              at null.<anonymous> (/node_modules/@aws-sdk/node-config-provider/dist-cjs/fromSharedConfigFiles.js:9:53)
              at null.<anonymous> (/node_modules/@aws-sdk/property-provider/dist-cjs/chain.js:11:28)
              at runMicrotasks (<anonymous>)
              at processTicksAndRejections (internal/process/task_queues.js:95:5)
              at null.coalesceProvider (/node_modules/@aws-sdk/property-provider/dist-cjs/memoize.js:13:24)
              at Object.isConstant (/node_modules/@aws-sdk/property-provider/dist-cjs/memoize.js:24:28)
              at Object.getEndpointFromRegion (/node_modules/@aws-sdk/config-resolver/dist-cjs/endpointsConfig/utils/getEndpointFromRegion.js:12:34)
              at null.buildHttpRpcRequest (/node_modules/@aws-sdk/client-sqs/dist-cjs/protocols/Aws_query.js:2540:68)
      "code": "ERR_SYSTEM_ERROR",
      "info": {
        "errno": -24,
        "code": "EMFILE",
        "message": "too many open files",
        "syscall": "uv_os_homedir"
      },
      "errno": -24,
      "syscall": "uv_os_homedir"
    }

Yes the call is memoized, but if your Lambda is getting executed heavily, and installation happens within the handler, then this call happens multiple times.

Your environment

SDK version number

@aws-sdk/client-sqs@3.40.0

Is the issue in the browser/Node.js/ReactNative?

Node.js

Details of the browser/Node.js/ReactNative version

node -v v14.17.5

Steps to reproduce

import { SQS } from '@aws-sdk/client-sqs'

const sqs = new SQS({})

Observed behavior

    error: {
      "type": "NodeError",
      "message": "A system error occurred: uv_os_homedir returned EMFILE (too many open files)",
      "stack":
          SystemError [ERR_SYSTEM_ERROR]: A system error occurred: uv_os_homedir returned EMFILE (too many open files)
              at Object.getHomeDir (/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/index.js:82:17)
              at Object.loadSharedConfigFiles (/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/index.js:11:89)
              at null.<anonymous> (/node_modules/@aws-sdk/node-config-provider/dist-cjs/fromSharedConfigFiles.js:9:53)
              at null.<anonymous> (/node_modules/@aws-sdk/property-provider/dist-cjs/chain.js:11:28)
              at runMicrotasks (<anonymous>)
              at processTicksAndRejections (internal/process/task_queues.js:95:5)
              at null.coalesceProvider (/node_modules/@aws-sdk/property-provider/dist-cjs/memoize.js:13:24)
              at Object.isConstant (/node_modules/@aws-sdk/property-provider/dist-cjs/memoize.js:24:28)
              at Object.getEndpointFromRegion (/node_modules/@aws-sdk/config-resolver/dist-cjs/endpointsConfig/utils/getEndpointFromRegion.js:12:34)
              at null.buildHttpRpcRequest (/node_modules/@aws-sdk/client-sqs/dist-cjs/protocols/Aws_query.js:2540:68)
      "code": "ERR_SYSTEM_ERROR",
      "info": {
        "errno": -24,
        "code": "EMFILE",
        "message": "too many open files",
        "syscall": "uv_os_homedir"
      },
      "errno": -24,
      "syscall": "uv_os_homedir"
    }

Expected behavior

  1. Do not error out
  2. Do not read config from disk by default, or allow overriding this behaviour.

Screenshots

N/A

Additional context

N/A

@moltar moltar added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Nov 12, 2021
@moltar
Copy link
Author

moltar commented Nov 12, 2021

Added a console.log to slurpFile:

const slurpFile = (path: string): Promise<string> =>

const slurpFile = (path) => new Promise((resolve, reject) => {
    console.log(path, new Error().stack)

And tested with the following two-liner file:

import { SQS } from '@aws-sdk/client-sqs'
const sqs = new SQS({})

Results

/Users/user/.aws/config Error: 
    at /Users/user/project/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/index.js:66:23
    at new Promise (<anonymous>)
    at slurpFile (/Users/user/project/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/index.js:64:29)
    at Object.loadSharedConfigFiles (/Users/user/project/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/index.js:13:9)
    at defaultProvider (/Users/user/project/node_modules/@aws-sdk/credential-provider-node/dist-cjs/index.js:17:57)
    at Object.credentialDefaultProvider (/Users/user/project/node_modules/@aws-sdk/client-sts/dist-cjs/defaultRoleAssumers.js:10:68)
    at Object.resolveAwsAuthConfig (/Users/user/project/node_modules/@aws-sdk/middleware-signing/dist-cjs/configurations.js:10:17)
    at new SQSClient (/Users/user/project/node_modules/@aws-sdk/client-sqs/dist-cjs/SQSClient.js:20:48)
    at new SQS (/Users/user/project/node_modules/@aws-sdk/client-sqs/dist-cjs/SQS.js:25:1)
    at Object.<anonymous> (/Users/user/project/test/sdk.ts:3:13)
/Users/user/.aws/credentials Error: 
    at /Users/user/project/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/index.js:66:23
    at new Promise (<anonymous>)
    at slurpFile (/Users/user/project/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/index.js:64:29)
    at Object.loadSharedConfigFiles (/Users/user/project/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/index.js:14:9)
    at defaultProvider (/Users/user/project/node_modules/@aws-sdk/credential-provider-node/dist-cjs/index.js:17:57)
    at Object.credentialDefaultProvider (/Users/user/project/node_modules/@aws-sdk/client-sts/dist-cjs/defaultRoleAssumers.js:10:68)
    at Object.resolveAwsAuthConfig (/Users/user/project/node_modules/@aws-sdk/middleware-signing/dist-cjs/configurations.js:10:17)
    at new SQSClient (/Users/user/project/node_modules/@aws-sdk/client-sqs/dist-cjs/SQSClient.js:20:48)
    at new SQS (/Users/user/project/node_modules/@aws-sdk/client-sqs/dist-cjs/SQS.js:25:1)
    at Object.<anonymous> (/Users/user/project/test/sdk.ts:3:13)
/Users/user/.aws/config Error: 
    at /Users/user/project/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/index.js:66:23
    at new Promise (<anonymous>)
    at slurpFile (/Users/user/project/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/index.js:64:29)
    at Object.loadSharedConfigFiles (/Users/user/project/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/index.js:13:9)
    at /Users/user/project/node_modules/@aws-sdk/node-config-provider/dist-cjs/fromSharedConfigFiles.js:9:53
    at /Users/user/project/node_modules/@aws-sdk/property-provider/dist-cjs/chain.js:11:28
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at coalesceProvider (/Users/user/project/node_modules/@aws-sdk/property-provider/dist-cjs/memoize.js:13:24)
    at isConstant (/Users/user/project/node_modules/@aws-sdk/property-provider/dist-cjs/memoize.js:24:28)
/Users/user/.aws/credentials Error: 
    at /Users/user/project/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/index.js:66:23
    at new Promise (<anonymous>)
    at slurpFile (/Users/user/project/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/index.js:64:29)
    at Object.loadSharedConfigFiles (/Users/user/project/node_modules/@aws-sdk/shared-ini-file-loader/dist-cjs/index.js:14:9)
    at /Users/user/project/node_modules/@aws-sdk/node-config-provider/dist-cjs/fromSharedConfigFiles.js:9:53
    at /Users/user/project/node_modules/@aws-sdk/property-provider/dist-cjs/chain.js:11:28
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at coalesceProvider (/Users/user/project/node_modules/@aws-sdk/property-provider/dist-cjs/memoize.js:13:24)
    at isConstant (/Users/user/project/node_modules/@aws-sdk/property-provider/dist-cjs/memoize.js:24:28)

As you can see, it is trying to read the same files twice, and makes at least 4 IO operations per instance of a client.

@moltar
Copy link
Author

moltar commented Nov 12, 2021

Now, making this a bit more realistic and instantiating SQS client in a loop.

This would be the case when it is instantiated inside a lambda handler.

Well, it wouldn't be a loop, but it would be instantiated multiple times when lambda is being executed warm, and very frequently.

import { SQS } from '@aws-sdk/client-sqs'

for (const iter of [1, 2]) {
  const sqs = new SQS({})
}

Then:

ts-node test/sdk.ts | grep Error | wc -l

Results

In 8 executions!

So the results are never cached globally, and are simply cached per client instance.

@moltar
Copy link
Author

moltar commented Nov 12, 2021

Here's a way to remove unnecessary calls:

import { SQS } from '@aws-sdk/client-sqs'

const sqs = new SQS({
  // removes first 2 calls
  credentials: {
    accessKeyId: 'x',
    secretAccessKey: '',
  },

  // removes second 2 calls
  defaultUserAgentProvider: async () => [],
})

@moltar
Copy link
Author

moltar commented Nov 12, 2021

But wait ... there is more!

If you actually try to use the methods of the queue, this will result in even more calls!

import { SQS } from '@aws-sdk/client-sqs'

const sqs = new SQS({
  credentials: {
    accessKeyId: process.env.AWS_ACCESS_KEY_ID || '',
    secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY || '',
    sessionToken: process.env.AWS_SESSION_TOKEN || '',
  },
  region: 'us-east-1',
  defaultUserAgentProvider: async () => [],
})

async function main() {
  await sqs.listQueues({})
}

main()

Results

$> ts-node test/sdk.ts | grep Error | wc -l

8

So we get 8 IO calls for each method call (I tried sendMessage too, it's the same)!

This is ridiculous and needs to be addressed immediately, as it is causing major issues!

@moltar
Copy link
Author

moltar commented Nov 12, 2021

Interestingly, calling listQueues in a loop, does not cause additional calls. So, I guess the results do get cached.

@moltar
Copy link
Author

moltar commented Nov 12, 2021

If anyone stumbles on this, the only way I found to shut the file reading, is to do the following hack.

import * as sharedIniFileLoader from '@aws-sdk/shared-ini-file-loader'

Object.assign(sharedIniFileLoader, {
  loadSharedConfigFiles: async (): Promise<sharedIniFileLoader.SharedConfigFiles> => ({
    configFile: {},
    credentialsFile: {},
  }),
})

@moltar moltar closed this as completed Nov 12, 2021
@moltar moltar reopened this Nov 12, 2021
@vudh1 vudh1 self-assigned this Nov 15, 2021
@petermorlion
Copy link

I can confirm that @moltar 's workaround solves this issue in AWS Lambda functions.

@adrai
Copy link

adrai commented Jan 30, 2022

The aws-sdk should detect if running in lambda and omit the loadSharedConfigFiles call completely, or at least provide a "supported" way to disable the loadSharedConfigFiles call...

@adrai
Copy link

adrai commented Jan 31, 2022

Without this workaround suggested by @moltar it seems v3.49.0 is even getting worse, and opening more file descriptors: probably because of this #3192 ?
//cc @AllanZhengYP @trivikr

@adrai
Copy link

adrai commented Jan 31, 2022

Ok, it seems with v3.49.0 it seems these type of EMFILE errors, can still occur (also with that workaround):

Error: connect EMFILE 52.94.5.100:443 - Local (undefined:undefined)
Error: getaddrinfo EMFILE dynamodb.eu-west-1.amazonaws.com
Error: A system error occurred: uv_os_homedir returned EMFILE (too many open files)

PS. so having a lambda setup, like this is incrementing the probability for such EMFILE errors: https://github.com/fastify/aws-lambda-fastify#lower-cold-start-latency

@adrai
Copy link

adrai commented Jan 31, 2022

import { DynamoDBClient } from '@aws-sdk/client-dynamodb'
import fs from 'fs'
import { lsof } from 'list-open-files'

let readFileCount = 0
const originalReadFile = fs.readFile
fs.readFile = function (f) {
  // console.log(`reading file: ${f}`)
  readFileCount++
  return originalReadFile.apply(fs, arguments)
}

const clientCreations = 1000
for (let index = 0; index < clientCreations; index++) {
  new DynamoDBClient()
}

console.log(`made ${clientCreations} client instances`)
console.log(`called readFile ${readFileCount} times`)

lsof().then((ret) => {
  const openFiles = ret[0].files.length
  console.log(`${openFiles} open fds`)

  lsof().then((ret) => {
    const openFiles = ret[0].files.length
    console.log(`${openFiles} open fds a bit later`)
  })
})
made 1000 client instances
called readFile 2000 times
4030 open fds
30 open fds a bit later

Beside opening too many file descriptors by reading non-existing config and credential files in lambda...
Is it possible lambda is not "waiting" for the file descriptors to be closed? => this would increase the problem in warm starts

As it seems there is more stuff generating open file descriptors in (v3.47.0 - v3.49.0) then just the readFile file descriptors, I currently stay on v3.46.0 with the above workaround!

@ffxsam
Copy link

ffxsam commented Feb 1, 2022

@AllanZhengYP @ajredniwja Could you please check into this when you get a chance? This is a serious issue that impacted our production system recently.

@adrai
Copy link

adrai commented Feb 3, 2022

@vudh1 you assigned this issue to yourself, can you say something about it?

@adrai
Copy link

adrai commented Feb 3, 2022

fyi: found an interesting package: https://github.com/samswen/lambda-emfiles

made some tests:

Some insights (same workload for all tests)

1) v3.46.0 without any workarounds:

Details

image
image

=> some leaks (around 230 emfiles)

2) v3.49.0 without any workarounds:

Details

image
image

=> some leaks and more emfiles (up to more than 600 emfiles)

3) v3.49.0 with loadSharedConfigFiles workaround:

Details

image
image

=> much less leaks, probably also less emfiles (up to more than 400 emfiles)

4) v3.46.0 with loadSharedConfigFiles workaround:

Details

image
image

=> much less leaks, much less emfiles (up to 180 emfiles)

So seems "4) v3.46.0 with loadSharedConfigFiles workaround" is the best!

@adrai
Copy link

adrai commented Feb 3, 2022

An additional curiosity:

Defining a custom requestHandler, with a very low socketTimeout reduces drastically the emfiles count:

  requestHandler: new NodeHttpHandler({
      socketTimeout: 10000 // <- this decreases the emfiles count, the Node.js default is 120000
  })
Details

image
image

@adrai
Copy link

adrai commented Feb 3, 2022

Yes, can confirm:

still some leaks but, loadSharedConfigFiles workaround + lower socketTimeout = 🚀 👍 (a lot less emfiles)

@moltar
Copy link
Author

moltar commented Feb 4, 2022

An additional curiosity:

Defining a custom requestHandler, with a very low socketTimeout reduces drastically the emfiles count:

  requestHandler: new NodeHttpHandler({
      socketTimeout: 10000 // <- this decreases the emfiles count, the Node.js default is 120000
  })

Great find! Any idea what the default value is?

Edit: I see the commend: 120000.

@moltar
Copy link
Author

moltar commented Feb 4, 2022

Wondering if this timeout perhaps need to equal Lambda timeout? Maybe this can be auto-set based on the getRemainingTimeInMillis?

https://docs.aws.amazon.com/lambda/latest/dg/nodejs-context.html

@moltar
Copy link
Author

moltar commented Feb 4, 2022

The only downside with this approach would be that the client has to be instantiated inside the handler. And I usually like to have them outside of the handler to preserve the instances across invocations.

@adrai
Copy link

adrai commented Feb 4, 2022

As this seems to be the timeout for an idle socket, I think in lambda this could be automatically defaulted to a very low value, at least magnitute of 10... so 12 seconds instead of 2 minutes... but even lower so 1 second for example, should work.

Would be nice if some AWS internal engineer would help to argument on this topic.

proposal:

// https://github.com/aws/aws-sdk-js-v3/blob/main/packages/node-http-handler/src/node-http-handler.ts#L62
socketTimeout: socketTimeout || 10000

and probably also here?

// https://github.com/aws/aws-sdk-js-v3/blob/main/packages/node-http-handler/src/node-http2-handler.ts#L44
this.requestTimeout = requestTimeout || 10000;

@ffxsam
Copy link

ffxsam commented Feb 4, 2022

This issue has been open for nearly three months and no one from AWS has chimed in. I'll reach out to my account manager and see if they can help move things along here.

@adrai
Copy link

adrai commented Feb 4, 2022

This issue has been open for nearly three months and no one from AWS has chimed in. I'll reach out to my account manager and see if they can help move things along here.

https://twitter.com/trivikram/status/1489588283726204928?s=21

@ffxsam
Copy link

ffxsam commented Feb 4, 2022

Thanks for that update!

@trivikr
Copy link
Member

trivikr commented Feb 4, 2022

AWS SDK for JavaScript team discussed this issue in our scrum today.
We're evaluating adding some kind of lock/mutex to fromSharedConfigFiles function, and I'll provide an update here.

@adrai
Copy link

adrai commented Feb 4, 2022

The leaking file descriptors of the readFile calls is just one part... the file descriptors of the http requests are the other part of the issue, which could be mitigated with a lower socketTimeout value.

@adrai Can you create a new bug report? It would be easy for tracking fixes.

here we go ;-) #3279

@ffxsam
Copy link

ffxsam commented Feb 4, 2022

Pardon any confusion on my part, but it seems like the file descriptor issue is happening for every single API call, not just upon instantiation. I'm only instantiating S3Client once, but then I call CopyObject 2000 times, and I get the EMFILE error. Does this match other people's experiences?

@adrai
Copy link

adrai commented Feb 4, 2022

Pardon any confusion on my part, but it seems like the file descriptor issue is happening for every single API call, not just upon instantiation. I'm only instantiating S3Client once, but then I call CopyObject 2000 times, and I get the EMFILE error. Does this match other people's experiences?

That's why I created #3279 😉

@trivikr
Copy link
Member

trivikr commented Feb 4, 2022

Hi folks, I've posted WIP PR to remove concurrent/duplicate calls in slurpFile at #3281 #3282

Do post your comments if you take a look.
cc folks who had commented on this issue: @adrai @ffxsam @moltar @petermorlion

@ffxsam
Copy link

ffxsam commented Feb 4, 2022

@trivikr Thanks, Trivikram, much appreciated!

@moltar
Copy link
Author

moltar commented Feb 4, 2022

Hi folks, I've posted WIP PR to remove concurrent/duplicate calls in slurpFile at #3281 #3282

Do post your comments if you take a look.

cc folks who had commented on this issue: @adrai @ffxsam @moltar @petermorlion

Thank you for looking at the issue.

But why do we need to have any calls in the lambda environment?

We need some way to disable config reading completely.

Thank you.

@trivikr
Copy link
Member

trivikr commented Feb 7, 2022

Hi folks, I've posted WIP PR to remove concurrent/duplicate calls in slurpFile at #3281 #3282

Since I spent too much time on this issue, I refactored shared-ini-file-loader into multiple components and posted an update in #3285

The fix is a merge of both #3281 and #3282:

  • I'd to use hash and promise callback queue on slurpFile so that hash can be specific to path, and not a mix of configPath+credsPath
  • The lastModified key from hash is removed, as data is anyway memoized downstream and readFile will be called for any path just once.

@trivikr
Copy link
Member

trivikr commented Feb 7, 2022

But why do we need to have any calls in the lambda environment?
We need some way to disable config reading completely.

@moltar Can you create a feature request for this question/proposal?
You can also link #3279 in your feature request under additional context.

@moltar
Copy link
Author

moltar commented Feb 7, 2022

I feel this is part of the same issue, no?

The client should not check the disk for credentials if there are env variables set, which is always the case in the Lambda env.

I can see it still trying to load ~/.aws/config with env vars present though, which might be valid, in a weird way. E.g. if someone puts the config there during Lambda execution, before SDK instantiation, for whatever reason.

I think one approach would be to continue make use the provider pattern, as it is already applied. Just make sure the SDK clients respect any explicit providers given. E.g. could be ConfigProvider and CredentialsProvider, when supplying both of those with instances of (imaginary) ConfigFromMemoryProvider and CredentialsFromEnvironmentProvider then it'd avoid making any disk IO and use the provider only.

If you think this is a viable approach, I will open a separate issue.

@trivikr
Copy link
Member

trivikr commented Feb 7, 2022

I feel this is part of the same issue, no?

A request to create a separate issue is for better tracking. This issue has already got 40 comments.

The discussions in this request are currently for reducing readFile calls. In the other issue, we can limit the discussions about skipping readFiles altogether in Lambda environments as you suggested, and how to approach that in AWS SDK for JavaScript and other SDKs.

I think one approach would be to continue make use the provider pattern, as it is already applied. Just make sure the SDK clients respect any explicit providers given. E.g. could be ConfigProvider and CredentialsProvider, when supplying both of those with instances of (imaginary) ConfigFromMemoryProvider and CredentialsFromEnvironmentProvider then it'd avoid making any disk IO and use the provider only.
If you think this is a viable approach, I will open a separate issue.

Yup, this is one of the approaches. Please create a feature request with that suggestion, and other alternatives if you have any.

@manchicken
Copy link

I'm still seeing this issue in 3.52.0.

@adrai
Copy link

adrai commented Feb 22, 2022

I'm still seeing this issue in 3.52.0.

It's probably caused by the sdk requests: #3279

@manchicken
Copy link

@adrai totally possible. I'm trying @moltar's work-around right now to see if that helps.

@trivikr
Copy link
Member

trivikr commented Feb 23, 2022

@adrai totally possible. I'm trying @moltar's work-around right now to see if that helps.

The workaround in #3019 (comment) will remove the remaining two readFile calls per client, but the #3279 will still remain.

Please use low socket timeouts as suggested by @adrai

@github-actions
Copy link

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs and link to relevant comments in this thread.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 10, 2022
@kellertk
Copy link
Contributor

Reopening this issue per https://twitter.com/thdxr/status/1696615147295379586

@kellertk kellertk reopened this Aug 29, 2023
@kellertk kellertk added the needs-triage This issue or PR still needs to be triaged. label Aug 29, 2023
@trivikr trivikr removed the needs-triage This issue or PR still needs to be triaged. label Aug 30, 2023
@trivikr
Copy link
Member

trivikr commented Aug 30, 2023

From the tweet

in terms of areas of improvements with these things it's typically death by a thousand cut - the current struggle i'm having is this issue that's been closed and unclear where it landed

It's very clear that the issue discussed in this specific report was resolved in #3285
The testing section of the PR shows how number of readFile calls are reduced from 4/8/16 to 2 calls in different scenarios.

I also verified that another issue created as a follow-up was also resolved in #3279 (comment), where we resolve config provider only once.

@trivikr trivikr closed this as completed Aug 30, 2023
@trivikr
Copy link
Member

trivikr commented Aug 30, 2023

If you're coming across EMFILE or any other issues in your setup, please create a new feature request or upvote existing ones if they're already reported.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.