Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting EMFILE error on @aws-sdk/client-s3 #5273

Closed
3 tasks done
sumangalam17 opened this issue Sep 28, 2023 · 13 comments
Closed
3 tasks done

Getting EMFILE error on @aws-sdk/client-s3 #5273

sumangalam17 opened this issue Sep 28, 2023 · 13 comments
Assignees
Labels
bug This issue is a bug. closed-for-staleness p2 This is a standard priority issue response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days.

Comments

@sumangalam17
Copy link

sumangalam17 commented Sep 28, 2023

Checkboxes for prior research

Describe the bug

trying to call this function from sendMessage and getting error like below

const saveToS3 = async (bucketName,data) => {

  const uploadParams = {
    Bucket: bucketName,
    Body: JSON.stringify(data)
  };

  const s3 = new S3({ maxAttempts: 3 });

  await s3.putObject(uploadParams)
    .then((data) => {
      logger.info(
        `${componentName}: saveToS3: Successfully uploaded file: `, data);
      return data;
    })
    .catch(err => {
      logger.error(
        `${componentName}: saveToS3: There was an error uploading file: `, err);
    });
}

error :

{
    "errno": -24,
    "code": "EMFILE",
    "syscall": "getaddrinfo",
    "hostname": "s3.us-east-1.amazonaws.com",
    "$metadata": {
        "attempts": 1,
        "totalRetryDelay": 0
    },
    "stack": "Error: getaddrinfo EMFILE us-east-1.amazonaws.com\n    at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:108:26)"
}

tried every possibilities mention in this issue #4345
like calling client globally
using destroy method
setting connection 1 and all

SDK version number

@aws-sdk/package-name@version, 3.418.0

Which JavaScript Runtime is this issue in?

Node.js

Details of the browser/Node.js/ReactNative version

types/node -v 20.2.1

Reproduction Steps

trying to call this function many time from send message

Observed Behavior

n/A

Expected Behavior

n/A

Possible Solution

No response

Additional Information/Context

No response

@sumangalam17 sumangalam17 added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Sep 28, 2023
@yenfryherrerafeliz yenfryherrerafeliz self-assigned this Sep 28, 2023
@ajredniwja
Copy link
Member

Hi @sumangalam17 thanks for opening this issue, seems like you are using v2 of the SDK and not using ``@aws-sdk/client-s3.

I would recommend just using try/catch with await.

Can you try to change your code to something like below

for V2

const AWS = require('aws-sdk');

const s3 = new AWS.S3({ maxAttempts: 3 });

const saveToS3 = async(bucketName, data) => {
    const uploadParams = {
        Bucket: bucketName,
        Body: JSON.stringify(data),
        Key: "file.json"
    };

    try {
        const response = await s3.putObject(uploadParams).promise();
        console.log(`Successfully uploaded the file: `, response);
        return response;
    } catch (err) {
        console.error(`Error: uploading`, err);
        throw err;
    }
}

V3 would look like:

const { S3Client, PutObjectCommand } = require("@aws-sdk/client-s3");

const s3 = new S3Client({ region: "YOUR_REGION" });

const saveToS3 = async (bucketName, data) => {
    const uploadParams = {
        Bucket: bucketName,
        Body: JSON.stringify(data),
        Key: "file.json"
    };

    try {
        const response = await s3.send(new PutObjectCommand(uploadParams));
        console.log(`Successfully uploaded the file: `, response);
        return response;
    } catch (err) {
        console.error(`Error: uploading`, err);
        throw err;
    }
}

I tried to call this function several times and wasnt able to reproduce the error.

(async() => {
    try {
        const bucketName = "bucket-name";

        for (let i = 0; i < 1000; i++) {
            const data = { key: `xyz-${i}` }; 
            await saveToS3(bucketName, data);
        }

    } catch (error) {
        console.error("Error uploading:", error);
    }
})();

Please let us know if that doesnt help. thanks

@ajredniwja ajredniwja added p2 This is a standard priority issue response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. and removed needs-triage This issue or PR still needs to be triaged. labels Sep 29, 2023
@sumangalam17
Copy link
Author

sumangalam17 commented Oct 2, 2023

its not due to

I tried to call this function several times and wasnt able to reproduce the error.

(async() => {
    try {
        const bucketName = "bucket-name";

        for (let i = 0; i < 1000; i++) {
            const data = { key: `xyz-${i}` }; 
            await saveToS3(bucketName, data);
        }

    } catch (error) {
        console.error("Error uploading:", error);
    }
})();

Please let us know if that doesnt help. thanks

actually its not due to calling that function many time its due to we are triggering sendMessage and that triggering this function and calling client and function both.

I tried using V3 and try and catch also but still its not working.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. label Oct 3, 2023
@jameshartt
Copy link

jameshartt commented Oct 17, 2023

Definitely seen this when we recently upgraded our v3 AWS dependencies recently. Had to roll back that upgrade. We seemed to see this consistently on many services. Haven't had the time to properly investigate it. But it looked like potentially TCP connections exceeding whatever a given Lambda can cope with. Only happened on Lambda functions that were being called a bunch.

@DesAWSume
Copy link

DesAWSume commented Nov 20, 2023

I am on the same boat here, when using this Javascript SDKv3 s3 client. running in Lambda
It randomly throw this error

{
    "errorType": "Error",
    "errorMessage": "bind EMFILE 0.0.0.0",
    "code": "EMFILE",
    "errno": -24,
    "syscall": "bind",
    "address": "0.0.0.0",
    "stack": [
        "Error: bind EMFILE 0.0.0.0",
        "    at __node_internal_captureLargerStackTrace (node:internal/errors:496:5)",
        "    at __node_internal_exceptionWithHostPort (node:internal/errors:671:12)",
        "    at node:dgram:364:20",
        "    at process.processTicksAndRejections (node:internal/process/task_queues:83:21)"
    ]
}

And going through my code, always using await with the PutObjectToS3 function. Can some investigate this?

@praneetrattan
Copy link

We have been facing this issue too, but with DynamoDb. We had a lambda being called multiple times concurrently (~503) and this lambda calls DynamoDB for a query. Out of those 503 lambdas, about 93 failed with the error:

{
  "errorType": "Error",
  "errorMessage": "getaddrinfo EMFILE dynamodb.us-east-1.amazonaws.com",
  "code": "EMFILE",
  "errno": -24,
  "syscall": "getaddrinfo",
  "hostname": "dynamodb.us-east-1.amazonaws.com",
  "$metadata": {
    "attempts": 1,
    "totalRetryDelay": 0
  },
  "stack": [
    "Error: getaddrinfo EMFILE dynamodb.us-east-1.amazonaws.com",
    "    at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:108:26)"
  ]
}

Any known solution for this? Some forums mentioned it might be a problem with how connections are handled in node. We just upgraded to V3 and as per docs AWS_NODEJS_CONNECTION_REUSE_ENABLED is set to true by default.
https://docs.aws.amazon.com/sdk-for-javascript/v3/developer-guide/node-reusing-connections.html

@Altane-be
Copy link

I'm also having this issues since passing to v3 for a number of lambda when initializing a new instance of DynamoDb class. Could we have an update on this please ?

@mahdiridho
Copy link

I get lot of emfile messages when try to call any api gw endpoint from lambda. Still stuck till now. Most of the articles I have tried, but no luck

@randomhash
Copy link

Describe the bug

Error: getaddrinfo EMFILE s3.eu-central-1.amazonaws.com
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:107:26)
    at GetAddrInfoReqWrap.callbackTrampoline (node:internal/async_hooks:128:17)
Processing large amounts of data leads to such errors with s3/dynamodb/possible other clients

SDK version number
3.362.0

Which JavaScript Runtime is this issue in?
Node.js

Details of the browser/Node.js/ReactNative version
lambda node 18

Reproduction Steps
That is tough one. Large amount of data processed within step functions

Dynamodb client 1:

export function getClient(credentials: Creds, forceRecreation: boolean = false): DynamoDBDocument {
  if (forceRecreation || !documentClient || areCredentialsExpired()) {
    documentClient = DynamoDBDocument.from(
      new DynamoDBClient({
        requestHandler: new NodeHttpHandler({
          httpsAgent: new Agent({
            keepAlive: true,
            maxSockets: Infinity,
          }),
        }),
        credentials: {
          accessKeyId: credentials.accessKeyId,
          secretAccessKey: credentials.secretAccessKey,
          sessionToken: credentials.sessionToken,
          expiration: credentials.expiration as Date,
        },
      })
    );

    cachedCredentials = credentials;
  }

  return documentClient;
}

Dynamodb client 2:

export const documentClient = DynamoDBDocument.from(
  new DynamoDBClient({
    logger: console,
    retryStrategy,
  })
);

s3 clients:

import {retryStrategy} from '../retry/index.js';

const s3download = new S3({
  region: process.env.AWS_REGION,
  logger: console,
  forcePathStyle: true,
  requestHandler: new NodeHttpHandler({
    httpsAgent: new Agent({
      timeout: 5000,
      maxSockets: 1000, // default 50
      // keepAlive is a default from AWS SDK. We want to preserve this for
      // performance reasons.
      keepAlive: true,
      keepAliveMsecs: 1000, // default unset,
    }),
    connectionTimeout: 1000,
    requestTimeout: 5000, // default 0
  }),
  retryStrategy,
});

const s3upload = new S3({
  region: process.env.AWS_REGION,
  logger: console,
  forcePathStyle: true,
  requestHandler: new NodeHttpHandler({
    httpsAgent: new Agent({
      maxSockets: 1000, // default 50
      keepAlive: false,
      keepAliveMsecs: 1000, // default unset,
    }),
    connectionTimeout: 1000,
    requestTimeout: 5000, // default 0
  }),
  retryStrategy,
});

export function getClient(params?: S3ClientConfig) {
  if (params) {
    return new S3(params);
  }

  return s3download;
}

Large amounts of data processed in this way leads to repetetive errors with emfile
s3 clients initiated once, ddb2 instance only 1, 1st instance is being reset on credentials expire

@kuhe kuhe added needs-triage This issue or PR still needs to be triaged. and removed p2 This is a standard priority issue labels Feb 19, 2024
@RanVaknin RanVaknin self-assigned this Feb 21, 2024
@RanVaknin
Copy link
Contributor

Hi everyone on the thread,

Sorry for the late response.

A lot of folks have commented here and shared info, but the part I'm suspicious about is the actual Lambda handler function itself.

In order to limit the amount of open file descriptors, you can take advantage of lambda's re-use of the execution environment.

By instantiating the SDK client outside of the handler's codepath, you will ensure you have one sdk client and that connections will be manged more efficiently.

I added a code example to our docs https://github.com/aws/aws-sdk-js-v3?tab=readme-ov-file#best-practices

Let me know if this helps.
Thanks,
Ran~

Related:
#4345
#5010
#4964

@RanVaknin RanVaknin added response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. p2 This is a standard priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Feb 21, 2024
@randomhash
Copy link

randomhash commented Feb 26, 2024

@RanVaknin hey, just a quick note, any idea on how to organize cross-account communication? Or maybe suitable sample, whenever we have in equation at least 2 aws accounts that share services?
As there are STS credentials with expiration 1 instance per lambda may be a bit problematic, especially whenever we are utilizing long-running tasks with complex data transfers
While making the actual destroy of the client is an option, still, looks like it does not help us to solve the EMFILE while there are lots of action happening within lambda itself

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. label Feb 27, 2024
@RanVaknin
Copy link
Contributor

Hi @randomhash ,

Like I mentioned in my comment above, it's hard to advise on a specific scenario without seeing your actual handler code.
Calling .destroy() is indeed a workaround but like you mentioned might not cover all cases. The correct approach is to instantiate the SDK client outside of the handler's codepath, in the function scope so that when the container is reused, there wont be another SDK created in memory.

If you can share a code example of what your application does we might be able to help you re-arrange your code in a way that avoids recreation of the SDK clients.

Thanks,
Ran~

@RanVaknin RanVaknin added the response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. label Mar 11, 2024
Copy link

This issue has not received a response in 1 week. If you still think there is a problem, please leave a comment to avoid the issue from automatically closing.

@github-actions github-actions bot added closing-soon This issue will automatically close in 4 days unless further comments are made. closed-for-staleness and removed closing-soon This issue will automatically close in 4 days unless further comments are made. labels Mar 22, 2024
Copy link

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs and link to relevant comments in this thread.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 10, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug This issue is a bug. closed-for-staleness p2 This is a standard priority issue response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days.
Projects
None yet
Development

No branches or pull requests