Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: limit size of data responses #220

Open
mx2323 opened this issue Oct 28, 2021 · 8 comments
Open

feature request: limit size of data responses #220

mx2323 opened this issue Oct 28, 2021 · 8 comments
Assignees
Labels
enhancement The issue is a request for improvement or a new feature status-triage_done Initial triage done, will be further handled by the driver team

Comments

@mx2323
Copy link

mx2323 commented Oct 28, 2021

currently with the node.js snowflake-sdk, if you do a SELECT * FROM LARGE_TABLE, the node.js snowflake client can crash the entire process if even one row is too large in the result set.

we would like to be able to set an option that limits the amount of data that is held in memory for each request, the sdk could either throw an exception, or gracefully return some subset of received data.

without this, we are UNABLE to prevent any user from overloading the calling node process and causing OOM errors.

note: this is still an issue, even with streaming rows, because that single row may still be too large.

@ghenkhaus
Copy link

+1

@sfc-gh-jfan sfc-gh-jfan reopened this Jul 6, 2022
@sfc-gh-dszmolka sfc-gh-dszmolka added the enhancement The issue is a request for improvement or a new feature label Jan 23, 2023
@sfc-gh-dszmolka
Copy link
Collaborator

hi, thank you for submitting this issue. we'll take a look how this could be handled.

@sfc-gh-dszmolka sfc-gh-dszmolka added the status-in_progress Issue is worked on by the driver team label Mar 30, 2023
@sfc-gh-dszmolka sfc-gh-dszmolka removed the status-in_progress Issue is worked on by the driver team label Dec 28, 2023
@sfc-gh-dszmolka
Copy link
Collaborator

in the meantime , as a possible workaround, setting max_old_space_size like
NODE_OPTIONS=--max_old_space_size=4096 (https://stackoverflow.com/questions/38558989/node-js-heap-out-of-memory)

could help mitigate the issue. probably you guys are already aware and using something similar as a workaround, but still leaving it here in case anyone new stumbles into this issue

@bhaskarbanerjee
Copy link

Seeking help @mx2323 @ghenkhaus @sfc-gh-dszmolka @sfc-gh-jfan and others
Tried out this sample code from https://docs.snowflake.com/en/developer-guide/sql-api/submitting-requests but because my data set size is between 6-7MB, it is failing with message Request to S3/Blob failed

`// Load the Snowflake Node.js driver.
var snowflake = require('snowflake-sdk');
// Create a Connection object that we can use later to connect.
var connection = snowflake.createConnection({
    account: "MY_SF_ACCOUNT",
    database: "MY_DB",
    schema: "MY_SCHEMA",
    warehouse: "MY_WH",
    username: "MY_USER",
    password: "MY_PWD"
});
// Try to connect to Snowflake, and check whether the connection was successful.
connection.connect( 
    function(err, conn) {
        if (err) {
            console.error('Unable to connect: ' + err.message);
            } 
        else {
            console.log('Successfully connected to Snowflake.');
            // Optional: store the connection ID.
            connection_ID = conn.getId();
            }
    }
);

var statement = connection.execute({
  sqlText: "Select * from LargeDataSet limit 100",
//sqlText: "Select * from LargeDataSet", -- fails with Request to S3/Blob failed
  complete: function(err, stmt, rows) {
    if (err) {
      console.error('Failed to execute statement due to the following error: ' + err.message);
    } else {
      console.log('Successfully executed statement: ' + stmt.getSqlText());
    }
  }
});`

We are observing this while upgrading the snowflake-sdk from 1.6.23 to ^1.9.0.
Things seem to be working fine with version 1.6.*, 1.7.0 and 1.8.0

Is there a resolution for fetching large data sets?

@sfc-gh-dszmolka
Copy link
Collaborator

hi @bhaskarbanerjee this issue you're seeing is not related to the original one, which is a feature/improvement request for something which doesn't exist yet. Let's keep this Issue for what it was originally intended for; tracking the original improvement request.

since small result sets work for you, only bigger have problems fetching, i would suspect the host you're running snowflake-sdk on, cannot reach the Snowflake internal stage (= S3 bucket) on which the query results are temporarily stored.

to fix this, I recommend running select system$allowlist() in your Snowflake account (perhaps on the GUI), and double confirming all of the endpoints listed as STAGE are in fact reachable from the host on which you have this problem. You can even use SnowCD to perform an automated test.

If you confirmed nothing blocks the connectivity to the stage and it still doesn't work, kindly open a new issue here or open a Snowflake Support case and we can help further.

@bhaskarbanerjee
Copy link

Thanks @sfc-gh-dszmolka let me try that but if it is a server side problem, then why does v1.6.*-1.8.0 work as a charm for large data set of 6-7 MB.
EXACT Same query not working with ^1.9.0 of the sdk

@bhaskarbanerjee
Copy link

Verified. We have 2 VPCs listed there and both are set to type= 'STAGE'. @sfc-gh-dszmolka

@bhaskarbanerjee
Copy link

Ran snowcd tool and msg="Check clear. No error presented." for both the VPCs listed there

@sfc-gh-dszmolka sfc-gh-dszmolka added the status-triage_done Initial triage done, will be further handled by the driver team label Feb 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement The issue is a request for improvement or a new feature status-triage_done Initial triage done, will be further handled by the driver team
Projects
None yet
Development

No branches or pull requests

7 participants