New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reuse connections on table.insert #41
reuse connections on table.insert #41
Comments
I believe what we want is a |
Any resolution to this? @alexander-fenster were you able to resolve the issue but setting the forever agent to false on a fork of google-cloud/storage? I have run into the same issue using bigquery. Is there some part of the google-cloud/bigquery api that also uses google-cloud/storage? If not @stephenplusplus could you provide a reference for where in the bigquery api repository this update needs to occur? |
If we confirm that this PR was a mistake, we would disable it in the same place, and all APIs would be immediately corrected. For any users that want to undo the change from that PR, you can use request overrides to turn Keep-Alive back on: const BigQuery = require('@google-cloud/bigquery')
const bigquery = new BigQuery({...})
bigquery.interceptors.push({
request: reqOpts => {
reqOpts.forever = true
return reqOpts
}
}) |
Ping. Next steps? |
Any updates? @stephenplusplus |
Sorry, I was hoping @alexander-fenster would advise. |
Any news for this issue ?? |
@stephenplusplus @alexander-fenster is this still an issue? Haven't seen any chatter about it in a long while |
Just to be safe, it would be great to run a quickie system test that does a bunch of inserts back-to-back, and monitors the number of open network connections. I have no idea how the move to |
We have resolved the issues we had with bigquery inserts. We haven't had
any since then.
…On Mon, Mar 18, 2019, 10:40 AM Justin Beckwith ***@***.***> wrote:
Just to be safe, it would be great to run a quickie system test that does
a bunch of inserts back-to-back, and monitors the number of open network
connections. I have no idea how the move to teeny-request could have
affected things here tbh
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#41 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AClkJfyK8xWSKw1AYdgrAtBwie7jiSboks5vX6VjgaJpZM4RGM3N>
.
|
@callmehiphop, can you please add a system test? |
This is resolved. No need to follow up any further.
…On Tue, May 28, 2019, 1:22 PM Solomon Duskis ***@***.***> wrote:
@callmehiphop <https://github.com/callmehiphop>, can you please add a
system test?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#41?email_source=notifications&email_token=AAUWIJJYQJNQZYF2FKZGUNLPXVS7DA5CNFSM4EIYZXG2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWM2YLI#issuecomment-496610349>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAUWIJKBKPQCUIQTKCW6TJLPXVS7DANCNFSM4EIYZXGQ>
.
|
Bumping, as connections do not seem to be reused anymore (anywhere, not only in Cloud functions). I'm wondering if the switch from
+ @JustinBeckwith Some digging:
const requestDefaults = {
timeout: 60000,
gzip: true,
forever: true,
pool: {
maxSockets: Infinity,
},
}; https://github.com/googleapis/nodejs-common/blob/master/src/util.ts#L35-L42
Both the function requestToFetchOptions(reqOpts: r.Options) {
const options: f.RequestInit = {
method: reqOpts.method || 'GET',
...(reqOpts.timeout && {timeout: reqOpts.timeout}),
...(reqOpts.gzip && {compress: reqOpts.gzip}),
};
if (typeof reqOpts.json === 'object') {
// Add Content-type: application/json header
reqOpts.headers = reqOpts.headers || {};
reqOpts.headers['Content-Type'] = 'application/json';
// Set body to JSON representation of value
options.body = JSON.stringify(reqOpts.json);
} else {
if (typeof reqOpts.body !== 'string') {
options.body = JSON.stringify(reqOpts.body);
} else {
options.body = reqOpts.body;
}
}
options.headers = reqOpts.headers as Headers;
let uri = ((reqOpts as r.OptionsWithUri).uri ||
(reqOpts as r.OptionsWithUrl).url) as string;
if (reqOpts.useQuerystring === true || typeof reqOpts.qs === 'object') {
const qs = require('querystring');
const params = qs.stringify(reqOpts.qs);
uri = uri + '?' + params;
}
if (reqOpts.proxy || process.env.HTTP_PROXY || process.env.HTTPS_PROXY) {
const proxy = (process.env.HTTP_PROXY || process.env.HTTPS_PROXY)!;
options.agent = new HttpsProxyAgent(proxy);
}
return {uri, options};
} https://github.com/googleapis/teeny-request/blob/master/src/index.ts#L22-L60 node HTTP(S) agents default to not reusing sockets:
|
TL;DR: the current implementation of BigQuery establishes a new connections on each table.insert() call. This causes problems for Cloud Functions because of connection quota that our users hit. Please update the client libraries to reuse connections.
Why new connections cause problems:
Cloud Functions is a serverless platform for executing snippets of code. Many our customers use function to store events in Big Query. GCF has a quota on connections, and this quota is relatively low because each connection requires a NAT port to be allocated and maintained until TCP packet timeout; and there are only 32k ports per server.
Why BigQuery is worse than other client libraries:
I found three types of clients:
It would be nice if all libraries worked as 1.
How to reproduce the problem:
Every call to the function exported by the code below creates a new connection.
******************* function.js ****************
************* package.json ****************
We also found that BigQuery uses IPv4, while other libraries prefer IPv6. This should also be fixed.
[Googlers: internal tracking id b/68240537]
The text was updated successfully, but these errors were encountered: