New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NodeJS client - Memory mode - S3 connection --> client does not automatically reconnect #5929
Comments
Can you please fill out the missing parts of the issue template? Which "latest as of 18-12-2022" version are you referring to? The latest master or release build? You also said to check your profile for affiliation, but neither your name nor your company are listed there With regards to your issue, have you enabled object caching? |
positive: |
@dberardo-com have you checked that your WebIdentity wasn't rotated? If you're running on EKS and WebIdentity, it might happed that they invalidate, and then subsequently you won't have access to S3 because of invalid/missing credentials. See https://medium.com/airwalk/how-a-pod-assumes-an-aws-identity-284fc6fda873 You can also inspect the token with jwt.io and determine if it's still valid. |
hi @tobilg i see your point. I am using a self hosted MinIO intsallation, but i guess the auth logic is the same. Where can i find the JWT stored in duckdb ? and how to prevent this in long-running application ? i would expect from the duckdb library to handle re-authentication by itself ... ? |
Well, I don't know how this works with MinIO... With S3, you'd have to specify the s3_access_key_id/s3_secret_access_key/s3_session_token (see https://duckdb.org/docs/extensions/httpfs). In EKS, the IAM permissions usually are provided via a service account role, and via the metadata service or webhook endpoint made available to the container you're running. I don't know about your setup, because you didn't explain that in your issue. Re-authentication with potentially new credentials done by DuckDB automatically afaik (and this would hardly be possible, as it relies on the credentials that are given to the container...) |
currently i use access/key auth strategy, so no session token involved (at least not as a configuration). I cant say if auth is the problem, i have more a feeling that this could be a timeout with missing reconnection, in this sense i feel that the comment from @Mause goes in the right direction. if there is any other place where i could look for more verbose logs, i would give it a try |
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days. |
This issue was closed because it has been stale for 30 days with no activity. |
What happens?
Long running in-memory nodejs clients of duckdb that are fetching from S3 might experience network problems.
I have noticed that my remote duckdb client was not able to "find" any parquet data from S3, although a local development instance was.
I thought the reason could be a network problem that cause a disconnection so i have just restarted the application. After restart, everything is working fine, and the query that was "not finding" files before, now can find them.
is it possible that the nodejs client does not automatically reconnects to S3 after network failures ? if so, how to add this behavior?
To Reproduce
TBH, very hard to reproduce . Is
OS:
k8s
DuckDB Version:
0.6
DuckDB Client:
nodejs
Full Name:
check github profile
Affiliation:
check github profile
Have you tried this on the latest
master
branch?Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
The text was updated successfully, but these errors were encountered: