-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
trying to push remove or push prune stack traces with no error message #639
Comments
@tommyjcarpenter so it looks like the operator doesn't have a resolver set - can you |
@aricart what are you looking for in the here are the fields: |
here is the redacted server config:
|
I wanted your operator to:
Like this:
|
Your server config looks OK |
Just for a bit of sanity: # Add the operator
/t/s [1]$ nsc add operator O
[ OK ] generated and stored operator key "OBPLSTGWUQH7F54UPQU7QE53F3YDWGY6OVQJ3IDBQTO6D2USHVBP6TE5"
[ OK ] added operator "O"
[ OK ] When running your own nats-server, make sure they run at least version 2.2.0
# create a sys account
/t/s $ nsc add account SYS
[ OK ] generated and stored account key "AAPMM4G4CYMICDMR5KV2R35KZUPFQ5U6Q2HRLK6J5OP5KBGMYSEWXIIO"
[ OK ] added account "SYS"
# set the sys account on the operator
/t/s $ nsc edit operator --service-url "nats://localhost:4222" --account-jwt-server-url "nats://localhost:4222" --system-account SYS
[ OK ] set account jwt server url to "nats://localhost:4222"
[ OK ] set system account "AAPMM4G4CYMICDMR5KV2R35KZUPFQ5U6Q2HRLK6J5OP5KBGMYSEWXIIO"
[ OK ] added service url "nats://localhost:4222"
[ OK ] edited operator "O"
# generate the resolver config
/t/s $ nsc generate config --nats-resolver --config-file server.conf
[ OK ] wrote server configuration to `/tmp/s/server.conf`
Success!! - generated `/tmp/s/server.conf`
# start the NATS server
/t/s $ nats-server -c server.conf &
[26391] 2024/03/05 15:17:40.669196 [INF] Starting nats-server
[26391] 2024/03/05 15:17:40.669284 [INF] Version: 2.11.0-dev
[26391] 2024/03/05 15:17:40.669286 [INF] Git: [not set]
[26391] 2024/03/05 15:17:40.669288 [INF] Name: NAIPUDMVSKC7PTW45E4WT74CC6XAOAWAVVUZKXTCXDWEX6BLP4EVHMDH
[26391] 2024/03/05 15:17:40.669291 [INF] ID: NAIPUDMVSKC7PTW45E4WT74CC6XAOAWAVVUZKXTCXDWEX6BLP4EVHMDH
[26391] 2024/03/05 15:17:40.669298 [INF] Using configuration file: server.conf
[26391] 2024/03/05 15:17:40.669300 [INF] Trusted Operators
[26391] 2024/03/05 15:17:40.669302 [INF] System : ""
[26391] 2024/03/05 15:17:40.669304 [INF] Operator: "O"
[26391] 2024/03/05 15:17:40.669306 [INF] Issued : 2024-03-05 15:17:16 -0400 AST
[26391] 2024/03/05 15:17:40.669322 [INF] Expires : Never
[26391] 2024/03/05 15:17:40.669698 [INF] Managing all jwt in exclusive directory /tmp/s/jwt
[26391] 2024/03/05 15:17:40.669909 [INF] Listening for client connections on 0.0.0.0:4222
[26391] 2024/03/05 15:17:40.670176 [INF] Server is ready
# Add an account
/t/s $ nsc add account A
[ OK ] generated and stored account key "ABGDFGOB2GCGCM6BJVEOAIVMLX4YNHHG2WMPUAV4KTOEQKTBPBRVP36I"
[ OK ] added account "A"
# Push the account
/t/s $ nsc push
[ OK ] push to nats-server "nats://localhost:4222" using system account "SYS":
[ OK ] push A to nats-server with nats account resolver:
[ OK ] pushed "A" to nats-server NAIPUDMVSKC7PTW45E4WT74CC6XAOAWAVVUZKXTCXDWEX6BLP4EVHMDH: jwt updated
[ OK ] pushed to a total of 1 nats-server
# Delete the account
/t/s $ nsc delete account A
[ OK ] expired account "A"
[ OK ] deleted account
[ OK ] deleted account directory
# Push with --prune
/t/s $ nsc push --prune
[26391] 2024/03/05 15:17:59.618908 [ERR] delete accounts request by OBPLSTGWUQH7F54UPQU7QE53F3YDWGY6OVQJ3IDBQTO6D2USHVBP6TE5 failed - delete must be enabled in server config
[ERR ] prune nats-server with nats account resolver:
[ OK ] list 2 accounts from nats-server NAIPUDMVSKC7PTW45E4WT74CC6XAOAWAVVUZKXTCXDWEX6BLP4EVHMDH:
[ OK ] account AAPMM4G4CYMICDMR5KV2R35KZUPFQ5U6Q2HRLK6J5OP5KBGMYSEWXIIO named SYS exists
[ OK ] account ABGDFGOB2GCGCM6BJVEOAIVMLX4YNHHG2WMPUAV4KTOEQKTBPBRVP36I only exists in server
[ OK ] listed accounts from a total of 1 nats-server
[ERR ] server NAIPUDMVSKC7PTW45E4WT74CC6XAOAWAVVUZKXTCXDWEX6BLP4EVHMDH responded with error: delete accounts request by OBPLSTGWUQH7F54UPQU7QE53F3YDWGY6OVQJ3IDBQTO6D2USHVBP6TE5 failed - delete must be enabled in server config
[ERR ] Fewer server responded to 'prune' (0) than to 'list' (1). Accounts may not be completely pruned.
Error: all jobs failed
# Fix the server config to have `allow_delete: true` and restart the server
/t/s [1]$ vim server.conf
/t/s $ killall nats-server
[26391] 2024/03/05 15:18:30.336226 [INF] Initiating Shutdown...
[26391] 2024/03/05 15:18:30.336315 [INF] Server Exiting..
fish: Job 1, 'nats-server -c server.conf &' has ended
/t/s $ nats-server -c server.conf &
[26422] 2024/03/05 15:18:36.367349 [INF] Starting nats-server
[26422] 2024/03/05 15:18:36.367454 [INF] Version: 2.11.0-dev
[26422] 2024/03/05 15:18:36.367457 [INF] Git: [not set]
[26422] 2024/03/05 15:18:36.367458 [INF] Name: NDNVR2PCT7RXAWMIQIYCHBVZFQARLPIDTA2I2P2MLTAUVEBU3OXK2YYH
[26422] 2024/03/05 15:18:36.367462 [INF] ID: NDNVR2PCT7RXAWMIQIYCHBVZFQARLPIDTA2I2P2MLTAUVEBU3OXK2YYH
[26422] 2024/03/05 15:18:36.367470 [INF] Using configuration file: server.conf
[26422] 2024/03/05 15:18:36.367472 [INF] Trusted Operators
[26422] 2024/03/05 15:18:36.367474 [INF] System : ""
[26422] 2024/03/05 15:18:36.367477 [INF] Operator: "O"
[26422] 2024/03/05 15:18:36.367479 [INF] Issued : 2024-03-05 15:17:16 -0400 AST
[26422] 2024/03/05 15:18:36.367496 [INF] Expires : Never
[26422] 2024/03/05 15:18:36.367885 [INF] Managing all jwt in exclusive directory /tmp/s/jwt
[26422] 2024/03/05 15:18:36.368091 [INF] Listening for client connections on 0.0.0.0:4222
[26422] 2024/03/05 15:18:36.368279 [INF] Server is ready
# Try the prune again
/t/s $ nsc push --prune
[ OK ] prune nats-server with nats account resolver:
[ OK ] list 2 accounts from nats-server NDNVR2PCT7RXAWMIQIYCHBVZFQARLPIDTA2I2P2MLTAUVEBU3OXK2YYH:
[ OK ] account AAPMM4G4CYMICDMR5KV2R35KZUPFQ5U6Q2HRLK6J5OP5KBGMYSEWXIIO named SYS exists
[ OK ] account ABGDFGOB2GCGCM6BJVEOAIVMLX4YNHHG2WMPUAV4KTOEQKTBPBRVP36I only exists in server
[ OK ] listed accounts from a total of 1 nats-server
[ OK ] pruned nats-server NDNVR2PCT7RXAWMIQIYCHBVZFQARLPIDTA2I2P2MLTAUVEBU3OXK2YYH: deleted 1 accounts
/t/s $
|
Not sure what your commands were doing, (the |
So the stack traces you list refers to:
The operator key was found or you would have gotten an error. So not quite sure what that is about. |
@aricart i showed the two commands that generated stacktraces I tried (independently): all i want to do is delete an account from the nats system - but I think |
The
|
As you see in my script it does do the delete.... Can you try what I did locally. |
We do have the System account (as I showed) but seem to be missing the other two fields:
How would I know what to set these to? |
@aricart the plot thickens so my boss, using the exact same operator, and without the We are on the same binary version 2.8.5 However, he is on an Apple M1 and I am on an Apple M3... I hope it's not that, but I have had issues with architecture versions like this before, so just wanted to put that out there. A stacktrace on a different architecture vs another user in our same company working.. |
Perhaps you tried this already - but I want to make sure we try a new environment. Next repeat my script above, but this time, I would like you to prefix all nsc invocations as: This will create the entire tree for keys and jwts in the specified directory (outside of where you keep your actual configs and keys) ie After you verify the repeatability, you can set your old operator to be current by doing: |
@aricart can you please paste the server.conf? im following this in depth guide, which talks about copying a server.conf, but I dont see the file in the docs? https://docs.nats.io/running-a-nats-service/nats_admin/security/jwt#concepts (I'm trying to run stuff locally now; the above thread was from our prod kubernetes deployment) |
Ah - for your local test all you need to do after you create your operator/accounts etc - is |
you can then start the server with |
@aricart I was able to run all of your commands using the local |
@aricart alright - two more pieces of info! I was able to do all of these steps in However, I can always stacktrace on PROD, here is a fresh test: looks like its dying here: Line 449 in f85538f
that's a pretty dense function with pointers and such, hard to tell exactly what's happening, I'm new to this codebase |
OK I see some doggy code there - there's an error that is added to the report, but the process doesn't stop. |
Two more questions:
|
Also - were you able to run two local tests with the different M3 binary that was failing - I am assuming that you did it on the stage and that worked, but on prod it failed - Also make sure that the operator keys in the production environment - the request to prune requires them. You can verify that the keys are available by running
The stored column should be checked. |
@aricart yes if you provide a branch I can build and do like a I was able to use the exact same M3 binary for your local tests, and against our |
I am betting one dollar that the operator keys are not all there, and it is failing to find one. |
I found the issue - let me do the fix |
@tommyjcarpenter If you want to try the branch for the fix, that would be awesome. - Note that you'll have to specify the |
I released a new nsc, you should be able to pull it with |
@tommyjcarpenter thank you for reporting this - if the release doesn't fix it for you, please reopen this issue |
Thanks I’ll try it out! Is it available through homebrew? That’s how I install/update nsc |
yeap - the update command will also do the right thing even if you installed with homebrew... |
@aricart it actually works with the latest... I didn't get an error message about missing an operator key..! It worked. What was the change (and should it have worked?). Thanks much for your attention |
The code had a couple of edge cases. But the one that affected you was that your operator had more than one signing key but the first listed were not there. Go has a feature where if something returns an interface the value can be nil, but the interface isn't. So checks for nil fail since the interface is not nil and then usage of that will panic because the value is. |
What version were you using?
nsc version 2.8.5
i dont have "nats-server" installed
What environment was the server running in?
Mac OSX 14.2.1
Is this defect reproducible?
I see an account after an
nsc pull -A
its there when i list them:
however, trying to do an
nsc delete
followed by apush -P
, OR apush -R
both stacktrace:Given the capability you are leveraging, describe your expectation?
im trying to delete a NATs account
Given the expectation, what is the defect you are observing?
stacktraces with no error messages
The text was updated successfully, but these errors were encountered: