Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom gRPC ports not honored by master.peers and weed shell envvars #4607

Closed
enspritz opened this issue Jun 26, 2023 · 6 comments
Closed

Custom gRPC ports not honored by master.peers and weed shell envvars #4607

enspritz opened this issue Jun 26, 2023 · 6 comments

Comments

@enspritz
Copy link

Describe the bug
We have elected to use custom gRPC ports for each of master, volume, and filer instances. So far, there are two use-cases where weed unable to interpret the port specifications outlined in the FAQ:

A) List of master peers

weed master -peers 10.0.0.1:9333.9334,10.0.0.2:9333,9334,10.0.0.3:9333,9334
In this case, the master will for example try to connect to http://10.0.0.1:9333.9334 but then fail.
Possibly, after splitting on commas, the code directly interprets each string as HOST:PORT , without checking if it's HOST:PORT.gRPC_PORT.

B) weed shell

sudo docker run --rm -it -e SHELL_FILER=10.0.0.1:9330.9331 -e SHELL_MASTER=10.0.0.1:9333.9334 chrislusf/seaweedfs:3.52 shell fs.verify -v
In this case, the shell will report that it tried and failed to connect to http://10.0.0.1:9333.9334
** Apologies, I'd like to paste in the error output but right now when I try to reproduce, shell responds by printing an endless series of dots and I don't know what it's doing and why...

System Setup
version 30GB 3.52 fb4b61036 linux amd64

@chrislusf
Copy link
Collaborator

The code looks correct. Need to see logs.

@enspritz
Copy link
Author

enspritz commented Jun 26, 2023

B) weed shell

OK interesting .. shell fs.verify -v produces this output when envvar SHELL_MASTER doesn't include the gRPC port and logging is not mentioned:

sudo docker run --rm -it -e SHELL_FILER=10.0.0.1:9330.9331 -e SHELL_MASTER=10.0.0.1:9333 chrislusf/seaweedfs:3.52 shell fs.verify -v
.........total 0 directories, 0 files

Is it working properly? Not sure, but there are indeed 0 dirs, 0 files in this fresh cluster.

Now, same command, but with logging option -v 0 produces an endless stream of dots.

sudo docker run --rm -it -e SHELL_FILER=10.0.0.1:9330.9331 -e SHELL_MASTER=10.0.0.1:9333 chrislusf/seaweedfs:3.52 -v 0 shell fs.verify -v
.....................................................................................................................................................................................................................................................................................................................................................................................................................^C

and setting verbosity to 1 or higher produces:

sudo docker run --rm -it -e SHELL_FILER=10.0.0.1:9330.9331 -e SHELL_MASTER=10.0.0.1:9333 chrislusf/seaweedfs:3.52 -v 1 shell fs.verify -v
I0626 05:45:10.425807 config.go:46 Reading : Config File "security" Not Found in "[/data /root/.seaweedfs /usr/local/etc/seaweedfs /etc/seaweedfs]"
I0626 05:45:10.426492 config.go:46 Reading : Config File "shell" Not Found in "[/data /root/.seaweedfs /usr/local/etc/seaweedfs /etc/seaweedfs]"
I0626 05:45:10.426823 masterclient.go:127 .adminShell masterClient bootstraps with masters map[localhost:9333:localhost:9333]
I0626 05:45:10.426933 masterclient.go:172 .adminShell masterClient Connecting to master localhost:9333
I0626 05:45:10.430106 masterclient.go:180 .adminShell masterClient failed to keep connected to localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
I0626 05:45:10.430170 masterclient.go:260 .adminShell masterClient failed to connect with master localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
..........I0626 05:45:11.430612 masterclient.go:172 .adminShell masterClient Connecting to master localhost:9333
I0626 05:45:11.431317 masterclient.go:180 .adminShell masterClient failed to keep connected to localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
I0626 05:45:11.431371 masterclient.go:260 .adminShell masterClient failed to connect with master localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
...........I0626 05:45:12.431940 masterclient.go:172 .adminShell masterClient Connecting to master localhost:9333
I0626 05:45:12.432610 masterclient.go:180 .adminShell masterClient failed to keep connected to localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
I0626 05:45:12.432649 masterclient.go:260 .adminShell masterClient failed to connect with master localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
..........I0626 05:45:13.433681 masterclient.go:172 .adminShell masterClient Connecting to master localhost:9333
I0626 05:45:13.434384 masterclient.go:180 .adminShell masterClient failed to keep connected to localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
I0626 05:45:13.434428 masterclient.go:260 .adminShell masterClient failed to connect with master localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
........I0626 05:45:14.435477 masterclient.go:172 .adminShell masterClient Connecting to master localhost:9333
I0626 05:45:14.436282 masterclient.go:180 .adminShell masterClient failed to keep connected to localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
I0626 05:45:14.436323 masterclient.go:260 .adminShell masterClient failed to connect with master localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
.....^C

Let's do the same set of commands again, this time specifying the master gRPC port. The result doesn't change.

sudo docker run --rm -it -e SHELL_FILER=10.0.0.1:9330.9331 -e SHELL_MASTER=10.0.0.1:9333.9334 chrislusf/seaweedfs:3.52 shell fs.verify -v
....................................................................................................................................................................................................................................................................................................................................................^C

and again setting verbosity to 1 or higher (this time 4):

sudo docker run --rm -it -e SHELL_FILER=10.0.0.1:9330.9331 -e SHELL_MASTER=10.0.0.1:9333.9334 chrislusf/seaweedfs:3.52 -v 4 shell fs.verify -v
I0626 05:55:09.490348 config.go:46 Reading : Config File "security" Not Found in "[/data /root/.seaweedfs /usr/local/etc/seaweedfs /etc/seaweedfs]"
I0626 05:55:09.491262 config.go:46 Reading : Config File "shell" Not Found in "[/data /root/.seaweedfs /usr/local/etc/seaweedfs /etc/seaweedfs]"
I0626 05:55:09.491551 masterclient.go:127 .adminShell masterClient bootstraps with masters map[localhost:9333:localhost:9333]
I0626 05:55:09.491670 masterclient.go:172 .adminShell masterClient Connecting to master localhost:9333
I0626 05:55:09.493904 masterclient.go:180 .adminShell masterClient failed to keep connected to localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
I0626 05:55:09.493942 masterclient.go:260 .adminShell masterClient failed to connect with master localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
.........I0626 05:55:10.494917 masterclient.go:172 .adminShell masterClient Connecting to master localhost:9333
I0626 05:55:10.495651 masterclient.go:180 .adminShell masterClient failed to keep connected to localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
I0626 05:55:10.495676 masterclient.go:260 .adminShell masterClient failed to connect with master localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
..........I0626 05:55:11.495949 masterclient.go:172 .adminShell masterClient Connecting to master localhost:9333
I0626 05:55:11.496492 masterclient.go:180 .adminShell masterClient failed to keep connected to localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
I0626 05:55:11.496519 masterclient.go:260 .adminShell masterClient failed to connect with master localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
..........I0626 05:55:12.496919 masterclient.go:172 .adminShell masterClient Connecting to master localhost:9333
I0626 05:55:12.497630 masterclient.go:180 .adminShell masterClient failed to keep connected to localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
I0626 05:55:12.497680 masterclient.go:260 .adminShell masterClient failed to connect with master localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
........I0626 05:55:13.498713 masterclient.go:172 .adminShell masterClient Connecting to master localhost:9333
I0626 05:55:13.499484 masterclient.go:180 .adminShell masterClient failed to keep connected to localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
I0626 05:55:13.499538 masterclient.go:260 .adminShell masterClient failed to connect with master localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
.........I0626 05:55:14.499958 masterclient.go:172 .adminShell masterClient Connecting to master localhost:9333
I0626 05:55:14.500697 masterclient.go:180 .adminShell masterClient failed to keep connected to localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
I0626 05:55:14.500735 masterclient.go:260 .adminShell masterClient failed to connect with master localhost:9333: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:19333: connect: connection refused"
..^C

The master gRPC port specified as 9334, but shell still attempts connection to 19333 (9333 which is correct plus the assumed +10000).

@chrislusf
Copy link
Collaborator

it's connecting to localhost:9333, the SHELL_MASTER value is not passed in successfully. Try to use the binary directly.

@enspritz
Copy link
Author

it's connecting to localhost:9333, the SHELL_MASTER value is not passed in successfully. Try to use the binary directly.

Um .. Is this a defect that can be fixed in a release, or was I doing something incorrectly?

@enspritz
Copy link
Author

enspritz commented Jun 27, 2023

A) List of master peers

Error message:

{"error":"Leader URL http://10.0.0.1:9333.9334 Parse Error: parse \"http://10.0.0.1:9333.9334\": invalid port \":9333.9334\" after host"}

@enspritz
Copy link
Author

Don't know if its useful info, but the masters are using -raftHashicorp.

kmlebedev pushed a commit to kmlebedev/seaweedfs that referenced this issue Dec 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants