Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adjust filer startup #4102

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Conversation

zemul
Copy link
Contributor

@zemul zemul commented Jan 3, 2023

What problem are we solving?

During the master election, the filer port cannot be started.

How are we solving the problem?

Remove unnecessary block.

How is the PR tested?

Checks

  • I have added unit tests if possible.
  • I will add related wiki document changes and link to this PR after merging.

@chrislusf
Copy link
Collaborator

why this is a problem?

@zemul
Copy link
Contributor Author

zemul commented Jan 3, 2023

why this is a problem?

In 3.34 my master was not selected for 4 minutes.
meanwhile, my filer will not start successfully for a long time.

@zemul
Copy link
Contributor Author

zemul commented Jan 3, 2023

filer log

Log file created at: 2022/12/02 10:58:59
Running on machine: hn57
Binary: Built with gc go1.18.2 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss threadid file:line] msg
W1202 10:58:59.997221 filer_server.go:142 skipping default store dir in ./filerldb2
I1202 10:59:00.006877 filer.go:130 existing filer.store.id = 1955893051
I1202 10:59:00.006906 configuration.go:28 configured filer store to mysql
I1202 10:59:35.954871 masterclient.go:266 updateVidMap ignore short heartbeat: volume_location:{}
I1202 10:59:35.954973 masterclient.go:223 .filer masterClient failed to receive from 172.16.254.59:9333: EOF
I1202 11:00:11.958809 masterclient.go:266 updateVidMap ignore short heartbeat: volume_location:{}
I1202 11:00:11.958869 masterclient.go:223 .filer masterClient failed to receive from 172.16.254.57:9333: EOF
I1202 11:00:47.964309 masterclient.go:266 updateVidMap ignore short heartbeat: volume_location:{}
I1202 11:00:47.964370 masterclient.go:223 .filer masterClient failed to receive from 172.16.254.58:9333: EOF
I1202 11:01:24.969882 masterclient.go:266 updateVidMap ignore short heartbeat: volume_location:{}
I1202 11:01:24.969949 masterclient.go:223 .filer masterClient failed to receive from 172.16.254.58:9333: EOF
I1202 11:02:00.974008 masterclient.go:266 updateVidMap ignore short heartbeat: volume_location:{}
I1202 11:02:00.974065 masterclient.go:223 .filer masterClient failed to receive from 172.16.254.59:9333: EOF
I1202 11:02:36.979647 masterclient.go:266 updateVidMap ignore short heartbeat: volume_location:{}
I1202 11:02:36.979716 masterclient.go:223 .filer masterClient failed to receive from 172.16.254.57:9333: EOF
I1202 11:03:13.983438 masterclient.go:266 updateVidMap ignore short heartbeat: volume_location:{}
I1202 11:03:13.983509 masterclient.go:223 .filer masterClient failed to receive from 172.16.254.57:9333: EOF
I1202 11:03:19.413518 masterclient.go:200 .filer masterClient failed to receive from 172.16.254.58:9333: rpc error: code = Unavailable desc = error reading from server: EOF
I1202 11:03:55.416575 masterclient.go:266 updateVidMap ignore short heartbeat: volume_location:{}
I1202 11:03:55.416637 masterclient.go:223 .filer masterClient failed to receive from 172.16.254.59:9333: EOF
I1202 11:04:19.421419 masterclient.go:208 master 172.16.254.58:9333 redirected to leader 172.16.254.57:9333
I1202 11:04:19.439025 master_client.go:20 the cluster has 2 filer
I1202 11:04:19.441596 meta_aggregator.go:97 loopSubscribeToOneFiler read 172.16.254.59:8888 start from 2022-12-02 11:03:19.439038516 +0800 CST 1669950199439038516
I1202 11:04:19.441655 meta_aggregator.go:97 loopSubscribeToOneFiler read 172.16.254.57:8888 start from 2022-12-02 11:03:19.439038516 +0800 CST 1669950199439038516
I1202 11:04:19.442078 meta_aggregator.go:108 subscribing remote 172.16.254.57:8888 meta change: connecting to peer filer 172.16.254.57:8888: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 172.16.254.57:18888: connect: connection refused"
I1202 11:04:19.446425 meta_aggregator.go:194 subscribing remote 172.16.254.59:8888 meta change: 2022-12-02 11:03:19.439038516 +0800 CST, clientId:1923813785
I1202 11:04:19.449388 filer.go:272 Start Seaweed Filer 30GB 3.34 at 172.16.254.57:8888

@zemul
Copy link
Contributor Author

zemul commented Jan 3, 2023

I found that this pull request only solves part of the problem.

existingNodes := fs.filer.ListExistingPeerUpdates()

There's also a blockage here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants