Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Master timeouts during dirAssign volume growth #5213

Closed
bvanelst opened this issue Jan 17, 2024 · 3 comments
Closed

Master timeouts during dirAssign volume growth #5213

bvanelst opened this issue Jan 17, 2024 · 3 comments

Comments

@bvanelst
Copy link

bvanelst commented Jan 17, 2024

Describe the bug
Timeouts when requesting a /dir/assign at the master(s).

System Setup

  • master(s) 3 of them, but I get the same issues when I only start one:
  • /usr/local/bin/weed -v=3 -logdir=/var/log/seaweedfs master -mdir=/etc/seaweedfs -ip=10.0.9.15 -port=9333 -metrics.address=10.0.9.17:9091 -defaultReplication=010 -volumePreallocate -garbageThreshold=0.3 -volumeSizeLimitMB=20000 -peers=10.0.9.17:9333,10.0.9.14:9333,10.0.9.15:9333
  • volume(s) 7 of them, I use different IP's, racks, and volumes.
  • /usr/local/bin/weed -v=3 -logdir=/var/log/seaweedfs volume -index=leveldb -mserver=10.0.9.17:9333,10.0.9.14:9333,10.0.9.15:9333 -dir=/volumes/98fb3388c280,/volumes/LHHGS,/volumes/e000c055cbe4,/volumes/c5a9aff45527,/volumes/619c9a0827f4,/volumes/f8c44345756f,/volumes/eeedca023938,/volumes/cae089cd2dd9,/volumes/20F30GRVRD,/volumes/KWEGS,/volumes/3d5638f4fd34,/volumes/18f39b04390d,/volumes/6a9e8c97ba2a,/volumes/LDTGS,/volumes/a9a6e2d048de,/volumes/20F308T27D,/volumes/19641ea6d6c5,/volumes/20F30JB3JE,/volumes/6e19dd8da77b,/volumes/3d1614841bcc,/volumes/372cb7e5ac18,/volumes/152d865d39ce,/volumes/20F305249D,/volumes/1289675b7f03,/volumes/222079443d03,/volumes/cc66d284719d,/volumes/ca6f98cd3c16,/volumes/6611045c7cf2,/volumes/381ee044d930,/volumes/ff81968af32c,/volumes/9d611128cfed,/volumes/21F306AD4F,/volumes/595892cb8709,/volumes/0553ccb52b90 -max=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 -concurrentDownloadLimitMB=20000 -concurrentUploadLimitMB=20000 -hasSlowRead=true -readBufferSizeMB=8 -compactionMBps=10 -rack=store02 -ip=10.0.9.2
  • OS version
  • Debian GNU/Linux 12 (bookworm) / Linux store02 6.1.0-17-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.69-1 (2023-12-30) x86_64 GNU/Linux
  • output of weed version
  • version 30GB 3.62 59b8af99b0aca1b9e88fec7b5f27c7d15e5e8604 linux amd64
  • no filer only masters & volume servers, we have an own metadata store.

Expected behavior
When we assign/request multiple keys or a single key at the Leader master, we don't expect timeouts. Normally we get an instant response. But sometimes (every x minutes) we get a timeout on the a request like: http://10.0.9.17:9333/dir/assign?collection=nntp&count=10000&replication=001. Also when we lower the count. We sadly don't get an error at this request, but I noticed when this happens I see the following log entry:
seaweedfs-master[459769]: I0116 14:50:10.468052 master_server_handlers.go:125 dirAssign volume growth {"collection":"nntp","replication":{"node":1},"ttl":{"Count":0,"Unit":0}} from 10.0.9.12:40308

It looks that this always happens when there is a dirAssign volume growth.
In parallel there are constantly POST requests directly to the volume servers to store data.

I thought a work-around was to use replication 010 or 002, but working only for a while.

We just started testing SeaweedFS and started with 3.60, but also after the upgrades to 3.62 we still see this issue. ( I didn't tested older versions )

Additional context
How I test/reproduce it:

while true; do curl --max-time 3 'http://localhost:9333/dir/assign?collection=nntp&replication=001'; echo "" ; done

I get once in a couple of seconds/minutes a timeout (also when I increase the max-time):

curl: (28) Operation timed out after 3000 milliseconds with 0 bytes received

at that moment I see always a volume growth message in the logging:

seaweedfs-master[459769]: I0116 14:50:10.468052 master_server_handlers.go:125 dirAssign volume growth {"collection":"nntp","replication":{"node":1},"ttl":{"Count":0,"Unit":0}} from 10.0.9.12:40308

Screen shot
b13912075ec83a54736ed1da4a98b5fcbe
Screenshot 2024-01-17 at 23 09 24

@bvanelst
Copy link
Author

bvanelst commented Jan 18, 2024

I downgraded to 3.59 and the timeouts/failed assigns seem to be gone. Also the log message below disappeared.

seaweedfs-master[375357]: I0118 11:52:34.725059 master_server_handlers.go:123 dirAssign volume growth {"collection":"nntp","replication":{"rack":1},"ttl":{"Count":0,"Unit":0},"preallocate":20971520000} from 10.0.9.12:36966

As you can see the traffic on our POST server is much more stable after the downgrade.

Screenshot 2024-01-18 at 12 52 56

The diskutil of the volume servers is also much better.

Screenshot 2024-01-18 at 14 58 34

I think it could be related to #5154

@chrislusf
Copy link
Collaborator

Added a fix for #5154

@BenoitKnecht
Copy link
Contributor

Added a fix for #5154

I'm still getting the timeouts described by @bvanelst with that fix, and I think I found what's causing it. In

if err := <-errCh; err != nil {
writeJsonError(w, r, http.StatusInternalServerError, fmt.Errorf("cannot grow volume group! %v", err))
return
}

we wait on the req.ErrCh channel. That channel is closed when we exit ProcessGrowRequest() here

if req.ErrCh != nil {
req.ErrCh <- err
close(req.ErrCh)
}

but not if we exit it there

} else {
glog.V(4).Infoln("discard volume grow request")
time.Sleep(time.Millisecond * 211)
vl.DoneGrowRequest()
}

It would be easy enough to explicitly close it in that else branch (which I did to make sure it got rid of the timeouts, which it did), but I wonder if there are other situations where this channel is not properly closed; e.g. should it be closed when we skip the loop too?

if !ms.Topo.IsLeader() {
//discard buffered requests
time.Sleep(time.Second * 1)
continue
}

@chrislusf What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants