Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.60,3.61,3.62 regression test failure #5154

Closed
chenrui333 opened this issue Jan 2, 2024 · 18 comments
Closed

3.60,3.61,3.62 regression test failure #5154

chenrui333 opened this issue Jan 2, 2024 · 18 comments

Comments

@chenrui333
Copy link
Contributor

👋 trying to build the latest release, but run into some build issue. The error log is as below:

error build log
Error: seaweedfs: failed
An exception occurred within a child process:
  Minitest::Assertion: --- expected
+++ actual
@@ -1,3 +1,3 @@
 # encoding: US-ASCII
 #    valid: true
-"126fa3d13c112c9c49d563b00836149bed94117edb54101a1a4d9c60ad0244be"
+"e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"

full build log, https://github.com/Homebrew/homebrew-core/actions/runs/7382019708/job/20081913872?pr=158730
relates to Homebrew/homebrew-core#158730

@chrislusf
Copy link
Collaborator

I could not follow the errors. Maybe it is related to #5153

@chenrui333
Copy link
Contributor Author

Yeah, it could be. Let me run some quick check.

@chenrui333
Copy link
Contributor Author

After the patch, still the same

==> Testing seaweedfs
I0102 17:10:58.114325 master.go:269 current: 172.21.82.195:63757 peers:
I0102 17:10:58.142689 file_util.go:27 Folder /private/tmp/seaweedfs-test-20240102-34518-75m69v Permission: -rwx------
I0102 17:10:58.142807 file_util.go:27 Folder /private/tmp/seaweedfs-test-20240102-34518-75m69v Permission: -rwx------
I0102 17:10:58.142982 master.go:269 current: 172.21.82.195:63757 peers:172.21.82.195:63757.63759
I0102 17:10:58.143229 master_server.go:127 Volume Size Limit is 30000 MB
I0102 17:10:58.143382 master.go:150 Start Seaweed Master 30GB 3.60 d4e91b6ad9cda07c9ad790e82593caba975cc8e8 at 0.0.0.0:63757
I0102 17:10:58.143620 raft_server.go:118 Starting RaftServer with 172.21.82.195:63757
I0102 17:10:58.144143 volume_grpc_client_to_master.go:43 checkWithMaster 172.21.82.195:63757: get master 172.21.82.195:63757 configuration: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 172.21.82.195:63759: connect: connection refused"
I0102 17:10:58.155498 raft_server.go:167 current cluster leader:
I0102 17:10:59.936141 volume_grpc_client_to_master.go:43 checkWithMaster 172.21.82.195:63757: get master 172.21.82.195:63757 configuration: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 172.21.82.195:63759: connect: connection refused"
I0102 17:11:01.729481 volume_grpc_client_to_master.go:43 checkWithMaster 172.21.82.195:63757: get master 172.21.82.195:63757 configuration: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 172.21.82.195:63759: connect: connection refused"
I0102 17:11:03.522788 volume_grpc_client_to_master.go:43 checkWithMaster 172.21.82.195:63757: get master 172.21.82.195:63757 configuration: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 172.21.82.195:63759: connect: connection refused"
I0102 17:11:05.316260 volume_grpc_client_to_master.go:43 checkWithMaster 172.21.82.195:63757: get master 172.21.82.195:63757 configuration: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 172.21.82.195:63759: connect: connection refused"
I0102 17:11:07.112930 volume_grpc_client_to_master.go:43 checkWithMaster 172.21.82.195:63757: get master 172.21.82.195:63757 configuration: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 172.21.82.195:63759: connect: connection refused"
I0102 17:11:08.905017 volume_grpc_client_to_master.go:43 checkWithMaster 172.21.82.195:63757: get master 172.21.82.195:63757 configuration: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 172.21.82.195:63759: connect: connection refused"
I0102 17:11:10.698129 volume_grpc_client_to_master.go:43 checkWithMaster 172.21.82.195:63757: get master 172.21.82.195:63757 configuration: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 172.21.82.195:63759: connect: connection refused"
I0102 17:11:12.490310 volume_grpc_client_to_master.go:43 checkWithMaster 172.21.82.195:63757: get master 172.21.82.195:63757 configuration: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 172.21.82.195:63759: connect: connection refused"
I0102 17:11:14.284583 volume_grpc_client_to_master.go:43 checkWithMaster 172.21.82.195:63757: get master 172.21.82.195:63757 configuration: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 172.21.82.195:63759: connect: connection refused"
I0102 17:11:16.077943 volume_grpc_client_to_master.go:43 checkWithMaster 172.21.82.195:63757: get master 172.21.82.195:63757 configuration: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 172.21.82.195:63759: connect: connection refused"
I0102 17:11:16.958814 master_server.go:215 [172.21.82.195:63757.63759]  - is the leader.
I0102 17:11:16.959991 master.go:201 Start Seaweed Master 30GB 3.60 d4e91b6ad9cda07c9ad790e82593caba975cc8e8 grpc server at 0.0.0.0:63759
I0102 17:11:18.461692 masterclient.go:156 No existing leader found!
I0102 17:11:18.461865 raft_server.go:189 Initializing new cluster
I0102 17:11:18.462047 master_server.go:174 leader change event:  => 172.21.82.195:63757.63759
I0102 17:11:18.462112 master_server.go:177 [172.21.82.195:63757.63759] 172.21.82.195:63757.63759 becomes leader.
I0102 17:11:18.583334 master_grpc_server.go:349 + client .master@172.21.82.195:63757.63759
I0102 17:11:18.794923 disk_location.go:238 Store started on dir: /private/tmp/seaweedfs-test-20240102-34518-75m69v with 0 volumes max 8
I0102 17:11:18.795048 disk_location.go:241 Store started on dir: /private/tmp/seaweedfs-test-20240102-34518-75m69v with 0 ec shards
I0102 17:11:18.795653 volume.go:370 Start Seaweed volume server 30GB 3.60 d4e91b6ad9cda07c9ad790e82593caba975cc8e8 at 0.0.0.0:63758
I0102 17:11:18.795954 volume_grpc_client_to_master.go:52 Volume server start with seed master nodes: [172.21.82.195:63757]
I0102 17:11:18.798668 volume_grpc_client_to_master.go:109 Heartbeat to: 172.21.82.195:63757
I0102 17:11:18.799216 node.go:250 topo adds child DefaultDataCenter
I0102 17:11:18.799301 node.go:250 topo:DefaultDataCenter adds child DefaultRack
I0102 17:11:18.799312 node.go:250 topo:DefaultDataCenter:DefaultRack adds child 172.21.82.195:63758
I0102 17:11:18.799352 node.go:250 topo:DefaultDataCenter:DefaultRack:172.21.82.195:63758 adds child
I0102 17:11:18.799362 master_grpc_server.go:138 added volume server 0: 172.21.82.195:63758 [ea788b47-4f3d-4e77-b394-822e1e1b404a]
I0102 17:11:18.799375 master_grpc_server.go:49 found new uuid:172.21.82.195:63758 [ea788b47-4f3d-4e77-b394-822e1e1b404a] , map[172.21.82.195:63758:[ea788b47-4f3d-4e77-b394-822e1e1b404a]]
==> curl http://localhost:63757/dir/assign
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0I0102 17:11:27.696740 master_server_handlers.go:125 dirAssign volume growth {"replication":{},"ttl":{"Count":0,"Unit":0}} from [::1]:63959
I0102 17:11:27.697541 store.go:166 In dir /private/tmp/seaweedfs-test-20240102-34518-75m69v adds volume:1 collection: replicaPlacement:000 ttl:
I0102 17:11:27.697798 volume_loading.go:142 loading memory index /private/tmp/seaweedfs-test-20240102-34518-75m69v/1.idx to memory
I0102 17:11:27.704223 store.go:170 add volume 1
I0102 17:11:27.704248 volume_grpc_client_to_master.go:179 volume server 172.21.82.195:63758 adds volume 1
I0102 17:11:27.704416 volume_growth.go:246 Created Volume 1 on topo:DefaultDataCenter:DefaultRack:172.21.82.195:63758
I0102 17:11:27.704426 volume_layout.go:405 Volume 1 becomes writable
I0102 17:11:27.704430 volume_growth.go:259 Registered Volume 1 on topo:DefaultDataCenter:DefaultRack:172.21.82.195:63758
I0102 17:11:27.704618 store.go:166 In dir /private/tmp/seaweedfs-test-20240102-34518-75m69v adds volume:2 collection: replicaPlacement:000 ttl:
I0102 17:11:27.704924 volume_loading.go:142 loading memory index /private/tmp/seaweedfs-test-20240102-34518-75m69v/2.idx to memory
I0102 17:11:27.709716 store.go:170 add volume 2
I0102 17:11:27.709728 volume_grpc_client_to_master.go:179 volume server 172.21.82.195:63758 adds volume 2
I0102 17:11:27.709827 volume_growth.go:246 Created Volume 2 on topo:DefaultDataCenter:DefaultRack:172.21.82.195:63758
I0102 17:11:27.709832 volume_layout.go:405 Volume 2 becomes writable
I0102 17:11:27.709836 volume_growth.go:259 Registered Volume 2 on topo:DefaultDataCenter:DefaultRack:172.21.82.195:63758
I0102 17:11:27.709958 store.go:166 In dir /private/tmp/seaweedfs-test-20240102-34518-75m69v adds volume:3 collection: replicaPlacement:000 ttl:
I0102 17:11:27.710324 volume_loading.go:142 loading memory index /private/tmp/seaweedfs-test-20240102-34518-75m69v/3.idx to memory
I0102 17:11:27.714995 store.go:170 add volume 3
I0102 17:11:27.715016 volume_grpc_client_to_master.go:179 volume server 172.21.82.195:63758 adds volume 3
I0102 17:11:27.715142 volume_growth.go:246 Created Volume 3 on topo:DefaultDataCenter:DefaultRack:172.21.82.195:63758
I0102 17:11:27.715150 volume_layout.go:405 Volume 3 becomes writable
I0102 17:11:27.715167 volume_growth.go:259 Registered Volume 3 on topo:DefaultDataCenter:DefaultRack:172.21.82.195:63758
I0102 17:11:27.715331 store.go:166 In dir /private/tmp/seaweedfs-test-20240102-34518-75m69v adds volume:4 collection: replicaPlacement:000 ttl:
I0102 17:11:27.715652 volume_loading.go:142 loading memory index /private/tmp/seaweedfs-test-20240102-34518-75m69v/4.idx to memory
I0102 17:11:27.720319 store.go:170 add volume 4
I0102 17:11:27.720333 volume_grpc_client_to_master.go:179 volume server 172.21.82.195:63758 adds volume 4
I0102 17:11:27.720467 volume_growth.go:246 Created Volume 4 on topo:DefaultDataCenter:DefaultRack:172.21.82.195:63758
I0102 17:11:27.720476 volume_layout.go:405 Volume 4 becomes writable
I0102 17:11:27.720480 volume_growth.go:259 Registered Volume 4 on topo:DefaultDataCenter:DefaultRack:172.21.82.195:63758
I0102 17:11:27.720645 store.go:166 In dir /private/tmp/seaweedfs-test-20240102-34518-75m69v adds volume:5 collection: replicaPlacement:000 ttl:
I0102 17:11:27.720985 volume_loading.go:142 loading memory index /private/tmp/seaweedfs-test-20240102-34518-75m69v/5.idx to memory
I0102 17:11:27.725710 store.go:170 add volume 5
I0102 17:11:27.725721 volume_grpc_client_to_master.go:179 volume server 172.21.82.195:63758 adds volume 5
I0102 17:11:27.725807 volume_growth.go:246 Created Volume 5 on topo:DefaultDataCenter:DefaultRack:172.21.82.195:63758
I0102 17:11:27.725814 volume_layout.go:405 Volume 5 becomes writable
I0102 17:11:27.725817 volume_growth.go:259 Registered Volume 5 on topo:DefaultDataCenter:DefaultRack:172.21.82.195:63758
I0102 17:11:27.725986 store.go:166 In dir /private/tmp/seaweedfs-test-20240102-34518-75m69v adds volume:6 collection: replicaPlacement:000 ttl:
I0102 17:11:27.726295 volume_loading.go:142 loading memory index /private/tmp/seaweedfs-test-20240102-34518-75m69v/6.idx to memory
I0102 17:11:27.730753 store.go:170 add volume 6
I0102 17:11:27.730764 volume_grpc_client_to_master.go:179 volume server 172.21.82.195:63758 adds volume 6
I0102 17:11:27.730865 volume_growth.go:246 Created Volume 6 on topo:DefaultDataCenter:DefaultRack:172.21.82.195:63758
I0102 17:11:27.730871 volume_layout.go:405 Volume 6 becomes writable
I0102 17:11:27.730875 volume_growth.go:259 Registered Volume 6 on topo:DefaultDataCenter:DefaultRack:172.21.82.195:63758
I0102 17:11:27.731048 store.go:166 In dir /private/tmp/seaweedfs-test-20240102-34518-75m69v adds volume:7 collection: replicaPlacement:000 ttl:
I0102 17:11:27.731474 volume_loading.go:142 loading memory index /private/tmp/seaweedfs-test-20240102-34518-75m69v/7.idx to memory
I0102 17:11:27.736966 store.go:170 add volume 7
I0102 17:11:27.736981 volume_grpc_client_to_master.go:179 volume server 172.21.82.195:63758 adds volume 7
I0102 17:11:27.737154 volume_growth.go:246 Created Volume 7 on topo:DefaultDataCenter:DefaultRack:172.21.82.195:63758
I0102 17:11:27.737160 volume_layout.go:405 Volume 7 becomes writable
I0102 17:11:27.737164 volume_growth.go:259 Registered Volume 7 on topo:DefaultDataCenter:DefaultRack:172.21.82.195:63758
I0102 17:11:27.737235 common.go:75 response method:GET URL:/dir/assign with httpStatus:406 and JSON:{"error":"failed to find writable volumes for collection: replication:000 ttl: error: No more writable volumes!"}
100   113  100   113    0     0   2722      0 --:--:-- --:--:-- --:--:--  2756
==> curl -F file=@/opt/homebrew/Library/Homebrew/test/support/fixtures/test.png http://localhost:63758/
I0102 17:11:27.768767 volume_server_handlers_write.go:28 NewVolumeId error: strconv.ParseUint: parsing "": invalid syntax
I0102 17:11:27.768790 common.go:75 response method:POST URL:/ with httpStatus:400 and JSON:{"error":"strconv.ParseUint: parsing \"\": invalid syntax"}
==> curl http://localhost:63758/
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
Killing child processes...
volume server has been killed
W0102 17:11:27.876574 master_grpc_server.go:100 SendHeartbeat.Recv server 172.21.82.195:63758 : rpc error: code = Canceled desc = context canceled
I0102 17:11:27.876587 master_grpc_server.go:365 - client .master@172.21.82.195:63757.63759
I0102 17:11:27.876597 volume_server.go:139 Stopping volume server...
I0102 17:11:27.876746 masterclient.go:225 .master masterClient failed to receive from 172.21.82.195:63757: rpc error: code = Unavailable desc = error reading from server: EOF
I0102 17:11:27.884242 topology_event_handling.go:87 Removing Volume 4 from the dead volume server 172.21.82.195:63758
I0102 17:11:27.884371 volume_layout.go:444 Volume 4 has 0 replica, less than required 1
I0102 17:11:27.884375 volume_layout.go:392 Volume 4 becomes unwritable
I0102 17:11:27.884378 topology_event_handling.go:87 Removing Volume 5 from the dead volume server 172.21.82.195:63758
I0102 17:11:27.884382 volume_layout.go:444 Volume 5 has 0 replica, less than required 1
I0102 17:11:27.884384 volume_layout.go:392 Volume 5 becomes unwritable
I0102 17:11:27.884385 topology_event_handling.go:87 Removing Volume 6 from the dead volume server 172.21.82.195:63758
I0102 17:11:27.884388 volume_layout.go:444 Volume 6 has 0 replica, less than required 1
I0102 17:11:27.884390 volume_layout.go:392 Volume 6 becomes unwritable
I0102 17:11:27.884391 topology_event_handling.go:87 Removing Volume 7 from the dead volume server 172.21.82.195:63758
I0102 17:11:27.884394 volume_layout.go:444 Volume 7 has 0 replica, less than required 1
I0102 17:11:27.884395 volume_layout.go:392 Volume 7 becomes unwritable
I0102 17:11:27.884397 topology_event_handling.go:87 Removing Volume 1 from the dead volume server 172.21.82.195:63758
I0102 17:11:27.884845 volume_layout.go:444 Volume 1 has 0 replica, less than required 1
I0102 17:11:27.884849 volume_layout.go:392 Volume 1 becomes unwritable
I0102 17:11:27.884852 topology_event_handling.go:87 Removing Volume 2 from the dead volume server 172.21.82.195:63758
I0102 17:11:27.884855 volume_layout.go:444 Volume 2 has 0 replica, less than required 1
I0102 17:11:27.884856 volume_layout.go:392 Volume 2 becomes unwritable
I0102 17:11:27.884860 topology_event_handling.go:87 Removing Volume 3 from the dead volume server 172.21.82.195:63758
I0102 17:11:27.884863 volume_layout.go:444 Volume 3 has 0 replica, less than required 1
I0102 17:11:27.884865 volume_layout.go:392 Volume 3 becomes unwritable
I0102 17:11:27.884880 node.go:262 topo:DefaultDataCenter:DefaultRack removes 172.21.82.195:63758
I0102 17:11:27.884884 master_grpc_server.go:87 unregister disconnected volume server 172.21.82.195:63758
I0102 17:11:27.884887 master_grpc_server.go:58 remove volume server 172.21.82.195:63758, online volume server: map[]
I0102 17:11:27.889379 volume.go:281 stop send heartbeat and wait 10 seconds until shutdown ...

@chenrui333
Copy link
Contributor Author

not sure why it is saying error: No more writable volumes

this is the server config

    # Start SeaweedFS master server/volume server
    master_port = free_port
    volume_port = free_port
    master_grpc_port = free_port
    volume_grpc_port = free_port

    fork do
      exec bin/"weed", "server", "-dir=#{testpath}", "-ip.bind=0.0.0.0",
           "-master.port=#{master_port}", "-volume.port=#{volume_port}",
           "-master.port.grpc=#{master_grpc_port}", "-volume.port.grpc=#{volume_grpc_port}"
    end

@chenrui333
Copy link
Contributor Author

the manual calls just work for me

$ curl http://localhost:55281/dir/assign
{"fid":"7,051a67aa86","url":"172.21.82.195:55282","publicUrl":"172.21.82.195:55282","count":1}

$ curl http://localhost:55281/dir/assign
{"fid":"2,066a3da793","url":"172.21.82.195:55282","publicUrl":"172.21.82.195:55282","count":1}

$ curl http://localhost:55281/dir/assign
{"fid":"4,07b46bb754","url":"172.21.82.195:55282","publicUrl":"172.21.82.195:55282","count":1}

$ curl http://localhost:55281/dir/assign
{"fid":"3,08b44e8031","url":"172.21.82.195:55282","publicUrl":"172.21.82.195:55282","count":1}

@chrislusf
Copy link
Collaborator

Try to limit the volume size to 1000MB and set the volume.max to 0.

@duanhongyi
Copy link
Contributor

same error:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x21e4bb9]

goroutine 71 [running]:
github.com/seaweedfs/seaweedfs/weed/command.(*S3Options).startS3Server(0x458a1c0)
        /opt/drycc/go/src/seaweedfs/weed/command/s3.go:226 +0x4b9
github.com/seaweedfs/seaweedfs/weed/command.runFiler.func1(0x0?)
        /opt/drycc/go/src/seaweedfs/weed/command/filer.go:185 +0x31
created by github.com/seaweedfs/seaweedfs/weed/command.runFiler
        /opt/drycc/go/src/seaweedfs/weed/command/filer.go:183 +0x2ae

@duanhongyi
Copy link
Contributor

@chrislusf
Copy link
Collaborator

The commit for #5153 should have fixed this nil error.

@duanhongyi
Copy link
Contributor

duanhongyi commented Jan 3, 2024

@chrislusf Release a small version to solve this problem, such as 3.61 ?

@chenrui333
Copy link
Contributor Author

Try to limit the volume size to 1000MB and set the volume.max to 0.

     fork do
       exec bin/"weed", "server", "-dir=#{testpath}", "-ip.bind=0.0.0.0",
+           "-master.volumeSizeLimitMB=1000", "-volume.max=0",
            "-master.port=#{master_port}", "-volume.port=#{volume_port}",
            "-master.port.grpc=#{master_grpc_port}", "-volume.port.grpc=#{volume_grpc_port}"
     end

tried it, still the same issue

@cr0c0dylus
Copy link

segmentation violation with filer:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1cf9bf5]

goroutine 80 [running]:
github.com/seaweedfs/seaweedfs/weed/command.(*S3Options).startS3Server(0x3cc9860)
/go/src/github.com/seaweedfs/seaweedfs/weed/command/s3.go:226 +0x495
github.com/seaweedfs/seaweedfs/weed/command.runFiler.func1(0x0?)
/go/src/github.com/seaweedfs/seaweedfs/weed/command/filer.go:185 +0x26
created by github.com/seaweedfs/seaweedfs/weed/command.runFiler in goroutine 1
/go/src/github.com/seaweedfs/seaweedfs/weed/command/filer.go:183 +0x2aa

@jan-berge-ommedal
Copy link

jan-berge-ommedal commented Jan 15, 2024

I think we see the same thing:

  • the first assign against a fresh weed server provisions writeable volumes THEN fails the assign request
  • any next assigns will work fine

regressed in 3.60, in 3.59 the first assign works as expected

@chrislusf
Copy link
Collaborator

@jan-berge-ommedal thanks for the easy way to reproduce this!

@chrislusf
Copy link
Collaborator

this should be fixed now in 3002087

@chenrui333
Copy link
Contributor Author

let me give that a shot. Thanks!

@chenrui333
Copy link
Contributor Author

this should be fixed now in 3002087

I can confirm that patch works for me!! Thanks!

@chenrui333
Copy link
Contributor Author

Ship in the homebrew side via Homebrew/homebrew-core#160016

@chenrui333 chenrui333 changed the title 3.60 regression test failure 3.60,3.61,3.62 regression test failure Jan 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants