Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DB update_actor_placement assert #788

Closed
plajjan opened this issue Aug 7, 2022 · 1 comment
Closed

DB update_actor_placement assert #788

plajjan opened this issue Aug 7, 2022 · 1 comment
Assignees
Labels
bug Something isn't working DB Related to the backend database

Comments

@plajjan
Copy link
Contributor

plajjan commented Aug 7, 2022

We got an assert from the test where we run an acton program with the DB.

ddb_test_server: backend/client_api.c:799: update_actor_placement: Assertion `first_rts != NULL' failed.\n"

From the traceback we can see that this happens on line 406

 Traceback (most recent call last):
  File "/__w/acton/acton/test/./test_db.py", line 406, in test_app_resume_tcp_server
    self.assertEqual(tcp_cmd(self.p, app_port, "GET"), "1")
  File "/__w/acton/acton/test/./test_db.py", line 107, in tcp_cmd
    return tcp_cmd(p, port, cmd, retries-1)
  File "/__w/acton/acton/test/./test_db.py", line 107, in tcp_cmd
    return tcp_cmd(p, port, cmd, retries-1)
  File "/__w/acton/acton/test/./test_db.py", line 88, in tcp_cmd
    raise TcpCmdError(f"Process is dead, returncode: {p.returncode}  stdout: {p.stdout and p.stdout.read()}  stderr: {p.stderr and p.stderr.read()}")

and looking in test_db.py we see that this is after resuming the program from DB... so could it be that first_rts is sometimes not set correctly when resuming from DB?

There's also some error about unpacking client message..

backend/failure_detector/db_queries.c: 2233:\x1b[0m RTS   : error unpacking client message

this was from the job https://github.com/actonlang/acton/runs/7710752909?check_suite_focus=true , raw output:

TcpCmdError: Process is dead, returncode: -6  stdout: b''  stderr: b"07:43:38.373681 \x1b[32mINFO \x1b[0m \x1b[90mrts/rts.c           : 2008:\x1b[0m RTS   : Detected 2 CPUs: Using 4 worker threads, due to low CPU count. No CPU affinity used.\n\n07:43:38.373867 \x1b[32mINFO \x1b[0m \x1b[90mrts/rts.c           : 2062:\x1b[0m RTS   : Starting distributed RTS node, host=localhost, node_id=-1, rack_id=-1, datacenter_id=-1\n\n07:43:38.373880 \x1b[32mINFO \x1b[0m \x1b[90mrts/rts.c           : 2063:\x1b[0m RTS   : Using distributed database backend replication factor of 3\n\n07:43:38.373887 \x1b[32mINFO \x1b[0m \x1b[90mrts/rts.c           : 2075:\x1b[0m RTS   : Using distributed database backend (DDB): 127.0.0.1:30400\n\n07:43:38.373893 \x1b[32mINFO \x1b[0m \x1b[90mrts/rts.c           : 2075:\x1b[0m RTS   : Using distributed database backend (DDB): 127.0.0.1:30402\n\n07:43:38.373899 \x1b[32mINFO \x1b[0m \x1b[90mrts/rts.c           : 2075:\x1b[0m RTS   : Using distributed database backend (DDB): 127.0.0.1:30404\n\n07:43:38.374912 \x1b[31mERROR\x1b[0m \x1b[90mbackend/failure_detector/db_queries.c: 2233:\x1b[0m RTS   : error unpacking client message\n\n07:43:38.375136 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/failure_detector/db_queries.c: 2246:\x1b[0m RTS   : Received gossip message: Membership_agreement_msg(type=2, ack_status=0, nonce=163208764967, Membership(Node(status=0, node_id=95937, rack_id=0, dc_id=0, hostname=127.0.0.1, port=30401), Node(status=0, node_id=95939, rack_id=0, dc_id=0, hostname=127.0.0.1, port=30403), Node(status=0, node_id=95941, rack_id=0, dc_id=0, hostname=127.0.0.1, port=30405), ), Client Membership(Node(status=0, node_id=131071, rack_id=0, dc_id=0, hostname=localhost, port=65535), , VC(95937:60, 95939:56, 95941:53))\nERROR: Server address 127.0.0.1:30400 was already added to membership!\nERROR: Server address 127.0.0.1:30402 was already added to membership!\nERROR: Server address 127.0.0.1:30404 was already added to membership!\n07:43:38.375281 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  588:\x1b[0m RTS   : Added RTS localhost:65535 - 131071 (-181345760) to membership!\n07:43:38.375291 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  788:\x1b[0m RTS   : CLIENT: Updating actor placement. Previous actor membership: actor_membership()\n\n07:43:38.375302 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  825:\x1b[0m RTS   : CLIENT: Actor membership: actor_membership()\n\n07:43:38.375313 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  231:\x1b[0m RTS   : CLIENT: Installed new agreed view Membership_agreement_msg(type=2, ack_status=0, nonce=163208764967, Membership(Node(status=0, node_id=95937, rack_id=0, dc_id=0, hostname=127.0.0.1, port=30401), Node(status=0, node_id=95939, rack_id=0, dc_id=0, hostname=127.0.0.1, port=30403), Node(status=0, node_id=95941, rack_id=0, dc_id=0, hostname=127.0.0.1, port=30405), ), Client Membership(Node(status=0, node_id=131071, rack_id=0, dc_id=0, hostname=localhost, port=65535), , VC(95937:60, 95939:56, 95941:53))\n\n07:43:38.375321 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  234:\x1b[0m RTS   : CLIENT: RTS membership: RTS_membership(RTS(status=0, rack_id=0, dc_id=0, hostname=localhost, rts_id=65535, local_index=131071), )\n\n07:43:38.375374 \x1b[32mINFO \x1b[0m \x1b[90mrts/rts.c           : 2084:\x1b[0m RTS   : Checking for existing actor state in DDB.\n\n07:43:38.375791 \x1b[32mINFO \x1b[0m \x1b[90mrts/rts.c           : 2087:\x1b[0m RTS   : Found 7 existing actors; Restoring actor state from DDB.\n\n07:43:38.376086 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  834:\x1b[0m RTS   : Adding actor -30 to membership!\n07:43:38.376102 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  867:\x1b[0m RTS   : Added actor -30 (-437258406) to membership!\n07:43:38.376109 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  869:\x1b[0m RTS   : CLIENT: Actor membership: actor_membership(Actor(actor_id=-30 (-437258406), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), )\n\n07:43:38.376115 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  834:\x1b[0m RTS   : Adding actor -23 to membership!\n07:43:38.376121 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  867:\x1b[0m RTS   : Added actor -23 (-2031240869) to membership!\n07:43:38.376145 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  869:\x1b[0m RTS   : CLIENT: Actor membership: actor_membership(Actor(actor_id=-30 (-437258406), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-23 (-2031240869), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), )\n\n07:43:38.376155 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  834:\x1b[0m RTS   : Adding actor -18 to membership!\n07:43:38.376161 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  867:\x1b[0m RTS   : Added actor -18 (-883753697) to membership!\n07:43:38.376168 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  869:\x1b[0m RTS   : CLIENT: Actor membership: actor_membership(Actor(actor_id=-30 (-437258406), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-23 (-2031240869), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-18 (-883753697), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), )\n\n07:43:38.376174 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  834:\x1b[0m RTS   : Adding actor -17 to membership!\n07:43:38.376180 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  867:\x1b[0m RTS   : Added actor -17 (1729844353) to membership!\n07:43:38.376187 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  869:\x1b[0m RTS   : CLIENT: Actor membership: actor_membership(Actor(actor_id=-30 (-437258406), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-23 (-2031240869), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-18 (-883753697), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-17 (1729844353), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), )\n\n07:43:38.376194 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  834:\x1b[0m RTS   : Adding actor -14 to membership!\n07:43:38.376201 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  867:\x1b[0m RTS   : Added actor -14 (-550086187) to membership!\n07:43:38.376208 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  869:\x1b[0m RTS   : CLIENT: Actor membership: actor_membership(Actor(actor_id=-30 (-437258406), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-23 (-2031240869), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-18 (-883753697), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-17 (1729844353), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-14 (-550086187), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), )\n\n07:43:38.376214 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  834:\x1b[0m RTS   : Adding actor -12 to membership!\n07:43:38.376220 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  867:\x1b[0m RTS   : Added actor -12 (-264701623) to membership!\n07:43:38.376228 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  869:\x1b[0m RTS   : CLIENT: Actor membership: actor_membership(Actor(actor_id=-30 (-437258406), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-23 (-2031240869), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-18 (-883753697), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-17 (1729844353), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-14 (-550086187), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-12 (-264701623), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), )\n\n07:43:38.376242 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  834:\x1b[0m RTS   : Adding actor -11 to membership!\n07:43:38.376248 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  867:\x1b[0m RTS   : Added actor -11 (2095136240) to membership!\n07:43:38.376257 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  869:\x1b[0m RTS   : CLIENT: Actor membership: actor_membership(Actor(actor_id=-30 (-437258406), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-23 (-2031240869), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-18 (-883753697), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-17 (1729844353), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-14 (-550086187), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-12 (-264701623), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-11 (2095136240), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), )\n\n07:43:38.376429 \x1b[31mERROR\x1b[0m \x1b[90mbackend/failure_detector/db_queries.c: 2233:\x1b[0m RTS   : error unpacking client message\n\n07:43:38.376486 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/failure_detector/db_queries.c: 2246:\x1b[0m RTS   : Received gossip message: Membership_agreement_msg(type=2, ack_status=2, nonce=129738077111243, Membership(Node(status=0, node_id=95937, rack_id=0, dc_id=0, hostname=127.0.0.1, port=30401), Node(status=0, node_id=95939, rack_id=0, dc_id=0, hostname=127.0.0.1, port=30403), Node(status=0, node_id=95941, rack_id=0, dc_id=0, hostname=127.0.0.1, port=30405), ), Client Membership(Node(status=1, node_id=131071, rack_id=0, dc_id=0, hostname=localhost, port=65535), , VC(95937:60, 95939:55, 95941:54))\nERROR: Server address 127.0.0.1:30400 was already added to membership!\nERROR: Server address 127.0.0.1:30402 was already added to membership!\nERROR: Server address 127.0.0.1:30404 was already added to membership!\n07:43:38.376622 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  559:\x1b[0m RTS   : RTS address localhost:65535 was already added to membership, updating status and metadata!\n\n07:43:38.376650 \x1b[36mDEBUG\x1b[0m \x1b[90mbackend/client_api.c:  788:\x1b[0m RTS   : CLIENT: Updating actor placement. Previous actor membership: actor_membership(Actor(actor_id=-30 (-437258406), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-23 (-2031240869), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-18 (-883753697), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-17 (1729844353), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-14 (-550086187), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-12 (-264701623), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), Actor(actor_id=-11 (2095136240), rts=localhost:65535, rts_index=131071, rack_id=0, dc_id=0, status=0), )\n\nddb_test_server: backend/client_api.c:799: update_actor_placement: Assertion `first_rts != NULL' failed.\n"
@plajjan plajjan added bug Something isn't working DB Related to the backend database labels Aug 7, 2022
@aagapi
Copy link
Collaborator

aagapi commented Sep 28, 2022

Fixed by #906

@aagapi aagapi closed this as completed Sep 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working DB Related to the backend database
Projects
None yet
Development

No branches or pull requests

2 participants