Skip to content

Conversation

@smklein
Copy link
Collaborator

@smklein smklein commented Jan 27, 2022

No description provided.

@smklein smklein changed the title Use 0.0.0.0 in place of 127.0.0.1 to appease docker Simulated Crucible Agent: Use 0.0.0.0 in place of 127.0.0.1 to appease docker Jan 27, 2022
@smklein smklein changed the title Simulated Crucible Agent: Use 0.0.0.0 in place of 127.0.0.1 to appease docker Simulated Crucible Agent: Use 0.0.0.0 (not 127.0.0.1) to appease docker Jan 27, 2022
@smklein smklein requested a review from david-crespo January 27, 2022 18:28
@smklein smklein enabled auto-merge (squash) January 27, 2022 18:35
@smklein smklein disabled auto-merge January 27, 2022 18:52
@davepacheco
Copy link
Collaborator

I think this means the server will be exposed on all local addresses. That feels like not a great default -- somebody running in their home network or even with a local internet-facing IP address would find their server exposed on those networks, too. I'm not sure it's worth the effort to do this here but it'd be nice to make this configurable and use 0.0.0.0 in the Docker environment if that's what we want there and keep using 127.0.0.1 as a more secure default.

@david-crespo
Copy link
Contributor

I think this means the server will be exposed on all local addresses. That feels like not a great default -- somebody running in their home network or even with a local internet-facing IP address would find their server exposed on those networks, too. I'm not sure it's worth the effort to do this here but it'd be nice to make this configurable and use 0.0.0.0 in the Docker environment if that's what we want there and keep using 127.0.0.1 as a more secure default.

That's the goal, we just want to make sure this fixes the problem I'm seeing on GCP.

@david-crespo
Copy link
Contributor

welp. docker is not appeased. same error, now with 0.0.0.0 in it

error sending request for url (http://0.0.0.0:41783/crucible/0/regions): error trying to connect: tcp connect error: Connection refused (os error 111), response_code: 500

@david-crespo
Copy link
Contributor

david-crespo commented Jan 27, 2022

For the record here are the details. Note that the timestamps on the 500s in the nexus log (bottom) are interleaved with but don't really line up with the 200s in the sled-agent log. Very odd!

david_crespo@console-git-bump-api:~$ docker logs sled-agent 2>&1 | grep "Created Simulated Crucible" | tail -10
Jan 27 19:18:08.179 INFO Created Simulated Crucible Server, address: 0.0.0.0:46289, kind: storage, server: 5223674a-d003-4149-97a9-89330cc1bffd, component: SledAgent
Jan 27 19:18:08.555 INFO Created Simulated Crucible Server, address: 0.0.0.0:33293, kind: storage, server: 5223674a-d003-4149-97a9-89330cc1bffd, component: SledAgent
Jan 27 19:18:08.640 INFO Created Simulated Crucible Server, address: 0.0.0.0:35507, kind: storage, server: 5223674a-d003-4149-97a9-89330cc1bffd, component: SledAgent
Jan 27 19:18:08.710 INFO Created Simulated Crucible Server, address: 0.0.0.0:40333, kind: storage, server: 5223674a-d003-4149-97a9-89330cc1bffd, component: SledAgent
Jan 27 19:18:08.752 INFO Created Simulated Crucible Server, address: 0.0.0.0:39349, kind: storage, server: 5223674a-d003-4149-97a9-89330cc1bffd, component: SledAgent
Jan 27 19:18:08.792 INFO Created Simulated Crucible Server, address: 0.0.0.0:35055, kind: storage, server: 5223674a-d003-4149-97a9-89330cc1bffd, component: SledAgent
Jan 27 19:18:08.839 INFO Created Simulated Crucible Server, address: 0.0.0.0:35835, kind: storage, server: 5223674a-d003-4149-97a9-89330cc1bffd, component: SledAgent
Jan 27 19:18:08.904 INFO Created Simulated Crucible Server, address: 0.0.0.0:33285, kind: storage, server: 5223674a-d003-4149-97a9-89330cc1bffd, component: SledAgent
Jan 27 19:18:08.948 INFO Created Simulated Crucible Server, address: 0.0.0.0:42807, kind: storage, server: 5223674a-d003-4149-97a9-89330cc1bffd, component: SledAgent
Jan 27 19:18:09.002 INFO Created Simulated Crucible Server, address: 0.0.0.0:32937, kind: storage, server: 5223674a-d003-4149-97a9-89330cc1bffd, component: SledAgent

david_crespo@console-git-bump-api:~$ docker logs sled-agent 2>&1 | grep /crucible/0/regions
Jan 27 19:18:37.976 INFO request completed, response_code: 200, uri: /crucible/0/regions, method: POST, req_id: d0314241-e177-4238-8615-3bf66066212c, remote_addr: 127.0.0.1:50202, local_addr: 0.0.0.0:35507, component: Simulated CrucibleAgent Dropshot Server, kind: storage, server: 5223674a-d003-4149-97a9-89330cc1bffd, component: SledAgent
Jan 27 19:18:38.377 INFO request completed, response_code: 200, uri: /crucible/0/regions, method: POST, req_id: 6fadd51e-99d1-4a4b-b222-8df8fdb3fc3a, remote_addr: 127.0.0.1:50208, local_addr: 0.0.0.0:35507, component: Simulated CrucibleAgent Dropshot Server, kind: storage, server: 5223674a-d003-4149-97a9-89330cc1bffd, component: SledAgent
Jan 27 19:18:38.754 INFO request completed, response_code: 200, uri: /crucible/0/regions, method: POST, req_id: b79d8a34-a1ee-4466-ad14-2998bfda51ae, remote_addr: 127.0.0.1:50216, local_addr: 0.0.0.0:35507, component: Simulated CrucibleAgent Dropshot Server, kind: storage, server: 5223674a-d003-4149-97a9-89330cc1bffd, component: SledAgent
Jan 27 19:18:38.758 INFO request completed, response_code: 200, uri: /crucible/0/regions, method: POST, req_id: 1b394c16-4381-4344-8821-84ad516c557e, remote_addr: 127.0.0.1:41044, local_addr: 0.0.0.0:39349, component: Simulated CrucibleAgent Dropshot Server, kind: storage, server: 5223674a-d003-4149-97a9-89330cc1bffd, component: SledAgent
Jan 27 19:18:39.139 INFO request completed, response_code: 200, uri: /crucible/0/regions, method: POST, req_id: ab332674-250e-479e-ab80-bd71ea8b5555, remote_addr: 127.0.0.1:36106, local_addr: 0.0.0.0:35055, component: Simulated CrucibleAgent Dropshot Server, kind: storage, server: 5223674a-d003-4149-97a9-89330cc1bffd, component: SledAgent
Jan 27 19:18:39.139 INFO request completed, response_code: 200, uri: /crucible/0/regions, method: POST, req_id: 605c0c23-e822-4aae-838c-bb5b699400af, remote_addr: 127.0.0.1:37994, local_addr: 0.0.0.0:33285, component: Simulated CrucibleAgent Dropshot Server, kind: storage, server: 5223674a-d003-4149-97a9-89330cc1bffd, component: SledAgent
Jan 27 19:18:39.139 INFO request completed, response_code: 200, uri: /crucible/0/regions, method: POST, req_id: a9b95fd1-9a43-4a4b-817c-cff0b8619365, remote_addr: 127.0.0.1:50224, local_addr: 0.0.0.0:35507, component: Simulated CrucibleAgent Dropshot Server, kind: storage, server: 5223674a-d003-4149-97a9-89330cc1bffd, component: SledAgent
Jan 27 19:18:39.902 INFO request completed, response_code: 200, uri: /crucible/0/regions, method: POST, req_id: 00a104eb-f055-444f-9b9b-76796d63219f, remote_addr: 127.0.0.1:57900, local_addr: 0.0.0.0:40333, component: Simulated CrucibleAgent Dropshot Server, kind: storage, server: 5223674a-d003-4149-97a9-89330cc1bffd, component: SledAgent
Jan 27 19:18:40.325 INFO request completed, response_code: 200, uri: /crucible/0/regions, method: POST, req_id: 0327efa3-8a32-4d76-8378-168dac1a4584, remote_addr: 127.0.0.1:57904, local_addr: 0.0.0.0:40333, component: Simulated CrucibleAgent Dropshot Server, kind: storage, server: 5223674a-d003-4149-97a9-89330cc1bffd, component: SledAgent

david_crespo@console-git-bump-api:~$ docker logs nexus 2>&1 | grep /crucible/0/regions
Jan 27 19:18:38.125 INFO request completed, error_message_external: Internal Server Error, error_message_internal: error sending request for url (http://0.0.0.0:44177/crucible/0/regions): error trying to connect: tcp connect error: Connection refused (os error 111), response_code: 500, uri: /organizations/maze-war/projects/prod-online/disks, method: POST, req_id: 52cb41f6-19ef-49f7-9021-3df038e5a144, remote_addr: 127.0.0.1:39404, local_addr: 0.0.0.0:8888, component: dropshot_external, name: e6bff1ff-24fb-49dc-a54e-c6a350cd4d6c
Jan 27 19:18:38.526 INFO request completed, error_message_external: Internal Server Error, error_message_internal: error sending request for url (http://0.0.0.0:37405/crucible/0/regions): error trying to connect: tcp connect error: Connection refused (os error 111), response_code: 500, uri: /organizations/maze-war/projects/prod-online/disks, method: POST, req_id: 4ec89eb7-f422-44b0-baa2-e29be2c5b1f7, remote_addr: 127.0.0.1:39412, local_addr: 0.0.0.0:8888, component: dropshot_external, name: e6bff1ff-24fb-49dc-a54e-c6a350cd4d6c
Jan 27 19:18:38.893 INFO request completed, error_message_external: Internal Server Error, error_message_internal: error sending request for url (http://0.0.0.0:44129/crucible/0/regions): error trying to connect: tcp connect error: Connection refused (os error 111), response_code: 500, uri: /organizations/maze-war/projects/prod-online/disks, method: POST, req_id: 80f910cf-e099-41a0-81c4-49958d017ff9, remote_addr: 127.0.0.1:39420, local_addr: 0.0.0.0:8888, component: dropshot_external, name: e6bff1ff-24fb-49dc-a54e-c6a350cd4d6c
Jan 27 19:18:40.078 INFO request completed, error_message_external: Internal Server Error, error_message_internal: error sending request for url (http://0.0.0.0:41783/crucible/0/regions): error trying to connect: tcp connect error: Connection refused (os error 111), response_code: 500, uri: /organizations/maze-war/projects/prod-online/disks, method: POST, req_id: ae1c28f5-73ad-4799-8a73-213bd6f0978a, remote_addr: 127.0.0.1:39446, local_addr: 0.0.0.0:8888, component: dropshot_external, name: e6bff1ff-24fb-49dc-a54e-c6a350cd4d6c
Jan 27 19:18:40.469 INFO request completed, error_message_external: Internal Server Error, error_message_internal: error sending request for url (http://0.0.0.0:39409/crucible/0/regions): error trying to connect: tcp connect error: Connection refused (os error 111), response_code: 500, uri: /organizations/maze-war/projects/prod-online/disks, method: POST, req_id: 33c3648b-3d0b-46e8-a69c-aa0609994282, remote_addr: 127.0.0.1:39454, local_addr: 0.0.0.0:8888, component: dropshot_external, name: e6bff1ff-24fb-49dc-a54e-c6a350cd4d6c

@smklein
Copy link
Collaborator Author

smklein commented Jan 27, 2022

The URLs being requested by Nexus also don't seem to align with the ports advertised by Crucible.... see: http://0.0.0.0:44177/crucible/0/regions; port 44177 shouldn't have been advertised here. I'll take a closer look at the Nexus side, but just checking - we're wiping the DB clean between multiple tests here, right?

@smklein smklein changed the title Simulated Crucible Agent: Use 0.0.0.0 (not 127.0.0.1) to appease docker Simulated Crucible Agent: Use CLI-configured IP address Jan 27, 2022
.set((
dsl::time_modified.eq(Utc::now()),
dsl::pool_id.eq(excluded(dsl::id)),
dsl::pool_id.eq(excluded(dsl::pool_id)),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this was the manifestation of the problem, but this was definitely a bug.

storage: ConfigStorage {
// Create 10 "virtual" U.2s, with 1 TB of storage.
zpools: vec![ConfigZpool { size: 1 << 40 }; 10],
ip: args.sled_agent_addr.ip(),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At @david-crespo 's suggestion, I'm just re-using whatever IP we supplied for the sled agent here when allocating ports for Crucible too.

@smklein smklein merged commit 908599d into main Jan 27, 2022
@smklein smklein deleted the storage-addr branch January 27, 2022 20:55
leftwo pushed a commit that referenced this pull request Apr 19, 2024
Propolis changes:
Update h2 dependency
Add NPT ops API definitions from illumos#15639
server: return better HTTP errors when not ensured (#649)

Crucible changes:
Make Region test suite generic across backends (#1263)
Remove async from now-synchronous functions (#1264)
Agent update to support cloning. (#1262)
Remove the Active → Faulted transition (#1260)
Avoid race condition in crutest rand-read/write (#1261)
Add Active -> Offline -> Faulted tests (#1257)
Reorganize dummy downstairs tests (#1253)
Switch to unbounded queues (#1256)
Add Upstairs session ID to dtrace stat probe, cleanup closure (#1254)
Panic instead of returning errors in unit tests (#1251)
Add a clone option to downstairs create (#1249)
leftwo added a commit that referenced this pull request Apr 19, 2024
Propolis changes:
Update h2 dependency
Add NPT ops API definitions from illumos#15639
server: return better HTTP errors when not ensured (#649)

Crucible changes:
Make Region test suite generic across backends (#1263) Remove async from
now-synchronous functions (#1264) Agent update to support cloning.
(#1262)
Remove the Active → Faulted transition (#1260)
Avoid race condition in crutest rand-read/write (#1261) Add Active ->
Offline -> Faulted tests (#1257)
Reorganize dummy downstairs tests (#1253)
Switch to unbounded queues (#1256)
Add Upstairs session ID to dtrace stat probe, cleanup closure (#1254)
Panic instead of returning errors in unit tests (#1251) Add a clone
option to downstairs create (#1249)

Co-authored-by: Alan Hanson <alan@oxide.computer>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants