Current IP pool utilization endpoint #5258

david-crespo · 2024-03-13T20:28:55Z

The strangest thing here is that IPv6 allocated and capacity are both u128 that we put in the JSON spec as "type": "string", "format": "uint128", where uint128 is a custom format our client generators will have to look for and handle. capacity is huge because IPv6 ranges are huge. IPv4 allocated and capacity are u32.

The strangest thing here is the fact that we have to handle the unlikely possibility that an IPv6 range is too large for us to serialize its length in a JSON response. IPv6 range size is a u128, and we seem to be capping our numbers in API responses at u32. My proposal for handling this is simply to make the field nullable in the response, and we make clear that a null value means it was too large to serialize. I think this is fine because this should only happen if the operator does something really bad, and it is well understood that the operator can mess up networking with a misconfiguration. Another approach would be to always return the size as a string, but I think given the unlikelihood of the number being too large (and the fact that it's pathological if it is) means returning a numeric string puts undue burden on clients to handle it.

…also a cop

david-crespo · 2024-03-13T20:30:54Z

nexus/db-queries/src/db/datastore/utilization.rs

+    }
+
+    // TODO: should this just return the vec of ranges and live in ip_pool.rs?
+    // TODO: generally we never retrieve all of anything, how bad is that


👁️ 👁️

nexus/db-model/src/utilization.rs

nexus/tests/integration_tests/ip_pools.rs

bnaecker · 2024-03-14T04:03:15Z

It probably will not be unusual at all. The IPv6 space is so huge it’s almost laughable. The most common subnet size is a /64, and technically you’re not even supposed to carve out smaller ones at all. We will certainly want a way to report the utilization of ranges up to the full /48 or whatever that we carve out for each VPC.

david-crespo · 2024-03-14T04:31:44Z

Thanks. I will look into a more elaborate BigInt setup. Probably a string with a custom format like uint128, and then we just make sure our generated clients know what to do with it.

…llifying

david-crespo · 2024-03-14T21:03:46Z

openapi/nexus.json

+          "allocated": {
+            "description": "The number of IPs allocated from the pool\n\nLike total, this is a numeric string with a custom \"uint128\" format representing a 128-bit integer, though in practice it is extremely unlikely to be bigger than 32 bits (2^32 = ~4.29 billion).",
+            "type": "string",
+            "format": "uint128"


This is new. Client generators will have to look for this and handle it appropriately.

Actually... this is non-trivial for the TS client. The only transformation of response JSON we currently do is parsing dates from strings, and we do that based solely on information present in the response, i.e., a) that the value is a string, b) the key starts with time_ or equals timestamp, and the value parses successfully as a date (source).

I had forgotten that while we do use the format info in the schema for generating the types (and the runtime validators we currently only use in the mock API), we do not have any of that info at request/response time, or more precisely, we do not currently encode that info the generated code that runs at that time. So I will have to think about how we're going to handle running this through BigInt. Some options involve making a change in this PR to make it easier for us in the generator, for example adding a special prefix or suffix on the field name.

cc @ahl

FelixMcFelix

Thanks a lot! I think this looks good on the omicron side, please ping me if you need to rework the u128 handling to make it easier for consumers and I can take another look.

FelixMcFelix · 2024-03-15T10:37:49Z

nexus/db-queries/src/db/datastore/ip_pool.rs

+        // This is a bit strange but it should be ok.
+        //   a) SQL tends to represent counts as signed integers even though
+        //      they can't really be negative, and Diesel follows that. We
+        //      assume here that it will be a positive i64.


I think practically this is a safe assumption; if we're approaching $2^{63}-1$ allocations then we have larger issues to worry about. :)

FelixMcFelix · 2024-03-15T10:54:00Z

nexus/db-queries/src/db/datastore/ip_pool.rs

+
+        use db::schema::ip_pool_range;
+
+        let ranges: Vec<IpPoolRange> = ip_pool_range::table


This is combining v4 and v6 capacity in a given pool -- is that the intended behaviour long-term? It's fine today since v6 isn't yet functional.

Interesting question! It hadn't occurred to me to give separate allocated and capacity counts for v4 and v6, but it makes a lot of sense, doesn't it. I will think about doing that here — it's always nice to avoid followup API changes where we can. One nice thing about that is that the IPv4 counts can be u32s.

Another way we could do this would be to have a given pool be IPv4 or IPv6 only, but I don't really like that. Right now you can add both IPv4 and IPv6 ranges to a pool, and it would be kind of complicated and/or confusing to require pools to have only one or the other.

All right, check out 39edb8f. A little complicated, maybe, but the end result is

{ ipv4: { allocated, capacity }, ipv6: { allocated, capacity }, }

Sound good?

I think so, thanks for doing this! This also helps a lot since external v4s can be quite constrained in practice.

david-crespo · 2024-03-16T03:39:49Z

nexus/src/app/utilization.rs

+                        "Failed to convert i64 {} IPv4 address count to u32",
+                        allocated.ipv4
+                    ))
+                })?,


I'm slightly nervous about this but I feel like it's going to be really, really hard to hit this 500!

david-crespo · 2024-03-18T20:38:10Z

nexus/db-queries/src/db/datastore/ip_pool.rs

+            // exactly as vulnerable as if it were paginated. If there are more
+            // than 10,000 ranges in a pool, we will undercount, but I have a
+            // hard time seeing that as a practical problem.
+            .limit(10000)


In private chat we discussed the alternative of doing this calculation in SQL so we don't have to limit it here. What we found is that neither CRDB nor Postgres support subtraction of IPv6 addresses when the result is greater than a 64 bit integer, and we couldn't figure out how to convert the addresses to 128 bit integers first and then subtract. So it seems we are forced to do this in Rust.

> select 'ffff::'::inet - '::1'; ERROR: result out of range

https://github.com/cockroachdb/cockroach/blob/91c1d5cbb5a56cd9c082d78814cd03c5471dddfa/pkg/util/ipaddr/ipaddr_test.go#L862-L881

https://github.com/cockroachdb/cockroach/blob/4deb9e3b77f38396a58bb8186fe51bb01cf9f007/pkg/util/ipaddr/ipaddr.go#L607-L632

david-crespo added 2 commits March 12, 2024 21:56

endpoint for allocated IP count and total capacity for an IP pool

b865571

sprinkle utilization checks everywhere. like if the easter bunny was …

da38cd3

…also a cop

david-crespo commented Mar 13, 2024

View reviewed changes

fix endpoint summary

25d6e26

david-crespo commented Mar 13, 2024

View reviewed changes

nexus/db-model/src/utilization.rs Show resolved Hide resolved

david-crespo requested a review from FelixMcFelix March 13, 2024 20:55

david-crespo added 3 commits March 13, 2024 21:14

move datastore functions into ip_pool file

57dd18c

integration test covering too-big total size we return as None

4629e40

silly unit test for conversion to utilization view

e320f1a

david-crespo commented Mar 14, 2024

View reviewed changes

nexus/tests/integration_tests/ip_pools.rs Outdated Show resolved Hide resolved

david-crespo added 3 commits March 14, 2024 13:24

serialize u128 as "type": "string", "format": "uint128" instead of nu…

03a6a58

…llifying

make allocated a u128 too

7f4052f

rename "total" to "capacity" everywhere

759e3e6

david-crespo commented Mar 14, 2024

View reviewed changes

fix clippy

9860fa8

david-crespo marked this pull request as ready for review March 15, 2024 01:54

david-crespo mentioned this pull request Mar 15, 2024

Use schema info to process response JSON at runtime oxidecomputer/oxide.ts#230

Closed

FelixMcFelix approved these changes Mar 15, 2024

View reviewed changes

This was referenced Mar 15, 2024

Show IP pool utilization on IP pool list and detail oxidecomputer/console#2070

Closed

Generate per-method transformResponse from schema oxidecomputer/oxide.ts#231

Closed

split utilization into separate ipv4 and ipv6 counts

39edb8f

david-crespo commented Mar 16, 2024

View reviewed changes

david-crespo added 3 commits March 16, 2024 16:25

nicer and faster way of getting the ipv4 and ipv6 allocated counts

9657e8d

Merge branch 'main' into ip-util-count

56a4de3

limit range length query to 10000 ranges

24763ef

david-crespo commented Mar 18, 2024

View reviewed changes

david-crespo enabled auto-merge (squash) March 18, 2024 20:41

david-crespo merged commit 01c57c5 into main Mar 18, 2024

david-crespo deleted the ip-util-count branch March 18, 2024 21:43

This was referenced Mar 20, 2024

IP range add can only be IPv4 #5107

Merged

IP pool utilization: break down by floating, ephemeral, SNAT #5332

Open


		use db::schema::ip_pool_range;

		let ranges: Vec<IpPoolRange> = ip_pool_range::table

Current IP pool utilization endpoint #5258

Current IP pool utilization endpoint #5258

Uh oh!

Conversation

david-crespo commented Mar 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

bnaecker commented Mar 14, 2024 via email • edited by david-crespo Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

david-crespo commented Mar 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

david-crespo Mar 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

FelixMcFelix left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

david-crespo commented Mar 13, 2024 •

edited

Loading

bnaecker commented Mar 14, 2024 via email •

edited by david-crespo

Loading

david-crespo commented Mar 14, 2024 •

edited

Loading

david-crespo Mar 14, 2024 •

edited

Loading