Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize VRP struct sizes. Use netip.Prefix instead of net.IPNet. #103

Merged
merged 1 commit into from
Dec 21, 2023

Conversation

floatingstatic
Copy link
Contributor

@floatingstatic floatingstatic commented Dec 14, 2023

I'm evaluating stayrtr in a memory constrained environment. The memory footprint of stayrtr could be optimized a bit. It seems the bulk of the memory is allocated during updates but there are also some things that could be modified to reduce memory usage during steady state. To do this I'm proposing the following changes:

  • Move from go 1.17 to 1.21

  • Use netip.Prefix instead of net.IPNet in a few structs to reduce field sizes (requires at least go1.18 which is why we change the base go version in go.mod). Bulk of the lines changed in this PR are related to this change but generally do not change much of the logic. We simply swap out equivalent functions in netip in place of those from net

  • Reordering fields in the VRP struct to reduce the size of the struct from 64 bytes to 40 bytes

Current:

$ structlayout . VRP
VRP.Prefix.IP net.IP: 0-24 (size 24, align 8)
VRP.Prefix.Mask net.IPMask: 24-48 (size 24, align 8)
VRP.MaxLen uint8: 48-49 (size 1, align 1)
padding: 49-52 (size 3, align 0)
VRP.ASN uint32: 52-56 (size 4, align 4)
VRP.Flags uint8: 56-57 (size 1, align 1)
padding: 57-64 (size 7, align 0)

New:

$ structlayout  . VRP
VRP.Prefix.ip.addr.hi uint64: 0-8 (size 8, align 8)
VRP.Prefix.ip.addr.lo uint64: 8-16 (size 8, align 8)
VRP.Prefix.ip.z *internal/intern.Value: 16-24 (size 8, align 8)
VRP.Prefix.bitsPlusOne uint8: 24-25 (size 1, align 1)
padding: 25-32 (size 7, align 0)
VRP.ASN uint32: 32-36 (size 4, align 4)
VRP.MaxLen uint8: 36-37 (size 1, align 1)
VRP.Flags uint8: 37-38 (size 1, align 1)
padding: 38-40 (size 2, align 0)
  • Reordering fields in the PDUIPv4Prefix and PDUIPv6Prefix structs from 72 bytes to 40 bytes

Current:

$ structlayout . PDUIPv4Prefix
PDUIPv4Prefix.Version uint8: 0-1 (size 1, align 1)
padding: 1-8 (size 7, align 0)
PDUIPv4Prefix.Prefix.IP net.IP: 8-32 (size 24, align 8)
PDUIPv4Prefix.Prefix.Mask net.IPMask: 32-56 (size 24, align 8)
PDUIPv4Prefix.MaxLen uint8: 56-57 (size 1, align 1)
padding: 57-60 (size 3, align 0)
PDUIPv4Prefix.ASN uint32: 60-64 (size 4, align 4)
PDUIPv4Prefix.Flags uint8: 64-65 (size 1, align 1)
padding: 65-72 (size 7, align 0)

$ structlayout . PDUIPv6Prefix
PDUIPv6Prefix.Version uint8: 0-1 (size 1, align 1)
padding: 1-8 (size 7, align 0)
PDUIPv6Prefix.Prefix.IP net.IP: 8-32 (size 24, align 8)
PDUIPv6Prefix.Prefix.Mask net.IPMask: 32-56 (size 24, align 8)
PDUIPv6Prefix.MaxLen uint8: 56-57 (size 1, align 1)
padding: 57-60 (size 3, align 0)
PDUIPv6Prefix.ASN uint32: 60-64 (size 4, align 4)
PDUIPv6Prefix.Flags uint8: 64-65 (size 1, align 1)
padding: 65-72 (size 7, align 0)

New:

$ structlayout . PDUIPv4Prefix
PDUIPv4Prefix.Prefix.ip.addr.hi uint64: 0-8 (size 8, align 8)
PDUIPv4Prefix.Prefix.ip.addr.lo uint64: 8-16 (size 8, align 8)
PDUIPv4Prefix.Prefix.ip.z *internal/intern.Value: 16-24 (size 8, align 8)
PDUIPv4Prefix.Prefix.bitsPlusOne uint8: 24-25 (size 1, align 1)
padding: 25-32 (size 7, align 0)
PDUIPv4Prefix.ASN uint32: 32-36 (size 4, align 4)
PDUIPv4Prefix.Version uint8: 36-37 (size 1, align 1)
PDUIPv4Prefix.MaxLen uint8: 37-38 (size 1, align 1)
PDUIPv4Prefix.Flags uint8: 38-39 (size 1, align 1)
padding: 39-40 (size 1, align 0)

$ structlayout . PDUIPv6Prefix
PDUIPv6Prefix.Prefix.ip.addr.hi uint64: 0-8 (size 8, align 8)
PDUIPv6Prefix.Prefix.ip.addr.lo uint64: 8-16 (size 8, align 8)
PDUIPv6Prefix.Prefix.ip.z *internal/intern.Value: 16-24 (size 8, align 8)
PDUIPv6Prefix.Prefix.bitsPlusOne uint8: 24-25 (size 1, align 1)
padding: 25-32 (size 7, align 0)
PDUIPv6Prefix.ASN uint32: 32-36 (size 4, align 4)
PDUIPv6Prefix.Version uint8: 36-37 (size 1, align 1)
PDUIPv6Prefix.MaxLen uint8: 37-38 (size 1, align 1)
PDUIPv6Prefix.Flags uint8: 38-39 (size 1, align 1)
padding: 39-40 (size 1, align 0)
  • Additional minor fixes to try to reduce memory, for example adding a boolean to not return unchanged vrps from ComputeDiff() except in tests as these are not needed other than for debug logging. We also modify ConvertSDListToMap() to only return a map with values containing previous flags instead of the entire struct. As best I can tell we only reference the value of this map in one place and only to get previous flag values.

Doing a simple side-by-side test with stayrtr started at roughly the same time on two hosts we can compare current master branch (yellow) memory RSS vs my forked branch (green):

image

For what its worth it seems most of the memory allocations occurs when updating VRPs. It appears the bulk of this is related to copying data which I haven't found a clean workaround for but may be something that could be addressed in a future pr. A view of this from pprof showing lots of this from VRP.Copy()

image

Beyond ensuring all test cases pass I have also run integration tests with BIRD 2.14 and confirm that RTR functionality continues to function as before with these changes.

@job job requested a review from ties December 14, 2023 21:42
@benjojo
Copy link
Collaborator

benjojo commented Dec 15, 2023

On the surface this looks fine, I would however want to see if two stayrtr's (old and this PR) running head to head for a few days with rtrmon comparing the two, just to make sure :)

@ties
Copy link
Collaborator

ties commented Dec 17, 2023

Very nice analysis and a clear change!

On the surface this looks fine, I would however want to see if two stayrtr's (old and this PR) running head to head for a few days with rtrmon comparing the two, just to make sure :)

That is probably the best way to be sure it is functionally identical. This way you can check that the VRPs converge after a set delay. The relevant metric is vrp_diff with a threshold that is high enough for stayrtr-current and stayrtr-thispr to update.

@floatingstatic
Copy link
Contributor Author

For clarification, is this an experiment you need me to run or are you doing this?

@ties
Copy link
Collaborator

ties commented Dec 18, 2023

For clarification, is this an experiment you need me to run or are you doing this?

I would prefer it if you ran it (I likely won't have time until 2024). Do you have a working setup with prometheus?

If we agree that convergence should be within 256s, I would check the history for violations of that property

max_over_time(vrp_diff{visibility_seconds="256"}[1h]) > 0

@floatingstatic
Copy link
Contributor Author

Ok hopefully I did this right, I ran something like this on a linux host with binaries built from my branch:

./stayrtr -bind 127.0.0.1:8282 -cache https://console.rpki-client.org/vrps.json -metrics.addr 127.0.0.1:9847
./rtrmon -addr 0.0.0.0:9866 -primary.host tcp://127.0.0.1:8282 -secondary.host https://console.rpki-client.org/vrps.json

Output looks like the below in prom. I do not see anything with label visibility_seconds="256", not sure if unexpected:

image

For whats its worth running a brief test using binaries built from upstream looks the same. Anyway I will leave this running for a couple of days and report back. Thanks!

@ties
Copy link
Collaborator

ties commented Dec 18, 2023

Ok hopefully I did this right, I ran something like this on a linux host with binaries built from my branch:

./stayrtr -bind 127.0.0.1:8282 -cache https://console.rpki-client.org/vrps.json -metrics.addr 127.0.0.1:9847
./rtrmon -addr 0.0.0.0:9866 -primary.host tcp://127.0.0.1:8282 -secondary.host https://console.rpki-client.org/vrps.json

Output looks like the below in prom. I do not see anything with label visibility_seconds="256", not sure if unexpected:

Those labels should be there eventually. Let's see!

@cjeker
Copy link
Contributor

cjeker commented Dec 20, 2023

This not only blocks #105 but also additional work to make the delta handling for ASPA correct.
It would be nice if this could get some priority to unblock this work.

@floatingstatic
Copy link
Contributor Author

@ties I have had 2 instances of stayrtr running along with 2 instances of rtrmon in my local dev environment for a couple of days. There are 2 jobs, rtrmon and rtrmon-upstream which cover binaries from my forked branch and current upstream respectively. They were not started at exactly the same time so there are some minor deltas. The output looks as follows:

image

I'm not sure exactly what you are looking for here, happy to share any additional views you may want to see here.

@ties
Copy link
Collaborator

ties commented Dec 20, 2023

@ties I have had 2 instances of stayrtr running along with 2 instances of rtrmon in my local dev environment for a couple of days. There are 2 jobs, rtrmon and rtrmon-upstream which cover binaries from my forked branch and current upstream respectively. They were not started at exactly the same time so there are some minor deltas. The output looks as follows:

Thanks for doing this!

I'm not sure exactly what you are looking for here, happy to share any additional views you may want to see here.

I hoped to see a flat line, as in "the worst case divergence converges within 256s".
Can you try the following case?

  • max_over_time(vrp_diff{visibility_seconds="851"}[1h]) > 0 and check if that is flat at 0 (if not, 851,1024,1706,3411 are also possible).

Alternatively, rtrmon can compare two stayrtrs, but reading the same JSON should also work if the JSON is from one source of truth.

@ties
Copy link
Collaborator

ties commented Dec 20, 2023

I just realised the default refresh interval is 600s; in that case, 851 or 1024 are the first values that are likely to converge.

I am also collecting data myself now, I should be able to give an update tomorrow.

@floatingstatic
Copy link
Contributor Author

I'm starting to wonder if this isn't the best way to check this or if I am doing something wrong. Yes, I am using default timers for everything but both upstream and my fork are showing a non-zero delta so either the testing methodology is incorrect or I'm doing something wrong here:

image

image

for what its worth the upstream deltas look "worse" to my eye.

So it sounds like you suggest perhaps running two stayrtr's (fork and master branch) and configure rtrmon to diff those two instead? I can give that a try and report back.

@job
Copy link
Member

job commented Dec 21, 2023

So it sounds like you suggest perhaps running two stayrtr's (fork and master branch) and configure rtrmon to diff those two instead? I can give that a try and report back.

that seems helpful indeed - before merging

@ties
Copy link
Collaborator

ties commented Dec 21, 2023

I'm starting to wonder if this isn't the best way to check this or if I am doing something wrong. Yes, I am using default timers for everything but both upstream and my fork are showing a non-zero delta so either the testing methodology is incorrect or I'm doing something wrong here:

Or the system does not converge as fast as I thought it would. I mostly run these tools in a monitoring role with (much) lower timers than you would want to run in real life.

So it sounds like you suggest perhaps running two stayrtr's (fork and master branch) and configure rtrmon to diff those two instead? I can give that a try and report back.

That would be the comparative test that I think will work (I'm writing this before I looked at my data). For data collection I used this setup:

# build the main branch + this PR and tag them, i.e.
# docker build . -t stayrtr-structoptimize --target stayrtr
services:
  stayrtr-master:
    image: stayrtr-master
    command:
      - -refresh
      - "60"
      - -cache
      - https://rpki-validator.ripe.net/json
  stayrtr-structoptimize:
    image: stayrtr-structoptimize
    command:
      - -refresh
      - "60"
      - -cache
      - https://rpki-validator.ripe.net/json
      # - https://console.rpki-client.org/rpki.json
  rtrmon:
    image: rpki/rtrmon
    restart: unless-stopped
    ports:
      - 9866:9866
    command:
      - -primary.host
      - tcp://stayrtr-master:8282
      - -secondary.host
      - tcp://stayrtr-structoptimize:8282

My reason for picking rpki-validator.ripe.net/json is that it has a very frequent update interval.

Now, looking at the data, we see that 1024s was converged between the two instances at all times:
Screenshot 2023-12-21 at 08 35 37

If we look at all the points in time, there were tiny differences that recovered (more aggressive timers might help there).
Screenshot 2023-12-21 at 08 36 09

I forgot to capture the metrics directly from stayrtr so I do not know the comparative memory usage. But I trust your statistics on that.

It looks good to me. LGTM!

@job job merged commit 3289b5d into bgp:master Dec 21, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants