Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eviction of entries from owned neighbor cache #83

Closed
whitequark opened this issue Nov 21, 2017 · 7 comments
Closed

Eviction of entries from owned neighbor cache #83

whitequark opened this issue Nov 21, 2017 · 7 comments

Comments

@whitequark
Copy link
Contributor

@whitequark whitequark commented Nov 21, 2017

Right now the neighbor cache will simply grow for as long as it can when unseen before IPs are added. This can potentially take a lot of memory if we're talking about a /8 network that someone runs an ARP scan against. We should probably do something about it.

@squidarth
Copy link
Contributor

@squidarth squidarth commented Jun 6, 2018

hi @whitequark! I'm new here and v. interested in contributing to this library! I want to confirm what needs to happen here before diving in:

It looks like in the case where the neighbor cache is not using heap-allocated storage that there is already logic in place to evict old entries:

     // src/iface/neighbor.rs:94

    Err((protocol_addr, neighbor)) => {
     // If we're going down this branch, it means that a fixed-size cache storage
     // is full, and we need to evict an entry.
     let old_protocol_addr = match self.storage {
         ManagedMap::Borrowed(ref mut pairs) => {
             pairs
                 .iter()
                 .min_by_key(|pair_opt| {
                     let (_protocol_addr, neighbor) = pair_opt.unwrap();
                     neighbor.expires_at
                 })
                 .expect("empty neighbor cache storage") // unwraps min_by_key
                 .unwrap() // unwraps pair
                 .0
         }
         // Owned maps can extend themselves.
         #[cfg(any(feature = "std", feature = "alloc"))]
         ManagedMap::Owned(_) => unreachable!()
     };

     let _old_neighbor =
         self.storage.remove(&old_protocol_addr).unwrap();
     match self.storage.insert(protocol_addr, neighbor) {
         Ok(None) => {
             net_trace!("filled {} => {} (evicted {} => {})",
                        protocol_addr, hardware_addr,
                        old_protocol_addr, _old_neighbor.hardware_addr);
         }
         // We've covered everything else above.
         _ => unreachable!()
     }
 }

is the task here to repeat this logic for the heap-allocated case?

  1. Do we want to make the the maximum size of the cache configurable, or do we want to be opinionated and set it to some reasonable number? Seems like the type of thing that ought to be configurable.

Thanks a lot for any guidance & for building this awesome library--I'm excited to dive in!

@whitequark
Copy link
Contributor Author

@whitequark whitequark commented Jun 6, 2018

Hi @squidarth! The challenge here, first and foremost, in figuring out what logic exactly should be followed here. I didn't really have much time to put into this issue myself. Do you think you could look at other TCP/IP stacks and see what sort of logic they use for the neighbor table?

@squidarth
Copy link
Contributor

@squidarth squidarth commented Jun 6, 2018

Yep--sounds good

@dlrobertson
Copy link
Collaborator

@dlrobertson dlrobertson commented Jun 6, 2018

@squidarth You might also check out the NDISC RFC. It has some good info about the Neighbor Cache. I can't remember if it says anything about evicting entries from the cache though.

@squidarth
Copy link
Contributor

@squidarth squidarth commented Jun 11, 2018

Alright, so today I spent some time digging into how Linux handles cache eviction:

My takeaway: I think we can simply add automatic garbage collection in the fill method that goes through the cache and removes any expired entries.

How the Linux Kernel handles cache eviction

First off, the kernel keeps a lot more state on entries in the ARP translation table does. It refers to items in the table as "neighbors", which go through a state machine (defined here), where they start out as NUD_REACHABLE, but might go into NUD_STALE if they have an old confirmation timestamp. These states are updated in the neigh_timer_handler that I linked to above.

There is another function neigh_periodic_work that actually goes through the items in the neighbor table and removes nodes that are NUD_FAILED.

Garbage collection is also triggered when allocating new neighbors if the number of entries in the table is above some constant value. On linux, these constants live at:

  • /proc/sys/net/ipv4/neigh/default/gc_thresh3
  • /proc/sys/net/ipv4/neigh/default/gc_thresh2

More thoughts

I don't think there's a huge need to redo how the ARP table is managed in smoltcp to adapt to having multiple states, and also think that an approach that uses another thread to garbage collect in the background would add a lot of complexity (we would have to move the NeighborCache into a shared data structure, etc.). An approach that synchronously removes the expired entries I feel like spiritually matches up with what linux does.

Re: the constants that linux provides, we probably want to allow users to supply these, but can use similar defaults.

Let me know what you think!

@whitequark
Copy link
Contributor Author

@whitequark whitequark commented Jun 11, 2018

also think that an approach that uses another thread to garbage collect in the background would add a lot of complexity

No threads--all functionality should be available either with core or with core+alloc.

Otherwise, sounds great.

@dlrobertson
Copy link
Collaborator

@dlrobertson dlrobertson commented Jun 12, 2018

I agree with @whitequark, this sounds great!

I don't think there's a huge need to redo how the ARP table is managed in smoltcp to adapt to having multiple states

The states mentioned above are the states of an entry in the Neighbor Cache required by NDISC that are required by Neighbor Unreachability Detection (NUD). So we will eventually need to add this state to the cache in order to be compliant with the RFC. I have a bullet in the issue #103 for this, but it should probably be its own issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants