Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbage collection? #32

Open
elidupree opened this issue Aug 29, 2017 · 12 comments
Open

Garbage collection? #32

elidupree opened this issue Aug 29, 2017 · 12 comments

Comments

@elidupree
Copy link
Owner

TimeSteward is theoretically ideal for incremental garbage collection: it is already obligated to represent its data as small, separable chunks with links between them, and retain immutable copies of old state.

(Here follows some not-perfectly-explained thoughts; maybe I will rewrite them when I'm in a clearer mental state.)

The basic idea is to record when each object is created, and incrementally trace through all objects based on their links, starting at the TimeSteward globals. When a trace is finished, it then iterates through all objects that have been allocated, and drops the ones that were created before the trace started but not reached by the trace.

In practice, implementing this will be very complicated. Some things to consider:

  • In what ways does the garbage collection need to support concurrency?
  • Garbage collection depends on user code to implement related traits (such as tracing). But it should remain memory-safe even if the user implements them wrong.
  • A neat way to do it would be to trace the data at a particular ExtendedTime – specifically, a time when forget_before() has been called, meaning that the data can no longer change before that. As the forgotten time goes forward, eventually everything that's inaccessible will be dropped. However, what about the case where lots of computation is done WITHOUT the forgotten time moving forward? If you ran out of memory while doing that, and the garbage collector wasn't able to free that memory, that would be bad. If we required client code to be able to retain snapshots of the full history instead of just a single moment, this can be handled well, and that would be a nice feature in general… But we might need fancier data structures for that, since it would be too inefficient to simply clone whole DataTimelines.
  • If we're doing this much memory management, maybe it would be possible to avoid using malloc() as well? There's some possible ideas about using memory pools and moving old objects around, leaving behind pointers to where they moved to. Of course, there's a trade-off between saving the work of malloc() and doing extra work moving objects and each time you follow a link (to check whether it has moved).
@elidupree
Copy link
Owner Author

elidupree commented Aug 31, 2017

Client code is already obligated to call destroy_prediction() when a prediction ceases to be accessible.

What if it had to call a function like that for DataHandle as well? It could be like crossbeam::epoch::Guard::unlinked(). The problem being, of course, that that's an unsafe function – so either I would make it safe by using reference counting (at a runtime cost), or make wrappers that do handle management, so that client code doesn't have to use unsafe functions itself (kind of like the way SimpleTimeline currently partially manages predictions).

So instead of making the user implement unsafe trait Trace {…} for all their types, I could have something like…

trait EventAccessor {
  ...
  unsafe fn unlinked (& Handle); // with a special behavior for the type that means a prediction
}
trait FutureCleanupAccessor {
  ...
  unsafe fn change_unlink_time (& Handle);
}

// Implement a custom derive for this? Or maybe piggyback on Serialize?
unsafe trait WhenUnlinked {
  fn unlinked (& EventAccessor); // for anything that it must additionally be unlinked when one thing is unlinked; by default, call it recursively on all fields
}

pub struct UniqueHandle <T> (Handle <T>)
impl WhenUnlinked for UniqueHandle { ... } // Also call the unlinked function of the thing behind the handle
pub struct SharedHandle <T> (Handle <(reference_count_by_time: BTreeMap??, T)>)
impl WhenUnlinked for SharedHandle { ... } // Mess with reference count and call the unlinked function of the thing if the references have run out

...
...in SimpleTimeline:
????

But my thoughts are not quite organized correctly, as I found when I tried and failed to make SimpleTimeline do all the prediction management. I'll have to think about this more.

@elidupree
Copy link
Owner Author

Current situation for SimpleTimeline:

What we would WANT to do is, whenever a new value is inserted, we call unlinked() (currently destroy_prediction()) on the previous value. (And we call change_unlink_time when the changes are reverted.) The problem is that this currently wouldn't support UniqueHandle (because the data have to be Clone and you have to clone all of them to update any of them) OR SharedHandle (because it doesn't exist).

Also, it seems weird/asymmetric to have SimpleTimeline automatically call unlinked(), but not also be responsible for establishing links? A practical issue is that if you call create_prediction() and then don't store the prediction anywhere (admittedly, that's currently forbidden, but anyway), then it automatically gets leaked. It seems more like the act of storing a handle should be the thing that gives it its existence, rather than the act of allocating the handle. However, we can't safely drop a handle that was allocated but not stored, because the user might have stored it somewhere else. And we don't want to make every create_prediction() an unsafe call. So we're stuck with this unless we can somehow bundle creation and storing together into a safe call.

Another issue: what if you move a prediction? Like remove it from one SimpleTimeline and put it into another one in the same event. We'd want to unlink it in the first operation and then re-link in the second. But change_unlink_time() (currently change_prediction_destroyer()) is currently only valid in future-cleanup, because it may refer to specific future times. Well… in the canonical simulation, you'd never have a reason to pass anything but None, so we can always have a "relink" function for regular events that's just equivalent to change_unlink_time (_, None). But SimpleTimeline wouldn't know whether you had just created it (so already linked) or whether you were moving it (unlinked).

We could use Drop impls to catch the user leaving any unlinked handles around in memory, but that would have a performance cost, and memory-safety things are too important to be allowed to turn off the audits in release builds.

@elidupree
Copy link
Owner Author

Wait a minute:

Even with just Rc/Arc instead of garbage collection, the only source of memory leaks is when we fail to call forget_before() on DataTimelineCells. And the main way we fail to call forget_before() is when the user discards a handle to one of them.

So the API we would need is:
– A trait required by DataHandle, which allows you to iterate the shallowly contained DataTimelineCells (i.e. the ones not behind another DataHandle)
– Some way the TimeSteward would be informed of the existent DataHandles (such as making them only constructible through an accessor), so that TimeSteward can call forget_before() on all of them
– Some way for the TimeSteward to know when it no longer needs to hold onto one of them (maybe forget_before() could return a value indicating that the timeline has now been ENTIRELY forgotten)

In this model, SimpleTimeline would be the structure responsible for remembering when it was destroyed (so that it could tell the TimeSteward it was done forgetting.)

@elidupree
Copy link
Owner Author

elidupree commented Sep 6, 2017

Notes about "piggyback off of Serialize":

At first glance, this seems like a silly/improper thing to do, but it might actually be the best thing to do? The accessible handles are required to be the same set as the serializable handles. So this would protect against slight differences between my own custom derive and Serialize's.

(Edit: there are some obscure circumstances where "iterate contained DataHandles/DataTimelineCells" could be implemented more efficiently than the Serialize hack, such as a Vec<Option<DataHandle<_>>> where the user knows which indices are Some for some reason. However, allowing the user to implement a more efficient approach can be added on later without breaking the Serialize default.)

As for how it would be done in practice, we can use specialization, something like this:


struct IterateDataTimelineCells {…}

impl Serializer for IterateDataTimelineCells {
  // trivial implementations of every method
}

impl Serialize for DataTimelineCell {
  fn serialize <S: Serializer> (&self, serializer: S) {
    if let Some(serializer) = static_downcast::<S, IterateDataTimelineCells> {
      // do the specific custom operation
    } else {
      // do real serialization
    }
  }
}

  trait StaticDowncast <T> {
    fn static_downcast (self)->Option <T>;
  }
  impl <T> StaticDowncast <T> for T {
    fn static_downcast (self)->Option <T> {Some(self)}
  }
  impl <T, U> StaticDowncast <T> for U {
    default fn static_downcast (self)->Option <T> {None}
  }
  fn static_downcast <T, U> (input: T)->U {
    StaticDowncast::<U>::static_downcast (input)
  }

@elidupree
Copy link
Owner Author

Notes about memory safety during deserialization:

The worst-case scenario would be if we have to make there be a thread-local "deserialization in progress" flag, have DataHandles panic if you try to dereference them when the flag is active, and then disable the flag after they're all initialized. This has runtime overhead because you always have to do the checks, and dereferencing these handles happens a lot.

Here's a different approach: note that it's forbidden for DataHandles alone to form cyclic data structures – cyclic data structures can only happen with interior mutability, and interior mutability is only permitted in DataTimelineCells. And DataTimelineCells are opaque without an accessor. So if we serialize in the correct order, we can leave only the DataTimelines inside the DataTimelineCells uninitialized, then go back and initialize them.

First pass: deserialize all handles to their final positions in memory, but DataTimelines are left uninitialized.
Second pass: deserialize all DataTimelines, which may include handles (which are now legitimate!), into a HashMap<id, timeline>.
Third pass: do a pass over all handles, putting the initialized DataTimelines in their places.

However, this still has a problem if the incoming data is invalid – how do you get rid of the uninitialized DataTimelines? You can never drop them and you can't replace them with anything, so you have to mem::forget() them, but it still leaks memory if you don't have something legitimate to replace them with before dropping their containing objects. So it seems like we may need to require trait DataTimeline to provide a legitimate null state that contains no handles.

But if we require that, it gets much easier! We don't have to make them uninitialized at all, just make them null. Then mem::replace() them with the proper values. (We're even allowed to do that within Serialize, because of the interior mutability!) So… We might be able to do the entire deserialization without unsafe code!

@elidupree
Copy link
Owner Author

Notes about "serializing in the correct order":

This is trickier than it might seem. Imagine that you have a simple chain, peripheral object->middle object->globals, but globals also contains a timeline pointing at peripheral object. So the serialization order has to be globals, then middle, then peripheral, but the order you find the objects is globals, then peripheral, then middle.

I suppose serialization could explicitly analyze them as a DAG and then serialize them in order by level (first the nodes that reference nothing, then the ones that reference only the first collection, etc.).

There might be a clever way to serialize them in a working order using the stack, but I probably need to worry about stack overflow in the worst-case of serializing a linked list of a zillion objects.

Anyway, this problem is clearly solvable within the "serialization algorithm" code, and won't require API support.

@elidupree
Copy link
Owner Author

As for predictions, I've realized that we don't need the full API including change_prediction_destroyer() at all. All we need is a "reference count", but specifically the number of references that are accessible at the time of the predicted event. So, the practical difference from the current system is that instead of create_prediction(), destroy_prediction(), and change_prediction_destroyer(), we'd have create_prediction(), link_prediction(), and unlink_prediction(), all in EventAccessor.

SimpleTimeline could finally take care of all the linking and unlinking.

Implementation-wise, in simple_full, instead of remembering prediction_destroyer, we would remember number_of_links. This could also replace should_be_executed (fiat events would just have the number of links always be 1, or 0 after they are removed). Simple_flat would also have to do this additional reference counting, but that seems fine.

Simulations that wanted to be more optimized than this automated reference counting could avoid using SimpleTimeline, and just link and unlink manually. There would only be a tiny performance disadvantage in keeping track of a count that happens to never go above 1, compared with keeping track of a bool, as we do currently. That kind of hyper-fine optimization might be something that we update the API for in the distant future, but that won't be for a long time, and would have to be based on actual experience.

@elidupree
Copy link
Owner Author

elidupree commented Sep 6, 2017

So, summary of API changes:

  • replace create_prediction(), destroy_prediction(), and change_prediction_destroyer(), with create_prediction(), link_prediction(), and unlink_prediction(). SimpleTimeline handles the linking. Remove IterateUniquelyOwnedPredictions because Serialize can do the new job.
  • Create DataTimeline::null() (new()? Default?).
  • DataTimeline::forget_before() returns a value. Returning true promises that it contains no handles, and will never contain handles in the future. SimpleTimeline must have a "destroy" function, and be forbidden from being modified after being destroyed so that it forget_before() can eventually forget all handles in it.
  • Replace DataHandle::new() with EventAccessor::new_handle().

@elidupree
Copy link
Owner Author

Wait a minute. Under this model, if you created a handle that didn't have any DataTimelineCells in it, then it would be pointless for the TimeSteward to know about it, because it would immediately observe that all DataTimelineCells in it had been completely forgotten, and therefore be permitted to forget about it.

But maybe it's not pointless? We'd like to make this system more real-time by preventing arbitrary-depth Rc dropping. So, imagine if we had a queue of all handles that hadn't been dropped yet, and frequently do incremental passes over the queue. When we pass each handle, we drop it if it has only one reference left (the one in the queue). If not, we call forget_before() on it. (Obviously, if it's dropped, it forgets everything anyway.)

So DataHandle will still be constructed with an accessor, but forget_before() doesn't need a return value.

@elidupree
Copy link
Owner Author

elidupree commented Dec 25, 2017

Problem: even with no DataTimelineCells, repeatedly iterating through a list of DataHandle would take O(n^2) operations to drop a linked list that was in the wrong order.

Solution: iterate in reverse insertion order. That way, the references are always dropped before the things they refer to.

The existence of this ordering might also help with the serialization algorithm.

@elidupree
Copy link
Owner Author

elidupree commented Dec 29, 2017

Note: keeping an explicit list of DataHandles implies type erasure; we might want to avoid the expense of doing a dynamic dispatch for every DataHandle.

Ideally, we'd want to use the existing structure to iterate the DataHandles. This is already possible using my piggyback-on-Serialize system. We can do it in one pass (which isn't real-time unless done in a different thread), or try to split the pass up by occasionally storing a Fn object (dynamic dispatch, but maybe much fewer than one per DataHandle). If we split it up or running it in a different thread, that may run into inconsistencies if we are examining any data that hasn't been finalized yet. It's probably okay to retain all non-finalized data, so we can just "run forget_before on every DataTimeline that's accessible from finalized data" rather than "run forget_before on every DataTimeline". If forget_before never discards non-finalized data, all data will EVENTUALLY be visited (after it gets finalized), and we don't have to be careful of inconsistencies (because accidentally visiting only-accessible-from-non-finalized-data data doesn't actually do anything). We just need to make sure the algorithm follows all of the links in the forgotten data before forgetting it.

A separate garbage collection thread might be the best idea, so we don't have to split up the pass or avoid using non-real-time structures like HashMaps of handles. And it would also mean that we can just allow a whole bunch of Arcs to get dropped at once, the way they might ordinarily.

A more memory-unsafe idea goes like this: Any data that becomes inaccessible at some simulation-time cannot become accessible again after that time. So, at least theoretically, we are free to discard data that becomes inaccessible during the finalized time period, without checking to make sure that no non-finalized data still references it. This is probably only worth considering if we observe that reference counting costs are actually significant.

@elidupree
Copy link
Owner Author

(Note: My planned design has changed a bunch since the older comments.)

Broadly speaking, the options for entity deallocation are:

  1. Handles are IDs into a global data structure; data can simply be removed from the data structure. (Cons: Lookup cost every time a handle is dereferenced.)
  2. Handles are reference counted pointers. (Cons: Reference counting cost every time a handle is created or destroyed.)
  3. A tracing garbage collection system. (Cons: Client code is more complex and has a safety obligation.)

With 1 and 2, to deal with cycles in entities' mutable data, you still have to either perform tracing garbage collection or require clients to delete entities explicitly. 2 and 3 also still need a global list so you can find the inaccessible pointers that need to be deallocated (if deleting entities explicitly is required, it might only need to contain the ones that are currently deleted, but I expect that we'll want a global collection of all entities anyway).

The API-affecting nature of 3 is probably prohibitive, but it's worth exploring how it could work. I already expect that handles can only be dereferenced through an Accessor (even if you're only looking at the immutable part); for 3, the Accessor would require not just a &'a EntityHandle<I,M>, but an AccessibleDataGuard<'a, EntityHandle<I,M>>, which is a guard guaranteeing that the returned EntityHandle is accessible within the current accessor; an accessor for an Event would provide an AccessibleDataGuard for that event, and all Accessors would provide an AccessibleDataGuard for the globals, but the only way to prove anything else accessible is to access it through one of the existing AccessibleDataGuards. A custom derive would provide accessor methods for fields; for example, if you had the following entity type:

#[derive(EntityData)]
struct Turret<EntityHandle: EntityHandleTrait> {
  location: Vector,
  target: EntityHandle::Type<(), Monster>,
} 

then the derive would produce a method

fn target<'a>(self: AccessibleDataGuard<'a, Turret<EntityHandle>>) -> AccessibleDataGuard<'a, EntityHandle::Type<(), Monster>> {
   unsafe {AccessibleDataGuard::new_unchecked(&self.target)}
}

So any time you wanted an AccessibleDataGuard to an inner object, you'd have to use a method instead of just a field access. And for things like AccessibleDataGuard<'a, Vec<T>>, I'd implement wrappers around the lookup and iterator methods, so that, for example, if you get vec.first().unwrap(), it is a AccessibleDataGuard<'a, T>. (I don't think I can implement Index though, because it's defined to return an &.)

So far, this seems to add a bit of complexity for client code, but maybe not be prohibitive. However, if you were using a custom container type that TimeSteward hadn't already provided wrapper methods for, you'd have to understand all that stuff in order to make it usable as entity data, and that seems pretty onerous. If the only benefit is to save the runtime cost of reference counting, which is pretty small, then for now, I think I can conclude that this system isn't worth building.

There is technically one other benefit to this system, which is that it provides type-level protections against giving an Accessor an improper EntityHandle (e.g. one that was stored in a thread_local or using forbidden interior mutability, especially one that was constructed for a different TimeSteward object (see #39)). However, I think it's difficult to do any of those things by accident - the above is primarily motivated by formal memory safety concerns rather than "I think a user would be likely to make this mistake in practice" concerns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant