-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Garbage Collection #88
Comments
Garbage Collector + Memory manager what is stable for use (in COSMOS OS): |
@ArsenShnurkov that code can't be used for a few reasons: Also I severely dislike their code management, it looks very fragmented and difficult to follow. With MOSA we should be able to create a simple garbage collector thanks to the compiler and it's IR stages which allow us to quickly insert code to all platforms and allow us to make calls into the platform specific runtimes. |
I would avoid what I consider to be the rabbit hole pursuit of trying to implement a working reference counting garbage collector. What makes MOSA unique is that all the compilation stages maintain if a register or stack location is an object type or something else. This is by design so precise garbage collector can be implemented. I'd rather spend time in that pursuit. |
The problem I see with the way mosa is at the moment is that while we know this information at compile time we are missing a lot of this information at runtime. With a mark and sweep system, if we miss even one root that would cause a fatal crash of the operating system. As of yet I haven't figured out an efficient implementation of Mark and sweep. |
Staccato: A Parallel and Concurrent Real-time Compacting Garbage Collector for Multiprocessors the Garbage Collection Bibliography (up to 2009-2010) Just google query (it gives citation counts): |
My current thoughts and plans for the first basic GC implementation are:
This is a simple design with some advantages: notably easy/straightforward implementation, fast memory allocator, reclamation of unused memory (without compacting) and disadvantages: internal fragmentation (by objects less than 4k) and external fragmentation (less continuous memory). Again, only my initial thoughts - this may change as we get closer to implementation. |
GC of .Net itself: https://github.com/dotnet/coreclr/tree/master/src/gc |
https://www.azul.com/resources/azul-technology/azul-c4-garbage-collector/ |
@ArsenShnurkov I'm working on 1) Precise Object Tracking now. It requires structure changes to the compiler to track object references and object offsets. In addition, x86 specific operand types, which are incompatible with other platforms, are being removed. |
@tgiphil, May I ask you write a wiki page about current state of GC in MOSA ? In particular - how the existing code structured and how to start read/understand it. |
Sure; however, there is no GC code yet. The compiler was recently restructured to support Precise Object Tracking by keeping track of object types and areas where object pointers may be obtrusive (i.e.. during a bitwise memory-to-memory copy). Interesting note: With the new GDB Connector (used with the new debugger), it will be possible to leverage it and write a prototype GC outside of the kernel and rigorously debug and test it. |
2007, MS, STOPLESS: A Real-Time Garbage Collector for Multiprocessors 2001, IBM, A Pure Reference Counting Garbage Collector |
@tgiphil, Please read this paper, it shoud give you some ideas on tracking (compressing, GC-unsafe instructions, usage of decompiler-like code to calculate instruction length): 1999, Intel, Support for Garbage Collection at Every Instruction in Compiler What is different or similar in your Precise Object Tracking implementation? |
@ArsenShnurkov - Thanks for sharing the paper. I have not read this one before. Below are my notes for the future GC implementation - it includes compact GC map:
Important to remember is to take a pointer to an object, it must be fixed first. And once fixed, an object can not be moved by the GC. So there is no need to try to track this reference (and any offsets derived from it). |
In comparison with the GC in the paper vs the our proposed MOSA implementation:
|
a paper about interrior pointers: I understand why interrior pointers are necessary (passing object fields and array elements into methods by reference, delegates for value types?), I understand why pinning is necessary (to work with external devices like videocards), but I don't understand why unsafe code is necessary in MOSA |
Unsafe code is necessary mostly because it's unavoidable. All the open source versions .NET framework use unsafe code internally. Note: The MOSA kernel and device drivers will avoid using any unsafe code. |
Hi! I'd really like to get into this issue. However, I'm unsure where to start with it in MOSA. Could you point me to some locations? |
If you need some help with it, let me know. I have studied the subject
some and got both major books on the subject.
|
@L3tum Great! Let's jump right in: We plan to implement a precise garbage collector (rather than a conservative garbage collector). This means the compiler needs to emit metadata that precisely describes the lifetime of all object references at every point of execution and every memory location. There are four basic pieces of this metadata required to make this happen:
The first item is the most challenging and where we should start. Take a look at PreciseGCStage.cs. It’s re-using some data flow analysis from the register allocation --- because they both attempt to determine the life spans of registers. In the case of the register allocator, it was analyzing the virtual registers life spans to determine what physical register to actually use. While in this stage, the life span analysis is for physical registers that contain object references. Note: For implementation simplicity, we try to maintain interior pointers (pointers directly to the root of an object) in all register and stack locations. Memory accesses are then indirect with the base register pointing to the root of the objects and offset in other register, or as an immediate constant. This avoids having to account for pointers directly into an object – which may be constantly changing (such as within a loop). At present, PreciseGCStage is vastly incomplete and what is there is untested. It would be helpful to complete and test the implementation that determines when registers contain object references. And then emit compact metadata that descriptions the live object references at each instruction point. See my July 17 notes, specifically 7, on to emit a compact GC metadata. This is just my initial idea – and there may be much better ways to do this. The key criteria for any representation is that it be compact and relatively fast to decode during a GC cycle. |
Thanks for the detailed answer! So far I think I've figured out where to begin in MOSA. |
So is the long term plan to have two execution / garbage collection models? |
@L3tum Understood. Let's break the live ranges analysis into three steps/parts: 1) the actual analysis, and 2) compressing the map, and 3) emitting metadata/map. The analysis needs to generate lists of instruction ranges where an object reference is alive. Most of this is already written since it based on the same analysis for register allocation - the only difference is this stage focuses on physical registers. I believe there are some TODO notes in the code. Once the code is drafted, let's add trace code to dump the live ranges information to Explorer's debug window. We can then review it manually to see if it is working. And if so, move on to the next step. The next step is to compresses the map. My proposal is similar to traditional a runtime-length compression algorithm, except it's not based on repeating characters, but rather the change points of the live ranges - in other words where a live range starts or ends - within the sets of registers and stack locations. So, gaps where there are no changes do not require any additional bits of encoding. This is important because on x86/x64 instructions are variable length. In addition, a status change can only affect one or two registers at a time. So, we capitalize on the rarity of changes to the live ranges to efficiency encode the map. I'll write up a more detailed specification. (note: I'm open to other forms of encoding). Ideally, the compression routine would stream out the metadata map sequentially from the start to end of a method. So this step would be embedded in the part above. The actually code that emits of the metadata is very similar to writing to a file stream. See examples: ProtectedRegionLayoutStage.cs and MetadataStage.cs. Note: These examples rely heavily on the linker's post processing --- which won't be necessary for this map --- so ignore all the linker method codes and zero filling. Don't worry about doing anything wrong - rather think of this as a learning opportunity. If you have any questions, you can post them here or on gitter (recommended). |
@evo01 - The focus is short-term -> a simplistic garbage collector for a single-process with multiple threads. |
@tgiphil - Thanks! -Adam |
@tgiphil Alright, thanks! I think I understand MOSA a bit better now so I'll get working. |
Currently there is no way to reclaim memory in MOSA.
This issue may not seem much of a big deal now but when we get ARM support up and running MOSA may be run in memory constrained environments so it will need to reclaim unused memory or it will run out of memory quite quickly.
Garbage collection in the Runtime will solve this but the issue is how do we implement it?
I propose that for starters we look at using a reference counter implementation as it will be the easiest to implement. I am aware that there are issues with it such as circular reference failure but those are minimal cases which are acceptable for MOSA 1.5.
Want to back this issue? Place a bounty on it! We accept bounties via Bountysource.
The text was updated successfully, but these errors were encountered: