Garbage Collection #88

charsleysa · 2014-09-03T09:30:22Z

Currently there is no way to reclaim memory in MOSA.
This issue may not seem much of a big deal now but when we get ARM support up and running MOSA may be run in memory constrained environments so it will need to reclaim unused memory or it will run out of memory quite quickly.

Garbage collection in the Runtime will solve this but the issue is how do we implement it?

I propose that for starters we look at using a reference counter implementation as it will be the easiest to implement. I am aware that there are issues with it such as circular reference failure but those are minimal cases which are acceptable for MOSA 1.5.

Want to back this issue? Place a bounty on it! We accept bounties via Bountysource.

ArsenShnurkov · 2014-09-05T14:03:21Z

Garbage Collector + Memory manager what is stable for use (in COSMOS OS):
http://cosmos.codeplex.com/discussions/462382

charsleysa · 2014-09-05T14:52:13Z

@ArsenShnurkov that code can't be used for a few reasons:
It's old and outdated
COSMOS don't even use it any more
MOSA and COSMOS have different structures which make code ports almost impossible.

Also I severely dislike their code management, it looks very fragmented and difficult to follow.

With MOSA we should be able to create a simple garbage collector thanks to the compiler and it's IR stages which allow us to quickly insert code to all platforms and allow us to make calls into the platform specific runtimes.

tgiphil · 2014-09-06T02:04:51Z

I would avoid what I consider to be the rabbit hole pursuit of trying to implement a working reference counting garbage collector. What makes MOSA unique is that all the compilation stages maintain if a register or stack location is an object type or something else. This is by design so precise garbage collector can be implemented. I'd rather spend time in that pursuit.

charsleysa · 2014-09-06T02:16:20Z

The problem I see with the way mosa is at the moment is that while we know this information at compile time we are missing a lot of this information at runtime.

With a mark and sweep system, if we miss even one root that would cause a fatal crash of the operating system.

As of yet I haven't figured out an efficient implementation of Mark and sweep.

ArsenShnurkov · 2014-09-06T08:22:37Z

Staccato: A Parallel and Concurrent Real-time Compacting Garbage Collector for Multiprocessors
http://researcher.watson.ibm.com/researcher/files/us-groved/rc24504.pdf
summer of 2006

the Garbage Collection Bibliography (up to 2009-2010)
http://www.cs.kent.ac.uk/people/staff/rej/gcbib/gcbib.pdf

Just google query (it gives citation counts):
https://www.google.com/search?q=%22real-time+garbage+collection%22+filetype%3Apdf

tgiphil · 2014-09-18T06:34:48Z

My current thoughts and plans for the first basic GC implementation are:

Precise object tracking --- the compiler will emit detailed stack and register usage.
Objects are allocated from memory blocks (maybe around 4k each)
Objects can span more than one block.
Memory block allocations are tracked and hold flags to indicate active or not per block
All objects contain a bit to indicate if the object was traversed already during a GC (the meaning of the bit flips each GC cycle)
On garbage collection -

All threads are suspended
All the memory block flags are initially set to not active
Object references on stack and static heap locations are collected ("root references")
All root references are traversed to find additional objects
Object that get traversed have their GC bit set (or flipped) - once flipped they are not traversed again
Memory blocks containing an object that got their GC bit set is set to active
Once all objects are traversed, all memory blocks which remain inactive are either re-used for future memory allocations or released back to the OS.
Finally, all threads are resumed.

This is a simple design with some advantages: notably easy/straightforward implementation, fast memory allocator, reclamation of unused memory (without compacting) and disadvantages: internal fragmentation (by objects less than 4k) and external fragmentation (less continuous memory).

Again, only my initial thoughts - this may change as we get closer to implementation.

GeroL · 2015-02-04T16:46:33Z

GC of .Net itself: https://github.com/dotnet/coreclr/tree/master/src/gc
C++ but still maybe useable somehow

ArsenShnurkov · 2016-10-18T19:57:43Z

https://www.azul.com/resources/azul-technology/azul-c4-garbage-collector/
Continuously Concurrent Compacting Collector
it never falls back to a stop-the-world compaction.
http://stackoverflow.com/questions/5841847/does-net-have-something-similar-to-what-azul-provides-for-java

tgiphil · 2016-10-19T04:44:20Z

@ArsenShnurkov I'm working on 1) Precise Object Tracking now. It requires structure changes to the compiler to track object references and object offsets. In addition, x86 specific operand types, which are incompatible with other platforms, are being removed.

ArsenShnurkov · 2017-07-06T08:05:31Z

@tgiphil, May I ask you write a wiki page about current state of GC in MOSA ? In particular - how the existing code structured and how to start read/understand it.

tgiphil · 2017-07-06T19:44:32Z

Sure; however, there is no GC code yet. The compiler was recently restructured to support Precise Object Tracking by keeping track of object types and areas where object pointers may be obtrusive (i.e.. during a bitwise memory-to-memory copy).

Interesting note: With the new GDB Connector (used with the new debugger), it will be possible to leverage it and write a prototype GC outside of the kernel and rigorously debug and test it.

ArsenShnurkov · 2017-07-10T18:05:35Z

2007, MS, STOPLESS: A Real-Time Garbage Collector for Multiprocessors
It was implemented on top of the Bartok compiler and runtime for C#
this is the first garbage collector that provides real-time support on stock multiprocessor hardware

2001, IBM, A Pure Reference Counting Garbage Collector
collects cyclic garbage
no counters for stack objects

ArsenShnurkov · 2017-07-14T22:19:58Z

@tgiphil, Please read this paper, it shoud give you some ideas on tracking (compressing, GC-unsafe instructions, usage of decompiler-like code to calculate instruction length):

1999, Intel, Support for Garbage Collection at Every Instruction in Compiler
compiler creates map for every instruction:
the set of registers and stack locations and "stack adjustment"

What is different or similar in your Precise Object Tracking implementation?

tgiphil · 2017-07-15T01:46:33Z

@ArsenShnurkov - Thanks for sharing the paper. I have not read this one before.

Below are my notes for the future GC implementation - it includes compact GC map:

1. Create internal GC class
2. Create memory pools 
3. Initialize gen 1 & 3 using a memory pool
4. Add GC.Fixed(object reference) to add the object to the fixed-do-not-move list
5. Create initial static objects in gen 3 since they generally are long lived
6. Create a table lists of pointers to static objects
	a. Loads & stores find & update static objects on this table
	b. This table also forms the one of the roots for GC
7. For each method stack frame, create map with the following:
	a. List of object parameters (offset from EBP)
	b. Compressed map showing for each PC location which registers have an object reference
	c. Compressed map showing for each stack space the PC ranges where the stack spaces contains a live object reference
	d. Idea for compressing both maps:
		i. Map is constructed by following this run length encoding scheme
		ii. The first two bits of every byte represent the instruction code
		iii. Instructions:
			1) 00: INCREMENT  {5 bits: increment}
				® Special: All zero means end of encoding
			2) 01: REGISTER {1 bit: alive or dead} {3 bits: register index}
			3) 10: STACK { 1bit: alive or dead} {5 bits: index} 
				® indexed from EBP / 4 (alignment)
			4) 11: EXTENDED {6 bits: index}
				® Add this index to the next stack instruction's index
	e. Iterative alive analysis finds where registers and stack slots contains object references at each PC position

Important to remember is to take a pointer to an object, it must be fixed first. And once fixed, an object can not be moved by the GC. So there is no need to try to track this reference (and any offsets derived from it).

tgiphil · 2017-07-15T02:05:37Z

In comparison with the GC in the paper vs the our proposed MOSA implementation:

Our proposed GC map is already compact since it spans multiple instructions or live ranges. In contrast with the paper which builds a GC map per instruction and then compresses it using Huffman compression. Huffman compression, which is based on frequency, would not helpful on our compacted data since it already squeezed extraneous bits and there would be little redundancy or patterns. RLE-type compression variations would not help either.
Both GC approaches try to avoid interior pointers. (The last MOSA refactoring was done to specifically address this). Unfortunately, .NET framework implementation does a lot of unsafe things for performance (example, pointers into strings). This will be a challenge for MOSA since it'll have to assume some conservative approaches (assuming some pointers could be references and fixing them to memory) and/or detect those unsafe regions and wait for thread(s) to move outside of those regions before starting a GC.
Both approaches assume a one-to-one mapping of IR instructions. While technically MOSA expands IR instructions into native instructions, it still internal represents them as a single instruction from which a GC map can still be built.

ArsenShnurkov · 2017-07-15T12:56:00Z

a paper about interrior pointers:
2009, MS, Transacting Pointer-based Accesses in an Object-based Software Transactional Memory System
http://transact09.cs.washington.edu/17_paper.pdf
Section 6 describes an important optimization that eliminates runtime overhead in most cases.

I understand why interrior pointers are necessary (passing object fields and array elements into methods by reference, delegates for value types?), I understand why pinning is necessary (to work with external devices like videocards), but I don't understand why unsafe code is necessary in MOSA

tgiphil · 2017-07-19T05:31:51Z

Unsafe code is necessary mostly because it's unavoidable. All the open source versions .NET framework use unsafe code internally. Note: The MOSA kernel and device drivers will avoid using any unsafe code.

L3tum · 2018-07-09T17:54:22Z

Hi!

I'd really like to get into this issue. However, I'm unsure where to start with it in MOSA. Could you point me to some locations?

evo01 · 2018-07-09T22:49:05Z

If you need some help with it, let me know. I have studied the subject some and got both major books on the subject.

tgiphil · 2018-07-09T23:54:12Z

@L3tum Great! Let's jump right in:

We plan to implement a precise garbage collector (rather than a conservative garbage collector). This means the compiler needs to emit metadata that precisely describes the lifetime of all object references at every point of execution and every memory location.

There are four basic pieces of this metadata required to make this happen:

At every point within a method, which registers contain object references,
Stack locations contain object references --- such as with the passing and returning of object references during a method call,
Temporary stack locations contain object references -- usually spill/store locations or non-primitive structs, and
Static objects or structs.

The first item is the most challenging and where we should start. Take a look at PreciseGCStage.cs. It’s re-using some data flow analysis from the register allocation --- because they both attempt to determine the life spans of registers. In the case of the register allocator, it was analyzing the virtual registers life spans to determine what physical register to actually use. While in this stage, the life span analysis is for physical registers that contain object references.

Note: For implementation simplicity, we try to maintain interior pointers (pointers directly to the root of an object) in all register and stack locations. Memory accesses are then indirect with the base register pointing to the root of the objects and offset in other register, or as an immediate constant. This avoids having to account for pointers directly into an object – which may be constantly changing (such as within a loop).

At present, PreciseGCStage is vastly incomplete and what is there is untested. It would be helpful to complete and test the implementation that determines when registers contain object references. And then emit compact metadata that descriptions the live object references at each instruction point.

See my July 17 notes, specifically 7, on to emit a compact GC metadata. This is just my initial idea – and there may be much better ways to do this. The key criteria for any representation is that it be compact and relatively fast to decode during a GC cycle.

L3tum · 2018-07-10T20:13:29Z

Thanks for the detailed answer!

So far I think I've figured out where to begin in MOSA.
Just one question: I'm unsure where/how to emit the metadata/map or what exactly you mean with it. I understand the general concept and structure you're going with, I think, but I'm not really sure where to save the maps etc. to, so to speak, and how to do it in MOSA. I don't want to do anything wrong just because I misunderstood something so I figured I'd ask before I just start with something.

evo01 · 2018-07-10T21:13:55Z

So is the long term plan to have two execution / garbage collection models?
a) One that handles garbage collection for compiled operating system code that has predetermined GC points
b) Another for users programs?

tgiphil · 2018-07-11T01:00:49Z

@L3tum Understood. Let's break the live ranges analysis into three steps/parts: 1) the actual analysis, and 2) compressing the map, and 3) emitting metadata/map.

The analysis needs to generate lists of instruction ranges where an object reference is alive. Most of this is already written since it based on the same analysis for register allocation - the only difference is this stage focuses on physical registers. I believe there are some TODO notes in the code. Once the code is drafted, let's add trace code to dump the live ranges information to Explorer's debug window. We can then review it manually to see if it is working. And if so, move on to the next step.

The next step is to compresses the map. My proposal is similar to traditional a runtime-length compression algorithm, except it's not based on repeating characters, but rather the change points of the live ranges - in other words where a live range starts or ends - within the sets of registers and stack locations. So, gaps where there are no changes do not require any additional bits of encoding. This is important because on x86/x64 instructions are variable length. In addition, a status change can only affect one or two registers at a time. So, we capitalize on the rarity of changes to the live ranges to efficiency encode the map. I'll write up a more detailed specification. (note: I'm open to other forms of encoding).

Ideally, the compression routine would stream out the metadata map sequentially from the start to end of a method. So this step would be embedded in the part above. The actually code that emits of the metadata is very similar to writing to a file stream. See examples: ProtectedRegionLayoutStage.cs and MetadataStage.cs. Note: These examples rely heavily on the linker's post processing --- which won't be necessary for this map --- so ignore all the linker method codes and zero filling.

Don't worry about doing anything wrong - rather think of this as a learning opportunity. If you have any questions, you can post them here or on gitter (recommended).

tgiphil · 2018-07-11T01:04:54Z

@evo01 - The focus is short-term -> a simplistic garbage collector for a single-process with multiple threads.

evo01 · 2018-07-11T03:34:40Z

@tgiphil - Thanks!

-Adam

L3tum · 2018-07-11T10:44:30Z

@tgiphil Alright, thanks!

I think I understand MOSA a bit better now so I'll get working.

charsleysa added Feature labels Sep 3, 2014

charsleysa added this to the 1.5 - ARM Platform Support milestone Sep 3, 2014

tgiphil modified the milestones: Future, 1.5 - ARM Platform Support Dec 26, 2014

tgiphil self-assigned this Oct 18, 2016

tgiphil modified the milestones: Future, 1.9 Release Aug 19, 2017

tgiphil changed the title ~~Kernel Garbage Collection~~ Garbage Collection Aug 19, 2017

tgiphil modified the milestones: 1.9 Release, 2.0 Release Aug 5, 2018

tgiphil modified the milestones: 2.2 Release, 2.4 Release Jan 12, 2023

tgiphil added the High Priority label Jan 30, 2023

tgiphil modified the milestones: 2.4 Release, 2.3 Release Apr 8, 2023

tgiphil modified the milestones: 2.5 Release, 2.6 Release Jan 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Garbage Collection #88

Garbage Collection #88

charsleysa commented Sep 3, 2014

ArsenShnurkov commented Sep 5, 2014

charsleysa commented Sep 5, 2014

tgiphil commented Sep 6, 2014

charsleysa commented Sep 6, 2014

ArsenShnurkov commented Sep 6, 2014

tgiphil commented Sep 18, 2014 •

edited

Loading

GeroL commented Feb 4, 2015

ArsenShnurkov commented Oct 18, 2016

tgiphil commented Oct 19, 2016 •

edited

Loading

ArsenShnurkov commented Jul 6, 2017 •

edited

Loading

tgiphil commented Jul 6, 2017 •

edited

Loading

ArsenShnurkov commented Jul 10, 2017 •

edited

Loading

ArsenShnurkov commented Jul 14, 2017

tgiphil commented Jul 15, 2017 •

edited

Loading

tgiphil commented Jul 15, 2017 •

edited

Loading

ArsenShnurkov commented Jul 15, 2017 •

edited

Loading

tgiphil commented Jul 19, 2017 •

edited

Loading

L3tum commented Jul 9, 2018

evo01 commented Jul 9, 2018 via email •

edited

Loading

tgiphil commented Jul 9, 2018 •

edited

Loading

L3tum commented Jul 10, 2018

evo01 commented Jul 10, 2018

tgiphil commented Jul 11, 2018 •

edited

Loading

tgiphil commented Jul 11, 2018

evo01 commented Jul 11, 2018 •

edited

Loading

L3tum commented Jul 11, 2018

Garbage Collection #88

Garbage Collection #88

Comments

charsleysa commented Sep 3, 2014

ArsenShnurkov commented Sep 5, 2014

charsleysa commented Sep 5, 2014

tgiphil commented Sep 6, 2014

charsleysa commented Sep 6, 2014

ArsenShnurkov commented Sep 6, 2014

tgiphil commented Sep 18, 2014 • edited Loading

GeroL commented Feb 4, 2015

ArsenShnurkov commented Oct 18, 2016

tgiphil commented Oct 19, 2016 • edited Loading

ArsenShnurkov commented Jul 6, 2017 • edited Loading

tgiphil commented Jul 6, 2017 • edited Loading

ArsenShnurkov commented Jul 10, 2017 • edited Loading

ArsenShnurkov commented Jul 14, 2017

tgiphil commented Jul 15, 2017 • edited Loading

tgiphil commented Jul 15, 2017 • edited Loading

ArsenShnurkov commented Jul 15, 2017 • edited Loading

tgiphil commented Jul 19, 2017 • edited Loading

L3tum commented Jul 9, 2018

evo01 commented Jul 9, 2018 via email • edited Loading

tgiphil commented Jul 9, 2018 • edited Loading

L3tum commented Jul 10, 2018

evo01 commented Jul 10, 2018

tgiphil commented Jul 11, 2018 • edited Loading

tgiphil commented Jul 11, 2018

evo01 commented Jul 11, 2018 • edited Loading

L3tum commented Jul 11, 2018

tgiphil commented Sep 18, 2014 •

edited

Loading

tgiphil commented Oct 19, 2016 •

edited

Loading

ArsenShnurkov commented Jul 6, 2017 •

edited

Loading

tgiphil commented Jul 6, 2017 •

edited

Loading

ArsenShnurkov commented Jul 10, 2017 •

edited

Loading

tgiphil commented Jul 15, 2017 •

edited

Loading

tgiphil commented Jul 15, 2017 •

edited

Loading

ArsenShnurkov commented Jul 15, 2017 •

edited

Loading

tgiphil commented Jul 19, 2017 •

edited

Loading

evo01 commented Jul 9, 2018 via email •

edited

Loading

tgiphil commented Jul 9, 2018 •

edited

Loading

tgiphil commented Jul 11, 2018 •

edited

Loading

evo01 commented Jul 11, 2018 •

edited

Loading