Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbage Collection #88

Open
charsleysa opened this issue Sep 3, 2014 · 26 comments
Open

Garbage Collection #88

charsleysa opened this issue Sep 3, 2014 · 26 comments

Comments

@charsleysa
Copy link
Member

Currently there is no way to reclaim memory in MOSA.
This issue may not seem much of a big deal now but when we get ARM support up and running MOSA may be run in memory constrained environments so it will need to reclaim unused memory or it will run out of memory quite quickly.

Garbage collection in the Runtime will solve this but the issue is how do we implement it?

I propose that for starters we look at using a reference counter implementation as it will be the easiest to implement. I am aware that there are issues with it such as circular reference failure but those are minimal cases which are acceptable for MOSA 1.5.


Want to back this issue? Place a bounty on it! We accept bounties via Bountysource.

@charsleysa charsleysa added this to the 1.5 - ARM Platform Support milestone Sep 3, 2014
@ArsenShnurkov
Copy link

Garbage Collector + Memory manager what is stable for use (in COSMOS OS):
http://cosmos.codeplex.com/discussions/462382

@charsleysa
Copy link
Member Author

@ArsenShnurkov that code can't be used for a few reasons:
It's old and outdated
COSMOS don't even use it any more
MOSA and COSMOS have different structures which make code ports almost impossible.

Also I severely dislike their code management, it looks very fragmented and difficult to follow.

With MOSA we should be able to create a simple garbage collector thanks to the compiler and it's IR stages which allow us to quickly insert code to all platforms and allow us to make calls into the platform specific runtimes.

@tgiphil
Copy link
Member

tgiphil commented Sep 6, 2014

I would avoid what I consider to be the rabbit hole pursuit of trying to implement a working reference counting garbage collector. What makes MOSA unique is that all the compilation stages maintain if a register or stack location is an object type or something else. This is by design so precise garbage collector can be implemented. I'd rather spend time in that pursuit.

@charsleysa
Copy link
Member Author

The problem I see with the way mosa is at the moment is that while we know this information at compile time we are missing a lot of this information at runtime.

With a mark and sweep system, if we miss even one root that would cause a fatal crash of the operating system.

As of yet I haven't figured out an efficient implementation of Mark and sweep.

@ArsenShnurkov
Copy link

Staccato: A Parallel and Concurrent Real-time Compacting Garbage Collector for Multiprocessors
http://researcher.watson.ibm.com/researcher/files/us-groved/rc24504.pdf
summer of 2006

the Garbage Collection Bibliography (up to 2009-2010)
http://www.cs.kent.ac.uk/people/staff/rej/gcbib/gcbib.pdf

Just google query (it gives citation counts):
https://www.google.com/search?q=%22real-time+garbage+collection%22+filetype%3Apdf

@tgiphil
Copy link
Member

tgiphil commented Sep 18, 2014

My current thoughts and plans for the first basic GC implementation are:

  1. Precise object tracking --- the compiler will emit detailed stack and register usage.
  2. Objects are allocated from memory blocks (maybe around 4k each)
  3. Objects can span more than one block.
  4. Memory block allocations are tracked and hold flags to indicate active or not per block
  5. All objects contain a bit to indicate if the object was traversed already during a GC (the meaning of the bit flips each GC cycle)
  6. On garbage collection -
  • All threads are suspended
  • All the memory block flags are initially set to not active
  • Object references on stack and static heap locations are collected ("root references")
  • All root references are traversed to find additional objects
  • Object that get traversed have their GC bit set (or flipped) - once flipped they are not traversed again
  • Memory blocks containing an object that got their GC bit set is set to active
  • Once all objects are traversed, all memory blocks which remain inactive are either re-used for future memory allocations or released back to the OS.
  • Finally, all threads are resumed.

This is a simple design with some advantages: notably easy/straightforward implementation, fast memory allocator, reclamation of unused memory (without compacting) and disadvantages: internal fragmentation (by objects less than 4k) and external fragmentation (less continuous memory).

Again, only my initial thoughts - this may change as we get closer to implementation.

@tgiphil tgiphil modified the milestones: Future, 1.5 - ARM Platform Support Dec 26, 2014
@GeroL
Copy link
Contributor

GeroL commented Feb 4, 2015

GC of .Net itself: https://github.com/dotnet/coreclr/tree/master/src/gc
C++ but still maybe useable somehow

@ArsenShnurkov
Copy link

@tgiphil tgiphil self-assigned this Oct 18, 2016
@tgiphil
Copy link
Member

tgiphil commented Oct 19, 2016

@ArsenShnurkov I'm working on 1) Precise Object Tracking now. It requires structure changes to the compiler to track object references and object offsets. In addition, x86 specific operand types, which are incompatible with other platforms, are being removed.

@ArsenShnurkov
Copy link

ArsenShnurkov commented Jul 6, 2017

@tgiphil, May I ask you write a wiki page about current state of GC in MOSA ? In particular - how the existing code structured and how to start read/understand it.

@tgiphil
Copy link
Member

tgiphil commented Jul 6, 2017

Sure; however, there is no GC code yet. The compiler was recently restructured to support Precise Object Tracking by keeping track of object types and areas where object pointers may be obtrusive (i.e.. during a bitwise memory-to-memory copy).

Interesting note: With the new GDB Connector (used with the new debugger), it will be possible to leverage it and write a prototype GC outside of the kernel and rigorously debug and test it.

@ArsenShnurkov
Copy link

ArsenShnurkov commented Jul 10, 2017

2007, MS, STOPLESS: A Real-Time Garbage Collector for Multiprocessors
    It was implemented on top of the Bartok compiler and runtime for C#
    this is the first garbage collector that provides real-time support on stock multiprocessor hardware

2001, IBM, A Pure Reference Counting Garbage Collector
    collects cyclic garbage
    no counters for stack objects

@ArsenShnurkov
Copy link

@tgiphil, Please read this paper, it shoud give you some ideas on tracking (compressing, GC-unsafe instructions, usage of decompiler-like code to calculate instruction length):

1999, Intel, Support for Garbage Collection at Every Instruction in Compiler
    compiler creates map for every instruction:
    the set of registers and stack locations and "stack adjustment"

What is different or similar in your Precise Object Tracking implementation?

@tgiphil
Copy link
Member

tgiphil commented Jul 15, 2017

@ArsenShnurkov - Thanks for sharing the paper. I have not read this one before.

Below are my notes for the future GC implementation - it includes compact GC map:

1. Create internal GC class
2. Create memory pools 
3. Initialize gen 1 & 3 using a memory pool
4. Add GC.Fixed(object reference) to add the object to the fixed-do-not-move list
5. Create initial static objects in gen 3 since they generally are long lived
6. Create a table lists of pointers to static objects
	a. Loads & stores find & update static objects on this table
	b. This table also forms the one of the roots for GC
7. For each method stack frame, create map with the following:
	a. List of object parameters (offset from EBP)
	b. Compressed map showing for each PC location which registers have an object reference
	c. Compressed map showing for each stack space the PC ranges where the stack spaces contains a live object reference
	d. Idea for compressing both maps:
		i. Map is constructed by following this run length encoding scheme
		ii. The first two bits of every byte represent the instruction code
		iii. Instructions:
			1) 00: INCREMENT  {5 bits: increment}
				® Special: All zero means end of encoding
			2) 01: REGISTER {1 bit: alive or dead} {3 bits: register index}
			3) 10: STACK { 1bit: alive or dead} {5 bits: index} 
				® indexed from EBP / 4 (alignment)
			4) 11: EXTENDED {6 bits: index}
				® Add this index to the next stack instruction's index
	e. Iterative alive analysis finds where registers and stack slots contains object references at each PC position

Important to remember is to take a pointer to an object, it must be fixed first. And once fixed, an object can not be moved by the GC. So there is no need to try to track this reference (and any offsets derived from it).

@tgiphil
Copy link
Member

tgiphil commented Jul 15, 2017

In comparison with the GC in the paper vs the our proposed MOSA implementation:

  • Our proposed GC map is already compact since it spans multiple instructions or live ranges. In contrast with the paper which builds a GC map per instruction and then compresses it using Huffman compression. Huffman compression, which is based on frequency, would not helpful on our compacted data since it already squeezed extraneous bits and there would be little redundancy or patterns. RLE-type compression variations would not help either.
  • Both GC approaches try to avoid interior pointers. (The last MOSA refactoring was done to specifically address this). Unfortunately, .NET framework implementation does a lot of unsafe things for performance (example, pointers into strings). This will be a challenge for MOSA since it'll have to assume some conservative approaches (assuming some pointers could be references and fixing them to memory) and/or detect those unsafe regions and wait for thread(s) to move outside of those regions before starting a GC.
  • Both approaches assume a one-to-one mapping of IR instructions. While technically MOSA expands IR instructions into native instructions, it still internal represents them as a single instruction from which a GC map can still be built.

@ArsenShnurkov
Copy link

ArsenShnurkov commented Jul 15, 2017

a paper about interrior pointers:
2009, MS, Transacting Pointer-based Accesses in an Object-based Software Transactional Memory System
    http://transact09.cs.washington.edu/17_paper.pdf
    Section 6 describes an important optimization that eliminates runtime overhead in most cases.

I understand why interrior pointers are necessary (passing object fields and array elements into methods by reference, delegates for value types?), I understand why pinning is necessary (to work with external devices like videocards), but I don't understand why unsafe code is necessary in MOSA

@tgiphil
Copy link
Member

tgiphil commented Jul 19, 2017

Unsafe code is necessary mostly because it's unavoidable. All the open source versions .NET framework use unsafe code internally. Note: The MOSA kernel and device drivers will avoid using any unsafe code.

@tgiphil tgiphil modified the milestones: Future, 1.9 Release Aug 19, 2017
@tgiphil tgiphil changed the title Kernel Garbage Collection Garbage Collection Aug 19, 2017
@L3tum
Copy link

L3tum commented Jul 9, 2018

Hi!

I'd really like to get into this issue. However, I'm unsure where to start with it in MOSA. Could you point me to some locations?

@evo01
Copy link

evo01 commented Jul 9, 2018 via email

@tgiphil
Copy link
Member

tgiphil commented Jul 9, 2018

@L3tum Great! Let's jump right in:

We plan to implement a precise garbage collector (rather than a conservative garbage collector). This means the compiler needs to emit metadata that precisely describes the lifetime of all object references at every point of execution and every memory location.

There are four basic pieces of this metadata required to make this happen:

  1. At every point within a method, which registers contain object references,
  2. Stack locations contain object references --- such as with the passing and returning of object references during a method call,
  3. Temporary stack locations contain object references -- usually spill/store locations or non-primitive structs, and
  4. Static objects or structs.

The first item is the most challenging and where we should start. Take a look at PreciseGCStage.cs. It’s re-using some data flow analysis from the register allocation --- because they both attempt to determine the life spans of registers. In the case of the register allocator, it was analyzing the virtual registers life spans to determine what physical register to actually use. While in this stage, the life span analysis is for physical registers that contain object references.

Note: For implementation simplicity, we try to maintain interior pointers (pointers directly to the root of an object) in all register and stack locations. Memory accesses are then indirect with the base register pointing to the root of the objects and offset in other register, or as an immediate constant. This avoids having to account for pointers directly into an object – which may be constantly changing (such as within a loop).

At present, PreciseGCStage is vastly incomplete and what is there is untested. It would be helpful to complete and test the implementation that determines when registers contain object references. And then emit compact metadata that descriptions the live object references at each instruction point.

See my July 17 notes, specifically 7, on to emit a compact GC metadata. This is just my initial idea – and there may be much better ways to do this. The key criteria for any representation is that it be compact and relatively fast to decode during a GC cycle.

@L3tum
Copy link

L3tum commented Jul 10, 2018

Thanks for the detailed answer!

So far I think I've figured out where to begin in MOSA.
Just one question: I'm unsure where/how to emit the metadata/map or what exactly you mean with it. I understand the general concept and structure you're going with, I think, but I'm not really sure where to save the maps etc. to, so to speak, and how to do it in MOSA. I don't want to do anything wrong just because I misunderstood something so I figured I'd ask before I just start with something.

@evo01
Copy link

evo01 commented Jul 10, 2018

So is the long term plan to have two execution / garbage collection models?
a) One that handles garbage collection for compiled operating system code that has predetermined GC points
b) Another for users programs?

@tgiphil
Copy link
Member

tgiphil commented Jul 11, 2018

@L3tum Understood. Let's break the live ranges analysis into three steps/parts: 1) the actual analysis, and 2) compressing the map, and 3) emitting metadata/map.

The analysis needs to generate lists of instruction ranges where an object reference is alive. Most of this is already written since it based on the same analysis for register allocation - the only difference is this stage focuses on physical registers. I believe there are some TODO notes in the code. Once the code is drafted, let's add trace code to dump the live ranges information to Explorer's debug window. We can then review it manually to see if it is working. And if so, move on to the next step.

The next step is to compresses the map. My proposal is similar to traditional a runtime-length compression algorithm, except it's not based on repeating characters, but rather the change points of the live ranges - in other words where a live range starts or ends - within the sets of registers and stack locations. So, gaps where there are no changes do not require any additional bits of encoding. This is important because on x86/x64 instructions are variable length. In addition, a status change can only affect one or two registers at a time. So, we capitalize on the rarity of changes to the live ranges to efficiency encode the map. I'll write up a more detailed specification. (note: I'm open to other forms of encoding).

Ideally, the compression routine would stream out the metadata map sequentially from the start to end of a method. So this step would be embedded in the part above. The actually code that emits of the metadata is very similar to writing to a file stream. See examples: ProtectedRegionLayoutStage.cs and MetadataStage.cs. Note: These examples rely heavily on the linker's post processing --- which won't be necessary for this map --- so ignore all the linker method codes and zero filling.

Don't worry about doing anything wrong - rather think of this as a learning opportunity. If you have any questions, you can post them here or on gitter (recommended).

@tgiphil
Copy link
Member

tgiphil commented Jul 11, 2018

@evo01 - The focus is short-term -> a simplistic garbage collector for a single-process with multiple threads.

@evo01
Copy link

evo01 commented Jul 11, 2018

@tgiphil - Thanks!

-Adam

@L3tum
Copy link

L3tum commented Jul 11, 2018

@tgiphil Alright, thanks!

I think I understand MOSA a bit better now so I'll get working.

@tgiphil tgiphil modified the milestones: 1.9 Release, 2.0 Release Aug 5, 2018
@tgiphil tgiphil modified the milestones: 2.2 Release, 2.4 Release Jan 12, 2023
@tgiphil tgiphil modified the milestones: 2.4 Release, 2.3 Release Apr 8, 2023
@tgiphil tgiphil modified the milestones: 2.5 Release, 2.6 Release Jan 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Status: In Progress
Development

No branches or pull requests

6 participants