Skip to content
This repository has been archived by the owner on Nov 1, 2020. It is now read-only.

Multi-module compilation preparation #851

Merged
merged 1 commit into from
Feb 26, 2016
Merged

Conversation

nattress
Copy link
Contributor

Add a new command-line switch to ILC: /multifile. When specified, only
the assemblies passed as input will have methods compiled. Referenced
types / methods from other assemblies are not compiled into the output
object file. This switch is most likely temporary as we hone our
compilation story and implementation.

Each managed module adds pointers to the start and end of a module
global data header to a custom section of the object file, .modules$I.
These entries are merged (on Windows, OSX / Linux needs a tweak to CLI
first) at link time producing a list of module headers.

In StartupCodeHelpers.cs, initialize global tables from each module
using the list of pointers that was written to .modules$I. This data is
discovered through two exports (__modules_a and __modules_z) which
through linker section merging, are placed either side of the module
header pointers in the final binary.

Alter interface dispatch to store its dispatch map table as an
ArrayOfEmbeddedDataNode and place it in the module header list. This
allows each module's EETypes to continue using index-based lookup of
dispatch maps.

Add a new field to EETypes which points at a ModuleInfo* through an
indirection cell. This indirection cell is filled in at runtime
initialization and allows a type to find its dispatch map table.

ModuleHeaderSection.cs|h files define the section headers currently
supported

Moved the module info lookup out of the bootstrapping code and into the
runtime

enum ModuleHeaderSection
{
StringEETypePtr = 0x00,
StringTableStart = 0x01,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have only one row per section - that has both start and end.

Something like:

struct ModuleInfoRow {
    int32 SectionID;
    int32 Flags; // Eventually, these flags can say whether start/end are actual pointers, RVAs, GP-relative, ...
    void * Start;
    void * End;
};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's better. The Start / End pattern felt awkward.

@@ -240,6 +250,40 @@ private class BoolArrayEqualityComparer : IEqualityComparer<bool[]>
}
}

private class BlobTupleEqualityComparer : IEqualityComparer<Tuple<string, byte[], int>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need to compare anything but the name? A readonly blob with same name but different content sounds like an internal compiler error. We would end up with the same symbol for two different blobs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right - I'll alter this. We were failing to compare Tuple objects as the same when the names in each were identical. I got a bit overly enthusiastic and compared all the contents.

@MichalStrehovsky
Copy link
Member

@nattress It feels like the compilation group management is split into multiple places (some stuff in Compilation.cs, some in NodeFactory.cs).

Would it make sense to introduce a new abstract class CompilationGroup that would have two classes deriving from it (SingleFileCompilationGroup and MultifileCompilationGroup)?

CompilationGroup would provide methods to determine whether a type/method is part of it. Maybe it could also provide compilation roots, etc.

We would new up the right thing in Compilation and pass the instance to NodeFactory. Then NodeFactory won't have to care about isMultifile at all.

I'm worried that the compilation group logic will get complex over time (e.g. if we introduce a concept of TOC files to deal with generics). People tend to just add more ifs into various places to handle the complexities and I would prefer to have some design that guides them to do the right thing.

@@ -0,0 +1,23 @@
// Licensed to the .NET Foundation under one or more agreements.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm conflicted about the location of this file. While it's nice to have both .cs and .h in a single places, we only put managed stuff into src/Common.

Everything native lives under src/Native, and we maintain that split even for common things like EEType flag definitions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree - we should follow the same convention as what we use for other .h/.cs files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll move it and use comments to hopefully avoid future disaster. Maybe we ought to look at tooling to keep these kinds of splits in sync. We'll hopefully remember to edit both places but other contributors may not. We have cspp a la AsmOffsets but it muddies the build imo.

// of the enum and deprecated sections should not be removed to preserve ID stability.
//
enum ModuleHeaderSection
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be eventually reconciled with ReadyToRunSectionType from https://github.com/dotnet/coreclr/blob/master/src/inc/readytorun.h. Could you please add comment about it?

Looking at what is there today - we should start this set of IDs at 200.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Add a new command-line switch to ILC: /multifile. When specified, only
the assemblies passed as input will have methods compiled. Referenced
types / methods from other assemblies are not compiled into the output
object file. This switch is most likely temporary as we hone our
compilation story and implementation.

Each managed module adds pointers to the start and end of a module
global data header to a custom section of the object file, .modules$I.
These entries are merged (on Windows, OSX / Linux needs a tweak to CLI
first) at link time producing a list of module headers.

In StartupCodeHelpers.cs, initialize global tables from each module
using the list of pointers that was written to .modules$I. This data is
discovered through two exports (__modules_a and __modules_z) which
through linker section merging, are placed either side of the module
header pointers in the final binary.

Alter interface dispatch to store its dispatch map table as an
ArrayOfEmbeddedDataNode and place it in the module header list. This
allows each module's EETypes to continue using index-based lookup of
dispatch maps.

Add a new field to EETypes which points at a ModuleManager* through an
indirection cell. This indirection cell is filled in at runtime
initialization and allows a type to find its dispatch map table.

ModuleHeaderSection.cs|h files define the section headers currently
supported. ModuleHeaderSection's enumerands are in line with the plan for
ReadyToRun.

Moved the module info lookup out of the bootstrapping code and into the
runtime

Place compilation module group logic to a dedicated set of classes,
CompilationModuleGroup, MultiFileCompilationModuleGroup, and
SingleFileCompilationModuleGroup which together abstract the logic for
decisions about which types / methods should be included in compilation
in single vs multi file.

Extract an interface from Compilation for the methods that root methods
/ types / Main so CompilationModuleGroup can root things without having
to know about the Compilation class.
nattress added a commit that referenced this pull request Feb 26, 2016
Multi-module compilation preparation
@nattress nattress merged commit e4806d6 into dotnet:master Feb 26, 2016
@nattress nattress deleted the multifile branch February 26, 2016 22:36
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants