Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JITServer AOT cache implementation #12153

Closed
7 tasks done
AlexeyKhrabrov opened this issue Mar 8, 2021 · 8 comments
Closed
7 tasks done

JITServer AOT cache implementation #12153

AlexeyKhrabrov opened this issue Mar 8, 2021 · 8 comments
Labels
comp:jitserver Artifacts related to JIT-as-a-Service project

Comments

@AlexeyKhrabrov
Copy link
Contributor

AlexeyKhrabrov commented Mar 8, 2021

This issue is to track the progress of contributing the implementation of caching AOT methods at JITServer. Below is the outline of the main stages (the list will be updated with new and complete work items).

@AlexeyKhrabrov
Copy link
Contributor Author

@mpirvu FYI

@mpirvu mpirvu added this to To do in JIT as a Service via automation Mar 9, 2021
@mpirvu mpirvu added the comp:jitserver Artifacts related to JIT-as-a-Service project label Mar 9, 2021
@mpirvu mpirvu linked a pull request Mar 10, 2021 that will close this issue
@AlexeyKhrabrov
Copy link
Contributor Author

Here is a description of the JITServer AOT cache design.

When the server performs an AOT compilation for a client JVM, it stores the resulting AOT method body in its AOT cache. The server doesn't have its own SCC, and cached AOT methods is detached from the clients' SCCs that were used to compile them. A cached AOT method body is stored on the server as is, but for each SCC offset in its relocation and validation records we also store the information required to find or recreate the corresponding entity in a different client's SCC. We call these pieces of information serialization records, and an AOT method body along with its serialization records a serialized AOT method. We refer to the process of creating these records during out-of-process AOT compilations as AOT method serialization. The process of materializing SCC entities in a different client JVM from serialization records is called AOT method deserialization.

Relocation and validation records in AOT methods contain offsets to the following SCC entities:

  • ROMClass;
  • ROMMethod;
  • Class chain - a list of ROMClass offsets;
  • Well-known classes - a list of class chain offsets.

Corresponding serialization records store the following information:

  • Class record: class name; secure hash (e.g. SHA-256) of the ROMClass body; name of the first class loaded by its class loader.
  • Method record: defining class record; index within the list of methods in the defining class.
  • Class chain record: list of corresponding class records.
  • Well-known classes record: list of corresponding class chain records; bitset describing which classes out of the predefined list of well-known classes are included.

Finding a ROMClass in the SCC requires its name and a class loader instance. We use the name of the first loaded class as a way to identify class loaders across different client JVMs. This is similar to the heuristic used by local AOT, which identifies class loaders by the class chain of the first loaded class. ROMClass hash is used to efficiently compare ROMClasses across clients for equality.

In order to avoid duplicating the information stored in serialization records, each distinct record is assigned a unique ID that other records and serialized AOT methods use to refer to it. E.g. a class chain record stores a list of class record IDs. Class loader identifying names are actually stored as a separate record type - class loader records. A serialized AOT method body stores a list of <record type, record ID, offset> tuples, where offset is the offset into the method relocation metadata where the corresponding SCC offset (which needs to be updated during deserialization) is located.

During an out-of-process AOT compilation at the JITServer, the information required to create serialization records is fetched from the client, either on demand or included with the compilation request or responses to existing queries. The serialization records are stored in the AOT cache (new ones are created, existing ones are found and reused). At the end of the compilation, the resulting serialized AOT method body is stored in the AOT cache, and subsequent AOT compilation requests of the same method from other clients can be served with this cached version.

A serialized AOT method is identified by its defining class chain, method index in its defining class, optimization level (normally warm), and its AOT header ID. Each client SCC stores an AOT header that describes relevant CPU features and JVM options assumed for AOT methods stored in this SCC. An AOT method can only be loaded in a different JVM if its AOT header is compatible. The JITServer assigns an AOT header ID to each distinct AOT header body used by its clients.

When a client JVM receives a serialized AOT method in response to a compilation requests, it needs to be deserialized before performing the AOT load. For each serialization record, we find or recreate the corresponding SCC entity, and update the corresponding SCC offsets in relocation and validation records. If at any point an SCC entity doesn't match (e.g. a ROMClass hash is different), deserialization fails and the client has to request a new complication from the server. The result of deserializing each record is cached to avoid repeated lookups and validations (including failed ones). IDs of records cached at the client are communicated back to server with subsequent compilation requests so that the server doesn't have to send them again. Cached records are invalidated accordingly when classes are unloaded.

Deserializing each record type is done as follows:

  • Class loader record: Lookup the class loader by the name of its first loaded class in the class loader table.
  • Class record: Get the class loader for the class loader record, and use it to find the RAMClass by name, get its ROMClass, and compare its hash with the one stored in the record. If the RAMClass is not yet loaded, we fail deserialization since the AOT load would fail anyway.
  • Method record: Find the RAM class for the defining class record, and get the ROMMethod by index.
  • Class chain record: Find the RAM class for the leaf class record and get its class chain in the SCC. For each class record in the chain, find the corresponding ROMClass and compare it with the one in class chain in the SCC.
  • Well-known classes record: Find the well-known class chain offsets in the SCC for the bitset of included classes stored in the record. For each class chain record, find its corresponding class chain and compare it with the one in the well-known class chain offsets in the SCC.

@AlexeyKhrabrov
Copy link
Contributor Author

@dsouzai FYI

@dsouzai
Copy link
Contributor

dsouzai commented Mar 12, 2021

Thanks for the design description @AlexeyKhrabrov; I thought there might be a way we didn't need another table, but it turns out that approach would lead to functional issues in AOT. Your approach looks functionally correct.

JIT as a Service automation moved this from To do to Done Mar 19, 2021
@AlexeyKhrabrov
Copy link
Contributor Author

This was not supposed to be closed by #12154. @mpirvu could you please reopen this issue? (I don't think I have permissions to do that)

@dsouzai dsouzai reopened this Mar 19, 2021
JIT as a Service automation moved this from Done to In progress Mar 19, 2021
@dsouzai
Copy link
Contributor

dsouzai commented Mar 19, 2021

Seems to have closed automatically; not sure why since #12154 didn't have one of the keywords specified in [1] but somehow it got linked nonetheless :S.

[1] https://docs.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue

@AlexeyKhrabrov
Copy link
Contributor Author

This issue got linked to a PR (#12154) again that doesn't use any keywords that would specify that it should be closed by the PR. This is probably an artifact of the "JIT as a service" automated project board.

@AlexeyKhrabrov
Copy link
Contributor Author

The last item in the list was essentially resolved in #14207. This issue can be closed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:jitserver Artifacts related to JIT-as-a-Service project
Projects
Development

No branches or pull requests

3 participants