Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Fetching contributors…

Cannot retrieve contributors at this time

458 lines (374 sloc) 14.087 kB
Jato ToDo
=========
Pekka Enberg <penberg@ki.fi>
2014
:toc:
Introduction
------------
This document lists features and improvements to the Jato VM for interested
hackers to work on. The ones in the section labeled as "Priority" are
considered most important.
Priority
--------
This section contains projects that are considered to be most important to
make Jato a top notch alternative JVM for contemporary applications that are
dominated by alternative languages such as Scala, Clojure, and JRuby.
LLVM backend
~~~~~~~~~~~~
Implemeting a production quality JIT compiler for a new architecture is
time-consuming and hard. The goal of this project is to simplify Jato porting
by implementing a code generator using LLVM C bindings. The work has already
been started and it is visible in the ++llvm/core++ branch.
Required skills::
C, LLVM, JVM
Difficulty::
Medium to Hard
Interpreter
~~~~~~~~~~~
Jato supports JIT-only execution which is slow for some applications. For
example, startup time for short-lived applications can be reduced with a simple
interpreter. The primary goal of this project is to improve 'vm/interp.c' to
cover the whole bytecode instruction set. Secondary goal for this project is to
implement mixed-mode execution where we start out by interpreting bytecode and
only JIT compile methods that are invoked a lot of times.
Required skills::
C, JVM
Difficulty::
Medium
Support for OpenJDK
~~~~~~~~~~~~~~~~~~~
Jato uses GNU Classpath to provide essential Java APIs. Unfortunately the
development of GNU Classpath seems to have slowed after OpenJDK was released.
The purpose of this project is to OpenJDK as an alternative for providing
essential APIs for better Java compatibility.
Required skills::
C, Java
Difficulty::
Medium
JSR 292: invokedynamic
~~~~~~~~~~~~~~~~~~~~~~
The goal of this project is to add support for the new +invokedynamic+
bytecode instruction specified in JSR 292 that's designed to improve execution
performance of dynamic languages such as JRuby and Redline Smalltalk. Please
note that you also need to do some work in GNU Classpath to be able to run
full invokedynamic applications.
Required skills::
C, Java
Difficulty::
Medium to Hard
Darwin Port
~~~~~~~~~~~
Darwin (a.k.a. Mac OS X) is a popular platform among software developers. The
goal of this project is to get rid of Linuxism such as:
- Thread-local storage (TLS)
- Signal handling
and be able to run Jato under Darwin.
Required skills::
C, Darwin
Difficulty::
Medium to Hard
Performance
-----------
SSA form
~~~~~~~~
The JIT compiler support SSA form but it's currently disabled by default
because it breaks some DaCapo benchmarks. The goal of this project is to fix
all the remaining issues in SSA form conversion so that it can be enabled by
default.
Required skills::
C, x86
Difficulty::
Hard
Method Inlining
~~~~~~~~~~~~~~~
Method inlining is an optimization where a method invocation is replaced with
an inline copy of the invoked method body. As shown by <<Suganuma02>>, static
inlining decisions only make sense for tiny methods where the body of the
method is smaller than the invocation site footprint. The purpose of this
project is to implement method inlining for tiny static methods and optionally,
if inline cache support is added to the VM, inlining of tiny virtual and
interface methods to the inline cache.
Required skills::
C, x86
Difficulty::
Medium
Array Bounds Check Elimination
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Out-of-bounds array stores and loads are required to throw the
+IndexOutOfBoundsException+. Currently the JIT compiler generates a bounds
check for every array access which is slow as seen in the SciMark2 benchmark,
for example. Fortunately, much of the array bounds checks can be eliminated
which speeds up SciMark by 40% and SPECjvm98 2% on average <<Wuerthinger07>>.
Required skills::
C
Difficulty::
Medium
Register Allocator Improvements
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Linear scan register allocator functions show up high in performance profiles
for many applications. The goal of this project is to improve both the speed of
the register allocator and the generated code. One such improvement would be to
use the SSA form in the register allocator <<Moe02>>.
Required skills::
C
Difficulty::
Medium
Startup Time
~~~~~~~~~~~~
The purpose of this project is to improve VM startup time. There are two
different classes of applications to optimize: command line applications such
as Ant and Maven and graphical user interface applications such as Eclipse.
Both classes might require different kinds of hacks to speed up the start up
time. You might need to optimize GNU Classpath and external libraries Jato uses
during VM startup.
Required skills::
C, Java
Difficulty::
Medium
Tail Call Optimizations
~~~~~~~~~~~~~~~~~~~~~~~
Functional languages such as Scala and Clojure make heavy use of tail-calls.
The purpose of this project is to implement tail call optimizations to speed up
functional languages on JVM. For background material, check out HotSpot
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4726340[Request for
Enhancement] on tail-call optimizations and a related blog post by
http://blogs.sun.com/jrose/entry/tail_calls_in_the_vm[John Rose] ("Tail calls
in the VM").
Required skills::
C, x86
Difficulty::
Hard
Virtual Machine
---------------
Ahead-of-time Compilation
~~~~~~~~~~~~~~~~~~~~~~~~~
The goal of this project is to implement support for ahead-of-time compilation.
Required skills::
C, JVM
Difficulty::
Hard
Garbage Collector
-----------------
Exact GC
~~~~~~~~
Jato uses Boehm GC as its garbage collector. While Boehm GC is stable and
reliable, it's unnecessarily slow because it's based on a conservative GC
algorithm. The first step for a fully integrated and fast garbage collector is
to implement an exact GC that uses safepoints (also known as GC points) to
stop-the-world and GC maps to enumerate the GC root set. There's a broken
implementation of safepoints in 'vm/gc.c' that serves as a starting point for
the project.
Required skills::
C, x86
Difficulty::
Medium
Compacting GC
~~~~~~~~~~~~~
This project is for implementing a compacting GC on top of the core GC to
reduce memory fragmentation and speed up object allocation in the GC. Please
note that this requires support from the VM and the JIT compiler because you
need to be able to move objects during compacting phase.
Required skills::
C, x86
Difficulty::
Hard
Low Latency GC
~~~~~~~~~~~~~~
The stop-the-world part of a GC can cause latencies in the order of many
milliseconds in Java applications. While there are pauseless GCs for special
purpose hardware, the purpose of this project is to implement GC latency
monitoring tools for the VM and investigate and implement practical solutions
for reducing GC latency on commodity hardware.
Required skills::
C, x86
Difficulty::
Hard
x86
---
x86-64 support
~~~~~~~~~~~~~~
The x86-64 architecture support in native JIT is still buggy. The purpose of
this project is to finish the port and make it stable.
Required skills::
C, x86-64, ABI, assembly
Difficulty::
Hard
Legacy floating point support (x87)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Add instruction selection and code emission for floating point arithmetic using
the x87 FPU instruction such as +fadd+ and +fdiv+. The main difference to the
project idea above is that the x87 instructions do not operate on regular
registers but use a stack-based approach which makes instruction selection
harder. As the SSE2 approach is more performant, the x87 support is only for
compatilibity with older x86 CPUs that do not support SSE.
Required skills::
C, x86
Difficulty::
Medium
ARM
---
ARM support
~~~~~~~~~~~
The purpose of this project is to port Jato the the ARM architecture. Possibly
even the upcoming 64-bit ARM.
Required skills::
C, ARM, ABI, assembly
Difficulty::
Hard
Portability
-----------
Machine Architectures
~~~~~~~~~~~~~~~~~~~~~
To port the VM to a new architecture, you need to introduce a new
'arch/<target>' directory that implements the following machine architecture
specific parts:
- Instruction encoding (see 'arch/x86/emit.c' and 'arch/x86/encoding.c')
- Instruction selection (see 'arch/x86/insn-selector.brg')
- Exception handling (see 'arch/x86/unwind_{32,64}.S')
- Signal bottom halves (see 'arch/x86/signal.c' and 'arch/x86/signal-bh.sh')
- +sun/misc/Unsafe+ locking primitives (see 'arch/x86/include/arch/cmpxchg*.h')
You also need to make sure the core code in 'vm', 'jit', and 'runtime'
directories works on your architecture. Special attention needs to be paid when
porting to big endian CPUs because of our x86 centric heritage.
Required skills::
C, target machine architecture
Difficulty::
Hard
Windows Port
~~~~~~~~~~~~
The goal of this project is to port Jato to Windows. Please note that the
codebase currently is very GCC and Linux specific which are likely to bite you
when porting to native Windows toolchain:
- ABI issues (e.g. Microsoft x64 calling conventions)
- Thread-local storage (TLS)
- Signal handling
- POSIXism
Required skills::
C, Windows
Difficulty::
Medium to Hard
Clang
~~~~~
Clang is an interesting new compiler built on top of LLVM that claims faster
compilation times and better code generation. The purpose of this project is to
fix blatant GCC-ism in Jato to make it compile with Clang. Please note that you
might need to dive into LLVM and Clang sources if you run into Clang crashes
while compiling the VM.
Required skills::
C, possibly LLVM
Difficulty::
Medium to Hard
JVM
---
Java Native Interface (JNI) support
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The VM has partial support for Java Native Interface (JNI) API. The goal of
this project is to finish the API.
Required skills::
C, Java
Difficulty::
Medium
JVM Tool Interface (JVM TI)
~~~~~~~~~~~~~~~~~~~~~~~~~~~
The JVM Tool Inteface is a native programming interface that's used by
development tool such as commercial YourKit Java Profiler.
Required skills::
C, Java
Difficulty::
Medium
Tools Support
-------------
Valgrind
~~~~~~~~
You can run Jato under Valgrind to detect bugs in the VM. However, to
accomplish that, Jato generates slightly different code because using signals
for lazy class initialization interracts badly with Valgrind. Furthermore,
throwing a +NullPointerException+ through +SIGSEGV+ also upsets Valgrind.
The goal of this project is to fix up Jato and Valgrind interractions. This
might require changes possibly in both projects.
Required skills::
C, x86
Difficulty::
Medium
GDB
~~~
Jato already hooks into GDB's JIT interface to provide basic debugging
functionality such as backtraces and breakpoints. This could use a cleanup and
various improvements, like the ability to display the arguments passed to a
method. We could implement a better mangling scheme, but it's probably better
to use the new custom debug info support in GDB 7.4. Check the
http://sourceware.org/gdb/onlinedocs/gdb/JIT-Interface.html#JIT-Interface[GDB
documentation] for details.
Another possible improvement could be minimizing the need to disable
optimizations.
This task should be rather straightforward, but it might require digging
through GDB to figure out possibly missing details from the documentation. It
can be scaled according to the features one proposes to implement.
Required skills::
C, GDB, x86(-64)
Difficulty::
Medium, depends on the scope of the proposal
Other
-----
fork() support
~~~~~~~~~~~~~~
The JRuby VM is unable to support fork() because OpenJDK does not support it.
The goal of this project is to make Java programs running under Jato be able to
call fork() and still have a working system.
Required skills::
C, POSIX
Difficulty::
Medium
POSIX API support
~~~~~~~~~~~~~~~~~
Projects like JRuby and Cassandra rely heavily on OS specific APIs for
correctness and performance reasons. The goal of this project is to support
POSIX APIs in Jato efficiently. Ideally, we should be able to inline the actual
API calls directly to the JIT'd machine code so that there's zero overhead.
As a starting point, you should look into the JNR POSIX project to see what
Jato can do to make it work and perform well out-of-the-box.
Required skills::
C, POSIX
Difficulty::
Medium
JIT API
~~~~~~~
VMs that run on the JVM need to transform their internal representation into
bytecode for the JIT to kick in. This is wasteful because JVMs will immediately
transform the bytecode to their own intermediate representation.
The goal of this project is to implement a Java API that allows Java programs
to generate JIT'd code directly and bypass the bytecode conversion layer.
Required skills::
C, POSIX
Difficulty::
Medium
JIT preloading
~~~~~~~~~~~~~~
The goal of this project is to implement support for saving off JITted code and
preload it to speed up startup time.
Required skills::
C
Difficulty::
Medium
Dalvik
~~~~~~
The Dalvik VM is very similar to the JVM semantically. The goal of this project
is to implement support for running Dalvik applications under Jato.
Required skills::
C, POSIX
Difficulty::
Hard
[bibliography]
References
----------
[bibliography]
- [[[Cooper01]]] Keith Cooper et al. A Simple, Fast Dominance Algorithm. 2001.
http://www.cs.rice.edu/~keith/EMBED/dom.pdf[URL]
- [[[Moe02]]] Hanspeter Mössenböck and Michael Pfeiffer. Linear Scan Register
Allocation in the Context of SSA Form and Register Constraints. 2002.
http://www.ssw.uni-linz.ac.at/Research/Papers/Moe02.html[URL]
- [[[Suganuma02]]] Toshio Suganuma et al. An Empirical Study of Method Inlining
for a Java Just-In-Time Compiler. http://www.usenix.org/events/javavm02/suganuma/suganuma_html/[URL]
- [[[Wuerthinger07]]] Thomas Würthinger and Christian Wimmer. Array Bounds
Check Elimination for the Java HotSpot™ Client Compiler. 2007.
http://www.ssw.uni-linz.ac.at/Research/Papers/Wuerthinger07/[URL]
Jump to Line
Something went wrong with that request. Please try again.