# Foreign Memory API


## Different kinds of memories
- Java GCed memory
- C dynamic memory (malloc)
- file mapped memory (mmap)
- non volatile memory (NVRAM)


## Current two APIs in Java
- java.nio.ByteBuffer
- sun.misc.Unsafe*


> sun.misc.Unsafe is not really supported  


## Example of ByteBuffer
```java
var nativeOrder = ByteOrder.nativeOrder();
var byteBuffer = ByteBuffer.allocateDirect(1024 * 4).order(nativeOrder);
try {
  for (var i = 0; i < 1024; i++) {
    byteBuffer.putInt(i * 4, 42);
  }
} finally {
  UNSAFE.invokeCleaner(byteBuffer);
}
```


## ByteBuffer
- API IO buffer oriented
- fast
- can not release direct memory explicitly
- use int for index (2G limit)


## Example of Unsafe
```java
long unsafe_addr = UNSAFE.allocateMemory(1024 * 4);
try {
  for (var i = 0; i < 1024; i++) {
    UNSAFE.putInt(unsafe_addr + (i * 4) , 42);
  }
} finally {
  UNSAFE.freeMemory(unsafe_addr);
}
```


## Unsafe
- Really unsafe !
  - crash
  - buffer overflow
- Not that JIT friendly
  - aliasing issue
  - no loop vectorisation


## Need a third API ?
- Unsafe replacement
  - 100% safe and fast
- More low level than ByteBuffer
  - keep ByteBuffer fast but solve the deallocation issue
- Panama (new C <-> Java bridge)
  - support struct, array of structs, etc 
  - nice with small data structures


## Foreign Memory API


## Foreign Memory API - Incubator
This api is currently in incubator mode
so for `java` and `javac` you need to add
the module `jdk.incubator.foreign` in the module graph
```
--add-modules jdk.incubator.foreign
```


then you can import it


In [1]:
import jdk.incubator.foreign.*;

## This API is still in flux
So it may change depending on the Java version


In [2]:
System.out.println("runtime version " + Runtime.version());

runtime version 15-ea+13-487


> make sure you are at least using Java 15-ea+7


# Memory Segment


## Memory Segment
A memory segment is an area of memory that can be created
from different kind of memories
- `MemorySegment.ofArray(array)`
- `MemorySegment.allocateNative(8192)`
- `MemorySegment.mapFromPath​(path, bytesSize, mapMode)`
- `MemorySegment.ofByteBuffer(byteBuffer)`


## Temporally Bounded
The memory management is __explicit__
- `allocateNative()` allocate the memory
- `close()` release it.


In [3]:
var segment = MemorySegment.allocateNative(8192);
segment.close();
System.out.println(segment.isAlive());

false


## Temporally Bounded (2)
An access after a `close()` results in a runtime exception


In [4]:
var segment = MemorySegment.allocateNative(8192);
segment.close();
System.out.println(segment.asByteBuffer());

EvalException: Segment is not alive

## Temporally Bounded (3)
A _try-with-resources_ is safer and the syntax cleaner


In [5]:
try (var segment = MemorySegment.allocateNative(8192)) {
  System.out.println(segment.isAlive());
  System.out.println(segment);
} // the memory is released

true
MemorySegment{ id=0x7d71b4e5 limit: 8192 }


## Spatially Bounded
You can only access the memory inside the bounds of the segment


In [6]:
try (var segment = MemorySegment.allocateNative(8192)) {
  var buffer = segment.asByteBuffer();
  var indexTooBig = 8192;
  System.out.println(buffer.get(indexTooBig));
}

EvalException: null

## Thread Bounded
Only the thread that has allocated the segment can access
to the data of that segment


In [7]:
try (var segment = MemorySegment.allocateNative(8192)) {
  var thread = new Thread(() -> {
    var buffer = segment.asByteBuffer();
  });
  thread.start();
  thread.join();
}

Exception in thread "Thread-1" java.lang.IllegalStateException: Attempt to access segment outside owning thread
	at jdk.incubator.foreign/jdk.internal.foreign.MemorySegmentImpl.checkValidState(MemorySegmentImpl.java:159)
	at jdk.incubator.foreign/jdk.internal.foreign.MemorySegmentImpl.asByteBuffer(MemorySegmentImpl.java:125)
	at REPL.$JShell$21.lambda$do_it$$0($JShell$21.java:17)
	at java.base/java.lang.Thread.run(Thread.java:832)


## Thread Bounded (2)
A thread can ask explicitly to access the segment using `acquire()`.


In [None]:
try (var segment = MemorySegment.allocateNative(8192)) {
  var thread = new Thread(() -> {
    try(var acquiredSegment = segment.acquire()) {
      var buffer = acquiredSegment.asByteBuffer();
    }
  });
  thread.start();
  thread.join();
}

> Sharing segments may lead to concurrency issues


## Thread Bounded (3)
An acquired segment must be closed before the segment can be closed.


In [8]:
try (var segment = MemorySegment.allocateNative(8192)) {
  var thread = new Thread(() -> {
    var acquiredSegment = segment.acquire();
    var buffer = acquiredSegment.asByteBuffer();
    // no close !
  });
  thread.start();
  thread.join();
}

EvalException: Cannot close a segment that has active acquired views

## Memory Segment - Summary
- Thread Bounded
- Temporally Bounded
- Spatially Bounded


# MemoryAddress and VarHandle


## MemoryAddress
An offset inside the segment


In [9]:
try (var segment = MemorySegment.allocateNative(8192)) {
  var base = segment.baseAddress();
  System.out.println(base);
  var newBase = base.addOffset(16);
  System.out.println(newBase);
}

MemoryAddress{ region: MemorySegment{ id=0x7ce0cd85 limit: 8192 } offset=0x0 }
MemoryAddress{ region: MemorySegment{ id=0x7ce0cd85 limit: 8192 } offset=0x10 }


## MemoryAddress is value based
MemoryAddress is not a classical class
- no identity
  - synchronized, wait/notify are not supported
- acts like a primitive type


> Not fully implemented yet !


## VarHandle
A class representing how of access to a value
- primitive, struct, array
- byte order (`java.nio.ByteOrder`)
- alignment
- semantics (plain, volatile, opaque)


## VarHandle (2)
A class representing how of access to a value


In [10]:
import java.nio.ByteOrder;
var nativeOrder = ByteOrder.nativeOrder();
System.out.println(nativeOrder);

LITTLE_ENDIAN


In [11]:
import java.lang.invoke.VarHandle;
VarHandle intHandle = MemoryHandles.varHandle(int.class, nativeOrder);
System.out.println(intHandle);

VarHandle[varType=int, coord=[interface jdk.internal.access.foreign.MemoryAddressProxy]]


## Get/set one `int` at address 32  


In [12]:
try (var segment = MemorySegment.allocateNative(8192)) {
  var base = segment.baseAddress();
  intHandle.set(base.addOffset(32), 42);
  System.out.println(intHandle.get(base.addOffset(32)));
}

42


## Access and alignment
You can not read/write a value if the address is not correctly aligned


In [13]:
var longHandle = MemoryHandles.varHandle(long.class, nativeOrder);
try(var segment = MemorySegment.allocateNative(8192)) {
  longHandle.set(segment.baseAddress().addOffset(3), 0L);
}

EvalException: Misaligned access at address: 140481268933715

## VarHandle Addressing mode
- using a direct mode
  `handle.get(MemoryAddress)`


In [14]:
var intHandle = MemoryHandles.varHandle(int.class, nativeOrder);
System.out.println(intHandle);

VarHandle[varType=int, coord=[interface jdk.internal.access.foreign.MemoryAddressProxy]]


- using an offset (access a member of a struct)


In [15]:
var intHandle = MemoryHandles.varHandle(int.class, nativeOrder);
var structHandle = MemoryHandles.withOffset(intHandle, 8);
System.out.println(structHandle);

VarHandle[varType=int, coord=[interface jdk.internal.access.foreign.MemoryAddressProxy]]


## VarHandle Addressing mode (2)
- using an array mode:
  `handle.get(MemoryAddress, long)`


In [17]:
var intHandle = MemoryHandles.varHandle(int.class, nativeOrder);
var intArrayHandle = MemoryHandles.withStride(intHandle, 4);
System.out.println(intArrayHandle);

VarHandle[varType=int, coord=[interface jdk.internal.access.foreign.MemoryAddressProxy, long]]


## Get/set an array of `int`s  using an array handle


In [18]:
try (var segment = MemorySegment.allocateNative(8192)) {
  var base = segment.baseAddress().addOffset(32);
  for (var i = 0 ; i < 128 ; i++) {
    intArrayHandle.set(base, i, 42);
  }
  System.out.println(intArrayHandle.get(base, 64));
}

42


## Get/set an array of `int`s  using a direct handle
Maybe slower because the _stride_ is not hoisted
out of the loop


In [None]:
try (var segment = MemorySegment.allocateNative(8192)) {
  var base = segment.baseAddress().addOffset(32);
  for (var i = 0 ; i < 128 ; i++) {
    intHandle.set(base.addOffset(i * 4), 42);
  }
  System.out.println(intHandle.get(base.addOffset(64 * 4)));
}

## MemoryAddress and VarHandle
- MemoryAddress is an offset in the segment
- VarHandle specifies the addressing mode


# MemoryLayout


## Example of MemoryLayout
Represents a C memory layout
```c
struct {
  int a;
  byte b[12];
}
```
- __value__ (number of bits + order)
- __group__ (struct or union)
- __sequence__ (array, sized or free)


## Example of MemoryLayout (2)
```c
struct {
  int a;
  byte b[12];
}
```


In [None]:
import static jdk.incubator.foreign.MemoryLayout.*;
var layout1 = ofStruct(
    ofValueBits(32, nativeOrder).withName("a"),
    ofSequence(12,
        ofValueBits(8, nativeOrder)
    ).withName("b")
).withBitAlignment(32);
System.out.println(layout1);

## Another example of MemoryLayout
```c
struct {
  double x;
  double y;
} []
```


In [None]:
var layout2 = ofSequence(
    ofStruct(
        ofValueBits(64, nativeOrder).withName("x"),
        ofValueBits(64, nativeOrder).withName("y")
    ).withBitAlignment(64)
);
System.out.println(layout2);

## Operations on a MemoryLayout


`withElementCount()` change the dimension of a sequence


In [None]:
System.out.println(layout2);
System.out.println(layout2.withElementCount(1024));

## Operations on a MemoryLayout (2)


Using a `PathElement`.`groupElement` to locate a field inside a struct and
`sequenceElement` to locate an item inside a sequence.


In [None]:
import static jdk.incubator.foreign.MemoryLayout.PathElement.groupElement;
import static jdk.incubator.foreign.MemoryLayout.PathElement.sequenceElement;

`map()` rewrite a field or an array


In [None]:
System.out.println(layout1);
var layout3 = layout1.map(l -> ((SequenceLayout)l).withElementCount(1024), groupElement("b"));
System.out.println(layout3);

# From a MemoryLayout


## VarHandle from a MemoryLayout
Create a VarHandle that access to a field of a struct


In [None]:
System.out.println(layout1);
var aHandle = layout1.varHandle(int.class, groupElement("a"));
System.out.println(aHandle);

If the primitive type is not compatible with the size


In [None]:
var aHandle = layout1.varHandle(long.class, groupElement("a"));
System.out.println(aHandle);

## VarHandle from MemoryLayout (2)
Using a `PathElement`.`sequenceElement()` to locate a sequence inside a layout.
The VarHandle has a supplementary parameter if the array has one free dimension


In [None]:
System.out.println(layout1);
var bHandle = layout1.varHandle(byte.class, groupElement("b"), sequenceElement());
System.out.println(bHandle);

In [None]:
System.out.println(layout2);
var xHandle = layout2.varHandle(double.class, sequenceElement(), groupElement("x"));
System.out.println(xHandle);

## A MemorySegment from a MemoryLayout
You can ask for a segment of the right size from a layout


In [None]:
var aHandle = layout1.varHandle(int.class, groupElement("a"));
var bHandle = layout1.varHandle(byte.class, groupElement("b"), sequenceElement());
try (var segment = MemorySegment.allocateNative(layout1)) {
  var base = segment.baseAddress();
  aHandle.set(base, 42);
  bHandle.set(base, 7, (byte)42);
  System.out.println(aHandle.get(base) + " " + bHandle.get(base, 7));
}

# Performance


# Perf: Read 8192 bytes as ints
Loop only, constant memory


with a `intHandle`
```java
var sum = 0;
for (var i = 0; i < 1024; i++) {
  sum += (int)INT_HANDLE.get(BASE.addOffset(i * 4));
}
blackhole.consume(sum);
```


# Perf: Read 8192 bytes as ints
Loop only, constant memory


with a `intArrayHandle`
```java
var sum = 0;
for (var i = 0; i < 1024; i++) {
  sum += (int)INT_ARRAY_HANDLE.get(BASE, (long) i);
}
blackhole.consume(sum);
```


## Perf: Read 8192 bytes as ints
Loop only, constant memory


| Benchmark             | Score   | Error    | Units |
| --------------------- | ------- | -------- | ----- |
|bytebuffer             | 245.003 | ±  7.009 | ns/op |
|segment_intArrayHandle | 246.747 | ±  7.306 | ns/op |
|segment_intHandle      | 627.923 | ±  1.115 | ns/op |
|unsafe                 | 240.544 | ±  9.917 | ns/op |


## Perf: Read 8192 bytes as ints (2)
Creation + loop


with a `intArrayHandle`
```java
try(var segment = MemorySegment.allocateNative(8192)) {
  var base = segment.baseAddress();
  var sum = 0;
  for (var i = 0; i < 1024; i++) {
    sum += (int)INT_ARRAY_HANDLE.get(base, (long) i);
  }
  blackhole.consume(sum);
}
```


## Perf: Read 8192 bytes as ints (2)
Creation + loop


| Benchmark             | Score   | Error    | Units |
| --------------------- | ------- | -------- | ----- |
|bytebuffer             | 672.081 | ± 19.294 | ns/op |
|segment_intArrayHandle | 620.525 | ± 10.574 | ns/op |
|segment_intHandle      | 869.473 | ±  6.158 | ns/op |
|unsafe_clean           | 594.088 | ± 23.806 | ns/op |
|unsafe_noclean         | 322.504 | ±  0.473 | ns/op |


## Perf: Write 8192 bytes as ints
loop only, constant memory


with a `intArrayHandle`
```java
for (var i = 0; i < 1024; i++) {
  INT_ARRAY_HANDLE.set(BASE, (long) i, 42);
}
```


## Pref: Write 8192 bytes as ints
Loop only, constant memory


| Benchmark             | Score   | Error    | Units |
| --------------------- | ------- | -------- | ----- |
|bytebuffer             |  37.467 | ±  0.323 | ns/op |
|segment_intArrayHandle |  32.235 | ±  0.276 | ns/op |
|segment_intHandle      | 544.011 | ± 17.242 | ns/op |
|unsafe                 | 249.486 | ±  7.173 | ns/op |


## Perf: Write 8192 bytes as ints (2)
Creation + loop


```java
try(var segment = MemorySegment.allocateNative(8192)) {
  var base = segment.baseAddress();
  for (var i = 0; i < 1024; i++) {
    INT_ARRAY_HANDLE.set(base, (long) i, 42);
  }
}
```


## Write 8192 bytes as ints (2)
creation + loop


| Benchmark             |  Score  | Error    | Units |
| --------------------- | ------- | -------- | ----- |
|bytebuffer             | 473.377 | ±  2.261 | ns/op |
|segment_intArrayHandle | 414.514 | ±  1.010 | ns/op |
|segment_intHandle      | 805.126 | ± 21.460 | ns/op |
|unsafe_clean           | 598.735 | ± 23.234 | ns/op |
|unsafe_noclean         | 328.945 | ±  1.568 | ns/op |


# Missing methods ??


## Provide elementCounts for the sequences after having created the MemoryLayout


In [None]:
MemoryLayout withElementCounts(MemoryLayout layout, Iterator<Integer> counts) {
  if (!counts.hasNext()) {
    return layout;
  }
  if (layout instanceof SequenceLayout seq) {
    var elementCount = seq.elementCount().orElse(counts.next());
    var result = ofSequence(elementCount, withElementCounts(seq.elementLayout(), counts)).withBitAlignment(seq.bitAlignment());
    return seq.name().map(result::withName).orElse(result);
  }
  if (layout instanceof GroupLayout group) {
    var result = ofStruct(group.memberLayouts().stream().map(l -> withElementCounts(l, counts)).toArray(MemoryLayout[]::new)).withBitAlignment(group.bitAlignment());
    return group.name().map(result::withName).orElse(result);
  }
  return layout;
}
MemoryLayout withElementCounts(MemoryLayout layout, int... counts) {
  return withElementCounts(layout, Arrays.stream(counts).boxed().iterator());
}

## How to use it ?


In [None]:
var partialArrayLayout = ofSequence(ofSequence(
    ofStruct(
        ofValueBits(64, nativeOrder).withName("x"),
        ofValueBits(64, nativeOrder).withName("y")
    ).withBitAlignment(64)
));
System.out.println(partialArrayLayout);

In [None]:
var matrix8x8Layout = withElementCounts(partialArrayLayout, 8, 8);
System.out.println(matrix8x8Layout);

## Provide an API entry point to create a MemoryLayout from the String representation instead of using a Constable
One of these format is lighter than the other, no ?


In [None]:
System.out.println(partialArrayLayout);
System.out.println(partialArrayLayout.describeConstable());