Skip to content
Permalink
Browse files
Change name from Memory Package to Memory Component.
Fix other misc spelling, etc.
  • Loading branch information
leerho committed Sep 21, 2021
1 parent 9afd5f2 commit 48f580945843229221afe0e864993ddc55e870b3
Show file tree
Hide file tree
Showing 9 changed files with 33 additions and 38 deletions.
@@ -43,7 +43,7 @@ layout: doc_page

### Built-In, General Purpose Functions

* General purpose [Memory Package]({{site.docs_dir}}/Memory/MemoryPackage.html) for managing data off the Java Heap.
* General purpose [Memory Component]({{site.docs_dir}}/Memory/MemoryComponent.html) for managing data off the Java Heap.
This enables systems designers the ability to manage their own large data heaps with
dedicated processor threads that would otherwise put undue pressure on the Java heap and
its garbage collection.
@@ -62,9 +62,9 @@ particularly important when processing sensitive user identifiers. Available wit
* [Pre-Sampling]({{site.docs_dir}}/Theta/ThetaPSampling.html). Built-in up-front sampling for cases where additional
contol is required to limit overall memory consumption when dealing with millions of sketches. Available with Theta Sketches.

* [Memory Package]({{site.docs_dir}}/Memory/MemoryPackage.html).
* [Memory Component]({{site.docs_dir}}/Memory/MemoryComponent.html).
Large query systems often require their own heaps outside the JVM in order to better manage garbage collection latencies.
The Java sketches utilize this powerful package.
The Java sketches utilize this powerful component.

* Built-in <b>Upper-Bound and Lower-Bound estimators</b>.
You are never in the dark about how good of an estimate the sketch is providing.
@@ -19,24 +19,24 @@ layout: doc_page
specific language governing permissions and limitations
under the License.
-->
## Memory Package
## Memory Component

### Introduction

The primary objective for the _Memory Package_ is to allow high-performance read-write access to Java "off-heap" memory
(also referred to as _direct_, or _native_ memory). However, as documented below, this package has a rich set of other
The primary objective for the _Memory Component_ is to allow high-performance read-write access to Java "off-heap" memory
(also referred to as _direct_, or _native_ memory). However, as documented below, this component has a rich set of other
capabilities as well.

#### Versioning
The _DataSketches_ memory package has its own repository and is released with its own jars in _Maven Central_
The _DataSketches_ memory component has its own repository and is released with its own jars in _Maven Central_
(groupId=org.apache.datasketches, artifactId=datasketches-memory).
This document applies to the memory package versions 0.10.0 and after.
This document applies to the memory component versions 0.10.0 and after.

#### Naming Conventions
To avoid confusion in the documentation the capitalized _Memory_ refers to the code in the
Java _org.apache.datasketches.memory_ package, and the uncapitalized _memory_ refers to computer memory in general.
There is also a class _org.apache.datasketches.memory.Memory_ that should not be confused with the _org.apache.datasketches.memory_ package.
In the text, sometimes _Memory_ refers to the entire package and sometimes to the specific class,
Java _org.apache.datasketches.memory_ component, and the uncapitalized _memory_ refers to computer memory in general.
There is also a class _org.apache.datasketches.memory.Memory_ that should not be confused with the _org.apache.datasketches.memory_ component.
In the text, sometimes _Memory_ refers to the entire component and sometimes to the specific class,
but it should be clear from the context.


@@ -76,42 +76,42 @@ replaces this C++ code with assembly language instructions called "intrinsics",
can be just a single CPU instruction. This results in superior runtime performance that is
very close to what could be achieved if the application was written in C++.

The _Memory_ package is essentially an extension of _Unsafe_ and wraps most of the
The _Memory_ component is essentially an extension of _Unsafe_ and wraps most of the
primitive get and put methods and a few specialized methods into a convenient API
organized around an allocated block of native memory.

The only "official" alternative available to systems developers is to use the Java _ByteBuffer_ class
that also allows access to off-heap memory. However, the _ByteBuffer_ API is extremely limited
and contains serious defects in its design and traps that many users of the _ByteBuffer_ class unwittingly
fall into, which results in corrupted data. This _Memory Package_ has been designed to be a
fall into, which results in corrupted data. This _Memory Component_ has been designed to be a
replacement for the _ByteBuffer_ class.

Using the _Memory_ package cannot be taken lightly, as the systems developer must now be
Using the _Memory_ component cannot be taken lightly, as the systems developer must now be
aware of the importance of memory allocation and deallocation and make sure these resources
are managed properly. To the extent possible, this _Memory Package_ has been designed leveraging Java's own
are managed properly. To the extent possible, this _Memory Component_ has been designed leveraging Java's own
_AutoCloseable_, and _Cleaner_ and also tracks when allocated memory has been freed and provides safety checks
against the dreaded "use-after-free" case.

### Architecture
The Memory package is designed around two major types of entities:
The Memory component is designed around two major types of entities:

* _Resources_: A _Resource_ is a collection of consecutive bytes.
* _APIs_: An API is a programming interface for reading and writing to a resource.

#### Resourses
The _Memory_ package defines 4 _Resources_, which at their most basic level can be viewed as a collection of consecutive bytes.
The _Memory_ component defines 4 _Resources_, which at their most basic level can be viewed as a collection of consecutive bytes.

* Primitive on-heap arrays: _boolean[], byte[], char[], short[], int[], long[], float[], double[]_.
* Java _ByteBuffers_.
* Off-heap memory. Also called "native" or "direct" memory.
* Memory-mapped files.

It should be noted at the outset that the off-heap memory and the memory-mapped file resources require special handling with respect to allocation and deallocation.
The _Memory Package_ has been designed to access these resources leveraging the Java _AutoCloseable_ interface and the Java internal _Cleaner_ class,
The _Memory Component_ has been designed to access these resources leveraging the Java _AutoCloseable_ interface and the Java internal _Cleaner_ class,
which also provides the JVM with mechanisms for tracking overall use of off-heap memory.

#### APIs
The _Memory_ package defines 5 principal APIs for accessing the above resources.
The _Memory_ component defines 5 principal APIs for accessing the above resources.

* _Memory_: Read-only access using byte offsets from the start of the resource.
* _WritableMemory_: Read/write access using byte offsets from the start of the resource.
@@ -128,7 +128,7 @@ These 5 principal APIs and the four Resources are then multiplexed into 32 API/R
* Little-Endian versus Big-Endian APIs for multibyte primitives

#### Design Goals
These are the major design goals for the _Memory Package_.
These are the major design goals for the _Memory Component_.

* __Common API__. The APIs should be agnostic to the chosen resource, with only a few minor exceptions.
* __Performance is critical__. The architecture has been specifically designed to eliminate unnecessary object and interface redirection.
@@ -151,7 +151,7 @@ These are the major design goals for the _Memory Package_.


#### Diagram of the Core Hierarchy
This includes both package-private classes as well as public classes, but should help the user understand the inner workings of the _Memory Package_.
This includes both package-private classes as well as public classes, but should help the user understand the inner workings of the _Memory Component_.

<img class="doc-img-full" src="{{site.docs_img_dir}}/memory/CoreHierarchy.png" alt="CoreHierarchy.png" />

@@ -67,7 +67,7 @@ This means that the plotted x values form an exponential series of the form <i>2
<img src="{{site.docs_img_dir}}/theta/ErrorSurface2.png" alt="ErrorSurface2" width="150px" />. The cross-sectional slice of this surface is approximately Gaussian like this graph
<img src="{{site.docs_img_dir}}/theta/400px-StandardNormalCurve.png" alt="400px-StandardNormalCurve" width="200px" />,
which has the +/- 1, 2 and 3 standard deviation points on the x-axis marked and the corresponding areas under the curve that represent the associated confidence levels.
* All the pitchfork graphs in this section were generated using the utilities located in the the <i>sketches-misc</i> repository, performance package.
* All the pitchfork graphs in this section were generated using the utilities located in the the <i>datasketches-characterization</i> repository.

The specifics of the above pitchfork graph:

@@ -63,7 +63,7 @@ Once we have our 30 day sketches, we merge all 30 sketches together into one fin

## The IntegerSketch and Helper classes

To help us code our example we will leverage the [IntegerSketch Package](https://github.com/apache/datasketches-java/tree/master/src/main/java/org/apache/datasketches/tuple/aninteger) from the library. This package consists of 5 classes, the _IntegerSketch_ and 4 helper classes, all of which extend generic classes of the parent _tuple_ package. Normally, the user/developer would develop these 5 classes to solve a particular analysis problem. These 5 classes can serve as an example of how to create your own Tuple Sketch solutions and we will use them to solve our customer engagement problem.
To help us code our example we will leverage the [IntegerSketch package](https://github.com/apache/datasketches-java/tree/master/src/main/java/org/apache/datasketches/tuple/aninteger) from the library. This package consists of 5 classes, the _IntegerSketch_ and 4 helper classes, all of which extend generic classes of the parent _tuple_ package. Normally, the user/developer would develop these 5 classes to solve a particular analysis problem. These 5 classes can serve as an example of how to create your own Tuple Sketch solutions and we will use them to solve our customer engagement problem.

Please refer to the [Tuple Overview](/docs/Tuple/TupleOverview.html) section on this website for a quick review of how the Tuple Sketch works.

15 pom.xml
@@ -89,10 +89,10 @@ under the License.
<!-- END:UNIQUE FOR THIS JAVA COMPONENT -->

<!-- Test -->
<testng.version>7.1.0</testng.version>
<testng.version>7.4.0</testng.version>

<!-- System-wide properties -->
<maven.version>3.0.0</maven.version>
<maven.version>3.5.0</maven.version>
<java.version>1.8</java.version>
<maven.compiler.source>${java.version}</maven.compiler.source>
<maven.compiler.target>${java.version}</maven.compiler.target>
@@ -124,7 +124,7 @@ under the License.
<coveralls-maven-plugin.version>4.3.0</coveralls-maven-plugin.version>
<!-- other -->
<lifecycle-mapping.version>1.0.0</lifecycle-mapping.version>
<git-commit-id-plugin.version>3.0.0</git-commit-id-plugin.version>
<git-commit-id-plugin.version>4.0.4</git-commit-id-plugin.version>
</properties>

<repositories>
@@ -161,12 +161,11 @@ under the License.
</dependency>
<!-- END: UNIQUE FOR THIS JAVA COMPONENT -->

<!-- Test Scope -->
<dependency>
<groupId>org.testng</groupId>
<artifactId>testng</artifactId>
<version>${testng.version}</version>
<scope>test</scope>

</dependency>
</dependencies>

@@ -179,10 +178,6 @@ under the License.
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-deploy-plugin</artifactId>
<version>${maven-deploy-plugin.version}</version>
<configuration>
<updateReleaseInfo>true</updateReleaseInfo>
<!-- see maven-install-plugin -->
</configuration>
</plugin>
<plugin>
<!-- Apache Parent pom, pluginManagement-->
@@ -546,7 +541,6 @@ under the License.
</execution>
</executions>
<configuration>
<injectAllReactorProjects>true</injectAllReactorProjects>
<archive>
<manifest>
<addDefaultEntries>false</addDefaultEntries>
@@ -558,6 +552,7 @@ under the License.
<Build-OS>${os.name} ${os.arch} ${os.version}</Build-OS>
<Implementation-Vendor>The Apache Software Foundation</Implementation-Vendor>
<GroupId-ArtifactId>${project.groupId}:${project.artifactId}</GroupId-ArtifactId>
<!-- these properties are generated by the git-commit-id-plugin during initialize -->
<git-branch>${git.branch}</git-branch>
<git-commit-id>${git.commit.id.full}</git-commit-id>
<git-commit-time>${git.commit.time}</git-commit-time>
@@ -30,7 +30,7 @@
*
* @author Lee Rhodes
*/
@SuppressWarnings("javadoc")
//@SuppressWarnings("javadoc")
public class ByteArrayBuilder {
private byte[] arr_;
private int count_ = 0;
@@ -47,7 +47,7 @@
*
* @author Lee Rhodes
*/
@SuppressWarnings("javadoc")
//@SuppressWarnings("javadoc")
public final class Files {
private static final String LS = System.getProperty("line.separator");
private static final byte CR = 0xD;
@@ -19,10 +19,10 @@
{"class":"Doc", "desc" : "Components", "dir" : "Architecture", "file": "Components" },
{"class":"Doc", "desc" : "Sketches by Component", "dir" : "Architecture", "file": "SketchesByComponent" },
{"class":"Doc", "desc" : "Sketch Criteria", "dir" : "Architecture", "file": "SketchCriteria" },
{ "class":"Dropdown", "desc" : "Memory Package", "array":
{ "class":"Dropdown", "desc" : "Memory Component", "array":
[
{"class":"Doc", "desc" : "Memory Package", "dir" : "Memory", "file": "MemoryPackage" },
{"class":"Doc", "desc" : "Memory Package Performance", "dir" : "Memory", "file": "MemoryPerformance" },
{"class":"Doc", "desc" : "Memory Componet", "dir" : "Memory", "file": "MemoryComponent" },
{"class":"Doc", "desc" : "Memory Component Performance", "dir" : "Memory", "file": "MemoryPerformance" },
]
},
{"class":"Doc", "desc" : "Notes on Order Sensitivity", "dir" : "Architecture", "file": "OrderSensitivity" },

0 comments on commit 48f5809

Please sign in to comment.