<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Implementing-SQL-Operations:-Select" data-toc-modified-id="Implementing-SQL-Operations:-Select-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Implementing SQL Operations: Select</a></span><ul class="toc-item"><li><span><a href="#Introduction" data-toc-modified-id="Introduction-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Introduction</a></span></li><li><span><a href="#Prerequisites" data-toc-modified-id="Prerequisites-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Prerequisites</a></span></li><li><span><a href="#Initialization" data-toc-modified-id="Initialization-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Initialization</a></span><ul class="toc-item"><li><span><a href="#Ensure-database-is-running" data-toc-modified-id="Ensure-database-is-running-1.3.1"><span class="toc-item-num">1.3.1&nbsp;&nbsp;</span>Ensure database is running</a></span></li><li><span><a href="#Download-and-install-additional-components." data-toc-modified-id="Download-and-install-additional-components.-1.3.2"><span class="toc-item-num">1.3.2&nbsp;&nbsp;</span>Download and install additional components.</a></span></li></ul></li><li><span><a href="#Connect-to-database-and-populate-test-data" data-toc-modified-id="Connect-to-database-and-populate-test-data-1.4"><span class="toc-item-num">1.4&nbsp;&nbsp;</span>Connect to database and populate test data</a></span><ul class="toc-item"><li><span><a href="#Create-a-secondary-index" data-toc-modified-id="Create-a-secondary-index-1.4.1"><span class="toc-item-num">1.4.1&nbsp;&nbsp;</span>Create a secondary index</a></span></li></ul></li></ul></li><li><span><a href="#Mapping-Components-of-Select-Statement" data-toc-modified-id="Mapping-Components-of-Select-Statement-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Mapping Components of Select Statement</a></span></li><li><span><a href="#Single-Record-Get" data-toc-modified-id="Single-Record-Get-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Single Record Get</a></span></li><li><span><a href="#Batch-Retrieval" data-toc-modified-id="Batch-Retrieval-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Batch Retrieval</a></span></li><li><span><a href="#Predicate-Based-Retrieval" data-toc-modified-id="Predicate-Based-Retrieval-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Predicate Based Retrieval</a></span><ul class="toc-item"><li><span><a href="#Query-Based-on-Index" data-toc-modified-id="Query-Based-on-Index-5.1"><span class="toc-item-num">5.1&nbsp;&nbsp;</span>Query Based on Index</a></span></li><li><span><a href="#Query-Based-on-Expression-Filter" data-toc-modified-id="Query-Based-on-Expression-Filter-5.2"><span class="toc-item-num">5.2&nbsp;&nbsp;</span>Query Based on Expression Filter</a></span></li><li><span><a href="#Scan-Based-on-Expression-Filter" data-toc-modified-id="Scan-Based-on-Expression-Filter-5.3"><span class="toc-item-num">5.3&nbsp;&nbsp;</span>Scan Based on Expression Filter</a></span></li></ul></li><li><span><a href="#Computed-Fields-with-Server-Function" data-toc-modified-id="Computed-Fields-with-Server-Function-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Computed Fields with Server Function</a></span></li><li><span><a href="#Takeaways-and-Conclusion" data-toc-modified-id="Takeaways-and-Conclusion-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>Takeaways and Conclusion</a></span></li><li><span><a href="#Clean-up" data-toc-modified-id="Clean-up-8"><span class="toc-item-num">8&nbsp;&nbsp;</span>Clean up</a></span></li><li><span><a href="#Further-Exploration-and-Resources" data-toc-modified-id="Further-Exploration-and-Resources-9"><span class="toc-item-num">9&nbsp;&nbsp;</span>Further Exploration and Resources</a></span><ul class="toc-item"><li><span><a href="#Next-steps" data-toc-modified-id="Next-steps-9.1"><span class="toc-item-num">9.1&nbsp;&nbsp;</span>Next steps</a></span></li></ul></li></ul></div>

# Implementing SQL Operations: Select
This tutorial describes how to implement certain SQL's Select statements in Aerospike.

This notebook requires Aerospike datbase running on localhost and that python and the Aerospike python client have been installed (`pip install aerospike`). Visit [Aerospike notebooks repo](https://github.com/aerospike-examples/interactive-notebooks) for additional details and the docker container.

## Introduction
In this notebook, we will see how specific Select statements in SQL can be implemented in Aerospike. 

SQL is widely known data access language. If you have used SQL in the past, the examples in this notebook will make it easy to translate specific SQL Select statements. 

This notebook is the first in the SQL Operations series that consists of the following topics:
- Implementing SQL Operations: Select
- Implementing SQL Operations: Update
- Implementing SQL Operations: Create and Delete
- Implementing SQL Operations: Aggregates

The specific topics and SQL Select statements we will discuss include:
- Columns, tables, and predicates
- Single record: Select binNames from namespace.set where id = key
- Batch retrieval: Select binNames from namespace.set where id in key-list
- Predicate based retrival: Select binNames from namespace.set where predicate
    - Index based query
    - Expression based query
    - Expression based scan
- Computed fields: Select function(\*) from namespace.set where id = key

Aerospike provides both synchronous and asynchronous execution modes for many operations. In this notebook, we will use mostly synchronous execution mode. Asynchronous execution is a topic for a future tutorial.

## Prerequisites
This tutorial assumes familiarity with the following topics:
- [Hello World](hello_world.ipynb)
- [Aerospike Basic Operations](basic_operations.ipynb)

## Initialization

### Ensure database is running
This notebook requires that Aerospike datbase is running. 
[Include the right code cell for Java or Python from the two cells below.] 

In [35]:
import io.github.spencerpark.ijava.IJava;
import io.github.spencerpark.jupyter.kernel.magic.common.Shell;
IJava.getKernelInstance().getMagics().registerMagics(Shell.class);
%sh asd

### Download and install additional components.
Install the Java client.

In [36]:
%%loadFromPOM
<dependencies>
  <dependency>
    <groupId>com.aerospike</groupId>
    <artifactId>aerospike-client</artifactId>
    <version>5.0.0</version>
  </dependency>
</dependencies>

## Connect to database and populate test data
The test data has ten records with user-key "id-1" through "id-10", two bins (fields) "bin1" and "bin2", in the namespace "test" and sets "sql-select-small"and null, and similarly structured 1000 records in set "sql-select-large". 

In [37]:
import com.aerospike.client.AerospikeClient;
import com.aerospike.client.Bin;
import com.aerospike.client.Key;
import com.aerospike.client.policy.ClientPolicy;

AerospikeClient client = new AerospikeClient("localhost", 3000);
System.out.println("Initialized the client and connected to the cluster.");

String Namespace = "test";
String SmallSet = "sql-select-small";
String LargeSet = "sql-select-large";
String NullSet = "";

ClientPolicy policy = new ClientPolicy();
for (int i = 1; i <= 10; i++) {
    Key key = new Key(Namespace, SmallSet, "id-"+i);
    Bin bin1 = new Bin(new String("bin1"), i);
    Bin bin2 = new Bin(new String("bin2"), 1000+i);
    client.put(policy.writePolicyDefault, key, bin1, bin2);
}
for (int i = 1; i <= 10; i++) {
    Key key = new Key(Namespace, NullSet, "id-"+i);
    Bin bin1 = new Bin(new String("bin1"), i);
    Bin bin2 = new Bin(new String("bin2"), 1000+i);
    client.put(policy.writePolicyDefault, key, bin1, bin2);
}
for (int i = 1; i <= 1000; i++) {
    Key key = new Key(Namespace, LargeSet, "id-"+i);
    Bin bin1 = new Bin(new String("bin1"), i);
    Bin bin2 = new Bin(new String("bin2"), 1000+i);
    client.put(policy.writePolicyDefault, key, bin1, bin2);
}

System.out.format("Test data popuated");

Initialized the client and connected to the cluster.
Test data popuated

java.io.PrintStream@731fdcb9

### Create a secondary index
To use the query API, a secondary index must exist on the query field. We create a numeric index on "bin1" in "sql-select-small" set.

In [38]:
import com.aerospike.client.policy.Policy;
import com.aerospike.client.query.IndexType;
import com.aerospike.client.task.IndexTask;
import com.aerospike.client.AerospikeException;
import com.aerospike.client.ResultCode;

String IndexName = "test_small_bin1_number_idx";

Policy policy = new Policy();
policy.socketTimeout = 0; // Do not timeout on index create.

try {
    IndexTask task = client.createIndex(policy, Namespace, SmallSet, IndexName, 
                                        "bin1", IndexType.NUMERIC);
    task.waitTillComplete();
}
catch (AerospikeException ae) {
    if (ae.getResultCode() != ResultCode.INDEX_ALREADY_EXISTS) {
        throw ae;
    }
}

System.out.format("Created index %s on ns=%s set=%s bin=%s.", 
                                    IndexName, Namespace, SmallSet, "bin1");

Created index test_small_bin1_number_idx on ns=test set=sql-select-small bin=bin1.

java.io.PrintStream@731fdcb9

# Mapping Components of Select Statement
**Columns, tables, and predicates**

In Aerospike, a relational database or schema maps to a namespace, a table maps to a set, a column maps to a bin. Thus a query "select columns from table where predicate" can be written in Aerospike terminology as "select bins from namespace.set where predicate".

**Record id**

Records are stored in a namespace, organized in sets, and each record is uniquely identfied by a key or id of the record that consists of a triple: (namespce, set, user-key) where user-key is a unique id within the set. Since the key is always identified with a record, it can be considered as a metadata or the primary key field, and is returned in all retrieval APIs.

**A word on Policy**: All APIs take a Policy argument. A policy contains many request parameters such as the timeout and maximum retries, as well as operations modifiers such as an expression filter. 

# Single Record Get
**Select bins from namespace.set where id = key**

Let's start with a simple example of a single record retrieval using its key. You can either get the entire record or specific bins.

Select * from namespace.set where id = key
- Record Client::get(Policy policy, Key key)

Select binNames from namsepace.set where id = key
- Record Client::get(Policy policy, Key key, String... binNames)

# Batch Retrieval
**Select column-list from table where id in key-list**

A batch operations operates on a list of records idenfied by the keys provided. This works similar to a single record retrieval, except multiple keys are specified.

**Select * from namespace.set where id in key-list**
- Record[] get(BatchPolicy policy, Key[] keys)

**Select binNames from namespace.set where id in key-list**
- Record[] get(BatchPolicy policy, Key[] keys, String... binNames)

A more general form of batch reads is also available that provides a union of simple batch results. It populates the argument records on return.

**Select (select binNames from namespace1.set1 where id in key-list) union ((select binNames from namespace2.set2 where id in key-list) ...**
- void get(BatchPolicy policy, List\<BatchRead\> records)


# Predicate Based Retrieval
In these operations, records matching a general predicate (filter) are retrieved.

**Select binNames from namespace.set where predicate**

There are multiple ways of performing such a query in Aerospike.
- Query using an index and/or expression filter
- Scan using an expression filter

To leverage an index, one must use a query operation. A query allows using both an index predicate and an expression filter. An expression filter may be used in place of an "index predicate", but it will not perform as well. While using only an expression filter, either a query or a scan may be used. An expression filter is specified in the policy, and is generally applicable to filter records beyond query and scan. 

## Query Based on Index
Record[] query(QueryPolicy policy, Statement statement)

## Query Based on Expression Filter
Record[] query(QueryPolicy policy, Statement statement)

## Scan Based on Expression Filter
The scan operation takes a callback object which is called for every record in the result.

void scanAll(ScanPolicy policy, String namespace, String setName, ScanCallback callback, String... binNames)

# Computed Fields with Server Function
This is a single record retrieval operation. An arbitrary function registed on the server (UDF) is invoked on a specified record. The API returns a generic Object which can be anything like a single value or a dictionary.

**Select function(\*) from namespace.set where id = key**
- Object execute(WritePolicy policy, Key key, String packageName, String functionName, Value... functionArgs)

# Takeaways and Conclusion
Many developers that are familiar with SQL would like to see how SQL operations translate to Aeropsike. In this tutorial, we looked at how to implement various Select statements. This should prove generally useful irrespective of SQL knowledge. While the examples here are synchronous execution, most can also be performed asynchronously.

# Clean up
Remove tutorial data and close connection.

In [39]:
client.dropIndex(null, Namespace, SmallSet, IndexName);
client.truncate(null, Namespace, null, null);
client.close();
System.out.println("Removed tutorial data and server connection closed.");

Removed tutorial data and server connection closed.


# Further Exploration and Resources
Here are some links for further exploration

Resources
- Related notebooks
    - [Queries](https://github.com/aerospike/aerospike-dev-notebooks.docker/blob/main/notebooks/python/query.ipynb)
- Developer Hub
    - [Java Developers Resources](https://developer.aerospike.com/java-developers)
- Github repos
    - [Java code examples](https://github.com/aerospike/aerospike-client-java/tree/master/examples/src/com/aerospike/examples)
- Documentation
    - [Java Client](https://www.aerospike.com/docs/client/java/index.html)
    - [Java API Reference](https://www.aerospike.com/apidocs/java/)

## Next steps

Visit [Aerospike notebooks repo](https://github.com/aerospike-examples/interactive-notebooks) to run additional Aerospike notebooks. To run a different notebook, download the notebook from the repo to your local machine, and then click on File->Open, and select Upload.