# Using the Platform's NoSQL (Key-Value) API

The platform's NoSQL (a.k.a. "key-value"/"KV") API provides access to the NoSQL database service, which enables storing and consuming data in a tabular format.
For more information, see the platform's [NoSQL-databases](https://www.iguazio.com/docs/v3.0/data-layer/nosql/) documentation.

## Initialize

In [1]:
import v3io.dataplane

Create a dataplane client

In [2]:
v3io_client = v3io.dataplane.Client()

> **Note**: You can pass to the client the `endpoint` and `access_key` parameters explicitly.
> The following code is equivalent to the default values:
>
> ``` python
> from os import getenv
> v3io_client = v3io.dataplane.Client(endpoint='http://v3io-webapi:8081',
>                                     access_key=getenv('V3IO_ACCESS_KEY'))
> ```
>
> When running the code remotely, you can obtain the URL of your cluster by copying the API URL of the web-APIs service (`webapi`) from the **Services** dashboard page. You can select between two types of URLs:
>
> - **HTTPS Direct** (recommended) &mdash; a URL of the format `https://<tenant IP>:<web-APIs port>`; for example, `https://default-tenant.app.mycluster.iguazio.com:8443`.
> - **HTTPS** &mdash; a URL of the format `https://webapi.<tenant IP>`; for example, `https://webapi.default-tenant.app.mycluster.iguazio.com`.
>
> For more information see the platform's [Data-Service Web-API General Structure](https://www.iguazio.com/docs/v3.0/data-layer/reference/web-apis/data-service-web-api-gen-struct/) documentation.

> **Number of maximum parallel connections**: Another noteworthy parameter is `max_connections`, which defines the number of maximum parallel connections when performing batch operations.
> If left unspecified, the default is 8 connections.
> For more information see the [Write Multiple Items](#Write-Multiple-Items) section in this tutorial.

### Set the Data Path

All data in the platform is stored in user-defined data containers.
This tutorial uses the predefined "users" container.
For more information refer to the platform's [data-containers](https://www.iguazio.com/docs/v3.0/data-layer/containers/) documentation.

In [3]:
CONTAINER = 'users'

Set the data path for storing the NoSQL (KV) table:

In [4]:
from os import getenv, path

V3IO_USERNAME = getenv('V3IO_USERNAME')
TABLE_PATH = path.join(V3IO_USERNAME, 'examples', 'v3io', 'kv')

## Write an Item

Use the `write` method to create an item with the provided attributes.
If an item with the same name (primary key) already exists in the specified table, the existing item is completely overwritten (replaced with a new item).
If the item or table don't exist, the operation creates them.

> **Note**: NoSQL tables in the platform don't need to be created prior to ingestion.
> When writing data to a NoSQL table, if the table doesn't exit, it's automatically created in the specified path as part of the write operation.

Create an example item:

In [5]:
from datetime import datetime
from array import array
item = {
    'title': "The Godfather",
    'rating': 9.2,
    'release_date': datetime(1972, 3, 24),
    'duration': 175
}

Write to the NoSQL (KV) storage:

In [6]:
print(f'Writing to {TABLE_PATH}')
response = v3io_client.kv.put(container=CONTAINER, table_path=TABLE_PATH, key='tt0068646', attributes=item)
print(f'Status code: {response.status_code}')

Writing to iguazio/examples/v3io/kv
Status code: 200


## Read an Item

Use the `get` method to retrieves the requested attributes of a table item.

In [7]:
response = v3io_client.kv.get(container=CONTAINER, table_path=TABLE_PATH, key='tt0068646')

Print the response output item:

In [8]:
print(response.output.item)

{'title': 'The Godfather', 'rating': 9.2, 'release_date': datetime.datetime(1972, 3, 24, 0, 0), 'duration': 175}


## Update an Item

Use the `update` method to update the attributes of a table item.
If the specified item or table don't exist, the operation creates them.

In [9]:
response = v3io_client.kv.update(container=CONTAINER, table_path=TABLE_PATH, key='tt0068646', attributes={'rating': 9.3})
print(response.status_code)

200


## Delete an Item

In [10]:
response = v3io_client.kv.delete(container=CONTAINER, table_path=TABLE_PATH, key='tt0068646')
print(response.status_code)

204


## Write Multiple Items

To get the highest possible throughput, you can send many requests towards the data layer and wait for all the responses to arrive (rather than send each request and wait for the response).
The SDK supports this through batching.
Any API call can be made through the client's built in `batch` object.
The API call receives the exact same arguments it would normally receive (except for `raise_for_status`), and does not block until the response arrives.
To wait for all pending responses, call the `wait` method of the `batch` object.

> **Note**: The number of parallel connections is determined by the `max_connections` parameter when you created the client.
> For instance, to set 16 parallel connections you should have in the beginning of the notebook `v3io_client = v3io.dataplane.Client(max_connections=16)`.
> The default is 8 connections.

In [11]:
movies = [
{'key': "tt0111161",
 'item': {'title': "The Shawshank Redemption",                          'rating': 9.2, 'release_date': datetime(1994, 10, 14), 'duration': 142}},
{'key': "tt0068646",
 'item': {'title': "The Godfather",                                     'rating': 9.1, 'release_date': datetime(1972, 3, 24),  'duration': 175}},
{'key': "tt0071562",
 'item': {'title': "The Godfather: Part II",                            'rating': 9,   'release_date': datetime(1974, 12, 18), 'duration': 202}},
{'key': "tt0468569",
 'item': {'title': "The Dark Knight",                                   'rating': 9,   'release_date': datetime(2008, 7, 18),  'duration': 152}},
{'key': "tt0050083",
 'item': {'title': "12 Angry Men",                                      'rating': 8.9, 'release_date': datetime(1957, 4, 10),  'duration': 96}},
{'key': "tt0108052",
 'item': {'title': "Schindler's List",                                  'rating': 8.9, 'release_date': datetime(1993, 2, 4),   'duration': 195}},
{'key': "tt0167260",
 'item': {'title': "The Lord of the Rings: The Return of the King",     'rating': 8.9, 'release_date': datetime(2003, 12, 17), 'duration': 201}},
{'key': "tt0110912",
 'item': {'title': "Pulp Fiction",                                      'rating': 8.8, 'release_date': datetime(1994, 10, 14), 'duration': 154}},
{'key': "tt0060196",
 'item': {'title': "The Good, the Bad and the Ugly",                    'rating': 8.8, 'release_date': datetime(1967, 12, 29), 'duration': 178}},
{'key': "tt0120737",
 'item': {'title': "The Lord of the Rings: The Fellowship of the Ring", 'rating': 8.8, 'release_date': datetime(2001, 12, 19), 'duration': 178}}
]

In [12]:
for movie in movies:
    v3io_client.batch.kv.put(container=CONTAINER, table_path=TABLE_PATH, key=movie.get('key'), attributes=movie.get('item'))

# wait for all writes to complete
responses = v3io_client.batch.wait()

The looped `put` interface in the previous code block sends all `put` requests to the data layer in parallel.
When `wait` is called, it blocks until either all responses arrive &mdash; in which case it returns a `Responses` object that contains the `responses` of each call &mdash; or an error occurs &mdash; in which case an exception is thrown.
You can pass `raise_for_status` to `wait`, and it behaves as previously explained.

> **Note:** The `batch` object is stateful, therefore you can only create one batch at a time.
> However, you can create multiple parallel batches yourself through the client's `create_batch` interface.

## Read Multiple Items

Retrieves (reads) attributes of multiple items in a table, according to the specified criteria.

In [13]:
items_cursor = v3io_client.kv.new_cursor(container=CONTAINER,
                                         table_path=TABLE_PATH,
                                         attribute_names=['title', 'rating'],
                                         filter_expression='duration < 170')

for item in items_cursor.all():
    print(item)

{'title': 'Pulp Fiction', 'rating': 8.8}
{'title': '12 Angry Men', 'rating': 8.9}
{'title': 'The Shawshank Redemption', 'rating': 9.2}
{'title': 'The Dark Knight', 'rating': 9}


## Create a Schema (Optional)

To support reading and writing NoSQL data using structured-data interfaces &mdash; such as Spark DataFrames, Presto, and [V3IO Frames](https://www.iguazio.com/docs/v3.0/data-layer/reference/frames/) ("Frames") &mdash; the platform uses a schema file that defines the schema of the data structure.
When writing NoSQL data in the platform using a Spark or Frames DataFrame, the schema of the data table is automatically identified and saved and then retrieved when using a structure-data interface to read data from the same table (unless you explicitly define the schema for the read operation).
However, to use a structure-data interface to read NoSQL data that was not written in this manner, you first need to define the table schema:

In [14]:
fields = [
    {
        'name': 'title',
        'type': 'string',
        'nullable': False
    },
    {
        'name': 'rating',
        'type': 'double',
        'nullable': True        
    },
    {
        'name': 'release_date',
        'type': 'timestamp',
        'nullable': False
    },
    {
        'name': 'duration',
        'type': 'long',
        'nullable': False        
    }
]

In [15]:
repsonse = v3io_client.kv.create_schema(container=CONTAINER, table_path=TABLE_PATH, key='title', fields=fields)
print(response.status_code)

204


Read the KV table using Frames:

In [16]:
import v3io_frames as v3f

v3f_client = v3f.Client('framesd:8081', container=CONTAINER)
v3f_client.read(backend='kv', table=TABLE_PATH)

Unnamed: 0_level_0,duration,rating,release_date
title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
The Dark Knight,152,9.0,2008-07-18 00:00:00+00:00
The Godfather: Part II,202,9.0,1974-12-18 00:00:00+00:00
Schindler's List,195,8.9,1993-02-04 00:00:00+00:00
12 Angry Men,96,8.9,1957-04-10 00:00:00+00:00
The Lord of the Rings: The Fellowship of the Ring,178,8.8,2001-12-19 00:00:00+00:00
The Godfather,175,9.1,1972-03-24 00:00:00+00:00
The Shawshank Redemption,142,9.2,1994-10-14 00:00:00+00:00
The Lord of the Rings: The Return of the King,201,8.9,2003-12-17 00:00:00+00:00
"The Good, the Bad and the Ugly",178,8.8,1967-12-29 00:00:00+00:00
Pulp Fiction,154,8.8,1994-10-14 00:00:00+00:00


## Delete the Table

Currently, most platform APIs don't have a dedicated method for deleting a table.
An exception to this is the V3IO Frames Client class, which supports a delete method for the NoSQL backend; for more information, see the [Frames documentation](https://www.iguazio.com/docs/v3.0/data-layer/reference/frames/).
However, you can use the file-system interface to delete a table directory from the relevant data container:

In [17]:
from os import sep
import shutil
V3IO_TABLE_PATH = path.join(sep, 'v3io', CONTAINER, TABLE_PATH)
shutil.rmtree(V3IO_TABLE_PATH)

Alternatively you can use the following commands:
```
!rm -r $V3IO_TABLE_PATH
```