# Using the Platform's Data-Object API

The platform's Simple-Object API enables performing simple data-object operations that resembles Amazon’s Simple Storage Service (S3) API.
In addition to the S3-like capabilities, the Simple-Object Web API enables appending data to existing objects.

## Initialize

In [1]:
import v3io.dataplane

Create a `dataplane` client:

In [2]:
v3io_client = v3io.dataplane.Client()

> **Note**: You can pass to the client the `endpoint` and `access_key` parameters explicitly.
> The following code is equivalent to the default values:
>
> ``` python
> from os import getenv
> v3io_client = v3io.dataplane.Client(endpoint='http://v3io-webapi:8081',
>                                     access_key=getenv('V3IO_ACCESS_KEY'))
> ```
>
> When running python code on local machine that connects to a remote Iguazio platform, you can obtain the URL of your cluster by copying the API URL of the web-APIs service (`webapi`) from the **Services** dashboard page. You can select between two types of URLs:
>
> - **HTTPS Direct** (recommended) &mdash; a URL of the format `https://<tenant IP>:<web-APIs port>`; for example, `https://default-tenant.app.mycluster.iguazio.com:8443`.
> - **HTTPS** &mdash; a URL of the format `https://webapi.<tenant IP>`; for example, `https://webapi.default-tenant.app.mycluster.iguazio.com`.
>
> You can get the access key from the platform dashboard: select the user-profile picture or icon from the top right corner of any page, and select **Access Keys** from the menu. In the **Access Keys** window, either copy an existing access key or create a new key and copy it. Alternatively, you can get the access key by checking the value of the `V3IO_ACCESS_KEY` environment variable in a web-shell or Jupyter Notebook service.
>
> For more information see the platform's [Data-Service Web-API General Structure](https://www.iguazio.com/docs/latest-release/data-layer/reference/web-apis/data-service-web-api-gen-struct/) documentation.

> **Number of maximum parallel connections**: Another noteworthy parameter is `max_connections`, which defines the number of maximum parallel connections when performing batch operations.
> If left unspecified, the default is 8 connections.
> For more information see the [Put Multiple Objects](#Put-Multiple-Objects) section in this tutorial.

### Set the Data Path

All data in the platform is stored in user-defined data containers.
This tutorial uses the predefined "users" container.
For more information refer to the platform's [data-containers](https://www.iguazio.com/docs/latest-release/data-layer/containers/) documentation.

In [3]:
CONTAINER = 'users'

Set the data path for storing the objects:

> **Note**: The following code uses the `V3IO_USERNAME` environment variable to store the data in the current user folder. When running python code on local machine that connects to a remote Iguazio platform, you should set this value to the user name you use for logging in to the system. Alternatively, you can get the user name by checking the value of the `V3IO_USERNAME` environment variable in a web-shell or Jupyter Notebook service.

In [4]:
from os import getenv, path

V3IO_USERNAME = getenv('V3IO_USERNAME')
OBJECTS_PATH = path.join(V3IO_USERNAME, 'data', 'v3io', 'objects')

## Put Object

Use the `put` method to adds a new object:

In [5]:
text = "It was the best of times,\n\
it was the worst of times,\n\
it was the age of wisdom,\n\
it was the age of foolishness,\n\
it was the epoch of belief,\n\
it was the epoch of incredulity,\n\
" 

In [6]:
OBJECT = path.join(OBJECTS_PATH, 'The Period.txt')
print(f'Writing to {OBJECT}')
response = v3io_client.object.put(container=CONTAINER, path=OBJECT, body=text)
print(f'Status code: {response.status_code}')

Writing to iguazio/data/v3io/objects/The Period.txt
Status code: 200


## Get Object

Use the `get` method to retrieve an object:

In [7]:
response = v3io_client.object.get(container=CONTAINER, path=OBJECT)
print(response.body.decode('utf-8'))

It was the best of times,
it was the worst of times,
it was the age of wisdom,
it was the age of foolishness,
it was the epoch of belief,
it was the epoch of incredulity,



## Append

You can also use the `put` to append data to an existing object.

> **Note**: The option to append data extends the capabilities of the AWS S3 `PUT Object` operation.

In [8]:
text2="it was the season of Light,\n\
it was the season of Darkness,\n\
it was the spring of hope,\n\
it was the winter of despair,\n\
"

In [9]:
response = v3io_client.object.put(container=CONTAINER, path=OBJECT, body=text2, append=True)
print(f'Status code: {response.status_code}')

Status code: 200


In [10]:
response = v3io_client.object.get(container=CONTAINER, path=OBJECT)
print(response.body.decode('utf-8'))

It was the best of times,
it was the worst of times,
it was the age of wisdom,
it was the age of foolishness,
it was the epoch of belief,
it was the epoch of incredulity,
it was the season of Light,
it was the season of Darkness,
it was the spring of hope,
it was the winter of despair,



## Delete Object

Use the `delete` method to delete an object:

In [11]:
response = v3io_client.object.delete(container=CONTAINER, path=OBJECT)
print(response.status_code)

204


## Put Multiple Objects

One way to increase performance is to send many requests towards the data layer and wait for all the responses to arrive (rather than send each request and wait for the response).
The SDK supports this through batching.
Any API call can be made through the client's built in `batch` object.
The API call receives the exact same arguments it would normally receive (except for `raise_for_status`), and does not block until the response arrives.
To wait for all pending responses, call the `wait` method of the `batch` object.

> **Note**: The number of parallel connections is determined by the `max_connections` parameter when you created the client. For instance, to set 16 parallel connections you should have in the beginning of the notebook `v3io_client = v3io.dataplane.Client(max_connections=16)`. The default is 8 connections.

> **Note**: The SDK also supports asynchronous API, which may also be useful to put multple objects. This capability is not demonstrated here, but you can read about it in the [v3io-py readme](https://github.com/v3io/v3io-py/blob/development/README.md#support-for-asyncio-experimental).

In [12]:
# Template of word sequence

nouns = ['time', 'person', 'year', 'way', 'day', 'thing', 'man', 'world', 'life', 'hand', 'part', 'child', 'eye', 'woman', 'place', 'work', 'week', 'case', 'point', 'government', 'company', 'number', 'group', 'problem', 'fact']
adjectives = ['good', 'new', 'first', 'last', 'long', 'great', 'little', 'own', 'other', 'old', 'right', 'big', 'high', 'different', 'small', 'large', 'next', 'early', 'young', 'important', 'few', 'public', 'bad', 'same', 'able']
prepositions = ['to', 'of', 'in', 'for', 'on', 'with', 'at', 'by', 'from', 'up', 'about', 'into', 'over', 'after']
others = ['the', 'that', 'this', 'my', 'one']

sequence = [nouns, prepositions, others, adjectives, nouns]

In [13]:
import random

random.seed(42)

# Generate a sequence of words

for i in range(10):
    generated_text = " ".join([random.choice(values) for values in sequence])
    print(generated_text)
    v3io_client.batch.object.put(container=CONTAINER, path=path.join(OBJECTS_PATH, f'obj_{i:02}'), body=generated_text)

# Wait for all writes to complete
responses = v3io_client.batch.wait()

company of the same life
world for that same way
number into one first point
woman to the first man
world from one good case
man into one different world
place up this good fact
thing into my right life
day for this last year
eye of this big government


The looped `put` interface in the previous code block sends all `put` requests to the data layer in parallel.
When `wait` is called, it blocks until either all responses arrive &mdash; in which case it returns a `Responses` object that contains the `responses` of each call &mdash; or an error occurs &mdash; in which case an exception is thrown.
You can pass `raise_for_status` to `wait`, and it behaves as previously explained.

> **Note:** The `batch` object is stateful, therefore you can only create one batch at a time.
> However, you can create multiple parallel batches yourself through the client's `create_batch` interface.

Display the contents of the first object:

In [14]:
response = v3io_client.object.get(container=CONTAINER, path=path.join(OBJECTS_PATH, 'obj_00'))
print(response.body.decode('utf-8'))

company of the same life


Query the container to list the objects:

In [15]:
response = v3io_client.container.list(container=CONTAINER, path=OBJECTS_PATH)

for content in response.output.contents:
    print(content.key)

iguazio/data/v3io/objects/obj_00
iguazio/data/v3io/objects/obj_01
iguazio/data/v3io/objects/obj_02
iguazio/data/v3io/objects/obj_03
iguazio/data/v3io/objects/obj_04
iguazio/data/v3io/objects/obj_05
iguazio/data/v3io/objects/obj_06
iguazio/data/v3io/objects/obj_07
iguazio/data/v3io/objects/obj_08
iguazio/data/v3io/objects/obj_09


## Delete the Objects

When running on the Iguazio platform (not from remote), you can use the file-system interface to delete a objects directory from the relevant data container:

In [16]:
import shutil
from os import sep

V3IO_OBJECTS_PATH = path.join(sep, 'v3io', CONTAINER, OBJECTS_PATH)

shutil.rmtree(V3IO_OBJECTS_PATH)

Alternatively, you can use the following command:

In [None]:
!rm -r $V3IO_OBJECTS_PATH