Skip to content

Latest commit

 

History

History
137 lines (92 loc) · 6.82 KB

README.md

File metadata and controls

137 lines (92 loc) · 6.82 KB

AIS Python SDK

AIS Python SDK provides a (growing) set of client-side APIs to access and utilize AIS clusters, buckets, and objects.

The project is, essentially, a Python port of the AIS Go APIs, with additional objectives that prioritize utmost convenience for Python developers.

Note that only Python 3.x (version 3.6 or later) is currently supported.


Installation

Install as a Package

The latest AIS release can be easily installed either with Anaconda or pip:

$ conda install aistore
$ pip install aistore

Install From Source

If you'd like to work with the current upstream (and don't mind the risk), install the latest master directly from GitHub:

$ git clone https://github.com/NVIDIA/aistore.git

$ cd aistore/python/

# upgrade pip to latest version
$ python -m pip install --upgrade pip       

# install dependencies 
$ pip install -r aistore/common_requirements

$ pip install -e .

Quick Start

In order to interact with your running AIS instance, you will need to create a client object:

from aistore.sdk import Client

client = Client("http://localhost:8080")

The newly created client object can be used to interact with your AIS cluster, buckets, and objects. See the examples and the reference docs for more details

External Cloud Storage Buckets

AIS supports a number of different backend providers or, simply, backends.

For exact definitions and related capabilities, please see terminology.

Many bucket/object operations support remote cloud buckets (third-party backend-based cloud buckets), including a few of the operations shown above. To interact with remote cloud buckets, you need to specify the provider of choice when instantiating your bucket object as follows:

# Head AWS bucket
client.bucket("my-aws-bucket", provider="aws").head()
# Evict GCP bucket
client.bucket("my-gcp-bucket", provider="gcp").evict()
# Get object from Azure bucket
client.bucket("my-azure-bucket", provider="azure").object("filename.ext").get()
# List objects in AWS bucket
client.bucket("my-aws-bucket", provider="aws").list_objects()

Please note that certain operations do not support external cloud storage buckets. Please refer to the SDK reference documentation for more information on which bucket/object operations support remote cloud buckets, as well as general information on class and method usage.


HTTPS

The SDK supports HTTPS connectivity if the AIS cluster is configured to use HTTPS. To start using HTTPS:

  1. Set up HTTPS on your cluster: Guide for K8s cluster
  2. If using a self-signed certificate with your own CA, copy the CA certificate to your local machine. If using our built-in cert-manager config to generate your certificates, you can use our playbook
  3. Options to configure the SDK for HTTPS connectivity:
    • Skip verification (for testing, insecure):
      • client = Client(skip_verify=True)
    • Point the SDK to use your certificate using one of the below methods:
      • Pass an argument to the path of the certificate when creating the client:
        • client = Client(ca_cert=/path/to/cert)
      • Use the environment variable
        • Set AIS_SERVER_CRT to the path of your certificate before initializing the client
    • If your AIS cluster is using a certificate signed by a trusted CA, the client will default to using verification without needing to provide a CA cert.

ETLs

AIStore also supports ETLs, short for Extract-Transform-Load. ETLs with AIS are beneficial given that the transformations occur locally, which largely contributes to the linear scalability of AIS.

Note: AIS-ETL requires Kubernetes. For more information on deploying AIStore with Kubernetes (or Minikube), refer here.

Check out the provided examples to learn more about working with AIS ETL.


API Documentation

Module Summary
api.py Contains Client class, which has methods for making HTTP requests to an AIStore server. Includes factory constructors for Bucket, Cluster, and Job classes.
cluster.py Contains Cluster class that represents a cluster bound to a client and contains all cluster-related operations, including checking the cluster's health and retrieving vital cluster information.
bucket.py Contains Bucket class that represents a bucket in an AIS cluster and contains all bucket-related operations, including (but not limited to) creating, deleting, evicting, renaming, copying.
object.py Contains class Object that represents an object belonging to a bucket in an AIS cluster, and contains all object-related operations, including (but not limited to) retreiving, adding and deleting objects.
object_group.py Contains class ObjectGroup, representing a collection of objects belonging to a bucket in an AIS cluster. Includes all multi-object operations such as deleting, evicting, prefetching, copying, and transforming objects.
job.py Contains class Job and all job-related operations.
dsort.py Contains class Dsort and all dsort-related operations.
etl.py Contains class Etl and all ETL-related operations.

For more information on SDK usage, refer to the SDK reference documentation or see the examples here.

References