# Platform SDK - Quick Start
This SDK allows you to interact with the platform components programatically.

Every method of the SDK contains an argument called `namespace` that allows you to overide the `namespace` where you are in.

### What can i do?!

* Access your DataSources and create a connector directly from it
* Access your Synthesizers and retrain it
* Generate a sample in a synthesiser 

### Which modules are available?

* Datasources
* Synthesizers

## DataSources Module

### What is a DataSource?

A DataSource is an abstract class that exposes methods to access a connector and from that the data from that connector.

It's from the responsability of each data source to implement those methods and the behaviour, but all of them returns a Dataset. Through the SDK the following datasource actions are possible:

Datasources:

- **DataSources.list** - List of all available datasources within the user namespace
- **DataSources.get** - Get a particular datasource details like data type (tabular or time-series), metadata and name.

Datasource
- **datasource.metadata** - Get the metadata from a selected datasource. This includes column, variable and datatypes information.
- **datasource.read** - Read all data from a selected datasource. This method returns a Dataset object
- **datasource.read_sample** - Read a sample of size n from a selected datasource. This method returns a Dataset object.

In [1]:
from ydata.platform.datasources import DataSources

### How to list my datasources?

In [2]:
my_datasources = DataSources.list()
print(my_datasources)

[GoogleCloudStorageDataSource(uid=1ee82408-5c7f-41b9-90fd-c0437b87e137, data_type=DataType.TABULAR, file_type=FileType.CSV, path=gs://ydata_testdata/stress_test_data/cardio_stress/*.csv), GoogleCloudStorageDataSource(uid=2d2981da-0f18-4622-8f89-dae9e27be838, data_type=DataType.TABULAR, file_type=FileType.CSV, path=gs://ydata_testdata/tabular/diamonds/data.csv), GoogleCloudStorageDataSource(uid=2dc592ff-3eab-487d-9f4d-bee1dddd2e8f, data_type=DataType.TIMESERIES, file_type=FileType.CSV, path=gs://ydata_testdata/timeseries/energy_spot_prices/spot_prices.csv), GoogleCloudStorageDataSource(uid=5333c910-6185-4290-a2fc-6b31659938d0, data_type=DataType.TIMESERIES, file_type=FileType.CSV, path=gs://ydata_testdata/timeseries/gas/20160930_203718.csv), GoogleCloudStorageDataSource(uid=5759d69f-e127-419d-a382-71d2cc01025a, data_type=DataType.TIMESERIES, file_type=FileType.CSV, path=gs://ydata_testdata/timeseries/paysim/data_with_timestamps.csv), AWSS3DataSource(uid=62f9e9c1-bdb8-4648-b3ef-cf253e517

### How to access a specific datasource?

In [4]:
datasource = DataSources.get('2d2981da-0f18-4622-8f89-dae9e27be838')

print(f'Data Type {datasource.data_type}')
print(f'\nMetadata {datasource.metadata}')
print(f'\nName {datasource.name}')

Data Type DataType.TABULAR

Metadata columns=[Column(name='carat', data_type='categorical', var_type='float'), Column(name='cut', data_type='categorical', var_type='string'), Column(name='color', data_type='categorical', var_type='string'), Column(name='clarity', data_type='categorical', var_type='string'), Column(name='depth', data_type='categorical', var_type='float'), Column(name='table', data_type='categorical', var_type='float'), Column(name='price', data_type='numerical', var_type='int'), Column(name='x', data_type='categorical', var_type='float'), Column(name='y', data_type='categorical', var_type='float'), Column(name='z', data_type='categorical', var_type='float')]

Name Diamonds


### How to access data?

In [5]:
# connector = datasource.connector
# print(f'Connector: {connector}')

sample_data = datasource.read_sample()
print(sample_data)

# full_data = datasource.read()

[1mDataset 
 
[0m[1mShape: [0m(10000, 10)
[1mSchema: [0m
    Column Variable type
0    carat         float
1      cut        string
2    color        string
3  clarity        string
4    depth         float
5    table         float
6    price           int
7        x         float
8        y         float
9        z         float




## Synthesizers Module

The platform SDK synthesizers module allows to interact and consume any of the synthesizer elements created at the level of the UI. Through the SDK the user is able to perform the following actions:

Synthesizers
- **synthesizers.list** - List the synthesizers available within the user namespace
- **synthesizers.get** - Access a particular synthesizer. The user is able to get the synthesizer training status and associated dataset metadata

Synthesizer & Samples
- **synthesizer.list_sample** - Access the history of samples generated using the synth
- **synthesizer.sample** - Generate a sample with n records
- **synthesizer.get_sample** - Access to a particular sample previously generated through the uuid

In [6]:
from ydata.platform.synthesizers import Synthesizers

### How to list my synthesizers?

In [12]:
synthesizers = Synthesizers.list()
synthesizers

[Synthesizer(uid='0096e03c-4a0f-466d-bb47-ff1e8b972c0b', name='Diamonds', data_source_uid='2d2981da-0f18-4622-8f89-dae9e27be838', date=datetime.datetime(2022, 7, 6, 21, 56, 22, tzinfo=datetime.timezone.utc), status=<Status.failed: 'failed'>, metadata=Metadata(columns=[Column(name='carat', data_type='numerical', var_type='float', generation=True), Column(name='cut', data_type='categorical', var_type='string', generation=True), Column(name='color', data_type='categorical', var_type='string', generation=True), Column(name='clarity', data_type='categorical', var_type='string', generation=True), Column(name='depth', data_type='categorical', var_type='float', generation=True), Column(name='table', data_type='categorical', var_type='float', generation=True), Column(name='price', data_type='numerical', var_type='int', generation=True), Column(name='x', data_type='categorical', var_type='float', generation=True), Column(name='y', data_type='categorical', var_type='float', generation=True), Colu

### How to access a specific synthesizer?

In [8]:
synthesizer = Synthesizers.get('44479a21-bc43-49c4-8e82-b52858e11009')
print(f'status {synthesizer.status}')
print(f'metadata {synthesizer.metadata}')

status Status.finished
metadata columns=[Column(name='Column_0', data_type='categorical', var_type='int', generation=True), Column(name='Column_1', data_type='categorical', var_type='string', generation=True), Column(name='Column_2', data_type='numerical', var_type='int', generation=True), Column(name='Column_3', data_type='categorical', var_type='string', generation=True), Column(name='Column_4', data_type='categorical', var_type='int', generation=True), Column(name='Column_5', data_type='categorical', var_type='string', generation=True), Column(name='Column_6', data_type='categorical', var_type='string', generation=True), Column(name='Column_7', data_type='categorical', var_type='string', generation=True), Column(name='Column_8', data_type='categorical', var_type='string', generation=True), Column(name='Column_9', data_type='categorical', var_type='string', generation=True), Column(name='Column_10', data_type='categorical', var_type='int', generation=True), Column(name='Column_11', d

### Generating a sample from the synthesizer

In [7]:
sample = synthesizer.sample(200)

print(f'generated sample \n{sample}')

generated sample 
uid='3a940d5b-0f88-4fa2-8539-052b45593120' author='luis.portela@ydata.ai' total_records=200 link=''


### List samples for a synthesizer

In [9]:
samples = synthesizer.list_samples()
print(samples)

[Sample(uid='9cbb2893-d95c-43ea-8032-42c29c14e08c', author='fabiana.clemente@ydata.ai', total_records=50000, link='')]


## Access to a specific sample

In [10]:
sample = synthesizer.get_sample('9cbb2893-d95c-43ea-8032-42c29c14e08c')
print(sample)

uid='9cbb2893-d95c-43ea-8032-42c29c14e08c' author='fabiana.clemente@ydata.ai' total_records=50000 link=''
