# Google Cloud Storage Connector - Quick Start

The CGS connector enables you to read/write data within the Google Cloud Storage with ease and integrate it with YData's platform.
Reading a dataset from GCS directly into a YData's `Dataset` allows its usage for Data Quality, Data Synthetisation and Preprocessing blocks.

The following tutorial covers:
- How to read data from GCS
- How to read data (sample) from GCS
- How to write data to GCS

In [None]:
# Import the necessary packages
from ydata.connectors import GCSConnector
from ydata.connectors.filetype import FileType
from ydata.utils.formats import read_json

In [None]:
# Load your credentials from a file
token = read_json('{insert-path-to-credentials}')

In [None]:
# Instantiate the Connector
connector = GCSConnector(project_id=token['project_id'], keyfile_dict=token)

In [None]:
# Load a dataset
# The file_type argument is optional. If not provided, we will infer it from the path you have provided.
data = connector.read_file('gs://{insert-bucket}/{insert-filepath}', file_type=FileType.CSV)

In [None]:
# For a quick glimpse, we can load a small subset of the data (e.g. 1%)
small_data = connector.read_sample('gs://{insert-bucket}/{insert-filepath}', sample_size=0.01)

In [None]:
# We could alternatively define a specific number of rows
very_small_data = connector.read_sample('gs://{insert-bucket}/{insert-filepath}', sample_size=67)

In [None]:
# Now imagine we want to store the sampled data.
connector.write_file(small_data, 'gs://{insert-bucket}/{insert-filepath}')

In [None]:
# Alternatively, we can write a new Dataframe 
from pandas.util.testing import makeDataFrame
dummy_df = makeDataFrame()
connector.write_file(dummy_df, 'gs://{insert-bucket}/{insert-filepath}', write_index=True)

## Advanced
Advanced features enable you to manage Google Cloud Storage directly through the connector.

In [None]:
# Delete a specific blob
connector.delete_blob_if_exists('gs://{insert-bucket}/{insert-filepath}')

In [None]:
# List the contents under a given bucket
connector.ls('gs://{insert-bucket}/')

In [None]:
# List the contents under a given bucket
connector.ls('gs://{insert-bucket}/{insert-path}')