# Introduction

In this notebook we provide a simple example of how to use the KaggleStorageClient.

Please note, that to make this notebook run properly, this repository must have been installed as a python package.

In order to use the kaggle credentials you must have them properly configured.

## Setting everything up

In [1]:
import os
import pandas as pd

from kaggle_storage_client import KaggleStorageClient

In [2]:
client = KaggleStorageClient()

## Creating the dummiest of datasets

In [3]:
df = pd.DataFrame([{
    'A': 0,
    'B': 1
}])

## Uploading by using the file path

First we will be creating an example directory and saving our dataset there, then we will upload it to kaggle using `add_upload`, and passing it only the file path.

In [5]:
example_dir = 'example'
file_path = os.path.join(example_dir, 'test.csv')
os.mkdir(example_dir)
df.to_csv(file_path)

In [7]:
dataset = 'example-dataset' # Name of our kaggle dataset.
kaggle_file_name = 'example.csv' # Name of your file in the local storage and kaggle dataset.
client.add_upload(dataset,kaggle_file_name , file_path)

upload -dataset example-dataset -filepath data/manuelalvarez/example-dataset/example.csv -folder data/manuelalvarez/example-dataset
remote-status of manuelalvarez/example-dataset is None
Data package template written to: data/manuelalvarez/example-dataset/dataset-metadata.json
{"title": "Example Dataset", "id": "manuelalvarez/example-dataset", "licenses": [{"name": "CC0-1.0"}]}
create data/manuelalvarez/example-dataset
data/manuelalvarez/example-dataset
  f> example.csv
  >> [L001] ,A,B
  >> [L002] 0,0,1
  f> dataset-metadata.json
  >> [L001] {"title": "Example Dataset", "id": "manuelalvarez/example-dataset", "licenses": [{"name": "CC0-1.0"}]}
Starting upload for file example.csv


100%|██████████| 11.0/11.0 [00:03<00:00, 3.25B/s]


Upload successful: example.csv (11B)
Your private Dataset is being created. Please check progress at https://www.kaggle.com/manuelalvarez/example-dataset


In [8]:
# Cleaning everything

os.remove(file_path)
os.rmdir(example_dir)

# Uploading using a DataFrame

Now we are going to upload the same file, but with a different name using the DataFrame uploading, to do so,we need to define a new argument, `content_call`.

In [9]:
dataset = 'example-dataset' # Name of our kaggle dataset.
kaggle_file_name = 'example2.csv' # Name of your file in the local storage and kaggle dataset.
call = ('to_csv', [], {'header': True, 'index':False})

client.add_upload(dataset, kaggle_file_name, df, call)

upload -dataset example-dataset -filepath data/manuelalvarez/example-dataset/example2.csv -folder data/manuelalvarez/example-dataset
remote-status of manuelalvarez/example-dataset is ready
{"title": "Example Dataset", "id": "manuelalvarez/example-dataset", "licenses": [{"name": "CC0-1.0"}]}
sync data/manuelalvarez/example-dataset
data/manuelalvarez/example-dataset
  f> example2.csv
  >> [L001] A,B
  >> [L002] 0,1
  f> example.csv
  >> [L001] ,A,B
  >> [L002] 0,0,1
  f> dataset-metadata.json
  >> [L001] {"title": "Example Dataset", "id": "manuelalvarez/example-dataset", "licenses": [{"name": "CC0-1.0"}]}
Starting upload for file example2.csv


100%|██████████| 8.00/8.00 [00:03<00:00, 2.11B/s]


Upload successful: example2.csv (8B)
Starting upload for file example.csv


100%|██████████| 11.0/11.0 [00:04<00:00, 2.62B/s]


Upload successful: example.csv (11B)
Dataset version is being created. Please check progress at https://www.kaggle.com/manuelalvarez/example-dataset
