<a href="https://colab.research.google.com/github/chrismarkella/Kaggle-access-from-Google-Colab/blob/master/AccessKaggle.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
import os

import numpy as np
import pandas as pd

from getpass import getpass

In [0]:
def access_kaggle():
    """
    Access Kaggle from Google Colab.
    If the /root/.kaggle does not exist then prompt for
    the username and for the Kaggle API key.
    Creates the kaggle.json access file in the /root/.kaggle/ folder. 
    """
    KAGGLE_ROOT = os.path.join('/root', '.kaggle')
    KAGGLE_PATH = os.path.join(KAGGLE_ROOT, 'kaggle.json')

    if '.kaggle' not in os.listdir(path='/root'):
        user = getpass(prompt='Kaggle username: ')
        key  = getpass(prompt='Kaggle API key: ')
        
        !mkdir $KAGGLE_ROOT
        !touch $KAGGLE_PATH
        !chmod 666 $KAGGLE_PATH
        with open(KAGGLE_PATH, mode='w') as f:
            f.write('{"username":"%s", "key":"%s"}' %(user, key))
            f.close()
        !chmod 600 $KAGGLE_PATH
        del user
        del key
        success_msg = "Kaggle is successfully set up. Good to go."
        print(f'{success_msg}')

access_kaggle()


Kaggle username: ··········
Kaggle API key: ··········
Kaggle is successfully set up. Good to go.


### Searching for "iowa" datasets.
Using the `-s` command option for search.

In [0]:
!kaggle datasets list -s iowa

ref                                                        title                                       size  lastUpdated          downloadCount  
---------------------------------------------------------  -----------------------------------------  -----  -------------------  -------------  
residentmario/iowa-liquor-sales                            Iowa Liquor Sales                          731MB  2017-11-14 19:59:36           4500  
nickptaylor/iowa-house-prices                              Iowa House Prices                          179KB  2018-02-21 22:16:19            729  
naberhausj/ford-cars-in-iowa                               Ford Cars in Iowa                            3MB  2018-12-06 23:08:05             97  
emurphy/ames-iowa-housing-prices-dataset                   Ames Iowa Housing Prices Dataset           190KB  2018-04-25 14:54:54            261  
firmament11/iowa-farm-data                                 IOWA Farm Data                              13KB  2019-04-24 01:3

### List the files in the dataset.

In [0]:
!kaggle datasets files nickptaylor/iowa-house-prices

name        size  creationDate         
---------  -----  -------------------  
test.csv   441KB  2018-02-21 22:16:19  
train.csv  450KB  2018-02-21 22:16:19  


### Downloading only a specific file from the dataset.
Using the `-f file_name`. 

In [0]:
!kaggle datasets download nickptaylor/iowa-house-prices -f train.csv
!ls -lh
!rm train.csv
print(f'After deleting the "train.csv" file.')
!ls -lh

Downloading train.csv to /content
  0% 0.00/450k [00:00<?, ?B/s]
100% 450k/450k [00:00<00:00, 66.8MB/s]
total 456K
drwxr-xr-x 1 root root 4.0K Dec 12 16:48 sample_data
-rw-r--r-- 1 root root 450K Dec 18 22:21 train.csv
After deleting the "train.csv" file.
total 4.0K
drwxr-xr-x 1 root root 4.0K Dec 12 16:48 sample_data


###Downloading the full dataset.

In [0]:
!kaggle datasets download nickptaylor/iowa-house-prices
!ls -lh

Downloading iowa-house-prices.zip to /content
  0% 0.00/179k [00:00<?, ?B/s]
100% 179k/179k [00:00<00:00, 63.0MB/s]
total 184K
-rw-r--r-- 1 root root 180K Dec 18 22:21 iowa-house-prices.zip
drwxr-xr-x 1 root root 4.0K Dec 12 16:48 sample_data


### It downloaded as a zip file. We unzip and delete the zip file.

In [0]:
!unzip iowa-house-prices.zip
!rm iowa-house-prices.zip
!ls -lh

Archive:  iowa-house-prices.zip
  inflating: test.csv                
  inflating: train.csv               
total 900K
drwxr-xr-x 1 root root 4.0K Dec 12 16:48 sample_data
-rw-r--r-- 1 root root 441K Sep 27 17:59 test.csv
-rw-r--r-- 1 root root 450K Sep 27 17:59 train.csv


### We delete the CSV files.

In [0]:
!rm *.csv
!ls -lh

total 4.0K
drwxr-xr-x 1 root root 4.0K Dec 12 16:48 sample_data


### We can download, unzip and delete the zip file in one step.

Using the `--unzip` command option.

In [0]:
!kaggle datasets download nickptaylor/iowa-house-prices --unzip
!ls -lh

Downloading iowa-house-prices.zip to /content
  0% 0.00/179k [00:00<?, ?B/s]
100% 179k/179k [00:00<00:00, 65.5MB/s]
total 900K
drwxr-xr-x 1 root root 4.0K Dec 12 16:48 sample_data
-rw-r--r-- 1 root root 441K Dec 18 22:21 test.csv
-rw-r--r-- 1 root root 450K Dec 18 22:21 train.csv


In [0]:
!rm *.csv
!ls -lh

total 4.0K
drwxr-xr-x 1 root root 4.0K Dec 12 16:48 sample_data


### Installing the "tree" command. We will use this soon.

In [0]:
!apt-get install tree

Reading package lists... Done
Building dependency tree       
Reading state information... Done
tree is already the newest version (1.7.0-5).
The following package was automatically installed and is no longer required:
  libnvidia-common-430
Use 'apt autoremove' to remove it.
0 upgraded, 0 newly installed, 0 to remove and 7 not upgraded.


### Downloading the zip and unzipping to a folder.
Using the `-p path_to_the_desired_folder` command option.

In [0]:
!kaggle datasets download nickptaylor/iowa-house-prices -p ./dataset/housing/ --unzip
!tree -sh dataset/

Downloading iowa-house-prices.zip to ./dataset/housing
  0% 0.00/179k [00:00<?, ?B/s]
100% 179k/179k [00:00<00:00, 65.2MB/s]
dataset/
└── [4.0K]  housing
    ├── [441K]  test.csv
    └── [450K]  train.csv

1 directory, 2 files


In [0]:
!rm -rf dataset/
!ls -lh

total 4.0K
drwxr-xr-x 1 root root 4.0K Dec 12 16:48 sample_data


### Download, unzip to a folder, delete the zip file and do everything in "quite" mode.
Using the `-q` command option for quite.

In [0]:
!kaggle datasets download nickptaylor/iowa-house-prices -p ./dataset/housing/ --unzip -q

In [0]:
!tree -sh dataset/

dataset/
└── [4.0K]  housing
    ├── [441K]  test.csv
    └── [450K]  train.csv

1 directory, 2 files
