# File Import/Export in Google Colab

## Uploading files from your local file system

`files.upload` returns a dictionary of the files which were uploaded.
The dictionary is keyed by the file name, the value is the data which was uploaded.

In [None]:
from google.colab import files

uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))

Saving 2017_StPaul_MN_Real_Estate.csv to 2017_StPaul_MN_Real_Estate.csv
User uploaded file "2017_StPaul_MN_Real_Estate.csv" with length 4096739 bytes


In [None]:
ls

2017_StPaul_MN_Real_Estate.csv  README.md  [0m[01;34msample_data[0m/


Please note that any file that is being uploaded to Google Colab will not persist after the session is terminated (which is typically a 24-hour window). This should be used with caution.

## Downloading files to your local file system

`files.download` will invoke a browser download of the file to the user's local computer.

In [None]:
from google.colab import files

files.download('2017_StPaul_MN_Real_Estate.csv')

## Removing a file from the Colab container

We can use the linux command `rm` to remove the file we just uploaded:

In [None]:
rm 2017_StPaul_MN_Real_Estate.csv

We can check to see if the file was removed by listing the files in the current directory (`content`) by the linux command `ls`:

In [None]:
ls

README.md  [0m[01;34msample_data[0m/


## Manual upload/download

This can be done by clicking on the "Files" icon on the top left of your Colab notebook then "Upload".

One can right click on any file to download it manually.

**Note:** The working directory for Colab notebooks is "`content`" and we will only use that folder to avoid breaking anything.

## Mounting Google Drive locally

The example below shows how to mount your Google Drive in your virtual machine using an authorization code, and shows a couple of ways to write & read files there. Once executed, observe the new file (`foo.txt`) is visible in https://drive.google.com/

Note this only supports reading and writing files; to programmatically change sharing settings etc use one of the other options below.

In [None]:
# Mount the drive
from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/gdrive


In [None]:
ls gdrive/My\ Drive/Teaching/QM875-Python-for-Data-Science-Bootcamp-2020-Summer/Python-Bootcamp-master/data

2017_StPaul_MN_Real_Estate.csv  movies_ratings.csv
FremontBridge.csv               state-abbrevs.csv
GOOGL.csv                       state-areas.csv
GOT-battles.csv                 state-population.csv
GOT-character-deaths.csv        Telco-Customer-Churn.csv
movies_metadata.csv


I can now use this path (or any other path in My Drive) to store data that I would like to persist. For instance in the cell below I am copying sample_data/california_housing_train.csv to a folder of my choice in my Google Drive.

In [None]:
cp sample_data/california_housing_train.csv gdrive/My\ Drive/Teaching/QM875-Python-for-Data-Science-Bootcamp-2020-Summer/Python-Bootcamp-master/data

Let's confirm the file was copied properly. We can also go to Drive and visually inspect it:

In [None]:
ls gdrive/My\ Drive/Teaching/QM875-Python-for-Data-Science-Bootcamp-2020-Summer/Python-Bootcamp-master/data

2017_StPaul_MN_Real_Estate.csv  movies_metadata.csv
california_housing_train.csv    movies_ratings.csv
FremontBridge.csv               state-abbrevs.csv
GOOGL.csv                       state-areas.csv
GOT-battles.csv                 state-population.csv
GOT-character-deaths.csv        Telco-Customer-Churn.csv


## Your Trun

The goal of this exercise is to upload some files to your Colab and then make a copy to your Google Drive. This will ensure you know how to import/export files to/from Colab. Please have in mind that any file here in Colab will be deleted within 24 hours (this is a temporary environment), so it is important to know how to export the results of your analysis to somewhere that will persist, such as your Google Drive or your laptop.

1. Go to https://grouplens.org/datasets/movielens/ and download `ml-latest-small.zip` file. Unzip it and then upload the following files to your Colab (`content` folder)
  * movies.csv
  * ratings.csv
2. Mount Google Drive locally and copy these two files into a folder called `movie_rating` within your QM875 folder.
  * You can use the following code example to copy movies.csv to a folder called tmp_QM875 (change it to your desired desitination)
  ```
cp movies.csv gdrive/My\ Drive/tmp_QM875
```
  * Notice that `gdrive/My\ Drive/` is your Drive's homepage.
3. Check your Drive and make sure the files are there.
4. Download these files to your local laptop.

In [None]:
# Your code goes here (please use as many cells as needed)