# **File Handling on Google Colab**

Chanin Nantasenamat

[*'Data Professor' YouTube channel*](http://youtube.com/dataprofessor)

In this Jupyter notebook, I will be showing you how to access Google Drive from within Google Colab where you will be able to read, write, copy and write files.

---

## **Mounting the Google Drive into Google Colab**

In [1]:
from google.colab import drive
drive.mount('/content/gdrive/', force_remount=True)

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/gdrive/


## **List content of directory**

Using the ***ls*** command we list the contents of the current working directory. We can see the newly created files. 

In [2]:
! ls

gdrive	sample_data


To show more details, let's add the -l option.

In [3]:
! ls -l

total 8
drwx------ 4 root root 4096 Mar 27 10:37 gdrive
drwxr-xr-x 1 root root 4096 Mar 18 16:23 sample_data


## **Create a directory**

Use the **mkdir** command to create a directory in the Bash command line.

In [0]:
mkdir compiled_data

Now, let's look for the newly created directory.

In [5]:
! ls

compiled_data  gdrive  sample_data


## **Create files**

### **Create files in Bash command line**

Creating a simple text file from the **Bash** command line.

### **Create files in Python**

In [0]:
! echo "The quick brown fox jumped over the lazy cat." > data.txt

Create a simple text file from within **Python**.

In [0]:
data2 = open("data2.txt", "w")
data2.write("The quick brown fox jumped over the lazy cat.")
data2.close()

### **Download files from the Internet**

We will be downloading some data from the Data Professor GitHub.

In [8]:
! wget https://github.com/dataprofessor/data/raw/master/weather-weka.csv

--2020-03-27 10:37:52--  https://github.com/dataprofessor/data/raw/master/weather-weka.csv
Resolving github.com (github.com)... 140.82.113.3
Connecting to github.com (github.com)|140.82.113.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/dataprofessor/data/master/weather-weka.csv [following]
--2020-03-27 10:37:53--  https://raw.githubusercontent.com/dataprofessor/data/master/weather-weka.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 362 [text/plain]
Saving to: ‘weather-weka.csv’


2020-03-27 10:37:53 (53.7 MB/s) - ‘weather-weka.csv’ saved [362/362]



## **Read files**

### **Read files in Bash command line**

Using the ***cat*** command, we display the contents of the newly created files.

In [9]:
! cat data.txt

The quick brown fox jumped over the lazy cat.


In [10]:
! cat data2.txt

The quick brown fox jumped over the lazy cat.

### **Read files in Python**

In [0]:
data = open("data.txt", "r")

In [0]:
data_content = data.read()

In [13]:
data_content

'The quick brown fox jumped over the lazy cat.\n'

In [14]:
data_content.strip('\n')

'The quick brown fox jumped over the lazy cat.'

## **Accessing the Google Drive from Google Colab**

Let's look at the contents of the **"Colab Notebooks"** folder in our Google Drive.

In [15]:
! ls "/content/gdrive/My Drive/Colab Notebooks/"

CDD-ML-Part-1-chembl.ipynb		   data
CDD-ML-Part-2-rdkit.ipynb		   hyperparameter-tuning.ipynb
CDD-molecular-visualization.ipynb	   My-First-Notebook.ipynb
CDD-protein-ligand-docking.ipynb	   progress_bar.ipynb
Colab_File_handling_on_Google_Colab.ipynb  Untitled0.ipynb


Now, let's change our working directory to the **"Colab Notebooks"** folder in our Google Drive. Congratulations, you now have access to the working directory of Google Colab.

In [0]:
! cd "/content/gdrive/My Drive/Colab Notebooks/"

List the contents of the current working directory (i.e. the **Colab Notebooks** folder).

In [17]:
! ls

compiled_data  data2.txt  data.txt  gdrive  sample_data  weather-weka.csv


## **Copy files from Google Drive into your Google Colab**

We should be aware that contents of the Colab working directory will automatically be removed when the session ends. The path of the Colab working directory can be shown by using the **pwd** command.

In [18]:
! pwd

/content


In the forthcoming cells, I will copy some files from a directory called **"dataset"** from my **Google Drive** to the Google Colab.

Lists the content of the **"dataset"** directory in my Google Drive. 

(Note: Replace **"dataset"** with the name of your own directory on your Google Drive.)

In [19]:
! ls "/content/gdrive/My Drive/dataset/"

dhfr.csv


Google Drive does not seem to allow changing to the directory.

In [20]:
! cd "/content/gdrive/My Drive/dataset/"
! pwd

/content


List the contents of current working directory. Make note that the dhfr.csv file is not found here.

In [21]:
! ls

compiled_data  data2.txt  data.txt  gdrive  sample_data  weather-weka.csv


Let's directly copy files from the **Google Drive** to **Colab working directory**. The full path to the dhfr.csv file on my Google Drive is at "/content/gdrive/My Drive/dataset/dhfr.csv" while the "." refers to the current working directory of Colab.

(Note: Replace **"dataset"** with the name of your own directory on your Google Drive.) *italicized text*

In [0]:
! cp "/content/gdrive/My Drive/dataset/dhfr.csv" .

List the contents of current working directory again. As we had just copied the **dhfr.csv** file from the Google Drive into the current working directory, we should now see **dhfr.csv** file in here.

In [23]:
! ls

compiled_data  data.txt  gdrive       weather-weka.csv
data2.txt      dhfr.csv  sample_data


## **Copy files from Google Colab to Google Drive**

In [24]:
! ls "/content/gdrive/My Drive/dataset/"

dhfr.csv


In [0]:
! cp data.txt data2.txt "/content/gdrive/My Drive/dataset/"

In [26]:
! ls "/content/gdrive/My Drive/dataset/"

data2.txt  data.txt  dhfr.csv


## **Move files from Google Colab to Google Drive**

In [27]:
! ls

compiled_data  data.txt  gdrive       weather-weka.csv
data2.txt      dhfr.csv  sample_data


Use the **mv** command to move files to the Google Drive.

In [0]:
! mv weather-weka.csv "/content/gdrive/My Drive/dataset/"

Let's list the contents of the **dataset** directory on Google Drive to see whether the copied file is indeed present.

In [29]:
! ls "/content/gdrive/My Drive/dataset/"

data2.txt  data.txt  dhfr.csv  weather-weka.csv


The **mv** command can also be used to rename files.

In [0]:
! mv dhfr.csv dhfr2.csv

## **Delete files**

In [31]:
! ls

compiled_data  data2.txt  data.txt  dhfr2.csv  gdrive  sample_data


Use the **rm** command to delete a file of interest.

In [32]:
! rm weather-weka.csv

rm: cannot remove 'weather-weka.csv': No such file or directory


## **Delete a directory**

Before we can show you how to delete a directory containing files, we will first create and populate one.

In [0]:
! mkdir tmp_data

In [0]:
! cd tmp_data

In [35]:
! wget https://github.com/dataprofessor/data/raw/master/heart-disease-cleveland.csv

--2020-03-27 10:39:00--  https://github.com/dataprofessor/data/raw/master/heart-disease-cleveland.csv
Resolving github.com (github.com)... 140.82.113.4
Connecting to github.com (github.com)|140.82.113.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/dataprofessor/data/master/heart-disease-cleveland.csv [following]
--2020-03-27 10:39:01--  https://raw.githubusercontent.com/dataprofessor/data/master/heart-disease-cleveland.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 12237 (12K) [text/plain]
Saving to: ‘heart-disease-cleveland.csv’


2020-03-27 10:39:02 (129 MB/s) - ‘heart-disease-cleveland.csv’ saved [12237/12237]



In [0]:
! mv heart-disease-cleveland.csv tmp_data

In [37]:
! ls tmp_data

heart-disease-cleveland.csv


Now, we are able to delete the **tmp_data** directory that contains a text file using the **rm** command together with the -r option.

In [38]:
! rm tmp_data

rm: cannot remove 'tmp_data': Is a directory


In [0]:
! rm -r tmp_data

In [40]:
! ls

compiled_data  data2.txt  data.txt  dhfr2.csv  gdrive  sample_data


Deleting an empty directory can be also done using the **rm** command together with the -r option.

In [0]:
! rm -r compiled_data

In [42]:
! ls

data2.txt  data.txt  dhfr2.csv	gdrive	sample_data


---