<a href="https://colab.research.google.com/github/CompPsychology/psych290_colab_public/blob/main/notebooks/week-03/W3_Tutorial_05B_mini_tutorial_saving_SQLite_in_GoogleDrive_(dla_tutorial).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# W3 Tutorial 5b -- Mini Tutorial on databases with Google Drive (DB: dla_tutorial) (2025-03)

(c) Johannes Eichstaedt, Samuel Campione & the World Well-Being Project, 2025.

✋🏻✋🏻 NOTE - You need to create a copy of this notebook before you work through it. Click on "Save a copy in Drive" option in the File menu, and safe it to your Google Drive.

✉️🐞 If you find a bug/something doesn't work, please slack us a screenshot, or email johannes.courses@gmail.com.

Up until now, we have been creating SQLite databases from scratch each time you run the `Setting up Colab` section. But this is not ideal if you're trying to save feature tables or other tables you may have been working with. (Some feature tables we work with take a long time to extract!)

Soo, once you have a database you want to save, we can send a copy of it to your Google Drive. This way, next time you boot up colab you can load your database and get working right away! 🤩👌


Here's the flow for copying your data to and from Google Drive:

```
1) Install DLATK
2) Copy database from Google Drive OR insert CSV data into SQLite
3) Setup database connection
4) Your language analysis
5) SAVE your database back to Google Drive

```

## 1) Setting up Colab with DLATK and SQLite

### 1a) Install DLATK

In [None]:
# assign the database name
database = "dla_tutorial"

In [None]:
# 1a) Install DLATK

# installing DLATK and necessary packages
!git clone -b psych290 https://github.com/dlatk/dlatk.git
!pip install -r dlatk/install/requirements.txt
!pip install dlatk/
!pip install wordcloud langid jupysql

### 1b) Copy your database into Colab

This is the part were we normally download a fresh copy of the data (in this case, the `dla_tutorial` data) and convert the CSVs to SQLite `.db` files.

**At the end of Tutorial 5**, you ran a cell that saved your `dla_tutorial` database to your Google Drive in a folder called `sqlite_databases`.

In the following cell, Google will ask you to allow this notebook to access your Drive--click yes and follow prompts to login and allow!

In [None]:
# 1b) Mount Google Drive & copy database to Colab

# connects & mounts your Google Drive to this colab space
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

# copies {database_name}.db to the sqlite_data folder in this Colab
!cp "/content/drive/MyDrive/sqlite_databases/{database}.db" "sqlite_data"

print(f"Database {database} copied from Google Drive!")

Mounted at /content/drive
Database dla_tutorial copied from Google Drive!


You should now see your `{database}.db` in the Colab `sqlite_data` folder on the left! 🥳

### Remember: if working with new data...
💡💡 If you're working with new data (never been stored in Google Drive), you need to upload the data to the colab and turn it into a SQLite database (e.g., `new_data.db`). We can do this in R later! But for now here's a way.

```
import os
from dlatk.tools.importmethods import csvToSQLite

# define name of new database
database = "album"

# give the path to the database -- sqlite_data/[database].db
database_path = os.path.join("sqlite_data", database)

# import CSVs into tables in this database
csvToSQLite(
    "album/data/album.csv",  # path to table csv    
    database_path,   
    "album"                  # new table name
)

csvToSQLite(
    "album/data/track.csv",  # path to table csv
    database_path,
    "track"                  # new table name
)
```

### 1c) Setup database connection

In [None]:
# 1c) Setup database connection

# loads the %%sql extension
%load_ext sql

# connects the extension to the database file
from sqlalchemy import create_engine
tutorial_db_engine = create_engine(f"sqlite:///sqlite_data/{database}.db?charset=utf8mb4")

# connect the extension to the database
%sql tutorial_db_engine

# set the output limit to 50
%config SqlMagic.displaylimit = 50

And now you're setup! Let's check our tables.


In [None]:
res = %sqlcmd tables
print(res)

Yay, they're all still there! At this point, you would go about your language analysis.

## 2) Saving your database

⚠️ ⚠️ Once you're done with your analysis and ready to close your Colab, **don't forget to save your database** back to Drive!

In [None]:
database = "dla_tutorial" # you probably already ran this!

In [None]:
# Save your database in Google Drive

# mount Google Drive (if not already)
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

# if it's your first time, create a directory in your Drive for sqlite DBs
import os
!mkdir -p "/content/drive/MyDrive/sqlite_databases"

# copy the database file to your Drive (-f force-writes over the old database with the changes)
!cp -f "sqlite_data/{database}.db" "/content/drive/MyDrive/sqlite_databases/"

# print a message to confirm it's done
print(f"Database {database} has been copied to your Google Drive with success!")

We can check to make sure it's there by running this.

In [None]:
!ls -lh "/content/drive/MyDrive/sqlite_databases"

Now, the next homework! 😎