<a href="https://colab.research.google.com/github/tuhin-datascience/SQL_in_Google_Colab/blob/main/SQL__in_Colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### **Google Colab and SQL**

Google Colab is an invaluable resource for data scientists and machine learning practitioners. It offers a free, collaborative, cloud-based platform that allows users to write and execute Python code. However, it's less known that SQL, the standard language for data manipulation and queries, can be used directly within Colab.

This guide will cover two methods for utilizing SQL within Google Colab: employing Python's SQLite library and utilizing magic commands.

One way to use SQL in Google Colab is by leveraging Python libraries such as SQLite3. This method requires importing the SQLite3 libraries, establishing a connection to a database, and executing SQL queries using the pandas library's `read_sql_query()` function.

Another method is through SQL magic commands, akin to those in Jupyter Notebooks. To employ SQL magic commands, one must install the ipython-sql library and activate the SQL extension with the %load_ext sql command. After activating the extension, you can connect to a database with the %sql command and run SQL queries using the %%sql cell magic command.

### **Method 1: Using SQLite Database**
SQLite is a serverless, self-contained, zero-configuration database engine that doesn't require additional installation steps. Python includes built-in support for SQLite.

To start, import the SQLite3 module and establish a connection to the database in your Google Colab notebook. For this example, we'll use an in-memory database, indicated by ':memory:'.

SQLite in-memory databases are stored in RAM instead of on a disk.

In [1]:
# Import of the library
import sqlite3

In [2]:
# Creation of the connection with the database
conn = sqlite3.connect('sales.db')
print("Opened database successfully");

Opened database successfully


In [3]:
# CREATING THE TABLE

conn.execute('''
CREATE TABLE IF NOT EXISTS sales(
                      order_id varchar(10),
                      item_code varchar(8),
                      discount integer,
                      price decimal(10,2) );''')

conn.commit()

print("Table created successfully");

conn.close()

Table created successfully


In [4]:
conn = sqlite3.connect('sales.db')
# INSERTING VALUES

conn.execute("INSERT INTO sales VALUES('PO100', 'item100', 1, 53);")
conn.execute("INSERT INTO sales VALUES('PO200', 'item200', 2, 42);")
conn.execute("INSERT INTO sales VALUES('PO300', 'item200', 2, 50);")
conn.execute("INSERT INTO sales VALUES('PO400', 'item200', 2, 49);")
conn.execute("INSERT INTO sales VALUES('PO500', 'item100', 1, 45);")
conn.execute("INSERT INTO sales VALUES('PO600', 'item100', 1, 55);")

conn.commit()

In [5]:
# Average goal by team

conn = sqlite3.connect('sales.db')

cursor = conn.execute(''' SELECT item_code,
                          AVG(price) AS avg_price
                          FROM sales
                          GROUP BY item_code;''')

for row in cursor:
  print(row)
conn.close()

('item100', 50.166666666666664)
('item200', 48.166666666666664)


In [9]:
# Now, the correct query, using the appropriate sub-query

conn = sqlite3.connect('sales.db')

cursor = conn.execute(''' SELECT item_code, avg_price
                          FROM (

                          -- Here we make our sub-query:
                            SELECT item_code,
                            AVG(price) AS avg_price
                            FROM sales
                            GROUP BY item_code) tp
                          -- End of the sub-query

                          WHERE avg_price > 49;''')

for row in cursor:
  print(row)
conn.close()

('item100', 50.166666666666664)


### **Method 2: Using Magic Commands**
Magic commands in IPython are a useful set of commands that help solve common problems while working with data. One such command is the SQL magic command that allows writing SQL queries within a notebook.

First, we need to install the ipython-sql extension. This can be done directly in a Colab cell:

In [10]:
# Install ipython-sql
!pip install ipython-sql --quiet


Next, load the SQL extension and create a SQLite database:

In [11]:
# Load the SQL extension
%load_ext sql

# Create a SQLite database
%sql sqlite://


Now you can write SQL queries using the %sql or %%sql magic commands. % is for single-line commands, and %% is for multi-line commands that run the entire cell as SQL.

Here’s an example of executing SQL commands to create a table, insert data, and run a query using %%sql.

In [12]:
%%sql
CREATE TABLE IF NOT EXISTS sales(
                      order_id varchar(10),
                      item_code varchar(8),
                      discount integer,
                      price decimal(10,2) );

 * sqlite://
Done.


[]

In [13]:
%%sql
INSERT INTO sales VALUES('PO100', 'item100', 1, 53);
INSERT INTO sales VALUES('PO200', 'item200', 2, 42);
INSERT INTO sales VALUES('PO300', 'item200', 2, 50);
INSERT INTO sales VALUES('PO400', 'item200', 2, 49);
INSERT INTO sales VALUES('PO500', 'item100', 1, 45);
INSERT INTO sales VALUES('PO600', 'item100', 1, 55);


 * sqlite://
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.


[]

In [14]:
%%sql
SELECT item_code, AVG(price) AS avg_price
FROM sales
GROUP BY item_code;

 * sqlite://
Done.


item_code,avg_price
item100,51.0
item200,47.0


In [16]:
%%sql

SELECT item_code, avg_price
                          FROM (

                          -- Here we make our sub-query:
                            SELECT item_code,
                            AVG(price) AS avg_price
                            FROM sales
                            GROUP BY item_code) tp
                          -- End of the sub-query

                          WHERE avg_price > 48;

 * sqlite://
Done.


item_code,avg_price
item100,51.0
