# SQL in Jupyter tutorial (part 2)
### How to connect to SQLite database using Jupyter

In this notebook I show how to connect to SQLite database from Jupyter and run queries in single cells.  
To use SQL commands as line magic, you need to import the appropriate extension. Be sure to install beforehand ```ipython-sql``` package. 

In [1]:
%load_ext sql

Then we connect to the database using the command below. Remember to use three '/' slash signs.

In [2]:
%sql sqlite:///data/test.db

'Connected: @data/test.db'

We are now connected to the database. Line magic works in such a way that when we enter a single *%sql* character, only this line is treated as an SQL query. If we use two percentage characters *%%sql*, then the entire content of the cell is treated as an SQL query. 

Below is a query that returns the names of all tables in the 'test' database.

In [3]:
%%sql
SELECT name FROM sqlite_master WHERE type='table';

 * sqlite:///data/test.db
Done.


name
clients
transactions


Let's see what's in the 'clients' table.

In [5]:
%%sql
SELECT * FROM clients 
LIMIT 10
;

 * sqlite:///test.db
Done.


id,name,gender,city,country
1,Atalanta MacMenamin,F,Charleston,United States
2,Doralynne Boulds,M,San Jose,United States
3,Herc Zarfat,M,Charleston,United States
4,Van Kivelle,M,Oklahoma City,United States
5,Cos Teggin,F,Naples,United States
6,Selia Dameisele,F,Houston,United States
7,Xymenes McGhie,M,Las Vegas,United States
8,Gloriana Smethurst,F,Tulsa,United States
9,Enoch Overil,F,Fort Worth,United States
10,Mickie Myles,M,Tuscaloosa,United States


As you can see, the data is presented in the form of a readable table, similar to pandas dataframe. Let's check the first 10 records from the 'transactions' database.

In [6]:
%%sql
SELECT * FROM transactions 
LIMIT 10
;

 * sqlite:///test.db
Done.


id,amount,category,date,time,credit_card
16,170,Sports,2023-01-05,7:12,americanexpress
19,262,Beauty,2023-01-04,0:17,visa
2,83,Automotive,2023-01-09,23:18,americanexpress
7,402,Automotive,2023-01-02,19:13,mastercard
11,301,Baby,2023-01-06,5:31,americanexpress
27,444,Computers,2023-01-06,4:40,mastercard
27,314,Health,2023-01-01,14:27,americanexpress
15,490,Toys,2023-01-11,15:40,americanexpress
18,252,Music,2023-01-06,7:59,visa
20,177,Sports,2023-01-01,17:38,mastercard


This way we can start the SQL exercise using only Jupyter. I encourage you to start experimenting with the 'test.db' database prepared for this tutorial.

### Some basic SQL queries examples

Let's do some simple database queries to test how SQL works in Jupyter. Let's check, for example, which of the customers spent the most using some of the basic functions such as JOIN, GROUP BY and ORDER BY. This gives us a list of the top 10 customers who have spent the most money.

In [None]:
%%sql
SELECT c.name, SUM(amount) AS total FROM clients c
LEFT JOIN transactions t ON c.id=t.id
GROUP BY t.id
ORDER BY total DESC
LIMIT 10
;

 * sqlite:///test.db
Done.


name,total
Sheeree Mucillo,13336
Mickie Myles,11993
Cos Teggin,10576
Rosmunda Hellikes,9809
Farlee Cowburn,9737
Doralynne Boulds,9661
Herta Jellis,9589
Melessa Hackforth,9492
Sharron Rann,9442
Grenville Roughley,9419


We can then select all transactions made by Sheeree Muccillo using the WHERE statement and sort them chronologically.

In [15]:
%%sql
SELECT name, amount, category, date, time, credit_Card FROM clients c
LEFT JOIN transactions t ON c.id=t.id
WHERE name = 'Sheeree Mucillo'
ORDER BY STRFTIME("%Y-%m-%d",date)
;

 * sqlite:///test.db
Done.


name,amount,category,date,time,credit_card
Sheeree Mucillo,500,Grocery,2023-01-01,10:42,visa
Sheeree Mucillo,324,Grocery,2023-01-01,11:15,americanexpress
Sheeree Mucillo,11,Computers,2023-01-01,11:58,mastercard
Sheeree Mucillo,115,Baby,2023-01-01,11:59,americanexpress
Sheeree Mucillo,186,Health,2023-01-01,20:30,americanexpress
Sheeree Mucillo,263,Kids,2023-01-01,3:28,mastercard
Sheeree Mucillo,464,Computers,2023-01-01,4:19,americanexpress
Sheeree Mucillo,266,Music,2023-01-02,21:07,visa
Sheeree Mucillo,487,Music,2023-01-02,3:28,americanexpress
Sheeree Mucillo,289,Movies,2023-01-02,7:33,visa


Finally, let's count how much Sheeree Mucillo spent in total each day and how many individual transactions there were.

In [22]:
%%sql
SELECT date, SUM(amount) AS total_amount, COUNT(name) AS transactions FROM clients c
LEFT JOIN transactions t ON c.id=t.id
WHERE name='Sheeree Mucillo'
GROUP BY STRFTIME("%Y-%m-%d",date)
ORDER BY STRFTIME("%Y-%m-%d",date)
;

 * sqlite:///test.db
Done.


date,total_amount,transactions
2023-01-01,1863,7
2023-01-02,1042,3
2023-01-03,1983,6
2023-01-04,446,1
2023-01-05,369,2
2023-01-06,1112,5
2023-01-07,1065,4
2023-01-08,461,4
2023-01-09,478,1
2023-01-10,1141,4


This is just an example of how you can analyze data in the form of a SQL database using Jupyter. Even on such a simple database used in this example, we can easily practice all the basic SQL functions.