# Running SQL in a Jupyter notebook

## Prerequisites

The following assumes that you have:

1. Completed **Step 1: Set up your SQL database**.
2. Installed the `ipython-sql`, `SQLAlchemy` version 1.4.46 and `psycopg2`.

## Loading `ipython-sql`

To get started, we first need to load the `ipython-sql` extension so that its commands become available in the notebook:


In [4]:
%load_ext sql

## Connecting to a database

To connect to a database, we have to use a connection string [in a format that SQLAlchemy understands.](https://docs.sqlalchemy.org/en/20/core/engines.html#database-urls)

For Postgres, that format looks something like this:

```
postgresql+psycopg2://username:password@host:5432/database
```

If we're using the default user and no password is set, we can simplify this by omitting them:

```
postgresql+psycopg2://host:5432/database
```

So to connect to our local database, which is called `intro_to_sql_for_analytics`, we can use:

```
postgresql+psycopg2://localhost:5432/intro_to_sql_for_analytics
```

We can connect to by supplying this conection string to the `%sql` command:


In [9]:
%sql postgresql+psycopg://localhost:5432/intro_to_sql_for_analytics

Running the above command will establish a connection to the database. 
Subsequent queries in other cells of this notebook will run against this connection.

In [10]:
# If the above produces an error uncomment and try this:

# from sqlalchemy import create_engine
# engine = create_engine('postgresql+psycopg://williamoconnell:password123@localhost/intro_to_sql_for_analytics')
# %sql $engine.url

## Running queries

Next, run a SQL query against the database you just connected to using the `%%sql` magic command:

In [12]:
%%sql

SELECT
    *
FROM billboard_top_100_year_end
LIMIT 5


 * postgresql+psycopg://localhost:5432/intro_to_sql_for_analytics
5 rows affected.


index,year,year_rank,group_name,artist,song_name,id
0,1956,1,Elvis Presley,Elvis Presley,Heartbreak Hotel,1.0
1,1956,2,Elvis Presley,Elvis Presley,Don't Be Cruel,2.0
2,1956,3,Nelson Riddle,Nelson Riddle,Lisbon Antigua,3.0
3,1956,4,Platters,Platters,My Prayer,4.0
4,1956,5,Gogi Grant,Gogi Grant,The Wayward Wind,5.0


 We've used `%%sql` here because the query spans multiple lines, but we could also write it on one line using `%sql`:

In [13]:
%sql SELECT * FROM billboard_top_100_year_end LIMIT 5

 * postgresql+psycopg://localhost:5432/intro_to_sql_for_analytics
5 rows affected.


index,year,year_rank,group_name,artist,song_name,id
0,1956,1,Elvis Presley,Elvis Presley,Heartbreak Hotel,1.0
1,1956,2,Elvis Presley,Elvis Presley,Don't Be Cruel,2.0
2,1956,3,Nelson Riddle,Nelson Riddle,Lisbon Antigua,3.0
3,1956,4,Platters,Platters,My Prayer,4.0
4,1956,5,Gogi Grant,Gogi Grant,The Wayward Wind,5.0


In the remaining bites, we'll omit the `%sql` / `%%sql` but you'll need to use one or the other, depending on how you want to format your query.

# Check your results

If everything was set up correctly, running the query should return a table with the following query results in it:

```
+------+-----------+---------------+---------------+------------------+----+
| year | year_rank |  group_name   |    artist     |    song_name     | id |
+------+-----------+---------------+---------------+------------------+----+
| 1956 |         1 | Elvis Presley | Elvis Presley | Heartbreak Hotel |  1 |
| 1956 |         2 | Elvis Presley | Elvis Presley | Don't Be Cruel   |  2 |
| 1956 |         3 | Nelson Riddle | Nelson Riddle | Lisbon Antigua   |  3 |
| 1956 |         4 | Platters      | Platters      | My Prayer        |  4 |
| 1956 |         5 | Gogi Grant    | Gogi Grant    | The Wayward Wind |  5 |
+------+-----------+---------------+---------------+------------------+----+

```

You've successfully connected to a database and ran SQL from a Jupyter notebook.