# Exploring the Solar System with SQL

Goals of this notebook are to see basic queries with:
* `SELECT` statements
* `ORDER BY` and `LIMIT`
* Aggregate functions
* `GROUP BY` statements
* `HAVING` clause
* a simple subquery

Another goal is to provide a space for you to practice SQL queries.


### Preliminaries
We import some necessary moduels. 

In [22]:
import sqlite3
import pandas as pd

Connect to a database.

In [23]:
conn = sqlite3.connect('db_creation/astronomy.db')

Tack a curser to the connection to read the data.

In [24]:
cur = conn.cursor()

### The Data

Consider the following table titled `black_holes` with characteristics of black holes, namely, the name and mass.
* The unit of mass is in Solar masses. To read the mass in this table, use scientific notation: `base_mass x 10^power`. For example, the Sombrero galaxy is `1x10⁹` times the mass of the Sun.
![image.png](attachment:image.png)

### `SELECT`, `LIMIT`, and Pandas

Run the next two cells. The first queries the `black_holes` table in the database. The second cell transforms the data into a Pandas DataFrame. 

In [38]:
# Select and display all the data in a pandas dataframe.
query = """
        SELECT *
        FROM black_holes;
        """
cur.execute(query)

<sqlite3.Cursor at 0x109c5c0a0>

In [39]:
df = pd.DataFrame(cur.fetchall())
df.columns = [x[0] for x in cur.description]  # for column names
df

Unnamed: 0,num,name,base_mass,power
0,1,M104 - Sombrero Galaxy,1,9
1,2,M31 - Andromeda Galaxy,1,8
2,3,NGC 4889,1,10
3,4,Cygnus X-1,15,0
4,1,M104 - Sombrero Galaxy,1,9
5,2,M31 - Andromeda Galaxy,1,8
6,3,NGC 4889,1,10
7,4,Cygnus X-1,15,0
8,1,M104 - Sombrero Galaxy,1,9
9,2,M31 - Andromeda Galaxy,1,8


Try this: Select and display the first two rows of the `black_holes` table. (_HINT: use_ `LIMIT`.)

In [12]:
# Select and display the first two rows of the table.
# Your code here

Select the `name` of each item in the `black_holes` table and display in a pandas dataframe.

In [40]:
# Select and display all the data in a pandas dataframe.
query = """
        SELECT name
        FROM black_holes;
        """
cur.execute(query)
df = pd.DataFrame(cur.fetchall())
df.columns = [x[0] for x in cur.description]  # for column names
df

Unnamed: 0,name
0,M104 - Sombrero Galaxy
1,M31 - Andromeda Galaxy
2,NGC 4889
3,Cygnus X-1
4,M104 - Sombrero Galaxy
5,M31 - Andromeda Galaxy
6,NGC 4889
7,Cygnus X-1
8,M104 - Sombrero Galaxy
9,M31 - Andromeda Galaxy


Try this: Select the `name` and `power` of each item in the `black_holes` table.

In [41]:
# Select and display all the data in a pandas dataframe.
# Your code here

Try this: Now, order the information about black holes by descending power. (_Hint: use_ `ORDER BY` and `DESC`.)