# Selecting Data - Lab


## Introduction 

NASA wants to go to Mars! Before they build their rocket, NASA needs to track information about all of the planets in the Solar System. In this lab, you'll practice querying the database with various `SELECT` statements. This will include selecting different columns and implementing other SQL clauses like `WHERE` to return the data desired.

<img src="./images/planets.png" width="600">

## Objectives
You will be able to:
* Connect to a SQL database using Python
* Retrieve all information from a SQL table
* Retrieve a subset of records from a table using a `WHERE` clause
* Write SQL queries to filter and order results
* Retrieve a subset of columns from a table

## Connecting to the Database

To get started, import `sqlite3` as well as `pandas` for conveniently displaying results. Then, connect to the SQLite database located at `planets.db`. 

In [1]:

# Your code here
import pandas as pd
import sqlite3 as sq3

conn = sq3.connect('planets.db')

## Database Schema

This database contains a single table, `planets`. This is the schema:

```
CREATE TABLE planets (
  id INTEGER PRIMARY KEY,
  name TEXT,
  color TEXT,
  num_of_moons INTEGER,
  mass REAL,
  rings BOOLEAN
);
```

The data looks something like this:

| id | name    | color      | num_of_moons | mass   | rings |
| -- | ------- | ---------- | ------------ | ------ | ----- |
| 1  | Mercury | gray       | 0            | 0.55   | FALSE |
| 2  | Venus   | yellow     | 0            | 0.82   | FALSE |
| 3  | Earth   | blue       | 1            | 1.00   | FALSE |
| 4  | Mars    | red        | 2            | 0.11   | FALSE |
| 5  | Jupiter | orange     | 67           | 317.90 | FALSE |
| 6  | Saturn  | hazel      | 62           | 95.19  | TRUE  |
| 7  | Uranus  | light blue | 27           | 14.54  | TRUE  |
| 8  | Neptune | dark blue  | 14           | 17.15  | TRUE  |

In [4]:
cur = conn.cursor()

cur.execute('''INSERT INTO planets VALUES (13, 'Void', 'electric blue', .5, -86, TRUE )''')

<sqlite3.Cursor at 0x18ee6549500>

Write SQL queries for each of the statements below using the same pandas wrapping syntax from the previous lesson.

## Select just the name and color of each planet

In [10]:
# Your code here
df = pd.DataFrame(cur.execute('''SELECT name, color FROM planets;''').fetchall())


In [11]:
df

Unnamed: 0,0,1
0,Mercury,gray
1,Venus,yellow
2,Earth,blue
3,Mars,red
4,Jupiter,orange
5,Saturn,hazel
6,Uranus,light blue
7,Neptune,dark blue
8,Void,electric blue


## Select all columns for each planet whose mass is greater than 1.00


In [12]:
# Your code here
mgrtthn1 = pd.DataFrame(cur.execute('''SELECT * FROM planets WHERE mass > 1;''').fetchall())
mgrtthn1

Unnamed: 0,0,1,2,3,4,5
0,5,Jupiter,orange,68,317.9,0
1,6,Saturn,hazel,62,95.19,1
2,7,Uranus,light blue,27,14.54,1
3,8,Neptune,dark blue,14,17.15,1


## Select the name and mass of each planet whose mass is less than or equal to 1.00

In [13]:
# Your code here
nm_and_ms_grtr1 = pd.DataFrame(cur.execute('''SELECT name, mass FROM planets WHERE mass <= 1;''').fetchall())
nm_and_ms_grtr1

Unnamed: 0,0,1
0,Mercury,0.55
1,Venus,0.82
2,Earth,1.0
3,Mars,0.11
4,Void,-86.0


## Select the name and color of each planet that has more than 10 moons

In [14]:
# Your code here
nm_clr_mns = pd.DataFrame(cur.execute('''SELECT name, color FROM planets WHERE num_of_moons > 10;''').fetchall())

In [15]:
nm_clr_mns

Unnamed: 0,0,1
0,Jupiter,orange
1,Saturn,hazel
2,Uranus,light blue
3,Neptune,dark blue


## Select the planet that has at least one moon and a mass less than 1.00

In [16]:
# Your code here
weird_planet = pd.DataFrame(cur.execute('''SELECT * FROM planets WHERE (num_of_moons > 1) AND (mass < 1);''').fetchall())

In [17]:
weird_planet

Unnamed: 0,0,1,2,3,4,5
0,4,Mars,red,2,0.11,0


## Select the name and color of planets that have a color of blue, light blue, or dark blue

In [19]:
# Your code here
blue_planets = pd.DataFrame(cur.execute('''SELECT name, color FROM planets WHERE color LIKE '%blue%';''').fetchall())
blue_planets

Unnamed: 0,0,1
0,Earth,blue
1,Uranus,light blue
2,Neptune,dark blue
3,Void,electric blue


## Select the name, color, and number of moons for the 4 largest planets that don't have rings and order them from largest to smallest

Note: even though the schema states that `rings` is a `BOOLEAN` and the example table shows values `TRUE` and `FALSE`, SQLite does not actually support booleans natively. From the [documentation](https://www.sqlite.org/datatype3.html#boolean_datatype):

> SQLite does not have a separate Boolean storage class. Instead, Boolean values are stored as integers 0 (false) and 1 (true).

Keep this in mind when you are filtering for "planets that don't have rings".

In [21]:
# Your code here
four_largest = pd.DataFrame(cur.execute('''SELECT name, color, num_of_moons 
                                            FROM planets 
                                            WHERE rings = FALSE
                                            ORDER BY mass DESC
                                            LIMIT 4;''').fetchall())
four_largest

Unnamed: 0,0,1,2
0,Jupiter,orange,68
1,Earth,blue,1
2,Venus,yellow,0
3,Mercury,gray,0


## Summary

Congratulations! NASA is one step closer to embarking upon its mission to Mars. In this lab, You practiced writing `SELECT` statements that query a single table to get specific information. You also used other clauses and specified column names to cherry-pick the data we wanted to retrieve. 