In [1]:
import psycopg2

# Prepare the Database

Run `docker-compose up` to get Postgres up and running.

In [2]:
credentials = "dbname='astronomy' user='user' host='localhost' password='pass'"
conn = psycopg2.connect(credentials)
cur = conn.cursor()

cur.execute("""
DROP TABLE IF EXISTS Planet;
DROP TABLE IF EXISTS Star;

CREATE TABLE Star (
  kepler_id INTEGER PRIMARY KEY,
  t_eff INTEGER,
  radius FLOAT
);


COPY Star (kepler_id, t_eff, radius) FROM '/data/week4/data1/stars.csv' CSV;
""")
conn.commit()

# Task 1
As a warm-up, add the following stars to the existing Star table:

| kepler_id | t_eff | radius |
|-----------|-------|--------|
| 7115384   | 3789  | 27.384 |
| 8106973   | 5810  | 0.811  |
| 9391817   | 6200  | 0.958  |

In [3]:
cur.execute("""
INSERT INTO Star (kepler_id, t_eff, radius) VALUES
  (7115384, 3789, 27.384),
  (8106973, 5810, 0.811),
  (9391817, 6200, 0.958);
""")
conn.commit()

# Add Planet Table

In [4]:
cur.execute("""
DROP TABLE IF EXISTS Planet;
DROP TABLE IF EXISTS Star;

CREATE TABLE Planet (
  kepler_id INTEGER NOT NULL,
  koi_name VARCHAR(20) NOT NULL,
  kepler_name VARCHAR(20),
  status VARCHAR(20) NOT NULL,
  period FLOAT,
  radius FLOAT,
  t_eq INTEGER
);


COPY Planet (kepler_id, koi_name, kepler_name, status, period, radius, t_eq)
  FROM '/data/week4/data1/planets.csv' CSV;
""")
conn.commit()

# Task 2

Your task is to fix a `Planet` table has been corrupted.

There are two problems: some of the rows were copied with a negative radius
and the unconfirmed planets were given fake Kepler names. To fix this, you'll
have to:

- Update the Kepler names of planets which don't have a confirmed status
  (replace the `kepler_name` with `NULL`);
- Delete rows where the `radius` is negative.
- Have a look at the table to find out what needs to be changed.

In [5]:
cur.execute("""
UPDATE Planet SET kepler_name = NULL WHERE status != 'CONFIRMED';
DELETE FROM Planet WHERE radius < 0;
""")
conn.commit()

# Task 2

Now you've ready to create your first table. Your task is to set up a new
`Planet` table and fill it with the planets listed below.

Your table should consist of the following attributes in this order:

- `kepler_id`, as type `INTEGER`
- `koi_name`, as type `VARCHAR(15)`
- `kepler_name`, as type `VARCHAR(15)`
- `status`, as type `VARCHAR(20)`
- `radius`, as type `FLOAT`

Add the following constraints:

- make all attributes `NOT NULL` except `kepler_name`;
- make the `koi_name` a unique attribute.

Finally, insert the following planets into your new table:

| kepler_id | koi_name  | kepler_name  | status    | radius  |
|-----------|-----------|--------------|-----------|---------|
| 6862328   | K00865.01 | NULL         | CANDIDATE | 119.021 |
| 10187017  | K00082.05 | Kepler-102 b | CONFIRMED | 5.286   |
| 10187017  | K00082.04 | Kepler-102 c | CONFIRMED | 7.071   |

Don't forget to use `NULL` for the missing planet name!

In [6]:
cur.execute("""
DROP TABLE IF EXISTS Planet;
DROP TABLE IF EXISTS Star;

CREATE TABLE Planet (
  kepler_id INTEGER NOT NULL,
  koi_name VARCHAR(15) NOT NULL UNIQUE,
  kepler_name VARCHAR(15),
  status VARCHAR(20) NOT NULL,
  radius FLOAT NOT NULL
);
INSERT INTO Planet VALUES
  (6862328, 'K00865.01', NULL, 'CANDIDATE', 119.021),
  (10187017, 'K00082.05', 'Kepler-102 b', 'CONFIRMED', 5.286),
  (10187017, 'K00082.04', 'Kepler-102 c', 'CONFIRMED', 7.071);
""")
conn.commit()

# Task 3

You can now create the `Star` and `Planet` tables that we've been querying
throughout these activities and fill the tables using the two CSV files which
contain the star and planet data.

The tables and CSV files should contain the columns as follows:

- `stars.csv`:
  - `kepler_id` as an integer and **primary key**;
  - `t_eff` as an integer;
  - `radius` as a float.
- `planets.csv`:
  - `kepler_id` as an integer and a **foreign key** referencing `Star` on
    `kepler_id`;
  - `koi_name` as a variable char (max 20) and **primary key**;
  - `kepler_name` as a variable char (max 20);
  - `status` as a variable char (max 20);
  - `period` as a float;
  - `radius` as a float;
  - `t_eq` as an integer.
  
Add the following other constraints to your tables:

- Ensure `t_eff` and `radius` in `Star` cannot be `NULL`
- Ensure the `status` in `Planet` cannot be `NULL`

In [7]:
cur.execute("""
DROP TABLE IF EXISTS Star;
DROP TABLE IF EXISTS Planet;

CREATE TABLE Star (
  kepler_id INTEGER PRIMARY KEY,
  t_eff INTEGER NOT NULL,
  radius FLOAT NOT NULL
);

CREATE TABLE Planet (
  kepler_id INTEGER REFERENCES Star (kepler_id),
  koi_name VARCHAR(20) PRIMARY KEY,
  kepler_name VARCHAR(20),
  status VARCHAR(20) NOT NULL,
  period FLOAT,
  radius FLOAT,
  t_eq INTEGER
);

COPY Star (kepler_id, t_eff, radius)
  FROM '/data/week4/data2/stars.csv' CSV;
COPY Planet (kepler_id, koi_name, kepler_name, status, period, radius, t_eq)
  FROM '/data/week4/data2/planets.csv' CSV;
""")
conn.commit()

# Task 4

Your task is to add two columns to the existing `Star` table to hold the
equatorial coordinates as RA and declination angles and then fill the new
columns with data. Call the attributes `ra` and `decl`.

The new stars data, including the equatorial coordinates is stored in
`stars_full.csv`. The attributes in this file are ordered as follows:
`(kepler_id, t_eff, radius, ra, decl)`.

To update the data in the table to fill in values for the new attributes,
use the `DELETE` statement to empty the table and then copy the full CSV file
in.

## Prepare Database

In [8]:
cur.execute("""
DROP TABLE IF EXISTS Planet;
DROP TABLE IF EXISTS Star;

CREATE TABLE Star (
  kepler_id INTEGER PRIMARY KEY,
  t_eff INTEGER,
  radius FLOAT
);


COPY Star (kepler_id, t_eff, radius) FROM '/data/week4/data3/stars.csv' CSV;
""")
conn.commit()

## Solution

In [9]:
cur.execute("""
ALTER TABLE Star
  ADD COLUMN ra FLOAT,
  ADD COLUMN decl FLOAT;

DELETE FROM Star;

COPY Star (kepler_id, t_eff, radius, ra, decl)
  FROM '/data/week4/data3/stars_full.csv' CSV;
""")
conn.commit()