<a href="https://colab.research.google.com/github/brendanpshea/database_sql/blob/main/Intro_SQL_03_Data_Definition_Language.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Intro to SQL 3: Data Defintion Language


*Database and SQL | Brendan Shea, PhD (Brendan.Shea@rctc.edu)*

This lecture provides an over of SQL's **data definition language (DDL)**, and its capacities to define and alter tables and the attributes they contain. We'll be using PostgreSQL in this lecture, but other enterprise-scale RDBMSs (such as Oracle, MySQL, SQL Server, etc.) will have very similar capabiliites, and the examples here can be adapted to these systems will only minimal changes. However, as always, you'll need to check the documentation!

#Setup: Run These Cells First
Let's first set up our Postgres database server and connect to it. We'll be starting with an empty database (with no tables in it). 


In [None]:
# Some UNIX utilites we need to install for the lab.
!pip install wget --quiet
!pip install sqlalchemy --quiet
!pip install ipython-sql --quiet

# Install postgresql server
!sudo apt-get -y -qq update
!sudo apt-get -y -qq install postgresql
!pip install pgspecial --quiet

!sudo service postgresql start


# Setup a password `postgres` for username `postgres`
!sudo -u postgres psql -U postgres -c "ALTER USER postgres PASSWORD 'postgres';"

# Setup a postgres database with name `my_data` to be used
!sudo -u postgres psql -U postgres -c 'DROP DATABASE IF EXISTS my_data;'

!sudo -u postgres psql -U postgres -c 'CREATE DATABASE my_data;'

# Postgres variables
%env DB_NAME=my_data
%env DB_HOST=localhost
%env DB_PORT=5432
%env DB_USER=postgres
%env DB_PASS=postgres

# Finally, let's make a connnection with the databse
%load_ext sql
%sql postgresql://$DB_USER:$DB_PASS@$DB_HOST/$DB_NAME

 * Starting PostgreSQL 10 database server
   ...done.
ALTER ROLE
DROP DATABASE
CREATE DATABASE
env: DB_NAME=my_data
env: DB_HOST=localhost
env: DB_PORT=5432
env: DB_USER=postgres
env: DB_PASS=postgres


  """)


'Connected: postgres@my_data'

#Getting to Know SQL Datatypes
This is an overview of the common SQL datatypes. Most implementations of SQL (SQLite, SQL server, Oracle, Postgres, etc.) will also provide *additional* types. We'll see some of these below.

##Text types

| Data type        | Description                                                                                                                                                                                                                                                                                      |
|------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| CHAR(size)       | Holds a fixed-length string (can contain letters, numbers, and special characters). The fixed size is specified in parenthesis. Can store up to 255 characters                                                                                                                                   |
| VARCHAR(size)    | Holds a variable length string (can contain letters, numbers, and special characters). The maximum size is specified in parenthesis. Can store up to 255 characters. Note: If you put a greater value than 255 it will be converted to a TEXT type                                               |
                                                                                                                                                                                                                                  |
| TEXT             | Holds a variable-length string with a maximum length of 65,535 characters                                                                                                                                                                                                                                        |
| BLOB (or BYTEA)             | For BLOBs (Binary Large OBjects). Holds up to 65,535 bytes of data. This is BLOB in SQLite and mySQL, or BYTEA in Postgres. Used to store files.                                                                                                                                                                                                                                                                                                                                                                                                                      |

##Number types

| Data type       | Description                                                                                                                                                                                                                           |
|-----------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
                                                                                                                       |
| SMALLINT  | Stores value from -32768 to 32767 normal.                                                                                                                                                                                                |
| INT(size)       | Stores value from -2147483648 to 2147483647.                                                                                                     |
| BIGINT(size)    | Stores value from -9223372036854775808 to 9223372036854775807 normal.                                                                                  |
| REAL (or FLOAT4)   |  A single-precision floating point number. Should NOT be used to store exact values.
| DOUBLE PRECISION (or FLOAT8)  |  A double-precision floating point number. Should NOT be used to store exact values.           |                     |
| DECIMAL(precision,scale) or NUMERIC(precision,scale) or NUMBER(precision,scale) | An exact numeric value of *precision* total digits. The number *scale* indicates how many of these are to the right of the decimal |

##Date types

| Data type   | Description                                                                                                                                                                                                                              |
|-------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| DATE()      | A date. Format: YYYY-MM-DDNote: The supported range is from '1000-01-01' to '9999-12-31'                                                                                                                                                 |
| DATETIME()  | *A date and time combination. Format: YYYY-MM-DD HH:MI:SSNote: The supported range is from '1000-01-01 00:00:00' to '9999-12-31 23:59:59'                                                                                                |
| TIMESTAMP() | *A timestamp. TIMESTAMP values are stored as the number of seconds since the Unix epoch ('1970-01-01 00:00:00' UTC). Format: YYYY-MM-DD HH:MI:SSNote: The supported range is from '1970-01-01 00:00:01' UTC to '2038-01-09 03:14:07' UTC |
| TIME()      | A time. Format: HH:MI:SSNote: The supported range is from '-838:59:59' to '838:59:59'                                                                                                                                                    |
| YEAR()      | A year in two-digit or four-digit format.Note: Values allowed in four-digit format: 1901 to 2155. Values allowed in two-digit format: 70 to 69, representing years from 1970 to 2069      

##JSON and XML

| Data type   | Description                                                                                                                                                                                                                              |
|-------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| JSON      | Semi-structurd data in Javascript Object Notation. Most databases wil provide functions for easily searching *within* this field (which is itself like little a non-relational database). Important for my "big data" applications.
| XML    | Semi-structured in Extensible Markup Languages. Similar in fuction/format to JSON, and is often used to exchange data between applications. 
                                                                               

#1. Our Data Model
We'll be creating a database that *implements* the following data model. In this case, we'll be storing data about patients, doctors, and their lab results. We'll be buiidling this step by step, in order to get a good sense of how SQL **data definition langauge (DDL)** can be used to create, modify, and delete tables in which we can store our data.
Medical.drawio (1).svg

#CREATEing and DROPping Tables
To begin with, let's create three tables: one for Patients, one for Doctors, and one for LABS. Right now, our only goal is to create the tables and their "inherent" attributes. Later, we'll worry about how to capture additional relationships and constrants. 

The CREATE TABLE statement has the following form:

```
CREATE TABLE table_name (
  column_1_name column_1_type [other constraints),
  column_2_name column_2_type [other constraints),
  ...
);
```
The result of this is the creation of a table with name table_name, and the columns that you have specified

The DROP TABLE deletes a table. Some common versions of this are:


```
DROP TABLE table_1;                 -- Deletes table_1
DROP TABLE IF EXISTS table_1;        -- Useful if you don't know if table_1 exists, and don't want an error.
DROP TABLE table_1 CASCADE;         -- Deletes the table and all linked tables
```




In my comments below, I've provided an overview of some of the considerations that are in play here. 

In [None]:
%%sql 
-- Create a table for patients
DROP TABLE IF EXISTS Patient;   -- if the table already exists, drop it

CREATE TABLE Patient(
  id          INTEGER,            -- ids are almost always ints or chars. Later, we'll see a special way of handling this in Postres. 
  dob         DATE,           -- patient's date of birth
  address     VARCHAR (128),  -- patients mailing adress. We'll assume no more than 128 characters.
  name        VARCHAR (50),    -- patient name, up to 50 characters
  weight_kg   NUMERIC (5,2),  -- patient's weight. No more than 999.99 kg! Keep 2 decimals.
  med_hist    JSON,            -- a field to store the patients medical history. This is a non-relational "database" within our database.
  photo       BYTEA           -- a photo (perhaps a .gif or .jpg) of the patient. In other RDBMS, this might be a BLOB.
);

 * postgresql://postgres:***@localhost/my_data
Done.
Done.


[]

In [None]:
%%sql 
-- Create a table for doctors

DROP TABLE IF EXISTS Doctor;

CREATE TABLE Doctor(
  id            CHAR(6),          -- IDs can also be chars, and some database textbooks recommed this. This should NOT be anything meaningful (such as name or email)
  name          VARCHAR (32),
  salary        NUMERIC(8,2),  -- No more than 999,999.99 as a salary
  med_license   CHAR(12)    -- A medical license is referred to by a fixed-length string 
)

 * postgresql://postgres:***@localhost/my_data
Done.
Done.


[]

In [None]:
%%sql 
-- Create a table for lab results

DROP TABLE IF EXISTS Lab;

CREATE TABLE Lab(
  date        DATE,     --the date the lab was administered
  glucose     REAL,     -- The patient's fasting glucose. The results of measurements like this are often stored as real values.
  platelets   DOUBLE PRECISION    -- The patient's platelent count. DOUBLE is the same as real, with twice the precision.
);

 * postgresql://postgres:***@localhost/my_data
Done.
Done.


[]

##Primary Keys and UNIQUE values, and NOT NULL
OK, so we've got some tables set up! However, our work isn't done yet. As you'll recall from earlier classes, each table in a relational databases MUST have a primary key. SQLite has allowed us to create tables with declaring a key (not all DMBSs will do this!), and has instead assigned a "default" integer key for us behind the scene. However, it's best to tell the DBMS what exactly we want the key to be.

Here, we nned to think about things like:
1. Which field (or fields) would make the best primary key.
2. Which fields we *need* data for (and thus can't be null).
3. Which fields (besides the primary key) must be be unique.

In [None]:
%%sql 
-- A slightly updated version of the Patient table.
DROP TABLE IF EXISTS Patient;  

CREATE TABLE Patient(
  id          SERIAL PRIMARY KEY,  -- "SERIAL" is a special type of INTEGER that will increase by 1 for each new entry. The default for Postgres primary keys.           
  dob         DATE NOT NULL,      -- NOT NULL means we that we *must* enter a dob for a patient           
  address     VARCHAR (128),       -- Since there is no "NOT NULL," this field mignt be  blank
  name        VARCHAR (50) NOT NULL,    -- Again, each patient must have a name    
  weight_kg   NUMERIC (5,2),          
  med_hist    JSON,            
  photo       BYTEA            
);

 * postgresql://postgres:***@localhost/my_data
Done.
Done.


[]

In [None]:
%%sql 
-- An uddated version of table for doctors

DROP TABLE IF EXISTS Doctor;

CREATE TABLE Doctor(
  id            CHAR(6) PRIMARY KEY,  -- let's make this our primary key
  name          VARCHAR (32) NOT NULL,
  salary        NUMERIC(8,2),       
  med_license   CHAR(12) UNIQUE    -- Each medical license code must be UNIQUE 
)

 * postgresql://postgres:***@localhost/my_data
Done.
Done.


[]

In [None]:
%%sql 
-- Update lab table. We don't have a primary key yet--more on this later!

DROP TABLE IF EXISTS Lab;

CREATE TABLE Lab(
  date        DATE NOT NULL,    
  glucose     REAL NOT NULL,     
  platelets   DOUBLE PRECISION NOT NULL   
);

 * postgresql://postgres:***@localhost/my_data
Done.
Done.


[]

##DEFAULT values and CHECKing constraints
One things that databases can do very well (that spreadsheets) can't is to make sure that our data "make sense." Databases can do this in several ways:

1. They can assign "default" values to fields where there is no information.
2. They can remember "contraints" on the possible values for each field. If an end-user tries to enter some value that *violates* these contraints, they won't allow the the user do so.

In [None]:
%%sql 
-- Another update to the Patient table.
DROP TABLE IF EXISTS Patient;  

CREATE TABLE Patient(
  id          SERIAL PRIMARY KEY,           
  dob         DATE NOT NULL CHECK (dob > '1800/1/1' AND dob < NOW()), -- Let's make sure the dob makes sense           
  address     VARCHAR (128),       
  name        VARCHAR (50) NOT NULL,      
  weight_kg   NUMERIC (5,2) CHECK (weight_kg > 0 AND weight_kg < 300), -- Weight must be between 0 and 300 kg.         
  med_hist    JSON,            
  photo       BYTEA            
);

 * postgresql://postgres:***@localhost/my_data
Done.
Done.


[]

In [None]:
%%sql 
-- Let's add a default salary to Doctor

DROP TABLE IF EXISTS Doctor;

CREATE TABLE Doctor(
  id           CHAR(6) PRIMARY KEY,  
  name          VARCHAR (32) NOT NULL,
  salary        NUMERIC(8,2) DEFAULT 150000.00,    -- Here the "default" might be the salary for newly hired MDs        
  med_license   CHAR(12) UNIQUE    
)

 * postgresql://postgres:***@localhost/my_data
Done.
Done.


[]

In [None]:
%%sql 
-- Update lab table

DROP TABLE IF EXISTS Lab;

CREATE TABLE Lab(
  date        DATE NOT NULL DEFAULT NOW(),           -- Absent other info, assume the lab took place today    
  glucose     REAL NOT NULL CHECK (glucose > 0),     -- No negative numbers              
  platelets   DOUBLE PRECISION NOT NULL CHECK(platelets > 0)
);

 * postgresql://postgres:***@localhost/my_data
Done.
Done.


[]

##Foreign Keys and Cascades
Alright! We've almost finished setting up our tables. Now we need to somehow link them together. Let's assume that, according to our data model:
1. Each patient can have one doctor as their primary care provider. New patients might not have a primary care provider.
2. Every lab must be associated with both a patient (its their blood!) and a doctor (who ordered the lab).
3. We also need to decide what happens if we DELETE a patient or physician from our database. (For example, a patient leaves the system, or a doctor retires).

In [None]:
%%sql 
-- Let's add a primary care provider to each patient's records

DROP TABLE IF EXISTS Patient CASCADE;     -- CASCADE drops all tables that depend on Patient (such as lab )

CREATE TABLE Patient(
  id          SERIAL PRIMARY KEY,       
  dob         DATE NOT NULL CHECK (dob > '1850/1/1' AND dob < NOW()), 
  address     VARCHAR (128),
  name        VARCHAR (50) NOT NULL,      
  weight_kg   NUMERIC (5,2) CHECK (weight_kg > 0 AND weight_kg < 300),          
  med_hist    JSON,            
  photo       BYTEA,
  primary_care CHAR(6),            -- Store a doctor id here.
  FOREIGN KEY(primary_care) REFERENCES Doctor(id)   -- Let the RDBMS know this is a foreign key, so it can ensure referential integrity   
      ON DELETE SET NULL        -- if a Doctor is removed from the system, set the primary_care field to null
)
;

 * postgresql://postgres:***@localhost/my_data
Done.
Done.


[]

In [None]:
%%sql 
-- Update lab table with references to doctor and patient 

DROP TABLE IF EXISTS Lab;

CREATE TABLE Lab(
  id          SERIAL PRIMARY KEY,  -- it may be best to use a serial primary key     -
  date        DATE NOT NULL DEFAULT NOW(),              
  glucose     REAL NOT NULL CHECK (glucose > 0),                   
  platelets   DOUBLE PRECISION NOT NULL CHECK(platelets > 0),
  patient_id  INTEGER,
  doctor_id   CHAR(6),
  FOREIGN KEY (patient_id) REFERENCES Patient(id)
    ON DELETE CASCADE,            -- if a patient is delete from the system, remove their lab results
  FOREIGN KEY (doctor_id) REFERENCES Doctor(id)
    ON DELETE SET NULL          -- if a doctor is delete from the system, just remove their id from the record
);

 * postgresql://postgres:***@localhost/my_data
Done.
Done.


[]

#ALTERing Tables and Their Attributes
We can also use the ALTER TABLE command to change our tables. Among other things we can:
1. ALTER TABLE ADD column_name(s) to add a column(s)
2. ALTER TABLE DROP column_names(s) to delete column(s)
3. ALTER TABLE ALTER column_names(s) to change columns in various ways. For example, we might change the datatype, and CHECK or DEFAULT conditions, etc.

RDBMSs can do these sorts of things much more efficiently than spreadsheets (yay!). However, in many cases, you'll want to be particularly careful when you "alter" primary keys. (And, in fact, many RDBMS systems won't allow you to alter such keys).

In [None]:
%%sql 
-- Some patients have really long names. Let's alter the patient table to increase max name size from 50 to 100
ALTER TABLE Patient ALTER COLUMN name TYPE VARCHAR(100);

-- Let's add an optional email column to the doctor table!
ALTER TABLE Doctor ADD COLUMN email VARCHAR(64);

--Let's get rid of the platelets column from the lab
ALTER TABLE Lab DROP COLUMN platelets;

--Instead, let's add a lab column that will track "LDL cholesterol"
ALTER TABLE Lab 
  ADD COLUMN ldl REAL CHECK (ldl > 0);


 * postgresql://postgres:***@localhost/my_data
Done.
Done.
Done.
Done.


[]

#Inserting Data Into Our Tables
We've finished making our tables (woo-hoo!). Now, it's time to insert some data into them. In SQL we insert data using the following command:



```
INSERT INTO table_name (attribute_list) VALUES (value_of_attribute_list)
```

Let's see how this works in action!


In [None]:
%%sql
-- Let's insert data for five patients

DELETE FROM Patient;  -- First, delete any data from table

-- Here come the inserts. Postgres will take care of assigning a primary key.
INSERT INTO Patient (name, dob, weight_kg) VALUES ('Sherlock Holmes', '1/6/1853', 80.6);
INSERT INTO Patient (name, dob, weight_kg) VALUES ('Nancy Drew', '8/31/2000', 54.54);
INSERT INTO Patient (name, dob, weight_kg) VALUES ('Veronica Mars', '5/31/1981', 50);
INSERT INTO Patient (name, dob, weight_kg) VALUES ('Harry Blackstone', '3/14/1975', 65.4);
INSERT INTO Patient (name, dob, weight_kg) VALUES ('Odafin Fin Tutuola ', '11/05/1970', 74.7);

--Now, show the result of our inserts.
SELECT * FROM Patient;

 * postgresql://postgres:***@localhost/my_data
0 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
5 rows affected.


id,dob,address,name,weight_kg,med_hist,photo,primary_care
1,1853-01-06,,Sherlock Holmes,80.6,,,
2,2000-08-31,,Nancy Drew,54.54,,,
3,1981-05-31,,Veronica Mars,50.0,,,
4,1975-03-14,,Harry Blackstone,65.4,,,
5,1970-11-05,,Odafin Fin Tutuola,74.7,,,


In [None]:
%%sql
-- Let's insert data for a few doctors

DELETE FROM Doctor;  -- First, delete any data from table

-- Notice we need to specify primary keys here!
INSERT INTO Doctor (id, name, med_license) VALUES ('df76fx', 'Gregory House', 'MN_65dsfh43s');
INSERT INTO Doctor (id, name, med_license) VALUES ('nh38nk', 'Meredith Grey', 'MN_863h53er3');
INSERT INTO Doctor (id, name, med_license) VALUES ('ty54dd', 'Doogie Howser', 'MN_673fsfsm4');

SELECT * FROM Doctor;

 * postgresql://postgres:***@localhost/my_data
0 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
3 rows affected.


id,name,salary,med_license,email
df76fx,Gregory House,150000.0,MN_65dsfh43s,
nh38nk,Meredith Grey,150000.0,MN_863h53er3,
ty54dd,Doogie Howser,150000.0,MN_673fsfsm4,


In [None]:
%%sql
-- Finally, let's insert some sample lab results.
-- We're going to let the date default to NOW()
-- Notice we need to look up patient and doctor IDs!

INSERT INTO Lab (patient_id, doctor_id, glucose, ldl) VALUES(1, 'df76fx', 78, 123);
INSERT INTO Lab (patient_id, doctor_id, glucose, ldl) VALUES(2, 'df76fx', 84, 98);
INSERT INTO Lab (patient_id, doctor_id, glucose, ldl) VALUES(3, 'nh38nk', 154, 164);
INSERT INTO Lab (patient_id, doctor_id, glucose, ldl) VALUES(4, 'ty54dd', 138, 203);

 * postgresql://postgres:***@localhost/my_data
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.


[]

#Updating and Deleting Data
We can also update and delete values in our table using the following commands:


```
-- Lets' update some rows!
UPDATE table_name 
  SET column_1 = value_1, column_2 = value_2, ...
  [WHERE condition(s)];

-- And let's delete some rows
DELETE FROM table_name [WHERE condition(s)];
```
Be careful of using DELETE FROM (or UPDATE) without a WHERE clause. If you don't specifiy which rows to update/delete, SQL will assume you want ALL of them. So "DELETE FROM table_name" deletes ALL of the rows in a table. (Unlike DROP TABLE, though, it doesn't actually delete the table itself."


In [None]:
%%sql 
-- First, let's give all of our doctors a 10% raise to their salary
UPDATE Doctor SET salary = salary * 1.1;
SELECT * FROM Doctor;

 * postgresql://postgres:***@localhost/my_data
3 rows affected.
3 rows affected.


id,name,salary,med_license,email
df76fx,Gregory House,165000.0,MN_65dsfh43s,
nh38nk,Meredith Grey,165000.0,MN_863h53er3,
ty54dd,Doogie Howser,165000.0,MN_673fsfsm4,


In [None]:
%%sql 
--Now, let's update Sherlock Holmes address
UPDATE Patient
SET address = '221b Baker St, London NW1 6XE, United Kingdom'
WHERE name = 'Sherlock Holmes';

SELECT * FROM Patient;

 * postgresql://postgres:***@localhost/my_data
1 rows affected.
5 rows affected.


id,dob,address,name,weight_kg,med_hist,photo,primary_care
2,2000-08-31,,Nancy Drew,54.54,,,
3,1981-05-31,,Veronica Mars,50.0,,,
4,1975-03-14,,Harry Blackstone,65.4,,,
5,1970-11-05,,Odafin Fin Tutuola,74.7,,,
1,1853-01-06,"221b Baker St, London NW1 6XE, United Kingdom",Sherlock Holmes,80.6,,,


In [None]:
%%sql
--Now let's delete 'Harry Blackstone' from our patient database
DELETE FROM Patient
WHERE name = 'Harry Blackstone';

 * postgresql://postgres:***@localhost/my_data
1 rows affected.


[]

In [None]:
%%sql
--Finally, let's delete ALL the data from the lab table, just to see what happens
DELETE FROM Lab;
SELECT * FROM Lab;

 * postgresql://postgres:***@localhost/my_data
3 rows affected.
0 rows affected.


id,date,glucose,patient_id,doctor_id,ldl


#Bulk Data Import
It's very time-consuming to enter data in this way and, in general, we don't need to do this. In particular, we often want to load data from **comma seperated value (CSV)** files. Most RDBMS make it very easy to do this using the SQL COPY command.

In [None]:
%%sql
--First, let's delete the existing data from our tables
DELETE FROM Patient;
DELETE FROM Doctor;
DELETE FROM Lab;

 * postgresql://postgres:***@localhost/my_data
4 rows affected.
3 rows affected.
0 rows affected.


[]

In [None]:
# Now, let's download our data files
!curl -O 'https://raw.githubusercontent.com/brendanpshea/database_class/main/doctor_data.csv'
!curl -O 'https://raw.githubusercontent.com/brendanpshea/database_class/main/patient_data.csv'
!curl -O 'https://raw.githubusercontent.com/brendanpshea/database_class/main/lab_data.csv'

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1038  100  1038    0     0   5164      0 --:--:-- --:--:-- --:--:--  5190
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 43762  100 43762    0     0   188k      0 --:--:-- --:--:-- --:--:--  188k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 86786  100 86786    0     0   399k      0 --:--:-- --:--:-- --:--:--  399k


In [None]:
%%sql
-- Finally, let's load them into our databas
COPY Doctor(id, name, salary, med_license) FROM '/content/doctor_data.csv'
  DELIMITER ',';

 * postgresql://postgres:***@localhost/my_data
25 rows affected.


[]

In [None]:
%%sql
--load patient data
COPY Patient(id,name,dob,address,weight_kg,primary_care) FROM '/content/patient_data.csv'
  DELIMITER ',' CSV;

 * postgresql://postgres:***@localhost/my_data
499 rows affected.


[]

In [None]:
%%sql
-- load lab data
COPY Lab(patient_id, doctor_id, date, glucose, ldl) 
FROM '/content/lab_data.csv' DELIMITER ',' CSV;

 * postgresql://postgres:***@localhost/my_data
1999 rows affected.


[]

In [None]:
%%sql
--Finally, let's see how this all fits together
--Which patient have the highest LDL?
SELECT P.name as "Patient", D.name as "Doctor", 
  P.weight_kg, L.date as "Lab Date", L.glucose, L.ldl
  FROM Lab L JOIN Patient P ON L.patient_id = P.id
  JOIN Doctor D ON L.doctor_id = D.id
  ORDER by L.ldl DESC
  LIMIT 20;

 * postgresql://postgres:***@localhost/my_data
20 rows affected.


Patient,Doctor,weight_kg,Lab Date,glucose,ldl
Sharon Harris,Eric Guzman,58.2,2020-02-23,66.24,217.21
Michael Miller,Raymond Jacobson,53.4,2021-10-14,94.95,200.08
Casey Hess DDS,Toni Copeland,54.2,2020-12-11,78.12,199.89
Crystal Chapman,Christopher Smith,76.1,2021-08-29,68.21,193.91
Frank Jones,Stephanie King,54.7,2022-01-14,84.35,190.02
Brandi Edwards,Carol Lee,68.9,2021-03-22,63.0,186.12
Steven Williams,Toni Copeland,66.1,2020-10-02,88.5,185.96
Ashley West,Eric Guzman,79.9,2022-01-09,120.08,185.23
Christina Murray,Timothy Salinas,48.9,2020-02-29,81.71,184.27
Blake Potter,Eric Guzman,57.1,2020-10-01,91.42,184.22
