# Advanced SQL

### Using distinct and coalesce

- The coalesce function, given two or more parameters, returns the first value that is not null


In [1]:
import psycopg2
%load_ext sql

In [2]:
%%sql
postgresql://postgres:password@localhost/advanced_sql

In [3]:
%%sql
select coalesce(NULL, 'test')

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


coalesce
test


- the function return 'test' because the first argument is Null and the second is not null

In [4]:
%%sql
select coalesce('orange', 'test')

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


coalesce
orange


- the function returns the first argument, because 'orange' is not null

In [5]:
%%sql 
select description, coalesce(description, 'No description') 
from categories order by description

 * postgresql://postgres:***@localhost/advanced_sql
6 rows affected.


description,coalesce
fruits,fruits
fruits,fruits
fruits,fruits
vegetable,vegetable
vegetable,vegetable
,No description


In [6]:
%%sql 
select coalesce(description, 'No description') as description
from categories

 * postgresql://postgres:***@localhost/advanced_sql
6 rows affected.


description
fruits
fruits
vegetable
No description
fruits
vegetable


- if we want to use the alias with space or capital letters we have to use quotes "".

In [7]:
%%sql 
select coalesce(description, 'No description') as "Description"
from categories

 * postgresql://postgres:***@localhost/advanced_sql
6 rows affected.


Description
fruits
fruits
vegetable
No description
fruits
vegetable


In [8]:
%%sql
select distinct coalesce(description, 'No description') as "Description"
from categories

 * postgresql://postgres:***@localhost/advanced_sql
3 rows affected.


Description
fruits
vegetable
No description


- we have used the select distinct statement
- this query returns only different values
- internally the data is sorted
- which means that the query my becomes slower as the number of records increases

## Using subqueries

- subqueries are nested queries

### Using the IN/NOT condition

Get row with pk of 1 or 2:

In [9]:
%%sql
select * from categories where pk=1 or pk=2 

 * postgresql://postgres:***@localhost/advanced_sql
2 rows affected.


pk,title,description
1,apple,fruits
2,orange,fruits


In [10]:
%%sql
select * from categories where pk in (1, 2) 

 * postgresql://postgres:***@localhost/advanced_sql
2 rows affected.


pk,title,description
1,apple,fruits
2,orange,fruits


- get all rows except with pk 1 or 2:

In [11]:
%%sql
select * from categories where not (pk=1 or pk=2) 

 * postgresql://postgres:***@localhost/advanced_sql
4 rows affected.


pk,title,description
3,lettuce,vegetable
4,lemon,
5,apricot,fruits
6,tomato,vegetable


- the *not in* operator reverses the functionality of the *in* operator

In [12]:
%%sql
select * from categories where pk not in (1, 2) 

 * postgresql://postgres:***@localhost/advanced_sql
4 rows affected.


pk,title,description
3,lettuce,vegetable
4,lemon,
5,apricot,fruits
6,tomato,vegetable


The records in the post table:

In [13]:
%%sql
select pk, title, content, author, category from posts

 * postgresql://postgres:***@localhost/advanced_sql
6 rows affected.


pk,title,content,author,category
1,my orange,my orange is the best orange in the world,1,2
2,my apple,my apple is the best orange in the world,1,1
3,Re:my orange,No! It's my orange the best orange in the world,2,2
4,my tomato,my tomato is the best orange in the world,2,6
5,my new orange,this my post on my new orange,1,2
6,my banana,hello b,10,11


Search all posts that belong to the orange category using subqueries.


In [14]:
%%sql
select pk,title, content, author, category from posts where category in 
(select pk from categories where title = 'orange')

 * postgresql://postgres:***@localhost/advanced_sql
3 rows affected.


pk,title,content,author,category
1,my orange,my orange is the best orange in the world,1,2
3,Re:my orange,No! It's my orange the best orange in the world,2,2
5,my new orange,this my post on my new orange,1,2


In [15]:
%%sql
select pk from categories where title = 'orange'

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


pk
2


In [16]:
%%sql
select pk,title, content, author, category from posts where category not in 
(select pk from categories where title = 'orange')

 * postgresql://postgres:***@localhost/advanced_sql
3 rows affected.


pk,title,content,author,category
2,my apple,my apple is the best orange in the world,1,1
4,my tomato,my tomato is the best orange in the world,2,6
6,my banana,hello b,10,11


In [17]:
%%sql
select p.pk as p_pk,p.title, p.category, c.pk as c_pk, c.title from posts as p, categories as c 
where p.category = c.pk and c.title = 'orange'



 * postgresql://postgres:***@localhost/advanced_sql
3 rows affected.


p_pk,title,category,c_pk,title_1
1,my orange,2,2,orange
3,Re:my orange,2,2,orange
5,my new orange,2,2,orange


### Using the Exists/NOT EXISTS condition

The Exists statement is used when we want to check whether a subquery returns (True).
For example:

In [18]:
%%sql
select pk,title, content, author, category from posts where exists 
(select pk from categories where title = 'orange' and posts.category=categories.pk)

 * postgresql://postgres:***@localhost/advanced_sql
3 rows affected.


pk,title,content,author,category
1,my orange,my orange is the best orange in the world,1,2
3,Re:my orange,No! It's my orange the best orange in the world,2,2
5,my new orange,this my post on my new orange,1,2


- for each post in the posts table, the subquery checks the categories table
to find a category in the posts table (posts.category=categories.pk) and the title of the category='orange'

Both queries written with the in condition and with the exists condition are called **semi-join queries**.

## Learning joins
- joins are a combination from the rows of two or more tables

For example, the following query returns all the combinations from the rows from the categories and the post table:

In [19]:
%%sql
select c.pk,c.title,p.pk,p.category,p.title from categories c,posts p




 * postgresql://postgres:***@localhost/advanced_sql
36 rows affected.


pk,title,pk_1,category,title_1
1,apple,1,2,my orange
2,orange,1,2,my orange
3,lettuce,1,2,my orange
4,lemon,1,2,my orange
5,apricot,1,2,my orange
6,tomato,1,2,my orange
1,apple,2,1,my apple
2,orange,2,1,my apple
3,lettuce,2,1,my apple
4,lemon,2,1,my apple


- this query makes a cartasian product between categories and posts.
- it can be called **cross join**
- it can also be written as:

In [20]:
%%sql
select c.pk,c.title,p.pk,p.category,p.title from categories c CROSS JOIN posts p

 * postgresql://postgres:***@localhost/advanced_sql
36 rows affected.


pk,title,pk_1,category,title_1
1,apple,1,2,my orange
2,orange,1,2,my orange
3,lettuce,1,2,my orange
4,lemon,1,2,my orange
5,apricot,1,2,my orange
6,tomato,1,2,my orange
1,apple,2,1,my apple
2,orange,2,1,my apple
3,lettuce,2,1,my apple
4,lemon,2,1,my apple


![](cross-join.png)

### Using INNER JOIN

- The inner join keyword selects records that have matching values in both tables

![](inner_join.png)

In [21]:
%%sql
select c.pk,c.title,p.pk,p.category,p.title from categories c,posts p 
where c.pk=p.category

 * postgresql://postgres:***@localhost/advanced_sql
5 rows affected.


pk,title,pk_1,category,title_1
2,orange,1,2,my orange
1,apple,2,1,my apple
2,orange,3,2,Re:my orange
6,tomato,4,6,my tomato
2,orange,5,2,my new orange


In [22]:
%%sql
select c.pk,c.title,p.pk,p.category,p.title from categories c 
inner join posts p on c.pk=p.category

 * postgresql://postgres:***@localhost/advanced_sql
5 rows affected.


pk,title,pk_1,category,title_1
2,orange,1,2,my orange
1,apple,2,1,my apple
2,orange,3,2,Re:my orange
6,tomato,4,6,my tomato
2,orange,5,2,my new orange


### Inner JOIN versus EXISTS/IN
- Using inner join condition, we can rewrite all queries that can be written using the IN or EXISTS condition.
- the join condition is preferable, because it performs better than 



In [23]:
%%sql
select p.pk,p.title,p.content,p.author,p.category from categories c 
inner join posts p on c.pk=p.category where c.title='orange'

 * postgresql://postgres:***@localhost/advanced_sql
3 rows affected.


pk,title,content,author,category
1,my orange,my orange is the best orange in the world,1,2
3,Re:my orange,No! It's my orange the best orange in the world,2,2
5,my new orange,this my post on my new orange,1,2


### Using Left joins



In [24]:
%%sql
select c.*,p.category, p.title from categories c 
left join posts p on p.category=c.pk

 * postgresql://postgres:***@localhost/advanced_sql
8 rows affected.


pk,title,description,category,title_1
2,orange,fruits,2.0,my orange
1,apple,fruits,1.0,my apple
2,orange,fruits,2.0,Re:my orange
6,tomato,vegetable,6.0,my tomato
2,orange,fruits,2.0,my new orange
5,apricot,fruits,,
4,lemon,,,
3,lettuce,vegetable,,


- this query returns all records of the categories table and returns the matched records
from the post table.
- if the second table (posts) has no matches, the result is null.

![](left-join.png)

- Suppose we want to search for all categories that do not have posts:

In [25]:
%%sql
select c.* from categories c
where c.pk not in 
(select category from posts)


 * postgresql://postgres:***@localhost/advanced_sql
3 rows affected.


pk,title,description
3,lettuce,vegetable
4,lemon,
5,apricot,fruits


In [26]:
%%sql
select c.* from categories c 
left join posts p on p.category=c.pk
where p.category is null

 * postgresql://postgres:***@localhost/advanced_sql
3 rows affected.


pk,title,description
3,lettuce,vegetable
4,lemon,
5,apricot,fruits


### Using Right join
- we can obtain the same result with right join:


In [27]:
%%sql
select c.*,p.category,p.title from posts p right join categories c on c.pk=p.category;

 * postgresql://postgres:***@localhost/advanced_sql
8 rows affected.


pk,title,description,category,title_1
2,orange,fruits,2.0,my orange
1,apple,fruits,1.0,my apple
2,orange,fruits,2.0,Re:my orange
6,tomato,vegetable,6.0,my tomato
2,orange,fruits,2.0,my new orange
5,apricot,fruits,,
4,lemon,,,
3,lettuce,vegetable,,


In [28]:
%%sql
select c.*,p.category, p.title from categories c 
right join posts p on p.category=c.pk

 * postgresql://postgres:***@localhost/advanced_sql
6 rows affected.


pk,title,description,category,title_1
2.0,orange,fruits,2,my orange
1.0,apple,fruits,1,my apple
2.0,orange,fruits,2,Re:my orange
6.0,tomato,vegetable,6,my tomato
2.0,orange,fruits,2,my new orange
,,,11,my banana


![](right-join.png)

### Full outer join
- is the combination of what we would have if we put together the right join and the left join

![](full-join.png)

In [29]:
%%sql
select *
from categories c
full join posts p on p.category = c.pk 



 * postgresql://postgres:***@localhost/advanced_sql
9 rows affected.


pk,title,description,pk_1,title_1,content,author,category
2.0,orange,fruits,1.0,my orange,my orange is the best orange in the world,1.0,2.0
1.0,apple,fruits,2.0,my apple,my apple is the best orange in the world,1.0,1.0
2.0,orange,fruits,3.0,Re:my orange,No! It's my orange the best orange in the world,2.0,2.0
6.0,tomato,vegetable,4.0,my tomato,my tomato is the best orange in the world,2.0,6.0
2.0,orange,fruits,5.0,my new orange,this my post on my new orange,1.0,2.0
,,,6.0,my banana,hello b,10.0,11.0
5.0,apricot,fruits,,,,,
4.0,lemon,,,,,,
3.0,lettuce,vegetable,,,,,


### Using self-join

Suppose we wanted to find all posts that belong to author 2 that have the same
category as those entered by author 1.

1. step: (search all records that belong to author 1)

In [30]:
%%sql
select distinct p1.title, p1.author, p1.category from posts p1
where p1.author = 1

 * postgresql://postgres:***@localhost/advanced_sql
3 rows affected.


title,author,category
my apple,1,1
my new orange,1,2
my orange,1,2


2. step: (search all records that belong to author 2)

In [31]:
%%sql
select distinct p2.title, p2.author, p2.category from posts p2
where p2.author = 2

 * postgresql://postgres:***@localhost/advanced_sql
2 rows affected.


title,author,category
my tomato,2,6
Re:my orange,2,2


In [32]:
%%sql
select distinct p2.title, p2.author, p2.category from posts p2, posts p1
where p1.category=p2.category and
p1.author = 1 and p2.author=2


 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


title,author,category
Re:my orange,2,2


In [33]:
%%sql
select distinct p2.title, p2.author, p2.category from posts p2
inner join posts p1 on (p1.category=p2.category)
where p1.author = 1 and p2.author=2

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


title,author,category
Re:my orange,2,2


- alias must be used for table names when a self join is performed

## Multiple joins

In [34]:
%%sql
select * from student

 * postgresql://postgres:***@localhost/advanced_sql
12 rows affected.


id,name,city,mentor_id,local_mentor
12,Wayne Green,New York,,1
2,Maria Highsmith,New York,3.0,1
10,John Goldwin,Chicago,6.0,2
3,Aimaar Abdul,Chicago,1.0,2
7,Irmgard Seekircher,Berlin,7.0,3
5,Gerald Hutticher,Berlin,6.0,3
4,Gudrun Schmidt,Berlin,5.0,3
11,Emilio Ramiro,Barcelona,,6
6,Itzi Elizabal,Barcelona,4.0,6
1,Dolores Perez,Barcelona,2.0,6


In [35]:
%%sql
select * from mentor

 * postgresql://postgres:***@localhost/advanced_sql
9 rows affected.


id,name,city
1,Peter Smith,New York
2,Laura Wild,Chicago
3,Julius Maxim,Berlin
4,Melinda O'Connor,Berlin
5,Patricia Boulard,Marseille
6,Julia Vila,Barcelona
7,Fabienne Martin,Paris
8,Rose Dupond,Brussels
9,Ahmed Ali,Marseille


In [36]:
%%sql
SELECT
    s.name,
    m.name
FROM student s
LEFT JOIN mentor m ON s.mentor_id = m.id

ORDER BY s.name;


 * postgresql://postgres:***@localhost/advanced_sql
12 rows affected.


name,name_1
Aimaar Abdul,Peter Smith
Alex Anjou,Julius Maxim
Christian Blanc,Melinda O'Connor
Dolores Perez,Laura Wild
Emilio Ramiro,
Gerald Hutticher,Julia Vila
Gudrun Schmidt,Patricia Boulard
Irmgard Seekircher,Fabienne Martin
Itzi Elizabal,Melinda O'Connor
John Goldwin,Julia Vila


In [37]:
%%sql
SELECT
    s.name,
    m.name
FROM student s
LEFT JOIN mentor m ON s.local_mentor = m.id

ORDER BY s.name;

 * postgresql://postgres:***@localhost/advanced_sql
12 rows affected.


name,name_1
Aimaar Abdul,Laura Wild
Alex Anjou,Fabienne Martin
Christian Blanc,Fabienne Martin
Dolores Perez,Julia Vila
Emilio Ramiro,Julia Vila
Gerald Hutticher,Julius Maxim
Gudrun Schmidt,Julius Maxim
Irmgard Seekircher,Julius Maxim
Itzi Elizabal,Julia Vila
John Goldwin,Laura Wild


In [38]:
%%sql
SELECT
    s.name,
    m1.name,
    m2.name
FROM student s
LEFT JOIN mentor m1 ON s.mentor_id = m1.id
LEFT JOIN mentor m2 ON s.local_mentor = m2.id

ORDER BY s.name;

 * postgresql://postgres:***@localhost/advanced_sql
12 rows affected.


name,name_1,name_2
Aimaar Abdul,Peter Smith,Laura Wild
Alex Anjou,Julius Maxim,Fabienne Martin
Christian Blanc,Melinda O'Connor,Fabienne Martin
Dolores Perez,Laura Wild,Julia Vila
Emilio Ramiro,,Julia Vila
Gerald Hutticher,Julia Vila,Julius Maxim
Gudrun Schmidt,Patricia Boulard,Julius Maxim
Irmgard Seekircher,Fabienne Martin,Julius Maxim
Itzi Elizabal,Melinda O'Connor,Julia Vila
John Goldwin,Julia Vila,Laura Wild


In [39]:
%%sql
select s.name, m.name, m2.name from student s, mentor m, mentor m2
where m.id = s.mentor_id and
m2.id = s.local_mentor
order by s.name

 * postgresql://postgres:***@localhost/advanced_sql
10 rows affected.


name,name_1,name_2
Aimaar Abdul,Peter Smith,Laura Wild
Alex Anjou,Julius Maxim,Fabienne Martin
Christian Blanc,Melinda O'Connor,Fabienne Martin
Dolores Perez,Laura Wild,Julia Vila
Gerald Hutticher,Julia Vila,Julius Maxim
Gudrun Schmidt,Patricia Boulard,Julius Maxim
Irmgard Seekircher,Fabienne Martin,Julius Maxim
Itzi Elizabal,Melinda O'Connor,Julia Vila
John Goldwin,Julia Vila,Laura Wild
Maria Highsmith,Julius Maxim,Peter Smith


#### 16.3.23 - Session with Jaman - Advanced Datatypes



### Enumerated Types
- Enumerated (enum) types are data types that comprise a static, ordered set of values.
- An example of an enum type might be the days of the week, or a set of status values for a piece of data.
Enum types are created using the CREATE TYPE command, for example:

In [40]:
%%sql
CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy', 'happy');
-- CREATE TYPE mood3 AS ENUM ('sad', 'ok', 'happy', '1');
-- CREATE TYPE mood2 AS ENUM ('sad', 'ok', 'happy', 'happy'); 
-- duplicate key value violates unique constraint

 * postgresql://postgres:***@localhost/advanced_sql
(psycopg2.errors.DuplicateObject) type "mood" already exists

[SQL: CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy', 'happy');]
(Background on this error at: https://sqlalche.me/e/20/f405)


In [41]:
%%sql
CREATE TABLE person (
    name text,
    current_mood mood
);
INSERT INTO person VALUES ('Moe', 'happy');
SELECT * FROM person WHERE current_mood = 'happy';

 * postgresql://postgres:***@localhost/advanced_sql
(psycopg2.errors.DuplicateTable) relation "person" already exists

[SQL: CREATE TABLE person (
    name text,
    current_mood mood
);]
(Background on this error at: https://sqlalche.me/e/20/f405)


In [42]:
%%sql
INSERT INTO person VALUES ('Larry', 'sad');
INSERT INTO person VALUES ('Curly', 'ok');


 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.
1 rows affected.


[]

### Ordering
- The ordering of the values in an enum type is the order in which the values were listed when the type was created.

In [43]:
%%sql
SELECT * FROM person
order by current_mood

 * postgresql://postgres:***@localhost/advanced_sql
8 rows affected.


name,current_mood
Larry,sad
Larry,sad
Larry,sad
Curly,ok
Curly,ok
Curly,ok
Moe,happy
bob,blissful


### order Alphabetically 

In [44]:
%%sql
SELECT * FROM person
order by current_mood::varchar

 * postgresql://postgres:***@localhost/advanced_sql
8 rows affected.


name,current_mood
bob,blissful
Moe,happy
Curly,ok
Curly,ok
Curly,ok
Larry,sad
Larry,sad
Larry,sad


In [45]:
%%sql
SELECT * FROM person
order by current_mood desc;

 * postgresql://postgres:***@localhost/advanced_sql
8 rows affected.


name,current_mood
bob,blissful
Moe,happy
Curly,ok
Curly,ok
Curly,ok
Larry,sad
Larry,sad
Larry,sad


In [46]:
%%sql
SELECT * FROM person WHERE current_mood > 'sad';

 * postgresql://postgres:***@localhost/advanced_sql
5 rows affected.


name,current_mood
Moe,happy
Curly,ok
Curly,ok
bob,blissful
Curly,ok


- Enum labels are case sensitive, so 'happy' is not the same as 'HAPPY'.
- White space in the labels is significant too.
- there is support for adding new values to an existing enum type
- Existing values cannot be removed from an enum type

In [47]:
%%sql
ALTER TYPE mood ADD VALUE 'very happy' AFTER 'happy';

 * postgresql://postgres:***@localhost/advanced_sql
Done.


[]

In [48]:
%%sql
INSERT INTO person VALUES ('bob', 'very happy');

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


[]

In [49]:
%%sql
SELECT * FROM person WHERE current_mood > 'sad'
order by current_mood

 * postgresql://postgres:***@localhost/advanced_sql
6 rows affected.


name,current_mood
Curly,ok
Curly,ok
Curly,ok
Moe,happy
bob,very happy
bob,blissful


In [50]:
%%sql 
ALTER TYPE mood RENAME VALUE 'very happy' TO 'blissful';

 * postgresql://postgres:***@localhost/advanced_sql
(psycopg2.errors.DuplicateObject) enum label "blissful" already exists

[SQL: ALTER TYPE mood RENAME VALUE 'very happy' TO 'blissful';]
(Background on this error at: https://sqlalche.me/e/20/f405)


In [51]:
%%sql
SELECT * FROM person WHERE current_mood > 'sad'
order by current_mood

 * postgresql://postgres:***@localhost/advanced_sql
6 rows affected.


name,current_mood
Curly,ok
Curly,ok
Curly,ok
Moe,happy
bob,very happy
bob,blissful


### UIID - Universally Unique Identifiers
- This identifier is a 128-bit quantity that is generated by an algorithm chosen to make it very unlikely that the same identifier will be generated by anyone else in the known universe using the same algorithm. 
- these identifiers provide a better uniqueness guarantee than sequence generators, which are only unique within a single database.

For example:

In [52]:
cat /etc/machine-id

c9328289eab9467794ac0f031fdb958f


- on same phones uuid will be displayed when you type:
*#06#

In [53]:
%%sql
create table mobile_phones(
    id serial,
    brand_name varchar(30),
    model text,
    operating_sys text,
    imei_number uuid
)

 * postgresql://postgres:***@localhost/advanced_sql
(psycopg2.errors.DuplicateTable) relation "mobile_phones" already exists

[SQL: create table mobile_phones(
    id serial,
    brand_name varchar(30),
    model text,
    operating_sys text,
    imei_number uuid
)]
(Background on this error at: https://sqlalche.me/e/20/f405)


In [54]:
%%sql
create extension if not exists "uuid-ossp"


 * postgresql://postgres:***@localhost/advanced_sql
Done.


[]

In [55]:
%%sql
select uuid_generate_v4() -- better than v1; completely random 

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


uuid_generate_v4
ee4114ff-6950-4bd0-900a-756c489c8900


In [56]:
%%sql
select uuid_generate_v1() -- mac address and timestamp to gen. uuid; but unsecure since personal info revealed

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


uuid_generate_v1
3806beb4-c715-11ed-a3fc-4bef70bfdade


In [57]:
%%sql
insert into mobile_phones(
    brand_name, model, operating_sys, imei_number)
values
('Samsung', 'Galaxy S23', 'Android', uuid_generate_v4())

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


[]

In [58]:
%%sql
alter table mobile_phones 
alter column imei_number 
set default uuid_generate_v4()

 * postgresql://postgres:***@localhost/advanced_sql
Done.


[]

In [59]:
%%sql
insert into mobile_phones(
    brand_name, model, operating_sys)
values
('Apple', 'iphone 14 pro', 'ios')

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


[]

In [60]:
%%sql
select * from mobile_phones

 * postgresql://postgres:***@localhost/advanced_sql
8 rows affected.


id,brand_name,model,operating_sys,imei_number,specs,spec_binary
1,Samsung,Galaxy S23,Android,01852630-83b3-467f-8d5e-c8595fb211ec,,
2,Apple,iphone 14 pro,ios,fafa1ad1-c0f7-44d5-b990-43df1849e2d6,"{'camera': '12MP', 'Memory': '128GB'}",
3,Apple,iphone 14 pro,ios,ec8acd8e-018f-4e75-b38c-8985e183e91e,"{'camera': '12MP', 'Memory': '128GB'}",
4,Apple,iphone 14 pro,ios,e11dfc09-0c4a-40b2-b2de-3d14cb627991,"{'camera': {'front': '12mp', 'back': '48mp'}}",
5,Apple,iphone 14 pro,ios,016224ca-5fb6-4d27-ab18-36fadce5e295,,"{'camera': {'back': '48mp', 'front': '12mp'}}"
6,Apple,iphone 14 pro,ios,76a6ce77-d0e2-45a1-8029-84eb9308129d,{'camera': '12mb'},
7,Samsung,Galaxy S23,Android,1832cf40-87d1-4757-80f6-a56d050b6aab,,
8,Apple,iphone 14 pro,ios,05ebcda0-1e67-46c3-a55d-967f8a832136,,


### JSON Types


In [61]:
%%sql 
alter table mobile_phones 
add column specs JSON

 * postgresql://postgres:***@localhost/advanced_sql
(psycopg2.errors.DuplicateColumn) column "specs" of relation "mobile_phones" already exists

[SQL: alter table mobile_phones 
add column specs JSON]
(Background on this error at: https://sqlalche.me/e/20/f405)


In [62]:
%%sql

update mobile_phones
set specs = '{"camera": "12MP", "Memory":"128GB"}'
where brand_name = 'Apple'

 * postgresql://postgres:***@localhost/advanced_sql
6 rows affected.


[]

In [63]:
%%sql
select * from mobile_phones

 * postgresql://postgres:***@localhost/advanced_sql
8 rows affected.


id,brand_name,model,operating_sys,imei_number,specs,spec_binary
1,Samsung,Galaxy S23,Android,01852630-83b3-467f-8d5e-c8595fb211ec,,
7,Samsung,Galaxy S23,Android,1832cf40-87d1-4757-80f6-a56d050b6aab,,
2,Apple,iphone 14 pro,ios,fafa1ad1-c0f7-44d5-b990-43df1849e2d6,"{'camera': '12MP', 'Memory': '128GB'}",
3,Apple,iphone 14 pro,ios,ec8acd8e-018f-4e75-b38c-8985e183e91e,"{'camera': '12MP', 'Memory': '128GB'}",
4,Apple,iphone 14 pro,ios,e11dfc09-0c4a-40b2-b2de-3d14cb627991,"{'camera': '12MP', 'Memory': '128GB'}",
5,Apple,iphone 14 pro,ios,016224ca-5fb6-4d27-ab18-36fadce5e295,"{'camera': '12MP', 'Memory': '128GB'}","{'camera': {'back': '48mp', 'front': '12mp'}}"
6,Apple,iphone 14 pro,ios,76a6ce77-d0e2-45a1-8029-84eb9308129d,"{'camera': '12MP', 'Memory': '128GB'}",
8,Apple,iphone 14 pro,ios,05ebcda0-1e67-46c3-a55d-967f8a832136,"{'camera': '12MP', 'Memory': '128GB'}",


In [64]:
%%sql
select specs->'camera' from mobile_phones

 * postgresql://postgres:***@localhost/advanced_sql
8 rows affected.


?column?
""
""
12MP
12MP
12MP
12MP
12MP
12MP


In [65]:
%%sql
select pg_typeof(specs->'camera') from mobile_phones

 * postgresql://postgres:***@localhost/advanced_sql
8 rows affected.


pg_typeof
json
json
json
json
json
json
json
json


In [66]:
%%sql
select pg_typeof(specs->>'camera') from mobile_phones -- --> converts json value to text

 * postgresql://postgres:***@localhost/advanced_sql
8 rows affected.


pg_typeof
text
text
text
text
text
text
text
text


In [67]:
%%sql
SELECT '"foo"'::jsonb @> '"foo"'::jsonb;

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


?column?
True


In [68]:
%%sql
insert into mobile_phones(
    brand_name, model, operating_sys, specs)
values
('Apple', 'iphone 14 pro', 'ios', '{"camera":{"front":"12mp","back":"48mp"}}')

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


[]

In [69]:
%%sql
select specs#> '{camera, back}' from mobile_phones

 * postgresql://postgres:***@localhost/advanced_sql
9 rows affected.


?column?
""
""
""
""
""
""
""
""
48mp


In [70]:
%%sql
select specs->'camera'->'back' from mobile_phones

 * postgresql://postgres:***@localhost/advanced_sql
9 rows affected.


?column?
""
""
""
""
""
""
""
""
48mp


In [71]:
%%sql
alter table mobile_phones
add column spec_binary JSONB

 * postgresql://postgres:***@localhost/advanced_sql
(psycopg2.errors.DuplicateColumn) column "spec_binary" of relation "mobile_phones" already exists

[SQL: alter table mobile_phones
add column spec_binary JSONB]
(Background on this error at: https://sqlalche.me/e/20/f405)


In [72]:
%%sql
insert into mobile_phones(
    brand_name, model, operating_sys, spec_binary)
values
('Apple', 'iphone 14 pro', 'ios', '{"camera":{"front":"12mp","back":"48mp"}}')

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


[]

In [73]:
%%sql
select * from mobile_phones

 * postgresql://postgres:***@localhost/advanced_sql
10 rows affected.


id,brand_name,model,operating_sys,imei_number,specs,spec_binary
1,Samsung,Galaxy S23,Android,01852630-83b3-467f-8d5e-c8595fb211ec,,
7,Samsung,Galaxy S23,Android,1832cf40-87d1-4757-80f6-a56d050b6aab,,
2,Apple,iphone 14 pro,ios,fafa1ad1-c0f7-44d5-b990-43df1849e2d6,"{'camera': '12MP', 'Memory': '128GB'}",
3,Apple,iphone 14 pro,ios,ec8acd8e-018f-4e75-b38c-8985e183e91e,"{'camera': '12MP', 'Memory': '128GB'}",
4,Apple,iphone 14 pro,ios,e11dfc09-0c4a-40b2-b2de-3d14cb627991,"{'camera': '12MP', 'Memory': '128GB'}",
5,Apple,iphone 14 pro,ios,016224ca-5fb6-4d27-ab18-36fadce5e295,"{'camera': '12MP', 'Memory': '128GB'}","{'camera': {'back': '48mp', 'front': '12mp'}}"
6,Apple,iphone 14 pro,ios,76a6ce77-d0e2-45a1-8029-84eb9308129d,"{'camera': '12MP', 'Memory': '128GB'}",
8,Apple,iphone 14 pro,ios,05ebcda0-1e67-46c3-a55d-967f8a832136,"{'camera': '12MP', 'Memory': '128GB'}",
9,Apple,iphone 14 pro,ios,55bf785c-fe46-4809-badf-5e5a71a1beba,"{'camera': {'front': '12mp', 'back': '48mp'}}",
10,Apple,iphone 14 pro,ios,98490ac1-3dc6-4d5f-a6ac-6942572f39ce,,"{'camera': {'back': '48mp', 'front': '12mp'}}"


In [74]:
%%sql
select spec_binary@>'{"camera":{"front":"12mp","back":"48mp"}}' from mobile_phones

 * postgresql://postgres:***@localhost/advanced_sql
10 rows affected.


?column?
""
""
""
""
""
True
""
""
""
True


In [75]:
%%sql
select spec_binary?'{"camera":{"front":"12mp","back":"48mp"}}' from mobile_phones

 * postgresql://postgres:***@localhost/advanced_sql
10 rows affected.


?column?
""
""
""
""
""
False
""
""
""
False


In [76]:
%%sql
insert into mobile_phones(
    brand_name, model, operating_sys, specs)
values
('Apple', 'iphone 14 pro', 'ios', '{"camera": "12mb"}')

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


[]

In [77]:
%%sql
select spec_binary ? 'camera' from mobile_phones

 * postgresql://postgres:***@localhost/advanced_sql
11 rows affected.


?column?
""
""
""
""
""
True
""
""
""
True


In [78]:
%%sql
SELECT '{"camera": "12mb"}'::jsonb ? 'camera'

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


?column?
True


- The || operator concatenates two JSON objects by generating an object containing the union of their keys, taking the second object's value when there are duplicate keys. 

In [79]:
%%sql
select '["a", "b"]'::jsonb || '["c", "d"]'::jsonb

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


?column?
"['a', 'b', 'c', 'd']"


#### 17.3.23 - Session with Jaman - Advanced Datatypes II
- PostgreSQL allows columns of a table to be defined as variable-length multidimensional arrays. 
- Arrays of any built-in or user-defined base type, enum type, composite type, range type, or domain can be created.

In [80]:
%%sql 
create table aldi(
    id serial,
    city varchar(30),
    shop_number int[]
)

 * postgresql://postgres:***@localhost/advanced_sql
(psycopg2.errors.DuplicateTable) relation "aldi" already exists

[SQL: create table aldi(
    id serial,
    city varchar(30),
    shop_number int[]
)]
(Background on this error at: https://sqlalche.me/e/20/f405)


In [81]:
%%sql 
CREATE TABLE sal_emp (
    name            text,
    pay_by_quarter  integer[],
    schedule        text[][]
);

 * postgresql://postgres:***@localhost/advanced_sql
(psycopg2.errors.DuplicateTable) relation "sal_emp" already exists

[SQL: CREATE TABLE sal_emp (
    name            text,
    pay_by_quarter  integer[],
    schedule        text[][]
);]
(Background on this error at: https://sqlalche.me/e/20/f405)


In [82]:
%%sql
INSERT INTO sal_emp
    VALUES ('Bill',
    '{10000, 10000, 10000, 10000}',
    '{{"meeting", "lunch"}, {"training", "presentation"}}');

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


[]

In [83]:
%%sql
INSERT INTO sal_emp
    VALUES ('Carol',
    '{20000, 25000, 25000, 25000}',
    '{{"breakfast", "consulting"}, {"meeting", "lunch"}}');

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


[]

In [84]:
%%sql
insert into aldi(city, shop_number) values
('Berlin', Array[1, 2, 3, 8, 90])

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


[]

In [85]:
%%sql
select * from aldi;

 * postgresql://postgres:***@localhost/advanced_sql
3 rows affected.


id,city,shop_number,images,opend_on,opened_time,opened_dtz,diff_time
2,Hamburg,"[10000, 10000, 10000, 10000, 12, 13]",,,,,
1,Berlin,"[1, 2, 3, 8, 90, 12, 13]",,2023-03-17,09:10:40,2023-03-17 08:10:40+01:00,"768 days, 0:00:00"
3,Berlin,"[1, 2, 3, 8, 90]",,,,,


In [86]:
%%sql
insert into aldi(city, shop_number) values
('Hamburg', '{10000, 10000, 10000, 10000}')

 * postgresql://postgres:***@localhost/advanced_sql
(psycopg2.errors.InvalidTextRepresentation) malformed array literal: "(10000, 10000, 10000, 10000)"
LINE 2: ('Hamburg', '(10000, 10000, 10000, 10000)')
                    ^
DETAIL:  Array value must start with "{" or dimension information.

[SQL: insert into aldi(city, shop_number) values
('Hamburg', '(10000, 10000, 10000, 10000)')]
(Background on this error at: https://sqlalche.me/e/20/9h9h)


In [87]:
%%sql
select * from aldi;

 * postgresql://postgres:***@localhost/advanced_sql
3 rows affected.


id,city,shop_number,images,opend_on,opened_time,opened_dtz,diff_time
2,Hamburg,"[10000, 10000, 10000, 10000, 12, 13]",,,,,
1,Berlin,"[1, 2, 3, 8, 90, 12, 13]",,2023-03-17,09:10:40,2023-03-17 08:10:40+01:00,"768 days, 0:00:00"
3,Berlin,"[1, 2, 3, 8, 90]",,,,,


In [88]:
%%sql
select * from sal_emp

 * postgresql://postgres:***@localhost/advanced_sql
4 rows affected.


name,pay_by_quarter,schedule
Bill,"[10000, 10000, 10000, 10000]","[['meeting', 'lunch'], ['training', 'presentation']]"
Carol,"[20000, 25000, 25000, 25000]","[['breakfast', 'consulting'], ['meeting', 'lunch']]"
Bill,"[10000, 10000, 10000, 10000]","[['meeting', 'lunch'], ['training', 'presentation']]"
Carol,"[20000, 25000, 25000, 25000]","[['breakfast', 'consulting'], ['meeting', 'lunch']]"


- The array subscript numbers are written within square brackets
- an array of n elements starts with array[1] and ends with array[n]

In [89]:
%%sql
SELECT * FROM sal_emp WHERE pay_by_quarter[1] != pay_by_quarter[2]

 * postgresql://postgres:***@localhost/advanced_sql
2 rows affected.


name,pay_by_quarter,schedule
Carol,"[20000, 25000, 25000, 25000]","[['breakfast', 'consulting'], ['meeting', 'lunch']]"
Carol,"[20000, 25000, 25000, 25000]","[['breakfast', 'consulting'], ['meeting', 'lunch']]"


We can also access arbitrary rectangular slices of an array

In [90]:
%%sql
SELECT name, pay_by_quarter[1:3] FROM sal_emp 

 * postgresql://postgres:***@localhost/advanced_sql
4 rows affected.


name,pay_by_quarter
Bill,"[10000, 10000, 10000]"
Carol,"[20000, 25000, 25000]"
Bill,"[10000, 10000, 10000]"
Carol,"[20000, 25000, 25000]"


- The Contains operator '@>'

In [91]:
%%sql
select pay_by_quarter @> Array[20000, 25000] from sal_emp --this subarray exists

 * postgresql://postgres:***@localhost/advanced_sql
4 rows affected.


?column?
False
True
False
True


In [92]:
%%sql
select name, unnest(schedule) from sal_emp

 * postgresql://postgres:***@localhost/advanced_sql
16 rows affected.


name,unnest
Bill,meeting
Bill,lunch
Bill,training
Bill,presentation
Carol,breakfast
Carol,consulting
Carol,meeting
Carol,lunch
Bill,meeting
Bill,lunch


In [93]:
%%sql
select name, cardinality(pay_by_quarter), cardinality(schedule) from sal_emp

 * postgresql://postgres:***@localhost/advanced_sql
4 rows affected.


name,cardinality,cardinality_1
Bill,4,4
Carol,4,4
Bill,4,4
Carol,4,4


In [94]:
%%sql
select name, (schedule) from sal_emp

 * postgresql://postgres:***@localhost/advanced_sql
4 rows affected.


name,schedule
Bill,"[['meeting', 'lunch'], ['training', 'presentation']]"
Carol,"[['breakfast', 'consulting'], ['meeting', 'lunch']]"
Bill,"[['meeting', 'lunch'], ['training', 'presentation']]"
Carol,"[['breakfast', 'consulting'], ['meeting', 'lunch']]"


In [95]:
%%sql
select * from sal_emp
where 'consulting' = any (schedule)

 * postgresql://postgres:***@localhost/advanced_sql
2 rows affected.


name,pay_by_quarter,schedule
Carol,"[20000, 25000, 25000, 25000]","[['breakfast', 'consulting'], ['meeting', 'lunch']]"
Carol,"[20000, 25000, 25000, 25000]","[['breakfast', 'consulting'], ['meeting', 'lunch']]"


### && Overlap operator

In [96]:
%%sql
select schedule && Array['breakfast', 'training'], schedule from sal_emp

 * postgresql://postgres:***@localhost/advanced_sql
4 rows affected.


?column?,schedule
True,"[['meeting', 'lunch'], ['training', 'presentation']]"
True,"[['breakfast', 'consulting'], ['meeting', 'lunch']]"
True,"[['meeting', 'lunch'], ['training', 'presentation']]"
True,"[['breakfast', 'consulting'], ['meeting', 'lunch']]"


In [97]:
%%sql
update aldi set shop_number = array_append(shop_number, 11);
update aldi set shop_number = array_cat(shop_number, array[12,13]) 

 * postgresql://postgres:***@localhost/advanced_sql
3 rows affected.
3 rows affected.


[]

In [98]:
%%sql
select * from aldi

 * postgresql://postgres:***@localhost/advanced_sql
3 rows affected.


id,city,shop_number,images,opend_on,opened_time,opened_dtz,diff_time
2,Hamburg,"[10000, 10000, 10000, 10000, 12, 13, 11, 12, 13]",,,,,
1,Berlin,"[1, 2, 3, 8, 90, 12, 13, 11, 12, 13]",,2023-03-17,09:10:40,2023-03-17 08:10:40+01:00,"768 days, 0:00:00"
3,Berlin,"[1, 2, 3, 8, 90, 11, 12, 13]",,,,,


In [99]:
%%sql
select array_append(ARRAY[1,2], 3)

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


array_append
"[1, 2, 3]"


In [100]:
%%sql
update aldi set shop_number = array_remove(shop_number, 11);
select * from aldi

 * postgresql://postgres:***@localhost/advanced_sql
3 rows affected.
3 rows affected.


id,city,shop_number,images,opend_on,opened_time,opened_dtz,diff_time
2,Hamburg,"[10000, 10000, 10000, 10000, 12, 13, 12, 13]",,,,,
1,Berlin,"[1, 2, 3, 8, 90, 12, 13, 12, 13]",,2023-03-17,09:10:40,2023-03-17 08:10:40+01:00,"768 days, 0:00:00"
3,Berlin,"[1, 2, 3, 8, 90, 12, 13]",,,,,


In [101]:
%%sql
select array_length(shop_number, 1) from aldi

 * postgresql://postgres:***@localhost/advanced_sql
3 rows affected.


array_length
8
9
7


In [102]:
%%sql
select *,array_length(schedule,1) from sal_emp

 * postgresql://postgres:***@localhost/advanced_sql
4 rows affected.


name,pay_by_quarter,schedule,array_length
Bill,"[10000, 10000, 10000, 10000]","[['meeting', 'lunch'], ['training', 'presentation']]",2
Carol,"[20000, 25000, 25000, 25000]","[['breakfast', 'consulting'], ['meeting', 'lunch']]",2
Bill,"[10000, 10000, 10000, 10000]","[['meeting', 'lunch'], ['training', 'presentation']]",2
Carol,"[20000, 25000, 25000, 25000]","[['breakfast', 'consulting'], ['meeting', 'lunch']]",2


In [103]:
%%sql
update sal_emp 
set array_cat(schedule[1], array_append(schedule[2], 'brunch'))
where name = 'Carol'

 * postgresql://postgres:***@localhost/advanced_sql
(psycopg2.errors.SyntaxError) syntax error at or near "("
LINE 2: set array_cat(schedule[1], array_append(schedule[2], 'brunch...
                     ^

[SQL: update sal_emp 
set array_cat(schedule[1], array_append(schedule[2], 'brunch'))
where name = 'Carol']
(Background on this error at: https://sqlalche.me/e/20/f405)


In [104]:
%%sql
select schedule from sal_emp;



 * postgresql://postgres:***@localhost/advanced_sql
4 rows affected.


schedule
"[['meeting', 'lunch'], ['training', 'presentation']]"
"[['breakfast', 'consulting'], ['meeting', 'lunch']]"
"[['meeting', 'lunch'], ['training', 'presentation']]"
"[['breakfast', 'consulting'], ['meeting', 'lunch']]"


In [105]:
%%sql
select array_cat(schedule[1:1], schedule[2:2]) from sal_emp

 * postgresql://postgres:***@localhost/advanced_sql
4 rows affected.


array_cat
"[['meeting', 'lunch'], ['training', 'presentation']]"
"[['breakfast', 'consulting'], ['meeting', 'lunch']]"
"[['meeting', 'lunch'], ['training', 'presentation']]"
"[['breakfast', 'consulting'], ['meeting', 'lunch']]"


In [106]:
%%sql
select array_agg(ELEMENTS) from
(select unnest(schedule[1:1]) from sal_emp where name = 'Carol') as ELEMENTS

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


array_agg
"{(breakfast),(consulting),(breakfast),(consulting)}"


In [107]:
%%sql
alter table aldi
add column images bytea;

 * postgresql://postgres:***@localhost/advanced_sql
(psycopg2.errors.DuplicateColumn) column "images" of relation "aldi" already exists

[SQL: alter table aldi
add column images bytea;]
(Background on this error at: https://sqlalche.me/e/20/f405)


In [108]:
%%sql
update aldi set images = null where id=1

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


[]

In [109]:
%%sql
select * from aldi

 * postgresql://postgres:***@localhost/advanced_sql
3 rows affected.


id,city,shop_number,images,opend_on,opened_time,opened_dtz,diff_time
2,Hamburg,"[10000, 10000, 10000, 10000, 12, 13, 12, 13]",,,,,
3,Berlin,"[1, 2, 3, 8, 90, 12, 13]",,,,,
1,Berlin,"[1, 2, 3, 8, 90, 12, 13, 12, 13]",,2023-03-17,09:10:40,2023-03-17 08:10:40+01:00,"768 days, 0:00:00"


In [110]:
%%sql
alter table aldi
add column opend_on date

 * postgresql://postgres:***@localhost/advanced_sql
(psycopg2.errors.DuplicateColumn) column "opend_on" of relation "aldi" already exists

[SQL: alter table aldi
add column opend_on date]
(Background on this error at: https://sqlalche.me/e/20/f405)


In [111]:
%%sql
update aldi set opend_on = '2023-03-17' where id=1

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


[]

In [112]:
%%sql
select * from aldi

 * postgresql://postgres:***@localhost/advanced_sql
3 rows affected.


id,city,shop_number,images,opend_on,opened_time,opened_dtz,diff_time
2,Hamburg,"[10000, 10000, 10000, 10000, 12, 13, 12, 13]",,,,,
3,Berlin,"[1, 2, 3, 8, 90, 12, 13]",,,,,
1,Berlin,"[1, 2, 3, 8, 90, 12, 13, 12, 13]",,2023-03-17,09:10:40,2023-03-17 08:10:40+01:00,"768 days, 0:00:00"


In [113]:
%%sql
alter table aldi
add column opened_time time

 * postgresql://postgres:***@localhost/advanced_sql
(psycopg2.errors.DuplicateColumn) column "opened_time" of relation "aldi" already exists

[SQL: alter table aldi
add column opened_time time]
(Background on this error at: https://sqlalche.me/e/20/f405)


In [114]:
%%sql
update aldi set opened_time = '09:10:40+2' where id=1;
select * from aldi

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.
3 rows affected.


id,city,shop_number,images,opend_on,opened_time,opened_dtz,diff_time
2,Hamburg,"[10000, 10000, 10000, 10000, 12, 13, 12, 13]",,,,,
3,Berlin,"[1, 2, 3, 8, 90, 12, 13]",,,,,
1,Berlin,"[1, 2, 3, 8, 90, 12, 13, 12, 13]",,2023-03-17,09:10:40,2023-03-17 08:10:40+01:00,"768 days, 0:00:00"


In [115]:
%%sql
alter table aldi
add column opened_dtz timestamp with time zone

 * postgresql://postgres:***@localhost/advanced_sql
(psycopg2.errors.DuplicateColumn) column "opened_dtz" of relation "aldi" already exists

[SQL: alter table aldi
add column opened_dtz timestamp with time zone]
(Background on this error at: https://sqlalche.me/e/20/f405)


In [116]:
%%sql
update aldi set opened_dtz = '2023-03-17 09:10:40+2' where id=1;
select * from aldi

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.
3 rows affected.


id,city,shop_number,images,opend_on,opened_time,opened_dtz,diff_time
2,Hamburg,"[10000, 10000, 10000, 10000, 12, 13, 12, 13]",,,,,
3,Berlin,"[1, 2, 3, 8, 90, 12, 13]",,,,,
1,Berlin,"[1, 2, 3, 8, 90, 12, 13, 12, 13]",,2023-03-17,09:10:40,2023-03-17 08:10:40+01:00,"768 days, 0:00:00"


In [117]:
%%sql
alter table aldi
add column diff_time interval day 

 * postgresql://postgres:***@localhost/advanced_sql
(psycopg2.errors.DuplicateColumn) column "diff_time" of relation "aldi" already exists

[SQL: alter table aldi
add column diff_time interval day]
(Background on this error at: https://sqlalche.me/e/20/f405)


In [118]:
%%sql
update aldi set diff_time = 'P2Y1M1W1DT1H1M1S' where id=1;
select * from aldi

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.
3 rows affected.


id,city,shop_number,images,opend_on,opened_time,opened_dtz,diff_time
2,Hamburg,"[10000, 10000, 10000, 10000, 12, 13, 12, 13]",,,,,
3,Berlin,"[1, 2, 3, 8, 90, 12, 13]",,,,,
1,Berlin,"[1, 2, 3, 8, 90, 12, 13, 12, 13]",,2023-03-17,09:10:40,2023-03-17 08:10:40+01:00,"768 days, 0:00:00"


### Recap/Subqueries II

Suppose we have a parents table of items and and a child table of stocks which reference to the item table.

Now lets produce a list of all items in stock without using joins

In [163]:
-- %%sql
-- SELECT
--   	location,
--   	item_id,
--   	(
--         select name from item where stock.item_id = id 
--   	), quantity 
-- FROM stock
-- ORDER BY location, name, quantity

SyntaxError: invalid syntax (2180195842.py, line 1)

In [170]:
%%sql
SELECT
  	location,
  	item_id,
  	(
        select id from item where stock.item_id = id -- must be one colum; either one row or where clause
  	), quantity 
FROM stock
ORDER BY location
limit 10

 * postgresql://postgres:***@localhost/advanced_sql
10 rows affected.


location,item_id,id,quantity
Overseas,1,1,8
Overseas,3,3,8
Overseas,4,4,0
Overseas,5,5,5
Overseas,6,6,6
Overseas,7,7,6
Overseas,8,8,8
Overseas,9,9,7
Overseas,10,10,5
Overseas,12,12,9


## With inner join

In [171]:
%%sql
SELECT
  	location,
  	item_id,
    item.id,
    quantity
from stock
inner join item
on item.id = stock.item_id
ORDER BY location
limit 10

 * postgresql://postgres:***@localhost/advanced_sql
10 rows affected.


location,item_id,id,quantity
Overseas,1,1,8
Overseas,3,3,8
Overseas,4,4,0
Overseas,5,5,5
Overseas,6,6,6
Overseas,7,7,6
Overseas,8,8,8
Overseas,9,9,7
Overseas,10,10,5
Overseas,12,12,9


lets find out the items with the maximum quantity:
1. get the highest quantity using the stock table
2. return the names of those items using the items and stock table 

In [187]:
%%sql

    SELECT quantity
    FROM stock
    ORDER BY quantity DESC
    LIMIT 1

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


quantity
9


In [179]:
%%sql
SELECT
  	(
  		SELECT name
  		FROM item
  		WHERE stock.item_id = item.id
  	) AS item,
  	location,
  	quantity
FROM stock
WHERE quantity = (
    SELECT quantity
    FROM stock
    ORDER BY quantity DESC
    LIMIT 1
)
ORDER BY item, location

 * postgresql://postgres:***@localhost/advanced_sql
23 rows affected.


item,location,quantity
Brand new Monitor,Overseas,9
Brand new Notebook,Outskirts 1,9
Cheap Bike,Overseas,9
Cheap garden table,Outskirts 2,9
Cheap Keyboard,Central Warehouse,9
Cheap Laptop,Outskirts 1,9
Cheap Mouse,Outskirts 2,9
Cheap Notebook,Outskirts 2,9
Cheap Smartphone,Overseas,9
Exceptional Jeans,Outskirts 2,9


Lets try to find items which are out of stock and have always a quantity of zero.

### Using the IN operator

In [130]:
%%sql 
select * from stock where item_id not in
(select item_id from stock where quantity > 0)

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


location,item_id,quantity
Central Warehouse,103,0


- why yields the following query a different result? 

In [127]:
%%sql
select * from item where id in
(select item_id from stock where quantity = 0)
order by name


 * postgresql://postgres:***@localhost/advanced_sql
33 rows affected.


id,name
22,Brand new Mouse
53,Brand new Mouse
4,Brand new Notebook
76,Brand new T-shirt
40,Cheap Laptop
41,Cheap Mouse
1,Cheap Tablet
94,Cheap T-shirt
103,Cheap T-shirt
32,Exceptional Jeans


In [133]:
%%sql
select * from stock
where item_id=72


 * postgresql://postgres:***@localhost/advanced_sql
3 rows affected.


location,item_id,quantity
Overseas,72,0
Central Warehouse,72,4
Outskirts 1,72,7


## Now with join

In [137]:
%%sql
select s1.* from stock s1
left join (select * from stock where quantity > 0) s2
on s1.item_id = s2.item_id
where s2.item_id is null;

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


location,item_id,quantity
Central Warehouse,103,0


## Aggregate Function

- Agg. functions perform a calculation on a set of rows and return a single row.
for example:
1. avg()
2. count()
3. max()
4. min()
5. sum()
6. (array_agg()) 

In [244]:
%%sql
select avg(quantity), sum(quantity)::numeric/count(*) from stock

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


avg,?column?
4.2904290429042895,4.2904290429042895


In [260]:
%%sql
select avg(quantity), sum(quantity)::numeric/count(id)  from 
(
select * from item
left join stock
on item.id = stock.item_id
where stock.item_id is null or stock.item_id = 10
) as out_of_stock


 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


avg,?column?
3.6666666666666665,1.8333333333333333


# Group by

- agg functions are used in conjunction with the group by clause
-  group by clause splits a result into groups of rows and agg functions perform calculation on them
- before grouping the data is sorted internally

#make diagram

count number of different items:

In [205]:
%%sql 
select location, count(*) as number_of from stock
group by location
order by number_of

 * postgresql://postgres:***@localhost/advanced_sql
4 rows affected.


location,number_of
Overseas,72
Central Warehouse,72
Outskirts 1,76
Outskirts 2,83


- if we want to use  conditions based on the result of an agg. function in a group by
than we have to use the having condition

now lets groupby with a condition the count should be higher than 72:

In [207]:
%%sql 
select location, count(*) as number_of from stock
group by location
having count(*) > 72
order by number_of

 * postgresql://postgres:***@localhost/advanced_sql
2 rows affected.


location,number_of
Outskirts 1,76
Outskirts 2,83


## Latter we will see have we can use groupby
- getting sum per item in stock
- filter for items which are zero

In [138]:
%%sql
select item_id, sum(quantity) from stock group by item_id
having sum(quantity) = 0

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


item_id,sum
103,0


In [196]:
%%sql
select * from item
left join
(
select item_id, sum(quantity) from stock group by item_id
having sum(quantity) = 0
) gr_stock
on item.id = gr_stock.item_id
where gr_stock.item_id is not null;

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


id,name,item_id,sum
103,Cheap T-shirt,103,0


### Maximum of each group

In [222]:
%%sql
select count(distinct item_id) from stock;

 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


count
102


In [231]:
%%sql
select item_id, item.name, max(quantity) as maximum from stock
inner join item 
on item.id = stock.item_id
group by stock.item_id, item.name

 * postgresql://postgres:***@localhost/advanced_sql
102 rows affected.


item_id,name,maximum
5,Exceptional Jeans,6
10,Exceptional Notebook,5
98,Used garden table,8
99,Standard Tablet,5
84,Used Laptop,8
64,Exceptional Keyboard,7
74,Used Bike,6
17,Cheap Bike,9
101,Cheap Smartphone,9
91,Exceptional Chair,7


In [237]:
%%sql
select *  from stock
inner join
(
select item_id, item.name as name, max(quantity) as maximum from stock
inner join item 
on item.id = stock.item_id
group by stock.item_id, item.name
) as agg
on stock.item_id = agg.item_id and stock.quantity = agg.maximum

 * postgresql://postgres:***@localhost/advanced_sql
116 rows affected.


location,item_id,quantity,item_id_1,name,maximum
Central Warehouse,5,6,5,Exceptional Jeans,6
Outskirts 2,10,5,10,Exceptional Notebook,5
Overseas,10,5,10,Exceptional Notebook,5
Outskirts 2,98,8,98,Used garden table,8
Central Warehouse,99,5,99,Standard Tablet,5
Overseas,99,5,99,Standard Tablet,5
Central Warehouse,84,8,84,Used Laptop,8
Overseas,64,7,64,Exceptional Keyboard,7
Overseas,74,6,74,Used Bike,6
Overseas,17,9,17,Cheap Bike,9


What do you think how array_agg() works?

In [210]:
%%sql
select * from item 
inner join (
select item_id, array_agg(location) from stock
group by item_id
) aggregate
on item.id = aggregate.item_id



 * postgresql://postgres:***@localhost/advanced_sql
102 rows affected.


id,name,item_id,array_agg
55,Cheap Tablet,55,"{""Outskirts 2"",Overseas,""Central Warehouse"",""Outskirts 1""}"
27,Brand new Notebook,27,"{""Outskirts 1"",""Outskirts 2"",""Central Warehouse"",Overseas}"
23,Standard Chair,23,"{""Outskirts 1"",Overseas,""Outskirts 2"",""Central Warehouse""}"
56,Used Mouse,56,"{""Outskirts 2"",""Outskirts 1"",""Central Warehouse"",Overseas}"
58,Standard Keyboard,58,"{""Outskirts 1"",Overseas,""Central Warehouse"",""Outskirts 2""}"
91,Exceptional Chair,91,"{""Central Warehouse"",Overseas,""Outskirts 1"",""Outskirts 2""}"
8,Standard Bike,8,"{Overseas,""Outskirts 1"",""Central Warehouse""}"
87,Refurbished Mouse,87,"{Overseas,""Central Warehouse"",""Outskirts 2""}"
74,Used Bike,74,"{""Central Warehouse"",""Outskirts 2"",""Outskirts 1"",Overseas}"
29,Brand new Mouse,29,"{""Outskirts 1"",""Central Warehouse""}"


## get duplicates

In [241]:
%%sql
select count(*) from
(
select item_id, count(*) from stock
group by item_id
having count(*) > 1
) as dubs


 * postgresql://postgres:***@localhost/advanced_sql
1 rows affected.


count
96
