# Lecture 5: Transactions, ACID, subqueries

## Announcements

- Get ready for Quiz. Check more information we discussed in the lecture 4 announcement.
- If you haven't noticed, we have additional materials and installation instructions. Installation instructions all for last week.

## Halfway check

- We learned many SQL keywords and how to use them. (some of them here)
    - SELECT
    - FROM
    - WHERE
    - GROUP BY
    - HAVING
    - ORDER BY
    - LIMIT

- We also looked into different types of joins and how to use them. And how the results are going to look for each join.
    - INNER JOIN
    - LEFT JOIN
    - RIGHT JOIN
    - FULL OUTER JOIN
    - CROSS JOIN
    - NATURAL JOIN
- We spent quite some time understanding the candidate, primary, and foreign keys. And how to use them. And the properties of each of them.
- How and where to specify the referential actions and how they behave.

## Todays theme
- Transactions and ACID
- Subqueries (correlated and uncorrelated)
- More keywords ANY, ALL and EXISTS.

## Transactions: all or nothing

So think about a banking system where you have to do 2 things at the same time: For example, think about an etransfer where you send money to someone else account. 
If you see this as 2 transactions, 
- one is crediting money and 
- other is debiting money.

```{admonition} Discussion Ques: What if some outage happen b/w 2 transactions ?? What should we do?

<img src="img/discuss.png" width="120">
```

```{toggle}
- Either all to be done or none to be done.

In other words, the changes need to be considered and treated as a single logical unit of work: either all changes are successfully made, or all of them should fail.
```

In database (and in banking) terms, each unit of work is called a transaction. Following is how you do it in SQL:

```sql
BEGIN TRANSACTION;

-- debiting source account
UPDATE
    accounts
SET
    balance = balance - 1000.0
WHERE
    account_number = 1132;

-- crediting target account
UPDATE
    accounts
SET
    balance = balance + 1000.0
WHERE
    account_number = 1279;
    
COMMIT;
```

```{important}
3 things to keep in mind:

- BEGIN TRANSACTION: start a transaction.
- COMMIT: end a transaction and commit all changes. Make changes permanent.
- ROLLBACK: end a transaction and discard all changes. Make no changes permanent or revert to the state before the transaction starts.
```
So, let's try to see the fun for real.

### Transactions Demo

```{note}
Here, we will interact with the data in 2 different ways. 1 is through the SQL magics, and the other is through psycopg2. These 2 will be treated as 2 different connections to the database.
```

In [49]:
# This is how you deal with credentials in a notebook
import json
import urllib.parse
import psycopg2
with open('../lectures/data/credentials.json') as f:
    login = json.load(f)
    
username = login['user']
password = urllib.parse.quote(login['password'])
host = login['host']
port = login['port']

In [50]:
%load_ext sql
%config SqlMagic.displaylimit = 30

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [51]:
%sql postgresql://{username}:{password}@{host}:{port}/mds

In [52]:
%%sql

DROP TABLE IF EXISTS
    instructor,
    instructor_course,
    course_cohort
;

CREATE TABLE instructor (
    id INTEGER PRIMARY KEY,
    name TEXT,
    email TEXT,
    phone VARCHAR(12),
    department VARCHAR(50)
    )
;

INSERT INTO
    instructor (id, name, email, phone, department)
VALUES
    (1, 'Mike', 'mike@mds.ubc.ca', '605-332-2343', 'Computer Science'),
    (2, 'Tiffany', 'tiff@mds.ubc.ca', '445-794-2233', 'Neuroscience'),
    (3, 'Arman', 'arman@mds.ubc.ca', '935-738-5796', 'Physics'),
    (4, 'Varada', 'varada@mds.ubc.ca', '243-924-4446', 'Computer Science'),
    (5, 'Quan', 'quan@mds.ubc.ca', '644-818-0254', 'Economics'),
    (6, 'Joel', 'joel@mds.ubc.ca', '773-432-7669', 'Biomedical Engineering'),
    (7, 'Florencia', 'flor@mds.ubc.ca', '773-926-2837', 'Biology'),
    (8, 'Alexi', 'alexiu@mds.ubc.ca', '421-888-4550', 'Statistics'),
    (15, 'Vincenzo', 'vincenzo@mds.ubc.ca', '776-543-1212', 'Statistics'),
    (19, 'Gittu', 'gittu@mds.ubc.ca', '776-334-1132', 'Biomedical Engineering'),
    (16, 'Jessica', 'jessica@mds.ubc.ca', '211-990-1762', 'Computer Science')
;

    
CREATE TABLE instructor_course (
    id SERIAL PRIMARY KEY,
    instructor_id INTEGER,
    course TEXT,
    enrollment INTEGER,
    begins DATE
    )
;

INSERT INTO
    instructor_course (instructor_id, course, enrollment, begins)
VALUES
    (8, 'Statistical Inference and Computation I', 125, '2021-10-01'),
    (8, 'Regression II', 102, '2022-02-05'),
    (1, 'Descriptive Statistics and Probability', 79, '2021-09-10'),
    (1, 'Algorithms and Data Structures', 25, '2021-10-01'),
    (3, 'Algorithms and Data Structures', 25, '2021-10-01'),
    (3, 'Python Programming', 133, '2021-09-07'),
    (3, 'Databases & Data Retrieval', 118, '2021-11-16'),
    (6, 'Visualization I', 155, '2021-10-01'),
    (6, 'Privacy, Ethics & Security', 148, '2022-03-01'),
    (2, 'Programming for Data Manipulation', 160, '2021-09-08'),
    (7, 'Data Science Workflows', 98, '2021-09-15'),
    (2, 'Data Science Workflows', 98, '2021-09-15'),
    (12, 'Web & Cloud Computing', 78, '2022-02-10'),
    (10, 'Introduction to Optimization', NULL, '2022-09-01'),
    (9, 'Parallel Computing', NULL, '2023-01-10'),
    (13, 'Natural Language Processing', NULL, '2023-09-10')
;

CREATE TABLE course_cohort (
    id INTEGER,
    cohort VARCHAR(7)
    )
;

INSERT INTO
    course_cohort (id, cohort)
VALUES
    (13, 'MDS-CL'),
    (8, 'MDS-CL'),
    (1, 'MDS-CL'),
    (3, 'MDS-CL'),
    (1, 'MDS-V'),
    (9, 'MDS-V'),
    (9, 'MDS-V'),
    (3, 'MDS-V')
;

   postgresql://postgres:***@localhost:5432/imdb
 * postgresql://postgres:***@localhost:5432/mds
   postgresql://postgres:***@localhost:5432/world
Done.
Done.
11 rows affected.
Done.
16 rows affected.
Done.
8 rows affected.


[]

In [53]:
%sql SELECT * FROM instructor;

   postgresql://postgres:***@localhost:5432/imdb
 * postgresql://postgres:***@localhost:5432/mds
   postgresql://postgres:***@localhost:5432/world
11 rows affected.


id,name,email,phone,department
1,Mike,mike@mds.ubc.ca,605-332-2343,Computer Science
2,Tiffany,tiff@mds.ubc.ca,445-794-2233,Neuroscience
3,Arman,arman@mds.ubc.ca,935-738-5796,Physics
4,Varada,varada@mds.ubc.ca,243-924-4446,Computer Science
5,Quan,quan@mds.ubc.ca,644-818-0254,Economics
6,Joel,joel@mds.ubc.ca,773-432-7669,Biomedical Engineering
7,Florencia,flor@mds.ubc.ca,773-926-2837,Biology
8,Alexi,alexiu@mds.ubc.ca,421-888-4550,Statistics
15,Vincenzo,vincenzo@mds.ubc.ca,776-543-1212,Statistics
19,Gittu,gittu@mds.ubc.ca,776-334-1132,Biomedical Engineering


In [54]:
conn = psycopg2.connect(database='mds', **login)

In [55]:
cur = conn.cursor()
cur.execute("""
    BEGIN TRANSACTION;
    
    UPDATE
        instructor
    SET
        phone = NULL;
""")

In [56]:
cur.execute("SELECT * FROM instructor")
cur.fetchall()

[(1, 'Mike', 'mike@mds.ubc.ca', None, 'Computer Science'),
 (2, 'Tiffany', 'tiff@mds.ubc.ca', None, 'Neuroscience'),
 (3, 'Arman', 'arman@mds.ubc.ca', None, 'Physics'),
 (4, 'Varada', 'varada@mds.ubc.ca', None, 'Computer Science'),
 (5, 'Quan', 'quan@mds.ubc.ca', None, 'Economics'),
 (6, 'Joel', 'joel@mds.ubc.ca', None, 'Biomedical Engineering'),
 (7, 'Florencia', 'flor@mds.ubc.ca', None, 'Biology'),
 (8, 'Alexi', 'alexiu@mds.ubc.ca', None, 'Statistics'),
 (15, 'Vincenzo', 'vincenzo@mds.ubc.ca', None, 'Statistics'),
 (19, 'Gittu', 'gittu@mds.ubc.ca', None, 'Biomedical Engineering'),
 (16, 'Jessica', 'jessica@mds.ubc.ca', None, 'Computer Science')]

In [57]:
%sql SELECT * FROM instructor;

   postgresql://postgres:***@localhost:5432/imdb
 * postgresql://postgres:***@localhost:5432/mds
   postgresql://postgres:***@localhost:5432/world
11 rows affected.


id,name,email,phone,department
1,Mike,mike@mds.ubc.ca,605-332-2343,Computer Science
2,Tiffany,tiff@mds.ubc.ca,445-794-2233,Neuroscience
3,Arman,arman@mds.ubc.ca,935-738-5796,Physics
4,Varada,varada@mds.ubc.ca,243-924-4446,Computer Science
5,Quan,quan@mds.ubc.ca,644-818-0254,Economics
6,Joel,joel@mds.ubc.ca,773-432-7669,Biomedical Engineering
7,Florencia,flor@mds.ubc.ca,773-926-2837,Biology
8,Alexi,alexiu@mds.ubc.ca,421-888-4550,Statistics
15,Vincenzo,vincenzo@mds.ubc.ca,776-543-1212,Statistics
19,Gittu,gittu@mds.ubc.ca,776-334-1132,Biomedical Engineering


```{admonition} Discussion Question: Why am I still getting those phone numbers? I thought I updated them to NULL.

<img src="img/discuss.png" width="120">
```

In [58]:
cur.execute("COMMIT;")

In [59]:
%sql SELECT * FROM instructor;

   postgresql://postgres:***@localhost:5432/imdb
 * postgresql://postgres:***@localhost:5432/mds
   postgresql://postgres:***@localhost:5432/world
11 rows affected.


id,name,email,phone,department
1,Mike,mike@mds.ubc.ca,,Computer Science
2,Tiffany,tiff@mds.ubc.ca,,Neuroscience
3,Arman,arman@mds.ubc.ca,,Physics
4,Varada,varada@mds.ubc.ca,,Computer Science
5,Quan,quan@mds.ubc.ca,,Economics
6,Joel,joel@mds.ubc.ca,,Biomedical Engineering
7,Florencia,flor@mds.ubc.ca,,Biology
8,Alexi,alexiu@mds.ubc.ca,,Statistics
15,Vincenzo,vincenzo@mds.ubc.ca,,Statistics
19,Gittu,gittu@mds.ubc.ca,,Biomedical Engineering


```{note}
Isolate transactions from each other. Unless you make those changes permanent, other users won't be able to see those changes in the database.
```

<img src="img/commitdraw.jpeg" width="600">

Lets look at an example of ROLLBACK. 

In [60]:
cur.execute("""
    BEGIN TRANSACTION;
    
    DELETE FROM instructor;
    """)

In [63]:
cur.execute("SELECT * FROM instructor")
cur.fetchall()

[(1, 'Mike', 'mike@mds.ubc.ca', None, 'Computer Science'),
 (2, 'Tiffany', 'tiff@mds.ubc.ca', None, 'Neuroscience'),
 (3, 'Arman', 'arman@mds.ubc.ca', None, 'Physics'),
 (4, 'Varada', 'varada@mds.ubc.ca', None, 'Computer Science'),
 (5, 'Quan', 'quan@mds.ubc.ca', None, 'Economics'),
 (6, 'Joel', 'joel@mds.ubc.ca', None, 'Biomedical Engineering'),
 (7, 'Florencia', 'flor@mds.ubc.ca', None, 'Biology'),
 (8, 'Alexi', 'alexiu@mds.ubc.ca', None, 'Statistics'),
 (15, 'Vincenzo', 'vincenzo@mds.ubc.ca', None, 'Statistics'),
 (19, 'Gittu', 'gittu@mds.ubc.ca', None, 'Biomedical Engineering'),
 (16, 'Jessica', 'jessica@mds.ubc.ca', None, 'Computer Science')]

<img src="img/omg.png" width="300">

In [62]:
cur.execute("ROLLBACK")

```{note}
For us Postgres users through psycopg2 in Python, it is important not to leave transactions open unintentionally. By default, even a `SELECT` statement begins a transaction (see [here](https://www.psycopg.org/docs/usage.html#transactions-control)), and unless the transaction is committed or rolled back explicitly, the session remains in the undesirable idle transaction state.

In order to avoid an unintentionally long-running transaction, we can do one of the following things:

- Use auto-commit mode by running `conn.autocommit = True`. In this mode, every executed statement is automatically committed if successful and rolled back if an error occurs.

- Use `with` context manager for the connection and the cursor:
  >with conn, conn.cursor() as cur:
  >
  >   cur.execute("SELECT * FROM instructor;")

The transaction is committed when a connection exits the `with` block with no raised exceptions, otherwise, the transaction is rolled back.
```

```{note}
For sql magic users, autocommit is set to true by default. So you don't have to worry about it. 
```

If the command is unsuccessful and there are any errors, then the transaction is implicitly rolled back.

```{admonition} Discussion: Explicitly when you want to make a rollback?
<img src="img/discuss.png" width="120">
```

```{toggle}
- The person is unhappy with the changes and wants to discard them.
- I want to test something and don't want to make a permanent change.
- Usually, when working with databases, you write some code and then perform some Python tasks. When you encounter an error in the Python backend programming, you may want to initiate a rollback.
```

## ACID (Key Properties of a DBMS)

Often we refer to ACID or ACID Compliant databases. With modern No-SQL databases, we also hear about BASE databases. However, ACID was one of the fundamental properties of most early databases and is the reason for their broad adoption. It is an acronym that means:

- ***A***tomic

The unit of operation within an ACID database is the transaction. A transaction may involve one command or multiple commands. When multiple commands modify data, if one of them fails, the transaction is not committed (i.e., none of the operations are performed).

E.g., We have a database with one table (Values) and two columns (a and b). We have an operation where, on a particular row, a value is subtracted from a and added to b. This might be similar to a banking transaction, where I withdraw money from my account to pay for a new pair of slippers. The operation takes place in two steps:

- Remove value from a
- Place value in b

Atomicity ensures that if a system failure occurs between steps one and two, or if either step one or step two is invalid for any other reason, the entire transaction will not be processed.


- ***C***onsistent

All data within the database follow the defined rules for the database, the defined rules for anyone table, and the defined rules for any one field. All operations on the database will return the same value, given the same set of underlying data each time.  

E.g., We have table marks that give the breakdown of marking schemes for various courses. It is divided into quiz, participation, midterm, and final columns. All fields must be integers, and because these values make up the course marking scheme, each row must sum to 100. We can add a constraint to the table that ensures this using the CHECK operator.

A database that ensures consistency would reject any transaction where the final result results in a marking sum <> 100.


- ***I***solation

Any transaction occurs in isolation. One transaction cannot affect another transaction. When a transaction acts on the data, the tables or rows on which it acts are effectively locked to other transactions.

If two transactions modify the same table, the sequence of operations will ensure that the first transaction completes before the second transaction can change the data.

 E.g., In that same marks table as before, let's assume that a department has decided that all final exam scores will be reduced by 5% and that that value will be shifted to participation. At the same time, an individual instructor in the program has decided to increase the final weighting and reduce the midterm weighting.

- ***D***urability

The results of any transaction that completes successfully are permanently stored within the database.

E.g., Data that goes in following a completed transaction doesn't come out unless another transaction operates to remove the data.

In [26]:
%sql postgresql://{username}:{password}@{host}:{port}/imdb

## Subqueries

```{admonition} Query: Using the world database, find the countries with surface area above the average value of all countries in the world.

<img src="img/discuss.png" width="120">
```

```{margin}
<img src="img/fromlecture3.png" width="300">
```

```{admonition} iclicker: Who thinks the below query will work?

    SELECT name
    FROM country
    WHERE surfacearea > AVG(surfacearea);

A) Yes, it will work

B) No, it will not work
```

```{toggle}
An aggregation function CANNOT be used in the `WHERE` clause.
```

Instead, we do this:

```{toggle}
    SELECT name
    FROM country
    WHERE surfacearea > (SELECT AVG(surfacearea) FROM country);
```

The following animation shows how the query is executed:

<img src="img/avgsubquery.png" width="600">

```{admonition} Query: Find the number of movies in the movie_genres table that are NOT listed as 'drama'?

<img src="img/discuss.png" width="120">
```

```{admonition} iclicker: Who thinks the below query is correct for the above question?

SELECT COUNT(DISTINCT movie_id)
FROM movie_genres
WHERE genre <> 'drama';

A) Yes, it is correct

B) No, it is incorrect
```

We tried this question, and the following is what most of us did. We talked about the edge case. Here is the animation of what we did.

```{margin}
<img src="img/fromlecture2.png" width="300">
```

<img src="img/dramaquery.png" width="600">

So, to capture the edge case on movies with genre drama and other genres, we need to use subqueries. 

In [27]:
%%sql
SELECT COUNT(DISTINCT movie_id)
FROM movie_genres
WHERE genre <> 'drama';

 * postgresql://postgres:***@localhost:5432/imdb
   postgresql://postgres:***@localhost:5432/mds
   postgresql://postgres:***@localhost:5432/world
1 rows affected.


count
22883


In [28]:
%%sql
SELECT COUNT(DISTINCT movie_id)
FROM movie_genres
WHERE movie_id NOT IN ( SELECT movie_id FROM movie_genres
        WHERE genre = 'drama');

 * postgresql://postgres:***@localhost:5432/imdb
   postgresql://postgres:***@localhost:5432/mds
   postgresql://postgres:***@localhost:5432/world
1 rows affected.


count
10060


Let's get to the execution by animating the scenario. Here is the pdf of the animation.

```{margin}
<img src="img/wowsubquery.png" width="300">
```

<img src="img/dramaquerycorre.png" width="600">

Let's look at another example. 

```{admonition} SQL query: Example: Retrieve the name of countries whose capital cities have a population larger than 5 million.?
<img src="img/discuss.png" width="120">
```

```{toggle}
```sql
SELECT name FROM country
WHERE capital IN (
        SELECT id
        FROM city
        WHERE population > 5000000);
```


```{tip}
I won't be asking you to write a SQL query using a certain technique. Like using subqueries or using joins. I won't be restricting you in any way, and you are free to write the SQL query the way you like as long as it is correct. 
```

In [31]:
%sql postgresql://{username}:{password}@{host}:{port}/world

In [None]:
%%sql
SELECT name
FROM country
WHERE capital IN ( SELECT id FROM city WHERE population > 5000000);

   postgresql://postgres:***@localhost:5432/imdb
   postgresql://postgres:***@localhost:5432/mds
 * postgresql://postgres:***@localhost:5432/world
13 rows affected.


name
United Kingdom
Egypt
Indonesia
Iran
Japan
China
Colombia
"Congo, The Democratic Republic of the"
South Korea
Mexico


```{important}
- A subquery should always be enclosed in parentheses, e.g. (SELECT ...)
- Subqueries should not be terminated by a semi-colon, as opposed to regular queries
- Sometimes, the main SQL statement is called the outer query, and the subquery is called the inner query

- Where can we use subquery ???? A subquery can be used in the SELECT, FROM, WHERE, and HAVING clauses. Subqueries are most commonly used in the WHERE clause.
```

Finally, here is a question from lab 1. Remember this question? 

```{margin}
<img src="img/discussedlab1.png" width="300">
```

```{admonition} SQL query: We want to find out what percentage of movies in the "imdb" database are rated no less than 7. Write a query that computes that percentage value with two digits after the decimal point, and prints the output as e.g. "10.25%".

For this question, write one query to find the total count, and use the result manually in another query to compute the percentage (You will learn how to do this in a single query soon!). 

Now how can we do this using subqueries in a single query? 
```

```{toggle}
```sql
SELECT ROUND(COUNT(*) / (SELECT COUNT(*) FROM movies) * 100, 2) || '%'
FROM movies
WHERE rating >= 7;
```

```{important}
Note: A subquery in the SELECT clause should always return a single value, not a column or rows of values.
```

In [16]:
%sql postgresql://{username}:{password}@{host}:{port}/world

SQL query: Example: Retrieve the name of countries where English is an official language, and have a population of over 1 million.

In [None]:
%%sql

SELECT
    name
FROM
    country
WHERE
    population > 1000000
    AND
    code IN (
        SELECT
            countrycode
        FROM
            countrylanguage
        WHERE
            language = 'English'
            AND
            isofficial = True
    )
;

   postgresql://postgres:***@localhost:5432/imdb
   postgresql://postgres:***@localhost:5432/mds
 * postgresql://postgres:***@localhost:5432/world
10 rows affected.


name
Australia
United Kingdom
South Africa
Hong Kong
Ireland
Canada
Lesotho
New Zealand
United States
Zimbabwe


## Correlated subqueries

Previously, we learned simple subqueries as the subquery is computed just once, and the result is used in the outer query. So basically, subquery executes just once.

Correlated subqueries are executed for each row of the outer query.

Example: Which countries have the largest population in the continent where they are located?

SQL query: Which countries have the largest population in the continent where they are located?

In [18]:
%%sql

SELECT c1.name, c1.continent
FROM country c1
WHERE c1.population = (SELECT MAX(c2.population) 
                        FROM country c2
                        WHERE c2.continent = c1.continent);

   postgresql://postgres:***@localhost:5432/imdb
   postgresql://postgres:***@localhost:5432/mds
 * postgresql://postgres:***@localhost:5432/world
11 rows affected.


name,continent
Australia,Oceania
Brazil,South America
China,Asia
Nigeria,Africa
Russian Federation,Europe
United States,North America
Antarctica,Antarctica
Bouvet Island,Antarctica
South Georgia and the South Sandwich Islands,Antarctica
Heard Island and McDonald Islands,Antarctica


Here is the animation of the query execution,

<img src="img/correlated1.png" width="600">
<img src="img/correlated2.png" width="600">


**Example:** Write a query that returns names of countries whose capital city is not their most populated city.

In [19]:
%%sql

SELECT
    co.name
FROM
    country co
JOIN
    city ci
ON
    co.capital = ci.id
WHERE
    ci.population <> (
        SELECT
            MAX(ci2.population)
        FROM
            city ci2
        WHERE
            ci.countrycode = ci2.countrycode
    )
ORDER BY
    co.population DESC
;

   postgresql://postgres:***@localhost:5432/imdb
   postgresql://postgres:***@localhost:5432/mds
 * postgresql://postgres:***@localhost:5432/world
48 rows affected.


name
China
India
United States
Brazil
Pakistan
Nigeria
Vietnam
Philippines
Turkey
South Africa


## ANY and ALL

The following animation shows how the query is executed:

<img src="img/sql-any-operator.png" width="600">

Example: Find all non-European countries whose population is larger than every European country.

In [20]:
%%sql

SELECT
    name
FROM
    country
WHERE
    continent <> 'Europe'
    AND
    population > ALL (
        SELECT
            population
        FROM
            country
        WHERE
            continent = 'Europe'
    )
;

   postgresql://postgres:***@localhost:5432/imdb
   postgresql://postgres:***@localhost:5432/mds
 * postgresql://postgres:***@localhost:5432/world
6 rows affected.


name
Brazil
Indonesia
India
China
Pakistan
United States


Alternative ways:
```sql
SELECT name
FROM country
WHERE continent <> 'Europe'
  AND population > (SELECT MAX(population)
                     FROM country
                     WHERE continent = 'Europe');

```

Example: Find all European countries whose population is smaller than at least one city in the US.

In [21]:
%%sql

SELECT
    name
FROM
    country
WHERE
    continent = 'Europe'
    AND
    population < ANY (
        SELECT
            population
        FROM
            city
        WHERE
            countrycode = (
                SELECT
                    code
                FROM
                    country
                WHERE
                    name ILIKE '%United%States'
            )
    )
;

   postgresql://postgres:***@localhost:5432/imdb
   postgresql://postgres:***@localhost:5432/mds
 * postgresql://postgres:***@localhost:5432/world
26 rows affected.


name
Albania
Andorra
Bosnia and Herzegovina
Faroe Islands
Gibraltar
Svalbard and Jan Mayen
Ireland
Iceland
Croatia
Latvia


Alternative ways:
```sql
SELECT name
FROM country
WHERE continent = 'Europe'
  AND population < (SELECT MAX(population) FROM city
                     WHERE countrycode = (SELECT code
                                           FROM country
                                           WHERE name ILIKE '%United%States'));
```

## EXISTS

The following animations show how the query is executed:

<img src="img/exists.png" width="500">

Example: Find all countries that have at least a city with a population greater that 5 million.

In [22]:
%%sql

SELECT co.name
FROM country co
WHERE EXISTS (SELECT * FROM city ci
        WHERE co.code = ci.countrycode AND ci.population > 5000000);

   postgresql://postgres:***@localhost:5432/imdb
   postgresql://postgres:***@localhost:5432/mds
 * postgresql://postgres:***@localhost:5432/world
18 rows affected.


name
Brazil
United Kingdom
Egypt
Indonesia
India
Iran
Japan
China
Colombia
"Congo, The Democratic Republic of the"


Alternative ways

```sql
SELECT co.name
FROM country co
WHERE co.code = ANY (
    SELECT ci.countrycode
    FROM city ci
    WHERE ci.population > 5000000
);
```

```{admonition} Iclicker: True/False: Both Query 1 and Query 2 will return the same result.

- A. True

- B. False

Query 1:

    SELECT co.name
    FROM country co
    WHERE EXISTS (SELECT * FROM city ci
            WHERE co.code = ci.countrycode AND ci.population > 5000000);

Query 2:

    SELECT co.name
    FROM country co
    WHERE EXISTS (SELECT co.code FROM city ci
            WHERE co.code = ci.countrycode AND ci.population > 5000000);
```

```{toggle}
A
```

## Moral of the story
- We learned about transactions and the importance of rollback and commit.
- We learned about ACID properties of transactions.
- Subqueries are used to answer questions that cannot be answered with a single query. We saw it can be useful in many situations.
- We picked some things to keep in mind when writing subqueries.
- correlated subqueries are executed for each row of the outer query. Whereas simple subqueries are executed just once.
- We saw how to use ANY, ALL, and EXISTS operators in subqueries.