# Working in Sets

Set theory is a branch of discrete mathematics that deals with a collection
of objects. 

There is a lot of conceptual overlap between set theory and
relational database concepts. 

It is no wonder that the output of a query is
frequently called a result**set**.

Primitive set theoretic operations like union, intersection, and
difference are increasingly supported in various implementations

# Union

Combines elements of two sets. 

set1 = { 1, 3, 5 }

set2 = { 1, 2, 3 }

set1 UNION set2 = { 1, 3, 5, 2 }



how to use the union operation in
SQL. 

```
tesdb=# SELECT * FROM proglang_tbl;
```

get the list of creation years of languages standardized
by either ANSI or ISO

```
tesdb=# SELECT year FROM proglang_tbl
        WHERE standard='ANSI'
        UNION
        SELECT year FROM proglang_tbl
        WHERE standard='ISO';


tesdb=# SELECT standard FROM proglang_tbl
        WHERE language = 'Fortran'
        UNION
        SELECT standard FROM proglang_tbl
        WHERE language = 'APL';  

tesdb=# SELECT standard FROM proglang_tbl
        WHERE language = 'Fortran'
        UNION
        SELECT standard FROM proglang_tbl
        WHERE language = 'APT';               
```


```
tesdb=# SELECT standard FROM proglang_tbl
        WHERE language = 'Fortran'
        UNION ALL
        SELECT standard FROM proglang_tbl
        WHERE language = 'APL';
```

SQL
engine does not have to bother with checking for duplicates.


If you have
constructed your participating queries in such a way that there are no repeated
values, using a UNION ALL would improve your query processing time.

# Intersection

intersection operation outputs only the common elements

set1 = { 1, 3, 5 }

set2 = { 1, 2, 3 }

set1 INTERSECTION set2 = { 1, 3 }

As with union, each common value is displayed only once. Duplicates
are removed from the final result set.


Using the INTERSECT in SQL

```
tesdb=# SELECT standard FROM proglang_tbl
        WHERE year=1964
        INTERSECT
        SELECT standard FROM proglang_tbl
        WHERE year=1957;
```

INTERSECT operator would
find the exact common values between the two queries 

let’s see what
happens when we add another column to the result list



```
tesdb=# SELECT year, standard FROM proglang_tbl
        WHERE year=1964
        INTERSECT
        SELECT year, standard FROM proglang_tbl
        WHERE year=1957;
```

different combined value of (year, standard), giving us a net zero result.

# Difference
set1 - set2 => 
all elements in set1 that do not occur in set2 

set1 = { 1, 3, 5 }

set2 = { 1, 2, 3 }


set1 DIFFERENCE set2 = { 5 }

set2 DIFFERENCE set1 = { 2 }



```
tesdb=# INSERT INTO proglang_tbl
        (id, language, author, year, standard)
        VALUES
        (9, 'RPG', 'IBM', 1964, 'ISO');

tesdb=# SELECT * FROM proglang_tbl;        
```

we wish to list out the years of creation of languages that
were standardized by ISO but not the ANSI (Listing 13-11). From our
source table, we find that three languages were standardized by ISO with
years 1972, 1959, and 1964. But since in 1964, APL was created, which was
eventually standardized by ANSI, we should ideally be left with the answer
1972 and 1959

```
tesdb=# SELECT year FROM proglang_tbl
        WHERE standard IN ('ISO')
        AND standard NOT IN ('ANSI');      

```

We thought 1964 would be ineligible


first there was a scan of ISO rows  – giving us
three values. 

Then ANSI rows were discounted but not necessarily from
the first result but the table as a whole. 

So while the APL 1964 was left
off, the freshly inserted RPG 1964 still remained, 

Set difference with EXCEPT

```
tesdb=# SELECT year FROM proglang_tbl WHERE standard IN ('ISO')
        EXCEPT
        SELECT year FROM proglang_tbl WHERE standard IN ('ANSI');
```