# Introduction

In this session
-------

You will learn the basics of set theory. The basic set operations are as follows:

- Union
- Intersection
- Difference

You will also learn about various types of joins, such as follows:

- Inner join
- Left join
- Right join

# Set Theory

Consider two sets **A and B** containing `even numbers` and `prime numbers`, respectively.

For ease of understanding, let's consider only values less than 10.

Therefore, the sets will be

    A = { 2, 4, 6, 8 } and B = { 2, 3, 5, 7 }

In [2]:
A = { 2, 4, 6, 8 }
B = { 2, 3, 5, 7 }

A, B

({2, 4, 6, 8}, {2, 3, 5, 7})

In [3]:
#A ∪ B = { 2, 3, 4, 5, 6, 7, 8 }
A | B

{2, 3, 4, 5, 6, 7, 8}

In [4]:
# A ∩ B = { 2 }
A & B

{2}

In [5]:
# A - B = { 4, 6, 8 }
A - B

{4, 6, 8}

Set Theory
-----

Write the value of B - A from the example in the text given above.

----------

`Suggested Answer`

{ 3, 5, 7 }

In [6]:
B - A

{3, 5, 7}

Set Theory
------

In a class of 50 students, 10 enrolled in both English and Hindi.

32 enrolled in English in total.

If the students of the class enrolled in at least one of the two subjects,

then how many students enrolled in only Hindi and not English?

You can use a Venn Diagram to solve this question.

Go through the link given below carefully, especially the first example problem.

https://brilliant.org/wiki/venn-diagram/

    22

    18

    28

    42

---------

    18

    ✓ Correct
    Feedback:

Denote the set of students who enrolled in English and Hindi by E and H, respectively.

The number of elements in E = 32.

So, the number of elements in H - E

    = Total number of elements - Number of elements in E
    = 50 - 32 = 18

Set Theory
-------

Fill in the blank.

If A and B are two sets, then A  ∩ (B  ∪ A) equals _____.

    A

    B

    A - B

    None of the above.

---------

    A

    ✓ Correct
    Feedback:

B ∪ A means all the elements in the Venn Diagram are included.

A  ∩ (B  ∪ A) would mean all elements that are common to A and the entire set of A ∪ B.

This would result in just A.

# SQL Joins

![ss](https://i.stack.imgur.com/UI25E.jpg)

Joins in a Star Schema
------

What is the maximum number of tables that can be joined in a star schema in a single query?

    2

    3

    4

    There is no such limit

----------

    3

    ✓ Correct
    Feedback:

Correct. If you look at the ERD for the market fact schema,

you can see that the dimension tables do not connect to any table other than the central fact table.

The same is applicable for any other star schema.

Types of Joins
--------

Consider the two tables given below.

<center>

**`Employees`**

![ss](https://images.upgrad.com/1634e3f7-73b5-45fd-bd0c-399c5447e4f9-Employees.PNG)

**`Departments`**

![ss](https://images.upgrad.com/c646995c-e298-4e27-9e9f-f48869477723-Departments.PNG)

</center>

Which query would you write to retrieve all employees from the Employees table along with the respective departments that can be obtained from the Departments table?

----------

    select last_name, dept_name

    from employees e

    left join departments d

    on e.dept_id = d.dept_id;

`✓ Correct`

**Feedback:**

This query would return all values from the Employees table and only matching values from the Departments table.

This is what is required as mentioned in the question.

-----------

    select last_name, dept_name

    from employees e

    right join departments d

    on e.dept_id = d.dept_id;

----------

    select last_name, dept_name

    from employees e

    inner join departments d

    on e.dept_id = d.dept_id;

--------

    None of the above

---------

# Types of Joins: A Demonstration

Examples of Multi-Joins
--------

Can you think of an example where you would need to use a multi-join,

i.e., query data from more than two tables at once?

-----

`Suggested Answer`

Consider three tables:

1. Vehicles table containing the columns vehicle id, colour id and customer id
2. Colours table containing the columns colour id and colour (black, white etc.)
3. Customers table containing the columns customer id and customer name

You would need to write a query using multi-joins to know which vehicle is owned by which customer as well as which colour that vehicle is.

'On' vs 'Using' Keyword
-------

State whether the following statement is true or false.

Every SQL statement containing a 'join' keyword can be rewritten with a 'using' keyword.

    True

    False

---------

    False

    ✓ Correct
    Feedback:

The 'using' keyword can be used in a query to join two tables only when the common column in the two tables has the same name in both tables.

Shorten the Clause
--------

You just saw the clause *City = 'Delhi' or City = 'Patna'* in a query.

Can you think of a shorter way of rewriting this clause?

---------

`Suggested Answer`

You can rewrite the clause as City in ('Delhi', 'Patna').

The in keyword comes in handy when you have multiple or clauses in a single query.

Explain the Query
-----

Explain in your own words what the following query is trying to achieve.

    select Customer_Name, sum(Order_Quantity) as No_Of_Orders
    from cust_dimen c
    inner join market_fact_full m
    on c.cust_id = m.cust_id
    group by Customer_Name
    order by No_Of_Orders desc
    limit 1;

----------

`Suggested Answer`

By writing the above query, you are listing each customer with his/her number of orders.

An inner join is used because the two columns are located in different tables, and you need all the matching values.

The table aliasing is done to understand whether the common column (cust_id) is referenced from the cust_dimen or market_fact_full table.

The grouping of customers is done because a customer may feature multiple times in the table.

Then they are ordered in descending order of the number of orders.

Finally, only the topmost value is displayed using the limit keyword to determine the customer who placed the most number of orders.

Types of Joins
-----

**Description**

Given a table named customers and a table named orders with the following columns:

<center>

![ss](https://media-doselect.s3.amazonaws.com/generic/0wR1ZKYKEd9pqMp1WYXz4ORgK/Customers.PNG)

![ss](https://media-doselect.s3.amazonaws.com/generic/8vzoaArYoGG3jLgp3Q1d8NM7/Orders.PNG)

</center>

Write a query to list the names of all customers who have placed at least one order.

**Sample Output**

    customerName

    Atelier graphique

------

    use upgrad;

    select distinct customerName
    from customers c inner join orders o
    on c.customerNumber = o.customerNumber

# Outer Joins: A Demonstration

Outer Joins vs Inner Joins
----

You just saw an example where an outer join would be preferred to an inner join.

Can you think of another such scenario?

---------

`Suggested Answer`

Consider a table 'Loans' containing information about loans, including the loan type, principal amount and the rate of interest.

Now, consider another table 'Applicants' that contains the following details:

the applicant name, address and loan_id.

Loan_id would be assigned null if a customer has applied for a loan but it has not been sanctioned yet.

It would be better to use an outer join in this case so that you can get information on even those applicants who still have not got their loan approved.

Inner vs. Outer Join
------

What is the difference between an inner join and an outer join operation?

    There is no difference.

    An inner join preserves a few values that are otherwise lost in an outer join.

    An outer join preserves a few values that are otherwise lost in an inner join.

    An outer join can be used only on outer queries,
    whereas an inner join operation can be used in subqueries.

---------

    An outer join preserves a few values that are otherwise lost in an inner join.

    ✓ Correct
    Feedback:

An outer join returns null if the corresponding column in the second table does not have matching values.

An inner join does not return non-matching values altogether.

# Views with Joins

Views with joins are especially useful if you need to store data from multiple columns in a single place for ready reference.

# Set Operations with SQL

*Two tables are **union-compatible** if*:

1. They have the same number of attributes, and
2. The attribute types are compatible, i.e., the corresponding attributes have the same data type.

Union vs Union all
--------

State whether the following statement is true or false.

It is preferable to use '`union all`' instead of '`union`' in a query.

    True

    False

--------

    False

    ✓ Correct
    Feedback:

Using union eliminates duplicate values in the result set, making it easier to look at and analyse the resulting output.

    "Union all" returns all values including duplicates.

## Practice Questions

Joins
------

What will be the output of the following query?

    select id, col_1, col_2
    from table_1 a
    inner join table_2 b
    using(id)
    inner join table_3 c
    using(id_2)

 --------

    This query will throw an error message.

    This query will perform the inner join of table_1 and table_3 only.

    This query will perform an inner join of table_1 and table_2 on ‘id’
    and then perform an inner join on ‘id_2’ with table_3.

    This query will perform an inner join of table_2 and table_3 on ‘id_2’
    and then perform an inner join on ‘id’ with table_1.

--------

    This query will perform an inner join of table_1 and table_2 on ‘id’
    and then perform an inner join on ‘id_2’ with table_3.

    ✓ Correct
    Feedback:

This statement is correct.

This command will perform an inner join of table_1 and table_2 on ‘id’

and then perform an inner join on ‘id_2’ with table_3.

Joins
----

You have been given the following two tables `‘Transactions’` and `‘Company’`.

<center>

**`Transactions`**

![ss](https://images.upgrad.com/44817ce1-4833-41f8-90cd-c88431be393a-img62.PNG)

**`Company`**

![ss](https://images.upgrad.com/bdbf28f9-2bf3-422a-81a0-dbbf57eb1be2-img63.PNG)

</center>

If you perform an inner join on these two tables on ‘Company_id’,

then how many rows will the resulting table contain?

    4

    3

    2

    5

--------

    3

    ✓ Correct
    Feedback:

When you perform the inner join on these tables, the three rows of the ‘Transactions’ table,

in which the ‘Company_id’ is equal to B, B and E, will appear.

Joins
-------

You have been given the following three tables `‘Transactions’`, `Company’` and `‘Headquarter’`.

<center>

**`Transactions`**

![ss](https://images.upgrad.com/fa8165b1-a15a-45b2-bee0-874b2e6248af-img64.PNG)

**`Headquarter`**

![ss](https://images.upgrad.com/7e57e4cb-0827-4959-8633-684de126028a-img65.PNG)

**`Company`**

![ss](https://images.upgrad.com/a40bb455-f475-4d19-82d6-443262bbd8ab-img66.PNG)

</center>

Suppose you want to get the company name corresponding to the highest amount paid in the `‘Transactions’` table.

Which method will be the most efficient to perform such an operation?

--------

- Perform the operation in three steps:

  - First, get the ‘Company_Headquarter_id’ corresponding to the highest amount
    from the ‘Transactions’ table.

  - Then, find the ‘Company_id’ from the ‘Headquarter’ table that corresponds
    to the ‘Company_headquarter_id’ obtained from the previous step.

  - Perform the operation on the ‘Company’ table to get the ‘Company_name’.

-------

- Perform a nested query operation among the three tables.

----------

- Inner join the three tables and query for the
  maximum amount in the inner joined table directly.

---------

    Inner join the three tables and query for the
    maximum amount in the inner joined table directly.

    ✓ Correct
    Feedback:

Performing inner joins on the three tables and querying the company name corresponding to the highest amount will be the most efficient to perform such a task.

Joins
-------

Suppose you have been given the following two tables `‘Customers’` and `‘Orders’`.

<center>

**`Customers`**

![ss](https://images.upgrad.com/526257cb-c72f-4ab7-8f94-9f6d7e2afff5-img54.PNG)

**`Orders`**

![ss](https://images.upgrad.com/2f88bd43-f7b0-42e2-a686-be32a83cde45-img55.PNG)

</center>

These two tables are to be joined on the `‘dept_no’` key.

Match each join type with the resulting table.

![ss](https://images.upgrad.com/4eb5ba93-8d3c-47bf-88f1-4033fee7132c-img67.PNG)

Select the correct option from below.

    A-2, B-1, C-4, D-5, E-3

    A-1, B-3, C-2, D-5, E-4

    A-4, B-3, C-1, D-2, E-5

    A-5, B-3, C-1, D-2, E-4

-------

    A-5, B-3, C-1, D-2, E-4

    ✓ Correct
    Feedback:

*Understand the Venn diagram and answer this question.*

Summary
-----

You learnt the following topics in this session:

- Some basic set operations such as union, intersection and difference
- Types of joins: inner joins, left joins and right joins
- Writing simple and complex queries involving multiple joins and views
- Using 'union' and 'union all' in queries