## SQL UNION

### What is SQL UNION?

In SQL, **UNION** is an operator used to combine the results of two or more `SELECT` queries into a single result set. It is used to merge the rows from multiple queries into one, creating a unified output. The **UNION** operator is particularly useful when you want to combine data from different tables or when you have similar queries with the same column structure and want to merge their results.

### How is SQL UNION used?

SQL UNION is used to combine the results of two or more SELECT queries into a single result set. It is particularly useful when you want to merge the data from different tables or when you have similar queries with the same column structure and want to stack their results on top of each other.

Here's how SQL UNION is used:

1. **Identify the SELECT Queries:** Determine the SELECT queries that you want to combine. The queries should have the same number of columns, and the columns' data types should be compatible. The order of columns in the SELECT queries must match.

2. **Use the UNION Operator:** In your SQL query, use the UNION keyword to combine the SELECT queries. Place the UNION operator between each SELECT query.

3. **Ensure Data Compatibility:** Ensure that the columns in both SELECT queries are compatible, i.e., they have the same data types and are in the same order.

4. **Eliminate Duplicate Rows (Optional):** By default, the UNION operator removes duplicate rows from the final result set. If you want to include duplicates, you can use the UNION ALL operator instead of UNION.

5. **Retrieve Data:** Execute the SQL query, and the result will be a single result set containing the combined rows from all the SELECT queries.

Example:

Consider two tables, "employees" and "contractors," with similar structures:

Table: employees

| emp_id | emp_name   | department |
|--------|------------|------------|
| 1      | John Doe   | HR         |
| 2      | Jane Smith | Sales      |

Table: contractors

| cont_id | cont_name   | department |
|---------|-------------|------------|
| 101     | Mark Johnson| Finance    |
| 102     | Sarah Brown | IT         |

To combine the data from both tables into a single result set, you can use UNION:

```sql
SELECT emp_id, emp_name, department FROM employees
UNION
SELECT cont_id, cont_name, department FROM contractors;
```

The result set will be:

| emp_id | emp_name     | department |
|--------|--------------|------------|
| 1      | John Doe     | HR         |
| 2      | Jane Smith   | Sales      |
| 101    | Mark Johnson | Finance    |
| 102    | Sarah Brown  | IT         |

The UNION operator successfully combines the rows from the "employees" and "contractors" tables into a single result set, merging the data with similar structures. The result set contains distinct rows, removing any duplicates by default. If you want to include duplicates, you can use the UNION ALL operator instead.

### How does SQL UNION work?

**SQL UNION** works by combining the results of two or more `SELECT` queries into a single result set. It allows you to stack the rows from different queries on top of each other, creating a unified output.

Here's a step-by-step explanation of how **SQL UNION** works:

Consider two tables, `"employees"` and `"contractors"`, with similar structures:

Table: employees

| emp_id | emp_name   | department |
|--------|------------|------------|
| 1      | John Doe   | HR         |
| 2      | Jane Smith | Sales      |

Table: contractors

| cont_id | cont_name   | department |
|---------|-------------|------------|
| 101     | Mark Johnson| Finance    |
| 102     | Sarah Brown | IT         |

Example SQL query with **UNION**:

```sql
SELECT emp_id, emp_name, department FROM employees
UNION
SELECT cont_id, cont_name, department FROM contractors;
```

1. The first `SELECT` query retrieves data from the `"employees"` table:

   
   emp_id | emp_name   | department
   -------|------------|-----------
   1      | John Doe   | HR
   2      | Jane Smith | Sales
   

2. The second `SELECT` query retrieves data from the `"contractors"` table:

  
   cont_id | cont_name    | department
   --------|--------------|-----------
   101     | Mark Johnson | Finance
   102     | Sarah Brown  | IT
  

3. The **UNION** operator combines the rows from both `SELECT` queries and removes duplicate rows (if any) to create a new result set:
   
   emp_id | emp_name     | department
   -------|--------------|-----------
   1      | John Doe     | HR
   2      | Jane Smith   | Sales
   101    | Mark Johnson | Finance
   102    | Sarah Brown  | IT
   

The final result set contains all the rows from both tables, merged together as a single unified output. Any duplicate rows are removed, providing a unique set of rows in the result. This is how **SQL UNION** works, allowing you to combine data from multiple sources with similar structures into one consolidated result set.

### What is the difference between SQL JOIN and SQL UNION?

SQL UNION and SQL JOIN are used for different purposes and have distinct functionalities:

1. **SQL UNION:**
   - **SQL UNION** is used to combine the results of two or more `SELECT` queries into a single result set.
   - It requires that the number of columns and their data types in the `SELECT` queries be the same.
   - It eliminates duplicate rows from the final result set by default, ensuring that each row is unique.
   - It does not consider any relationship between tables; instead, it simply stacks the results vertically.
   - The **UNION** operator is used for combining the results, and it does not require a common column or key between the tables.
   - Example: 
     ```sql
     SELECT column1, column2 FROM table1
     UNION
     SELECT column1, column2 FROM table2;
     ```

2. **SQL JOIN:**
   - **SQL JOIN** is used to combine rows from two or more related tables based on a common column or key.
   - It is used to retrieve data from multiple tables simultaneously, merging information from different sources into a single result set.
   - It requires a related column or key between the tables to establish a relationship for combining rows.
   - The **JOIN** operation combines the rows based on matching values in the specified columns.
   - There are different types of **JOINs**, such as **INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN**, each with specific behaviors.
   - Example: 
     ```sql
     SELECT column1, column2 FROM table1
     INNER JOIN table2 ON table1.common_column = table2.common_column;
     ```

In summary, the main difference between **SQL UNION** and **SQL JOIN** is their purpose and behavior. **UNION** is used to combine the results of multiple queries with the same structure, while **JOIN** is used to retrieve data from related tables based on a common column or key. Each serves a unique role in SQL queries, allowing for powerful data manipulation and retrieval capabilities.

Note: - Refer the following link to understand the execution order of the query.

# Theory Questions:

1. What is the primary purpose of the UNION operator in SQL?

2. How is the UNION operator different from the UNION ALL operator?

3. What requirements must two SELECT statements meet to be combined using the UNION operator?

4. Why is it important for the columns in the SELECT statements to have compatible data types when using UNION?

5. How does the UNION operator differ from the JOIN operator in SQL?

6. Can you explain a scenario where you would use UNION instead of a JOIN, and vice versa?

7. Describe a real-world scenario where the UNION operator might be particularly useful.

8. How can you ensure that the results combined using UNION do not contain any duplicate rows?

9. When combining results from multiple tables using UNION, how can you ensure that the order of columns matches across the SELECT statements?

10. Can you think of any performance considerations to keep in mind when using the UNION operator, especially with large datasets?

11. How can you optimize a query that uses UNION to ensure that it runs efficiently and retrieves results quickly?

### Table Query for questions:

**Books Table:**

```sql
CREATE TABLE Books (
    book_id INT,
    title VARCHAR(255),
    author VARCHAR(255),
    publication_year INT
);

INSERT INTO Books (book_id, title, author, publication_year) VALUES
(1, 'The Great Gatsby', 'F. Scott Fitzgerald', 1925),
(2, 'To Kill a Mockingbird', 'Harper Lee', 1960),
(3, '1984', 'George Orwell', 1949);
```

**Magazines Table:**

```sql
CREATE TABLE Magazines (
    magazine_id INT,
    title VARCHAR(255),
    editor VARCHAR(255),
    publication_year INT
);

INSERT INTO Magazines (magazine_id, title, editor, publication_year) VALUES
(1, 'National Geographic', 'Susan Goldberg', 1888),
(2, 'The New Yorker', 'David Remnick', 1925),
(3, 'TIME', 'Edward Felsenthal', 1923);
```

# Easy Level:

Q. Combine the title column from both Books and Magazines tables.

Q. List all unique publication years from the Books and Magazines tables.

Q. Retrieve a list of titles, including both books and magazines.

Q. Display all distinct publication years from the books and magazines in a single list.

Q. Create a list of unique titles from the Books and Magazines tables.

# Medium Level:

Q. Combine the authors from the Books table and the editors from the Magazines table into one list.

Q. Create a unified list of book titles and magazine titles.

Q. List all unique publication years from books and magazines, sorted in descending order.

Q. Display a list of all authors and editors, without duplicates.

Q. Retrieve a list of all titles published in the year 2020 from both books and magazines.

# Hard Level:

Q. Get a combined list of titles and their publication years from both books and magazines.

Q. Create a unified list of titles and their sources (book or magazine).

Q. List all publication years, along with a label identifying whether the year is from a book or a magazine.

Q. Combine the titles of books and magazines, along with an indicator of the type of publication.

Q. Create a list of all authors and editors from books and magazines, tagged with the type of their publication.