# HW3s - Brian Flores

## CH6 Set operators

## Q1

Explain the difference between the `UNION ALL` and `UNION` operators. In what cases are they equivalent?  When they are equivalent, which one should you use?

## **Explanation:**

The **difference** between the `UNION ALL` and `UNION` operator are what is added and what is not. The `UNION ALL` operator allows **for duplicates** therefore resulting in a **multiset** and not a set. The default union behavior returns what is in two tables therefore `ALL` keeps what is both in table 1 and table 2. Normally in set theory a set is a collection of items that has **no duplicates**. Therefore, the default `UNION` behavior removes the duplicates within a collection of rows.

When we talk about when they are equivalent it usually refers to whether duplicates are found. Therefore, **when no duplicates are not found in either table** then **_both are equivant_** since `UNION ALL` includes possible duplicates.

If we want to know when one is better of the other we ought to first decide that no duplicates are in either table. However `UNION` which has a default distinct predicate attach checks for duplicates at runtime (which costs time). **Therefore when we know that duplicates are not found in either its best to use the `UNION ALL` operator** (It performs no duplicate check).

---


## Q6

You are given the following query:
```sql 
SELECT country, region, city
FROM HR.Employees

UNION ALL

SELECT country, region, city
FROM Production.Suppliers;
```

You are asked to add logic to the query 
such that it would guarantee that the rows from Employees
would be returned in the output before the rows from Suppliers,
and within each segment, the rows should be sorted
by country, region, city

**Tables involved:** NORTHWINDS database, Employees and Suppliers tables

**Desired output:**

```
country         region          city
--------------- --------------- ---------------
UK              NULL            London
UK              NULL            London
UK              NULL            London
UK              NULL            London
USA             WA              Kirkland
USA             WA              Redmond
USA             WA              Seattle
USA             WA              Seattle
USA             WA              Tacoma
Australia       NSW             Sydney
Australia       Victoria        Melbourne
Brazil          NULL            Sao Paulo
Canada          QuÈbec          MontrÈal
Canada          QuÈbec          Ste-Hyacinthe
Denmark         NULL            Lyngby
Finland         NULL            Lappeenranta
France          NULL            Annecy
France          NULL            Montceau
France          NULL            Paris
Germany         NULL            Berlin
Germany         NULL            Cuxhaven
Germany         NULL            Frankfurt
Italy           NULL            Ravenna
Italy           NULL            Salerno
Japan           NULL            Osaka
Japan           NULL            Tokyo
Netherlands     NULL            Zaandam
Norway          NULL            Sandvika
Singapore       NULL            Singapore
Spain           Asturias        Oviedo
Sweden          NULL            Gˆteborg
Sweden          NULL            Stockholm
UK              NULL            London
UK              NULL            Manchester
USA             LA              New Orleans
USA             MA              Boston
USA             MI              Ann Arbor
USA             OR              Bend
```

(38 row(s) affected)

---

## **Proposition, Table, Columns and the Predicate.**

**Proposition:** Retrieve all rows from `HumanResources].[Employee]` and `[Prodduction].[Supplier]` that have the following columns **country** **region**, **city**. Even if there are duplicate rows return them and make sure to return the results from the Employees table before the Suppliers table. Remember to sort the result by country, region and city.

**Tables:** `HumanResources].[Employee]` and `[Prodduction].[Supplier]`

**Columns:** country, region, city. Speific Names cant bre provided as they are labeled differently in different tables.

**Predicate:** 

- Apply the `UNION ALL` operator to return the multiset of both tables
    


In [None]:
-- Q6

-- Original Query for TSQLV6

USE TSQLV6;

SELECT country, region, city
FROM (
      SELECT sortOrder = 1, country, region, city
      FROM [HR].[Employees]

      UNION ALL

      SELECT sortOrder = 2, country, region, city
      FROM [Production].[Suppliers]
     ) AS U
ORDER BY sortOrder, country, region,city;

-- new Query for Northwinds2022TSQLV7

USE Northwinds2022TSQLV7;

SELECT EmployeeCountry, EmployeeRegion, EmployeeCity
FROM (
      SELECT sortOrder = 1, EmployeeCountry, EmployeeRegion, EmployeeCity
      FROM [HumanResources].[Employee]

      UNION ALL

      SELECT sortOrder = 2, SupplierCountry, SupplierRegion, SupplierCity
      FROM [Production].[Supplier]
     ) AS U

ORDER BY sortOrder, EmployeeCountry, EmployeeRegion, EmployeeCity;

# Propositions

Translating queries to problems.

## Circumventing Unsupported Logical Phases
In this section we try to find the way to apply other operaters on a result of a set operator. Recall that using a set operator blocks you from using every other operator besides **order by**.

Also remember that you can not pass in a cursor into a set operator, it must be a table (set). How can we find a way to get around this hurdle.

---

### CULP.1

Collect disinct rows (capturing columns country, region and city) of the `[HumanResources].[Employee]` table and the `[Sales].[Customer]` table. Then for each country show how many countries are in the result set of the two tables.

**Tables Involved**: `[HumanResources].[Employee]` And `[Sales].[Customer]`

**Constraints**: Use `UNION` but use `GROUP BY`

**Takeaways**: Using a table expression (via parenthesis) to capture a set operator and then you can apply a group by after.

In [None]:
--CULP.1

-- original query for TSQLv6

USE TSQLV6;

SELECT country, COUNT(*) AS numlocations
FROM (SELECT country, region, city FROM [HR].[Employees]

      UNION

      SELECT country, region, city FROM [Sales].[Customers]) AS U
GROUP BY country;

-- new query for Northwinds2022TSQLV7

USE Northwinds2022TSQLV7;

SELECT EmployeeCountry, COUNT(*) AS numLocations
FROM (SELECT EmployeeCountry, EmployeeRegion, EmployeeCity FROM [HumanResources].[Employee]

      UNION

      SELECT CustomerCountry, CustomerRegion, CustomerCity FROM [Sales].[Customer]) AS U
GROUP BY EmployeeCountry;


### CULP.2

Aggregrate the top 2 orders of employeeid 3 and 5. Select the top 2 orders according to the orderdate and the orderid as a tiebreaker. If there are multiple duplicate rows make sure to return them. Display the employeeid orderid and orderdate

**Tables Involved**: `[Sales].[Order]`

**Constraints**: Use `WHERE empid = 3` and `WHERE empid = 5` order the results of the employees by orderdate and order id descending meaning highers numbers appear to the top.

**Takeaways**: When you want to `ORDER BY` within a query you need to select `TOP(N)` to trick a set operation to accept a psuedo table.


In [15]:
-- CULP.2

-- original query for TSQLv6

USE TSQLV6;

SELECT empid, orderid, orderdate
FROM (SELECT TOP (2) empid, orderid, orderdate
      FROM [Sales].[Orders]
      WHERE empid = 3
      ORDER BY orderdate DESC, orderid DESC) AS D1

UNION ALL

SELECT empid, orderid, orderdate
FROM (SELECT TOP (2) empid, orderid, orderdate
      FROM [Sales].[Orders]
      WHERE empid = 5
      ORDER BY orderdate DESC, orderid DESC) AS D2;

-- original query for Northwinds2022TSQLV7
USE Northwinds2022TSQLV7;

SELECT EmployeeId, OrderId, OrderDate
FROM (SELECT TOP(2) EmployeeId, OrderId, OrderDate 
      FROM [Sales].[Order]
      WHERE EmployeeId = 3
      ORDER BY OrderDate DESC, OrderId DESC) AS D1

UNION ALL

SELECT EmployeeId, OrderId, OrderDate
FROM (SELECT TOP(2) EmployeeId, OrderId, OrderDate 
      FROM [Sales].[Order]
      WHERE EmployeeId = 5
      ORDER BY OrderDate DESC, OrderId DESC) AS D2;


empid,orderid,orderdate
3,11063,2022-04-30
3,11057,2022-04-29
5,11043,2022-04-22
5,10954,2022-03-17


EmployeeId,OrderId,OrderDate
3,11063,2016-04-30
3,11057,2016-04-29
5,11043,2016-04-22
5,10954,2016-03-17
