# Week 1: CTEs
## What are CTEs?
CTE is the abbreviation for "Common Table Expression". You can think of it as a named intermediate result that you can access over and over again in your query. The important thing is that a CTE defines a table and that all columns in that table must be named. So the CTE contains a `SELECT` statement where all derived columns must have an alias.
## Where do I use CTEs?
There are always scenarios where the use of CTEs is appropriate. The CTE is not always the best and fastest way to write a query. But even if it does not improve the performance of a query, it usually improves the readability.
CTEs are very useful in places where you would otherwise use nested queries, that is, in places where the granularity differs from different elements of queries.
Let's look at an example. Let's query the `[Sales].[Orders]`. table of the WWI database. First of all, we want to find out which customers have ordered several times from the same sales employee in one day. This is a simple 'GROUP BY' query:

In [1]:
SELECT [CustomerID]
      ,[SalespersonPersonID]
      ,[OrderDate]
      ,count(distinct OrderID) as OrdersAtDateBySalesPerson
  FROM [WideWorldImporters].[Sales].[Orders] 
group by CustomerId, SalespersonPersonID, OrderDate
order by 4 desc

CustomerID,SalespersonPersonID,OrderDate,OrdersAtDateBySalesPerson
81,8,2014-03-13,4
125,2,2013-03-30,4
569,3,2014-01-23,4
65,8,2016-04-01,4
855,20,2013-12-03,4
104,14,2015-01-26,4
24,13,2013-07-25,4
78,14,2015-10-27,4
967,15,2014-08-14,4
115,14,2015-08-21,4


In the next step, we write the total number of orders of the respective customer on the line. The naïve approach here would be to use a window function and count the orders in the grouping in the same way as for counting:

In [2]:
SELECT [CustomerID]
      ,[SalespersonPersonID]
      ,[OrderDate]
      ,COUNT(distinct OrderID) as OrdersAtDateBySalesPerson
      ,COUNT(distinct OrderID) OVER (partition by CustomerID) as TotalOrders
  FROM [WideWorldImporters].[Sales].[Orders] 
group by CustomerId, SalespersonPersonID, OrderDate
order by 4 desc

: Msg 10759, Level 15, State 1, Line 5
Use of DISTINCT is not allowed with the OVER clause.

However, this approach fails because the use of `Distinct` in windowed functions is not allowed: `Use of DISTINCT is not allowed with the OVER clause`.
If it is given that there is only one row in the table per order, the result would be the same, so let's try the following:

In [1]:
SELECT [CustomerID]
      ,[SalespersonPersonID]
      ,[OrderDate]
      ,COUNT(OrderID) as OrdersAtDateBySalesPerson
      ,COUNT(OrderID) OVER (partition by CustomerID) as TotalOrders
  FROM [WideWorldImporters].[Sales].[Orders] 
group by CustomerId, SalespersonPersonID, OrderDate
order by 4 desc

: Msg 8120, Level 16, State 1, Line 5
Column 'WideWorldImporters.Sales.Orders.OrderID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

Here we get the message that the use of the `OrderID` column is not allowed because it is not contained in the `Group By`: `Column 'WideWorldImporters.Sales.Orders.OrderID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.`
We can work around this problem too: we can simply count the lines by using a `COUNT(1)`:

In [3]:
SELECT [CustomerID]
      ,[SalespersonPersonID]
      ,[OrderDate]
      ,COUNT(1) as OrdersAtDateBySalesPerson
      ,COUNT(1) OVER (partition by CustomerID) as TotalOrders
  FROM [WideWorldImporters].[Sales].[Orders] 
group by CustomerId, SalespersonPersonID, OrderDate
order by 4 desc

CustomerID,SalespersonPersonID,OrderDate,OrdersAtDateBySalesPerson,TotalOrders
78,14,2015-10-27,4,118
104,14,2015-01-26,4,100
569,3,2014-01-23,4,122
855,20,2013-12-03,4,90
24,13,2013-07-25,4,95
65,8,2016-04-01,4,103
81,8,2014-03-13,4,106
115,14,2015-08-21,4,108
125,2,2013-03-30,4,99
575,13,2014-02-06,4,109


Now, however, we no longer want to know the total number of orders, but a customer rank, which is calculated from this number:

In [4]:
SELECT [CustomerID]
      ,[SalespersonPersonID]
      ,[OrderDate]
      ,COUNT(1) as OrdersAtDateBySalesPerson
      ,DENSE_RANK() OVER (ORDER BY COUNT(1) OVER (partition by CustomerID)) as CustRank
  FROM [WideWorldImporters].[Sales].[Orders] 
group by CustomerId, SalespersonPersonID, OrderDate
order by 4 desc

: Msg 4109, Level 15, State 1, Line 5
Windowed functions cannot be used in the context of another windowed function or aggregate.

And here at the latest we are at the end with the window functions, because they must not contain each other: `Windowed functions cannot be used in the context of another windowed function or aggregate.` 
The solution to this problem is provided by CTEs: Similar to the way we count the total number of orders for the customer in a view and then could join them with the query, we can do this with a CTE. For this we first write the CTE. You can always recognize this by the fact that it starts with the keyword 'WITH'. We then join this CTE with our query, which calculates the rank of the lines:

In [5]:
WITH cte_rank as (
    SELECT [CustomerID]
        ,COUNT(1) as TotalOrders
    FROM [WideWorldImporters].[Sales].[Orders] 
    GROUP BY CustomerID
)
SELECT o.[CustomerID]
    ,[SalespersonPersonID]
    ,[OrderDate]
    ,COUNT(1) OVER (PARTITION BY o.CustomerID, SalespersonPersonID, OrderDate)  as OrdersAtDateBySalesPerson
    ,DENSE_RANK() OVER (ORDER BY r.TotalOrders)
FROM [WideWorldImporters].[Sales].[Orders] o 
LEFT JOIN cte_rank r ON o.CustomerID = r.CustomerID
ORDER BY 4 DESC

CustomerID,SalespersonPersonID,OrderDate,OrdersAtDateBySalesPerson,(No column name)
855,20,2013-12-03,4,50
855,20,2013-12-03,4,50
24,13,2013-07-25,4,53
24,13,2013-07-25,4,53
125,2,2013-03-30,4,57
125,2,2013-03-30,4,57
104,14,2015-01-26,4,61
104,14,2015-01-26,4,61
575,13,2014-02-06,4,68
575,13,2014-02-06,4,68


So here the CTE replaces a nested `SELECT` statement or view that we could otherwise use for these purposes as well. 
Similarly, CTEs can also be used to access the results of `ROW_NUMBER` or `RANK` lines, because if we only want to see the first order of a customer on the day, we can do this as follows: we number the orders on that day and select the one with line number 1.

In [7]:
WITH cte_rownumbers AS (
    select [CustomerID]
        ,[OrderDate]
        ,ROW_NUMBER() OVER (PARTITION BY [CustomerID], [OrderDate] ORDER BY [OrderDate]) as RowNumber
    FROM [WideWorldImporters].[Sales].[Orders]
)
SELECT * FROM cte_rownumbers
WHERE RowNumber = 1

CustomerID,OrderDate,RowNumber
1,2013-03-21,1
1,2013-03-26,1
1,2013-04-13,1
1,2013-05-23,1
1,2013-06-05,1
1,2013-06-14,1
1,2013-06-24,1
1,2013-07-20,1
1,2013-07-29,1
1,2013-08-06,1


This way you can also delete duplicates from your tables.

### References
- [Official Documentation from Microsoft](https://docs.microsoft.com/de-de/sql/t-sql/queries/with-common-table-expression-transact-sql?view=sql-server-2017)