# Efficient Queries

### Introduction

In writing code, remember that our priorities is to make it work, make it right (meaning write clean code) and then to make it fast.

In this lesson, we'll talk about techniques for optimizing SQL queries.

### A. Reducing Repetition

The first technique is simply to look for and reduce repetition in SQL queries.  This often happens when there are CTEs or subqueries.  In this case, different CTEs can be selecting from the same table, where instead we can just reference that original CTE. 

```sql
with customer_orders as (
    select * from customers join orders on orders.customer_id = customers.id
),
customer_order_locations as (
        select * from customers 
        join orders on orders.customer_id = customers.id 
        join locations on customer_id.location_id = locations.id
)
select * from customer_order_locations
```

So in the above statement, we are loading up and joining the customers and orders a second time in our `customer_order_locations` CTE.  Instead we could have just referenced the `customer_orders` CTE and joined from there. 


Finding repetition in the above may seem pretty obvious, but repetition like that can occur in a long CTE.

How do you spot it?

Well just press `ctl + f` to see if certain queries are performed more than once.

### Reducing in Select Clauses

Selecting more columns takes more time than selecting fewer columns.  This is because all of that data needs to be loaded into memory.  And certain columns, like text columns, or images may contain a lot of data.  It's a good idea to make sure that each of the columns we load into memory are actually needed. 

### Avoiding Distinct

Using distinct in the select statement can be particularly slow, and when possible, should be avoided.  Here is how SQL performs distinct underneath.  

1. First it, fetches the candidate rows into a temporary area
2. Then it sorts the candidate rows
3. It then moves through the sorted list of candidate rows to skip duplicate rows. Typically, it’ll stream the non-duplicate rows directly to the application at this point. 

> Read more [here](https://www.quora.com/Does-select-distinct-slow-down-a-query)

Instead of performing a select distinct, we can try to avoid it.  For example, instead of performing a distinct of first_name, maybe we can group by the first name.  Or perhaps we can avoid the duplication of values by removing a join.  

For example, look at the query below:

```sql
select distinct(customers.id) from customers join orders on orders.customer_id = customers.id
```

The reason why we are getting duplicate customer_ids in the first place is because a customer has many orders and we are joining orders.  So if we avoid the join, we can avoid the distinct in the first place.

### Be Careful with Wildcards

Let's consider the following query.

```sql
select * from customers where WHERE first_name LIKE '%than';
```

This query can take SQL a while to run, because it will move through an entire string before getting to the end where it potentially finds a match.  

However if the wildcard is at the end of a string, this is less costly as it can look for a match at the top of a string and move on.

```sql
select * from customers WHERE first_name LIKE 'nath%';
```

### Remember indexes and joins

Finally, remember our tips from other lessons.

* We can speed up our where clauses with indexes, and then make sure these indexes are used by our query by avoiding functions in the where clause.   

* Be aware of joins, by loading reducing the data either by reducing the number of tables used, or adding a where clause to reduce the amount of data joined.  Remember our tips to speed up joins:
    * Only join on necessary tables
    * Reduce the data as much possible before any joins (for example, group by before joining)
    * join on indexed or int columns.
    

To review tips, and see some more, check out the resources below.

### Resources

[Devart Sql queries](https://blog.devart.com/how-to-optimize-sql-query.html)

[More advanced optimization tips](https://help.hcltechsw.com/commerce/7.0.0/com.ibm.commerce.developer.soa.doc/refs/rsdperformanceworkspaces.html)



