# Filtering, sorting and calculating data with SQL

## Basics of Filtering with SQL

* Why filter?
    * Be specific about the data you want to retrieve
    * Reduce the number of records you retrieve
    * Increase query performance
    * Reduce the strain on the client application
    * Governance limitations

* Filtering is done with WHERE Clause Operators:

In [None]:
# WHERE Clause Operators
statement = """

SELECT column_name, column_name
FROM table_name
WHERE column_name operator value;

"""

* Possible operators:
    * = equal
    * <> not equal
    * \> greater than
    * < less than
    * \>= greater than or equal
    * <= less than or equal
    * BETWEEN ... AND
    * ISNULL



## Advanced filtering: IN, OR, and NOT

* IN Operator
    * Specifies a range of conditions
    * Comma delimited list of values
    * Enclosed in ()

In [None]:
# IN Operator example
statement = """

SELECT
ProductID
,UnitPrice
,SupplierID
From Products
WHERE SupplierID IN (9, 10, 11);

"""

* OR Operator
    * DBMS will not evaluate the second conditions in a WHERE clause if the first condition is met
    * Use for any rows matching the specific conditions

In [None]:
# WHERE Operator example
statement = """

SELECT
ProductName
,ProductID
,UnitPrice
,SupplierID
,ProductName
FROM Products
WHERE ProductName = 'Tofu' OR 'Konbu';

"""

* IN vs. OR
    * IN works the same as OR
    * Benefits of IN:
        * Long list of options
        * IN executes faster than OR
        * Don't have to think about the order with IN
        * Can contain another SELECT for subqueries

* Using OR with AND
    * You have to be very careful because you can get different results depending on how you implement it (use of parenthesis)

In [1]:
# Example of NOT operator
statement = """

SELECT *
FROM Employees
WHERE NOT City='London' AND
NOT City='Seatle';

"""

## Using wildcards in SQL

* What are wildcards?
    * Special character used to match parts of a value
    * Search pattern made from literal text, wild character, or a combination
    * Uses LIKE as an operator (though it is technically a predicate)
    * Can only be used with strings
    * Cannot be used for non-text datatypes
    * Helpful for data scientists as they explore string variables

* Using % wildcards
    * **'%Pizza'** - Grabs anything ending with the word Pizza
    * **'Pizza%'** - Grabs anything after the work Pizza
    * **'%Pizza%'** - Grabs anything before and after the word Pizza