# Intermediate SQL: Filtering Records

Here you can access every table used in the course. To access each table, you will need to specify the `cinema` schema in your queries (e.g., `cinema.reviews` for the `reviews` `table.`

## Chapter 3: Filtering Numbers
* `WHERE`: allows us to filter to see only rows within a field meeting certain criteria
    * Comparison operators for WHERE: `>`,`<`,`>=`,`<=`,`=`,`<>`
        * `<>` excludes a rows with a certain value (like not equal to)
    * Can use these with strings (country names, position titles, etc. ) or numbers 
        * For strings, have to use '' around item to filter for
* `WHERE` comes before `LIMIT` but after`FROM`

In [27]:
-- With numbers in comparison operator
SELECT title
FROM cinema.films
WHERE release_year > 1960;
-- With string in comparison operator
SELECT title
FROM cinema.films
WHERE country = 'Germany';
-- With `WHERE` and `LIMIT`
SELECT title
FROM cinema.films
WHERE country = 'Germany'
LIMIT 5;

Unnamed: 0,title
0,Metropolis
1,Pandora's Box
2,City of Angels
3,Clay Pigeons
4,Run Lola Run


## Chapter 4: Multiple Criteria
* If we want to filter by multiple field values, use `OR`, `AND`, `BETWEEN`
    * `OR` and `AND` ARE combined with `WHERE` and field must be repeated
    * When combinging filtering condictions, enclose individual clauses in parentheses
    * `BETWEEN` is inclusive and includes end values

In [9]:
-- Using OR 
SELECT title
FROM cinema.films
WHERE release_year = 1994
	OR release_year = 2000;
-- Using AND
SELECT title
FROM cinema.films
WHERE release_year > 1994 
	AND release_year < 2000;
-- Using OR and AND
SELECT title
FROM cinema.films
WHERE (release_year = 1994 OR release_year = 1995)
	AND (certification = 'PG' OR certification = 'R');
-- Using BETWEEN, AND
SELECT title
FROM cinema.films
WHERE release_year
	BETWEEN 1994 AND 2000;
-- ALTERNATIVE:
SELECT title
FROM cinema.films
WHERE release_year >= 1994
	AND release_year <= 2000;
-- Using all of the above
SELECT title
FROM cinema.films
WHERE release_year
	BETWEEN 1994 AND 2000 AND country = 'UK';
-- Select all details for German-language films released after 2000 but before 2010 using only WHERE and AND.
SELECT *
FROM films
WHERE language = 'German'
    AND (release_year > 2000 AND release_year < 2010);
-- Write a query to get the title and release_year of films released in 1990 or 1999, which were in English or Spanish and took in more than $2,000,000 gross
SELECT title, release_year
FROM films
WHERE (release_year = 1990 OR release_year = 1999)
	AND (language = 'English' OR language = 'Spanish') 
	AND gross > 2000000
-- Use BETWEEN with AND on the films database to get the title and release_year of all Spanish-language films released between 1990 and 2000 (inclusive) with budgets over $100 million
SELECT title, release_year
FROM films
WHERE release_year BETWEEN 1990 AND 2000
	AND budget > 100000000
	AND (language = 'Spanish' OR language = 'French');

Unnamed: 0,title
0,Four Weddings and a Funeral
1,The Hudsucker Proxy
2,Dead Man Walking
3,GoldenEye
4,Richard III
...,...
59,Snatch
60,The Claim
61,The Claim
62,The House of Mirth


## Chapter 5: Filtering text
* `WHERE` can also filter text patterns using `LIKE`, `NOT LIKE`, or `IN` using wildcards
    * wildcards:
        * `%` (matches zero, one or many characters in field value listed)
        * `_` (matches single character)
    * `LIKE` is used w/ `WHERE` using a wildcard to identify patterns in text
    * `NOT LIKE` is used w/ `WHERE` to find records that DON'T match specified pattern
    * These are case sensitive
* Wildcards can be placed anywhere and can be combined
* To filter based on many conditions or a range of numbers, can use several `OR`s or use `IN`
    * Enclose various conditions in parentheses

In [21]:
-- Using LIKE with %
SELECT name
FROM cinema.people 
WHERE name LIKE 'Ade%';
-- Using LIKE with _
SELECT name
FROM cinema.people
WHERE name LIKE 'Ev_';
-- Using NOT LIKE
SELECT name
FROM cinema.people
WHERE name NOT LIKE 'A.%';
-- Using wildcards in other positions
SELECT name 
FROM cinema.people
WHERE name LIKE '%r';
SELECT name
FROM cinema.people
WHERE name LIKE '_____t';
-- Using IN
SELECT title
FROM cinema.films
WHERE release_year IN (1920,1930,1940);
SELECT title
FROM cinema.films
WHERE country IN ('Germany','France');
-- Select the names of people whose names have 'r' as the second letter
SELECT name
FROM cinema.people
WHERE name LIKE '_r%';
-- Find the title, certification, and language all films certified NC-17 or R that are in English, Italian, or Greek
SELECT title, certification, language
FROM cinema.films
WHERE certification IN ('NC-17', 'R') 
    AND language IN ('English', 'Greek', 'Italian');
-- CHALLENGE: Find out what unique 90's films we have in our dataset that would be suitable for English-speaking teens
SELECT COUNT (DISTINCT title) AS nineties_english_films_for_teens
FROM films
WHERE release_year BETWEEN 1990 AND 1999
	AND language = 'English'
	AND certification IN ('G', 'PG', 'PG-13');

Unnamed: 0,title,certification,language
0,Psycho,R,English
1,A Fistful of Dollars,R,Italian
2,Rosemary's Baby,R,English
3,The Wild Bunch,R,English
4,Catch-22,R,English
...,...,...,...
2001,The Neon Demon,R,English
2002,The Perfect Match,R,English
2003,The Purge: Election Year,R,English
2004,The Veil,R,English


## Chapter 6: Null Values
* When using `COUNT` can include missing or not include missing values using:
    * to not include missing: `COUNT(field_name)`
    * to include missing: `COUNT(*)`
* Missing values are `null`: no data was inputed
* To see how much data is missing use `IS NULL`
* To see exclusively data that meets a criterion and has a value reported, use `IS NOT NULL`
* Using `COUNT(field_name)` automatically excludes null values

In [26]:
SELECT COUNT(name) AS no_birthdates
FROM cinema.people
WHERE birthdate IS NULL;
-- To filter for field value and exclude missing rows
SELECT COUNT(name) AS count_birthdates
FROM cinema.people
WHERE birthdate IS NOT NULL;
-- List all film titles with missing budgets
SELECT title AS no_budget_info
FROM cinema.films
WHERE budget IS NULL;
-- Count the number of films we have language data for
SELECT COUNT(*) AS count_language_known
FROM cinema.films
WHERE language IS NOT NULL;

Unnamed: 0,count_language_known
0,4968
