# 1. Introduction to Logical Operators
In the previous mission, we covered the basics of databases, SQL, and the SELECT statement. In this mission, we'll explore how to express more complex filtering criteria. We'll continue to work with the data set, recent-grads.csv, which we've loaded into the table recent_grads. Here's a preview:

![](Picture3.png)

We learned about SQL's comparison operators in the last lesson:

- Less than: <
- Less than or equal to: <=
- Greater than: >
- Greater than or equal to: >=
- Equal to: =
- Not equal to: !=

These were useful for expressing our filtering criteria, or condition, in the WHERE statement. But what if we want to use multiple filtering criteria to specify the data we want from the database?

Logical operators are keywords we can use to combine filtering criteria and express more specific conditions. Here are the two basic logical operators we use most often:

- OR (returns either Condition1 or Condition2)
- AND (returns both Condition1 and Condition2)

# 2. Returning Multiple Conditions With AND
The following psuedo-code will help you conceptualize how we use the AND statement with a WHERE statement:

    SELECT [column1, column2,...] FROM [table1]
    WHERE [condition1] AND [condition2]
Now we can write a SQL query that returns all of the female-majority majors with more than 10000 employed graduates.

Let's see what this query looks like:

    SELECT Major,ShareWomen,Employed FROM recent_grads 
    WHERE ShareWomen>0.5 AND Employed>10000;
We want the database to return all of the rows where both conditions are true:

1. ShareWomen > 0.5
2. Employed > 10000

**Instructions**

- Run the query above, which returns all of the female-majority majors with more than 10000 employed graduates.
- Use the LIMIT statement to return just the first 10 results.

In [6]:
import sqlite3
jobs = sqlite3.connect('jobs.db')
c = jobs.cursor()

for row in c.execute('''SELECT Major,ShareWomen,Employed 
FROM recent_grads 
WHERE ShareWomen>0.5 AND Employed>10000 
LIMIT 10;'''):
    print(row)

('COMPUTER SCIENCE', 0.578766338, 102087)
('NURSING', 0.896018988, 180903)
('COMPUTER AND INFORMATION SYSTEMS', 0.7077185020000001, 28459)
('INTERNATIONAL RELATIONS', 0.632986838, 21190)
('AGRICULTURE PRODUCTION AND MANAGEMENT', 0.59420765, 12323)
('CHEMISTRY', 0.5051405379999999, 48535)
('BUSINESS MANAGEMENT AND ADMINISTRATION', 0.580948004, 276234)
('BIOCHEMICAL SCIENCES', 0.515406449, 25678)
('HUMAN RESOURCES AND PERSONNEL MANAGEMENT', 0.672161443, 20760)
('MISCELLANEOUS HEALTH MEDICAL PROFESSIONS', 0.702020202, 10076)


# 3. Returning One of Several Conditions With OR

We used the AND operator to specify that our filter needs to pass two Boolean conditions. Both of the conditions had to evaluate to True for the record to appear in the result set. If we wanted to specify a filter that meets either of the conditions instead, we would use the OR operator.

    SELECT [column1, column2,...] FROM [table1]
    WHERE [condition1] OR [condition2]

We'll dive straight into a practice problem because we use the OR and AND operators in similar ways.

**Instructions**

Write a SQL query that returns the first 20 majors that either:
- have a Median salary greater than or equal to 10,000, or
- have less than or equal to 1,000 Unemployed people

We only want to include the following columns in the results, and in this order:
- Major
- Median
- Unemployed

In [10]:
def print_sql(sql_command):
    for row in c.execute(sql_command):
        print(row)  
        
x='''SELECT Major,Median,Unemployed FROM recent_grads WHERE Median >= 10000 OR Unemployed <= 1000 LIMIT 20;'''

print_sql(x)

('PETROLEUM ENGINEERING', 110000, 37)
('MINING AND MINERAL ENGINEERING', 75000, 85)
('METALLURGICAL ENGINEERING', 73000, 16)
('NAVAL ARCHITECTURE AND MARINE ENGINEERING', 70000, 40)
('CHEMICAL ENGINEERING', 65000, 1672)
('NUCLEAR ENGINEERING', 65000, 400)
('ACTUARIAL SCIENCE', 62000, 308)
('ASTRONOMY AND ASTROPHYSICS', 62000, 33)
('MECHANICAL ENGINEERING', 60000, 4650)
('ELECTRICAL ENGINEERING', 60000, 3895)
('COMPUTER ENGINEERING', 60000, 2275)
('AEROSPACE ENGINEERING', 60000, 794)
('BIOMEDICAL ENGINEERING', 60000, 1019)
('MATERIALS SCIENCE', 60000, 78)
('ENGINEERING MECHANICS PHYSICS AND SCIENCE', 58000, 23)
('BIOLOGICAL ENGINEERING', 57100, 589)
('INDUSTRIAL AND MANUFACTURING ENGINEERING', 57000, 699)
('GENERAL ENGINEERING', 56000, 2859)
('ARCHITECTURAL ENGINEERING', 54000, 170)
('COURT REPORTING', 54000, 11)


# 4. Grouping Operators With Parentheses

There's a certain class of questions that we can't answer using only the techniques we've learned so far. For example, if we wanted to write a query that returned all Engineering majors that either had mostly female graduates or an unemployment rate below 5.1%, we would need to use parentheses to express this more complex logic.

The three raw conditions we'll need are:

    Major_category = 'Engineering'
    ShareWomen >= 0.5
    Unemployment_rate < 0.051

What the SQL query looks like using parantheses:

    select Major, Major_category, ShareWomen, Unemployment_rate
    from recent_grads
    where (Major_category = 'Engineering') and (ShareWomen > 0.5 or Unemployment_rate < 0.051);

The first thing you may notice is that we didn't capitalize any of the operators or statements in the query. SQL's built-in keywords are case-insensitive, which means we don't have to capitalize operators like AND or statements like SELECT.

The second thing you may notice is how we enclosed the logic we wanted to be evaluated together in parentheses. This is very similar to how we group mathematical calculations together in a particular order. The parentheses makes it explictly clear to the database that we want all of the rows where both of the expressions in the statements evaluate to True:

    (Major_category = 'Engineering' and ShareWomen > 0.5) -> True or False
    (ShareWomen > 0.5 or Unemployment_rate < 0.051) -> True or False

If we had written the where statement without any parentheses, the database would guess what our intentions are, and actually execute the following query instead:

    where (Major_category = 'Engineering' and ShareWomen > 0.5) or (Unemployment_rate < 0.051)
    
Leaving the parentheses out implies that we want the calculation to happen from left to right in the order in which the logic is written, and wouldn't return us the data we want. Now let's run our intended query and see the results!

**Instructions**
- Run the query we explored above, which returns all Engineering majors that:
    - either had mostly women graduates
    - or had an unemployment rate below 5.1%, which was the rate in August 2015
- We're interested in returning the Major, Major_category, ShareWomen, and Unemployment_rate columns.

In [12]:
sql='''select Major, Major_category, ShareWomen, Unemployment_rate
from recent_grads
where (Major_category = 'Engineering') and (ShareWomen > 0.5 or Unemployment_rate < 0.051);'''

print_sql(sql)

('PETROLEUM ENGINEERING', 'Engineering', 0.120564344, 0.018380527)
('METALLURGICAL ENGINEERING', 'Engineering', 0.153037383, 0.024096386)
('NAVAL ARCHITECTURE AND MARINE ENGINEERING', 'Engineering', 0.107313196, 0.050125313)
('MATERIALS SCIENCE', 'Engineering', 0.310820285, 0.023042836)
('ENGINEERING MECHANICS PHYSICS AND SCIENCE', 'Engineering', 0.183985189, 0.006334343)
('INDUSTRIAL AND MANUFACTURING ENGINEERING', 'Engineering', 0.34347321799999997, 0.042875544)
('MATERIALS ENGINEERING AND MATERIALS SCIENCE', 'Engineering', 0.292607004, 0.027788805)
('ENVIRONMENTAL ENGINEERING', 'Engineering', 0.558548009, 0.093588575)
('INDUSTRIAL PRODUCTION TECHNOLOGIES', 'Engineering', 0.75047259, 0.028308097)
('ENGINEERING AND INDUSTRIAL MANAGEMENT', 'Engineering', 0.174122505, 0.03365166)


# 5. Practice Grouping Operators

In this step, you'll practice grouping operators to express more complex logic.

**Instructions**

Find all majors that meet all of the following criteria:
- Major_category of Business or Arts or Health
- Employed students greater than 20,000 or Unemployment_rate below 5.1%

We're only interested in the following columns (in the following order):
- Major
- Major_category
- Employed
- Unemployment_rate

Return all of the results (don't apply a limit).

In [15]:
x='''select Major, Major_category, Employed, Unemployment_rate
from recent_grads
where (Major_category = 'Business' or Major_category = 'Arts' or Major_category = 'Health') 
and (Employed > 20000 or Unemployment_rate < 0.051);'''

print_sql(x)

('OPERATIONS LOGISTICS AND E-COMMERCE', 'Business', 10027, 0.047858702999999995)
('NURSING', 'Health', 180903, 0.04486272400000001)
('FINANCE', 'Business', 145696, 0.060686356)
('ACCOUNTING', 'Business', 165527, 0.069749014)
('MEDICAL TECHNOLOGIES TECHNICIANS', 'Health', 13150, 0.03698279)
('MEDICAL ASSISTING SERVICES', 'Health', 9168, 0.042506527)
('GENERAL BUSINESS', 'Business', 190183, 0.072861468)
('BUSINESS MANAGEMENT AND ADMINISTRATION', 'Business', 276234, 0.07221834099999999)
('MARKETING AND MARKETING RESEARCH', 'Business', 178862, 0.061215064000000007)
('HUMAN RESOURCES AND PERSONNEL MANAGEMENT', 'Business', 20760, 0.059569649)
('COMMERCIAL ART AND GRAPHIC DESIGN', 'Arts', 83483, 0.096797577)
('TREATMENT THERAPY PROFESSIONS', 'Health', 37861, 0.059821207)
('HOSPITALITY MANAGEMENT', 'Business', 36728, 0.061169193)
('GENERAL MEDICAL AND HEALTH SERVICES', 'Health', 24406, 0.082101621)
('FILM VIDEO AND PHOTOGRAPHIC ARTS', 'Arts', 31433, 0.10577224)
('MUSIC', 'Arts', 47662, 0.07595

# 6. Order Results With ORDER BY

The database has been ordering all of our results by the Rank column, because that's how the original data set ordered the data. This may not make sense for all queries, though. SQL comes with an ORDER BY statement that allows us to specify how we want the database to order our results. To use the ORDER BYstatement, we need to specify the column we want to order the results by, and whether we want to order them in ascending (low to high) or descending order.

    SELECT [column1, column2,...] FROM [table1]
    WHERE [conditions]..
    ORDER BY column1 [ASC or DESC]

We use ASC to order from low to high, and DESC to order from high to low. SQL uses the standard methods of ordering -- alphabetically for text fields and numerically for numeric fields. This means that if we order by a text field in descending order, the results will be in reverse alphabetical order.

The following code selects the Employed column, orders it in ascending order (low to high), and limits the results to the first 10:

    select Employed
    from recent_grads
    order by Employed asc
    limit 10;
    
This query returns the lowest 10 values in the Employed column. First, it puts the values in Employed in ascending order, then returns the first 10 values under the new ordering.

**Instructions**

- Return the first 10 values in the Major column in reverse alphabetical order.

In [17]:
x='''select Major
from recent_grads
order by Major desc
limit 10;'''

print_sql(x)

('ZOOLOGY',)
('VISUAL AND PERFORMING ARTS',)
('UNITED STATES HISTORY',)
('TREATMENT THERAPY PROFESSIONS',)
('TRANSPORTATION SCIENCES AND TECHNOLOGIES',)
('THEOLOGY AND RELIGIOUS VOCATIONS',)
('TEACHER EDUCATION: MULTIPLE LEVELS',)
('STUDIO ARTS',)
('STATISTICS AND DECISION SCIENCE',)
('SPECIAL NEEDS EDUCATION',)


# 7. Order Results Based on Multiple Columns

SQL also allows us to specify multiple columns in the ORDER BY statement. If multiple rows have the same values in one column, for example, we can order by that column first, then by a different column. You may have done something similar with a Microsoft Excel spreadsheet.

Here's what the psuedocode for this looks like:

    select [column1, column2..]
    from table_name
    order by column1 (asc or desc), column2 (asc or desc)
    
Ordering by multiple columns is especially useful when working with people's names, because databases often have separate columns for first and last names. We can specify that we want to order or alphabetize query results by Last Name and First Name. After alphabetizing all last names, the database will alphabetize all rows that have the same values for Last Name by First Name.

    Last Name	First Name
    Khan	Sal
    Khan	Tony
    Prescot	Pete
    Prescot	Russ

Now it's your turn!

**Instructions**

Write a query that orders the majors by Major in ascending order, then by Median salary in descending order. We're interested in selecting only these columns, in the following order:
- Major_category
- Median
- Major
Limit the query to just the first 20 results.

In [22]:
x='''select Major_category, Median, Major
from recent_grads
order by Major asc, Median desc
limit 10;'''

print_sql(x)

('Business', 45000, 'ACCOUNTING')
('Business', 62000, 'ACTUARIAL SCIENCE')
('Communications & Journalism', 35000, 'ADVERTISING AND PUBLIC RELATIONS')
('Engineering', 60000, 'AEROSPACE ENGINEERING')
('Agriculture & Natural Resources', 40000, 'AGRICULTURAL ECONOMICS')
('Agriculture & Natural Resources', 40000, 'AGRICULTURE PRODUCTION AND MANAGEMENT')
('Agriculture & Natural Resources', 30000, 'ANIMAL SCIENCES')
('Humanities & Liberal Arts', 28000, 'ANTHROPOLOGY AND ARCHEOLOGY')
('Computers & Mathematics', 45000, 'APPLIED MATHEMATICS')
('Engineering', 54000, 'ARCHITECTURAL ENGINEERING')


# 8. Next Steps
This lesson gave you some practice with writing and running SQL queries. The next mission in this course is a challenge that will give you an opportunity to apply what you've learned so far.