### Order By

You'll learn how to **change the order** of your results using the ORDER BY clause.

**ORDER BY** is usually the *last clause in your query*, and it sorts the results returned by the rest of your query

Let's see some example with this table. Here the ID column is not ordered

|  ID   |        Name        | Animal |
| :---: | :----------------: | :----: |
|   1   | Dr. Harris Bonkers | Rabbit |
|   4   |        Tom         |  Cat   |
|   2   |        Moon        |  Dog   |
|   3   |       Ripley       |  Cat   |


In [None]:
# Here we are selecting all columns and order the rows by ID
QUERY = """
    SELECT *
    FROM 'bigquery-public-data.pet_records.pets'
    ORDER BY ID
"""
# Ir also works with strings columns
QUERY = """
    SELECT *
    FROM 'bigquery-public-data.pet_records.pets'
    ORDER BY Animal
"""
# You can reverse the order using the DESC argument (short for 'descending')
QUERY = """
    SELECT *
    FROM 'bigquery-public-data.pet_records.pets'
    ORDER BY Animal DESC
"""

### Dates
https://www.kaggle.com/dansbecker/order-by#Dates

Next, we'll talk about dates, because they come up very frequently in real-world databases. There are two ways that dates can be stored in BigQuery: as a DATE or as a DATETIME.

The DATE format has the year first, then the month, and then the day. It looks like this:

> YYYY-[M]M-[D]D \
> **YYYY:** Four-digit year \
> **[M]M:** One or two digit month \
> **[D]D:** One or two digit day
> 
> So **2019-01-10** is interpreted as **January 10, 2019.**

The DATETIME format is like the date format ... but with time added at the end.

#### EXTRACT
Ofthen you'll need to get parts of the date.

|  ID   |        Name        | Animal |    Date    |
| :---: | :----------------: | :----: | :--------: |
|   1   | Dr. Harris Bonkers | Rabbit | 2022-02-01 |
|   4   |        Tom         |  Cat   | 2022-02-02 |
|   2   |        Moon        |  Dog   | 2022-02-03 |
|   3   |       Ripley       |  Cat   | 2022-02-04 |

In [1]:
# Here instead of DAY you can use WEEK, YEAR, CENTURY, DECADE etc... just check the docs
QUERY = """
    SELECT Name, EXTRACT(DAY from Date) AS Day
    FROM 'bigquery-public-data.pet_records.pets'
"""

And you'll get something like this

|        Name        |  Day  |
| :----------------: | :---: |
| Dr. Harris Bonkers |  01   |
|        Tom         |  02   |
|        Moon        |  03   |
|       Ripley       |  04   |

##### Example with real dataset

In [3]:
from google.cloud import bigquery
from google.oauth2 import service_account

credentials = service_account.Credentials.from_service_account_file("secrets.json")
client = bigquery.Client(credentials=credentials)
dataset = client.get_dataset('bigquery-public-data.nhtsa_traffic_fatalities')
table_ref = dataset.table("accident_2015")
table = client.get_table(table_ref)
df = client.list_rows(table, max_results=5).to_dataframe()
df

Unnamed: 0,state_number,state_name,consecutive_number,number_of_vehicle_forms_submitted_all,number_of_motor_vehicles_in_transport_mvit,number_of_parked_working_vehicles,number_of_forms_submitted_for_persons_not_in_motor_vehicles,number_of_persons_not_in_motor_vehicles_in_transport_mvit,number_of_persons_in_motor_vehicles_in_transport_mvit,number_of_forms_submitted_for_persons_in_motor_vehicles,...,minute_of_ems_arrival_at_hospital,related_factors_crash_level_1,related_factors_crash_level_1_name,related_factors_crash_level_2,related_factors_crash_level_2_name,related_factors_crash_level_3,related_factors_crash_level_3_name,number_of_fatalities,number_of_drunk_drivers,timestamp_of_crash
0,19,Iowa,190204,1,1,0,0,0,1,1,...,2,0,,0,,0,,1,1,2015-09-11 20:20:00+00:00
1,19,Iowa,190233,1,1,0,0,0,1,1,...,88,0,,0,,0,,1,1,2015-11-01 00:30:00+00:00
2,19,Iowa,190179,1,1,0,0,0,2,2,...,1,0,,0,,0,,1,0,2015-05-04 16:18:00+00:00
3,19,Iowa,190248,1,1,0,0,0,4,4,...,99,0,,0,,0,,2,0,2015-11-17 12:26:00+00:00
4,19,Iowa,190231,1,1,0,0,0,1,1,...,88,0,,0,,0,,1,0,2015-10-31 04:49:00+00:00


Let's use the table to **determine how the number of accidents varies with the day of the week**. Since:
- the **consecutive_number** column contains a unique ID for each accident, and
- the **timestamp_of_crash** column contains the date of the accident in DATETIME format,

In [5]:
# Query to find out the number of accidents for each day of the week
query = """
    SELECT COUNT(consecutive_number) AS num_accidents, 
    EXTRACT(DAYOFWEEK FROM timestamp_of_crash) AS day_of_week
    FROM `bigquery-public-data.nhtsa_traffic_fatalities.accident_2015`
    GROUP BY day_of_week
    ORDER BY num_accidents DESC
"""
query_job = client.query(query)
accidents_by_day = query_job.to_dataframe()
accidents_by_day

Unnamed: 0,num_accidents,day_of_week
0,5659,7
1,5298,1
2,4916,6
3,4460,5
4,4182,4
5,4038,2
6,3985,3
