# Calculations with SQL

## Scenario

I will analyze subway ridership data to help improve the quality of the city of New York's public transportation.

## Dataset

I will use the BigQuery public dataset called `new_york_subway` with the full path `bigquery-public-data.new_york_subway`. In this dataset, I will specifically use the table `subway_ridership_2013_present` for my analysis.

## Query: Explore the change in weekly ridership from 2013 to 2014

By subtracting the number of riders in 2013 from the number of riders in 2014, I will be able to determine whether there as an increase or a decrease in ridership for every station. I execute the following query:

In [None]:
SELECT
    station_name,
    ridership_2013,
    ridership_2014,
    -- Subtract 2013 from 2014 to determine change
    ridership_2014 - ridership_2013 AS ridership_change
FROM
    bigquery-public-data.new_york_subway.subway_ridership_2013_present;

The query produces a table showing the ridership totals for each station in 2013 and 2014, as well as a column indicating the number with which it increased or decreased. The station *Times Sq - 42 St / 42 St* had an increase of 7212 in 2014, while the station *4 Av* had 321 less riders in 2014, as shown below:

![Query results for 2013 to 2014 ridership change](c05m04-query-2013-2014-change.png 'Query results for 2013 to 2014 ridership change')

## Query: Average weekly ridership from 2013 to 2016

To calculate the average weekly ridership for the years 2013, 2014, 2015 and 2016 using multiple arithmetic operators, I would need to add the ridership totals for the 4 years together and divide this by 4. The query I execute is as follows:

In [None]:
SELECT
    station_name,
    ridership_2013,
    ridership_2014,
    ridership_2015,
    ridership_2016,
    /* Combine multiple arithmetic operators to calculate
       the average weekly ridership from 2013 to 2016 */
    (ridership_2013 + ridership_2014 + ridership_2015 + ridership_2016) / 4 AS ridership_average
FROM
    bigquery-public-data.new_york_subway.subway_ridership_2013_present;

The query output is a table showing the ridership at each station for the years 2013, 2014, 2015 and 2016. It also calculates the average ridership over the 4 years. With the trends clearly visible, I can conclude that the station *Myrtle - Wyckoff Avs* has had a steady increase in weekly ridership year-on-year from 2013 to 2016. Interestingly, from 2014 to 2016, weekly ridership at *Myrtle - Wyckoff Avs* consistently exceeded the overall average ridership observed during the 2013-2016. I include a screenshot of the query output below:

![Query results average ridership from 2013 to 2016](c05m04-query-2013-2016-avg.png 'Query results average ridership from 2013 to 2016')