# Use subqueries to refine data

## Scenario

I work for an organization that is responsible for the safety, efficiency, and maintenance of transportation systems in my city. I have been asked to gather information around the use of Citi Bikes in New York City. This information will be used to convince the mayor and other city officials to invest in a bike sharing and rental system to help push the city toward its environmental and sustainability goals.

For this purpose, I will use 3 SQL queries to gather information:

- about the average trip duration by station,
- to compare trip duration by station, and
- to determine the five stations with the longest mean trip durations.

## Dataset

I will obtain this information using the BigQuery public dataset `new_york_citibike` with the full path `bigquery-public-data.new_york_citibike`. This dataset has the following tables:

- citibike_stations
- citibike_trips

## Query: Average trip duration by station

To find the average trip duration by station, I execute the following query on the `citibike_trips` table containing a subquery in a FROM statement:

In [None]:
/* Outer query to obtain the station id and average trip
   duration of trips started from each */
SELECT
    avg_trip_duration.start_station_id,
    avg_trip_duration.avg_duration

FROM
	-- Subquery to calculate average trip duration
    (
		SELECT
            start_station_id,
            ROUND(AVG(tripduration), 2) as avg_duration
	    FROM
            bigquery-public-data.new_york_citibike.citibike_trips
        GROUP BY
            start_station_id
	) AS
    avg_trip_duration -- Subquery alias

ORDER BY avg_duration DESC;

The query returns a list of every station id with the average trip duration of trips that started from each station. Below is a preview of the output:

![Average trip duration by station](c05m03-query-avg-trip-duration.png 'Average trip duration by station')

## Query: Compare trip duration by station

To compare the average trip duration per station to the overall average trip duration from all stations