# Create temporary tables

## Scenario

A bikeshare company has reached a milestone, and their marketing team wants to write a blog post announcing the popularity of their most-used bike. They want to include the name of the station where that bike can most likely be found, so they asked me to determine which bike is used most often.

For the purpose of my analysis, I will assume that most-used bike means the bike with the longest total trip duration and not the bike with the most trips.

## Dataset

I will use the BigQuery public dataset called `austin_bikeshare` with the full path `bigquery-public-data.austin_bikeshare`, specifically the `bikeshare_trips` table.

## Create a temporary table

To find the ID number of the bike with the longest total trip time in minutes, I will create a temporary table that calculates the total minutes of each trip for each bike. The temporary table is set up as follows:

In [None]:
-- Temporary table to find bike with longest total trip time
WITH longest_used_bike AS (
    SELECT
      bike_id,
      SUM(duration_minutes) AS trip_duration
    FROM
      bigquery-public-data.austin_bikeshare.bikeshare_trips
    GROUP BY
      bike_id
    -- Sort longest to shortest trip duration
    ORDER BY
      trip_duration DESC
    -- Only return longest trip
    LIMIT 1
  )

## Query temporary table

I cannot run the query at this point as the temporary table I have created as not been queries yet. I can now write the query that will obtain the station where this bike with the longest trip time was used most often as a starting point. To do this, I will join the temporary table to the original table and return the station where the highest number of trips started as follows:

In [None]:
-- Query to find where identified bike starts most often
SELECT
  trips.start_station_id,
  trips.bike_id, 
  COUNT(*) AS trip_count
FROM
  -- Retrieve from temporary table
  longest_used_bike AS longest
INNER JOIN
  -- Join original table
  bigquery-public-data.austin_bikeshare.bikeshare_trips AS trips
ON longest.bike_id = trips.bike_id
GROUP BY
  trips.start_station_id,
  trips.bike_id
ORDER BY
  trip_count DESC
LIMIT 1;

When putting it all together, the output of the query is shown below. From the output, it is evident that the most-used bike is number 370 and it can most likely be found at station number 3798 having started 177 trips from this station.

![Longest used bike starting from station](c05m04-query-bike-longest-used.png 'Longest used bike starting from station')