In [1]:
%load_ext google.cloud.bigquery

## Correlated subquery (Without WITH)

In [2]:
%%bigquery

SELECT
  start_date,
  COUNT(*) AS num_long_trips
FROM
  -- "First from"
  (
  SELECT
    start_station_name,
    duration,
    EXTRACT(DATE
    FROM
      start_date) AS start_date
  FROM
    dataflow-templates-327714.bigquery_examples.cycle_hire
  WHERE
    start_station_name = end_station_name) AS roundtrips
WHERE
  -- "Outer where"
  duration > 2 * (
  SELECT
    AVG(duration) AS avg_duration
  FROM
    dataflow-templates-327714.bigquery_examples.cycle_hire
  WHERE
    start_station_name = end_station_name
    AND roundtrips.start_station_name = start_station_name)
GROUP BY
  start_date
ORDER BY
  num_long_trips DESC
LIMIT
  5

Query complete after 0.03s: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 1409.77query/s]
Downloading: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:02<00:00,  1.75rows/s]


Unnamed: 0,start_date,num_long_trips
0,2016-12-25,740
1,2016-05-08,714
2,2017-04-09,667
3,2015-08-01,663
4,2015-12-25,660


## Optimized using WITH

Improving reusability and readability using WITH

In [3]:
%%bigquery
WITH 
roundtrips AS (
  SELECT
    start_station_name,
    duration,
    EXTRACT(DATE
    FROM
      start_date) AS start_date
  FROM
    dataflow-templates-327714.bigquery_examples.cycle_hire
  WHERE
    start_station_name = end_station_name),

station_avg AS (
  SELECT
    start_station_name,
    AVG(duration) AS avg_duration
  FROM
    roundtrips
  GROUP BY
    start_station_name)
SELECT
  start_date,
  COUNT(*) AS num_long_trips
FROM
  roundtrips
JOIN
  station_avg
USING
  (start_station_name)
WHERE
  duration > 2 * avg_duration
GROUP BY
  start_date
ORDER BY
  num_long_trips DESC
LIMIT
  5

Query complete after 0.00s: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 2099.43query/s]
Downloading: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:01<00:00,  4.03rows/s]


Unnamed: 0,start_date,num_long_trips
0,2016-12-25,740
1,2016-05-08,714
2,2017-04-09,667
3,2015-08-01,663
4,2015-12-25,660
