Skip to content

Making the HTTP2 query cheaper #110

Open
@tunetheweb

Description

@tunetheweb

We have a HTTP/2 requests graph which does a look up on the $_protocol field in the requests.payload column. This currently costs 211TB and costs an estimated $1,058 (yes - one thousand bucks!!!) and counting, to run and is re-run every month. Which is quite frankly ridiculous. It also takes forever to run and sometimes times out.

I wanted to add an HTTP/3 graph since it's getting out there but can't justify doubling that cost! While our generous benefactor may be able to absorb that, others can't, and I think we should be setting a better examples here.

If we use the summary_requests table and use the reqHttpVersion, or respHttpVersion (or both!) then the cost plummets to 363GB and or an estimated $1.77!!! And the data looks pretty similar (not exactly the same as requests and summary_requests look to have slight differences in number of rows, but close enough).

However, there is an issue as these fields had bad data for a long time (relevant WPT issue and was only fixed from October 2020. I would prefer to track the growth longer than that and ideally back to 2015 when HTTP/2 was launched.

So we've a few choices:

  1. Fix up the bad data. Ideally we'd join requests to summary_requests and update the bad reqHttpVersion, or respHttpVersion values to the $._protocol field but can't figure out how to do that.
  2. Patch the bad data by saying ori:, us:, od:, me: or : / values are effectively HTTP/2. This isn't always the case and there are a small number of HTTP/1.1 connections which give those values, but it's close enough and a lot easier to run this clean up than option 1 (unless there is a way to join these two tables I'm not seeing?).
  3. Have a hacky SQL (see below) to patch it in the query instead. Seems a bit of a hack.
  4. Add the protocol column to summary_requests table and backfill all the old values. Seems like quite an effort.
  5. Wait until we reorganised the tables like we've talked about.
  6. Leave as is and just implement HTTP/3 query in cheaper manner.

Thoughts?

#standardSQL
SELECT
  SUBSTR(_TABLE_SUFFIX, 0, 10) AS date,
  UNIX_DATE(CAST(REPLACE(SUBSTR(_TABLE_SUFFIX, 0, 10), '_', '-') AS DATE)) * 1000 * 60 * 60 * 24 AS timestamp,
  IF(ENDS_WITH(_TABLE_SUFFIX, 'desktop'), 'desktop', 'mobile') AS client,
  ROUND(SUM(IF(respHttpVersion = 'HTTP/2'
               OR respHttpVersion = 'ori' -- bad value that mostly means HTTP/2 (parsed incorrectly from :authority:)
               OR respHttpVersion = 'us:' -- bad value that mostly means HTTP/2 (parsed incorrectly from :status:)
               OR respHttpVersion = 'od:' -- bad value that mostly means HTTP/2 (parsed incorrectly from :method:)
               OR respHttpVersion = 'me:' -- bad value that mostly means HTTP/2 (parsed incorrectly from :scheme:)
               OR respHttpVersion = ': /' -- bad value that mostly means HTTP/2 (parsed incorrectly from :path:)
               OR reqHttpVersion = 'HTTP/2'
               OR reqHttpVersion = 'ori' -- bad value that mostly means HTTP/2 (parsed incorrectly from :authority:)
               OR reqHttpVersion = 'us:' -- bad value that mostly means HTTP/2 (parsed incorrectly from :status:)
               OR reqHttpVersion = 'od:' -- bad value that mostly means HTTP/2 (parsed incorrectly from :method:)
               OR reqHttpVersion = 'me:' -- bad value that mostly means HTTP/2 (parsed incorrectly from :scheme:)
               OR reqHttpVersion = ': /' -- bad value that mostly means HTTP/2 (parsed incorrectly from :path:)
             , 1, 0)) * 100 / COUNT(0), 2) AS percent
FROM
  `httparchive.summary_requests.*`
GROUP BY
  date,
  timestamp,
  client
ORDER BY
  date DESC,
  client

Here's the comparison of what that comes back with compared to the current production site:

date timestamp client percent curr_pct diff
2021_05_01 1.6198E+12 desktop 64.55 64.8 0.25
2021_05_01 1.6198E+12 mobile 64.96 65.3 0.34
2021_04_01 1.6172E+12 desktop 68.46 68.6 0.14
2021_04_01 1.6172E+12 mobile 67.47 67.6 0.13
2021_03_01 1.6146E+12 desktop 68.5 68.6 0.1
2021_03_01 1.6146E+12 mobile 68.15 68.3 0.15
2021_02_01 1.6121E+12 desktop 68.15 68.3 0.15
2021_02_01 1.6121E+12 mobile 68.05 68.2 0.15
2021_01_01 1.6095E+12 desktop 67.19 67.3 0.11
        67.5 67.5
2020_12_01 1.6068E+12 desktop 66.75 66.9 0.15
2020_12_01 1.6068E+12 mobile 67.11 67.3 0.19
2020_11_01 1.6042E+12 desktop 65.95 66.1 0.15
2020_11_01 1.6042E+12 mobile 66.24 66.4 0.16
2020_10_01 1.6015E+12 desktop 65.57 65.7 0.13
2020_10_01 1.6015E+12 mobile 65.46 65.6 0.14
2020_09_01 1.5989E+12 desktop 63.52 64.8 1.28
2020_09_01 1.5989E+12 mobile 65.61 64.9 -0.71
2020_08_01 1.5962E+12 desktop 62.53 63.7 1.17
2020_08_01 1.5962E+12 mobile 65.09 63.8 -1.29
2020_07_01 1.5936E+12 desktop 62.23 64.2 1.97
2020_07_01 1.5936E+12 mobile 64.43 64.2 -0.23
2020_06_01 1.591E+12 desktop 61.46 64.4 2.94
2020_06_01 1.591E+12 mobile 62.34 64.5 2.16
2020_05_01 1.5883E+12 desktop 60.63 63.4 2.77
2020_05_01 1.5883E+12 mobile 61.79 63.8 2.01
2020_04_01 1.5857E+12 desktop 59.6 62.2 2.6
2020_04_01 1.5857E+12 mobile 60.6 62.4 1.8
2020_03_01 1.583E+12 desktop 59.79 62.3 2.51
2020_03_01 1.583E+12 mobile 60.68 62.5 1.82
2020_02_01 1.5805E+12 desktop 60.32 63.5 3.18
2020_02_01 1.5805E+12 mobile 60.91 63.1 2.19
2020_01_01 1.5778E+12 desktop 55.1 59.2 4.1
2020_01_01 1.5778E+12 mobile 55.11 59.3 4.19
2019_12_01 1.5752E+12 desktop 54.37 58.9 4.53
2019_12_01 1.5752E+12 mobile 54.27 58.9 4.63
2019_11_01 1.5726E+12 desktop 47.22 58 10.78
2019_11_01 1.5726E+12 mobile 53.51 58.2 4.69
2019_10_01 1.5699E+12 desktop 52.55 57.1 4.55
2019_10_01 1.5699E+12 mobile 52.43 56.9 4.47
2019_09_01 1.5673E+12 desktop 51.8 56.2 4.4
2019_09_01 1.5673E+12 mobile 53.47 56 2.53
2019_08_01 1.5646E+12 desktop 51.4 55.7 4.3
2019_08_01 1.5646E+12 mobile 55.16 55.5 0.34
2019_07_01 1.5619E+12 desktop 51.81 54.9 3.09
2019_07_01 1.5619E+12 mobile 54.53 54.8 0.27
2019_06_01 1.5593E+12 desktop 50.83 53.8 2.97
2019_06_01 1.5593E+12 mobile 50.21 53.3 3.09
2019_05_01 1.5567E+12 desktop 48.16 53.1 4.94
2019_05_01 1.5567E+12 mobile 47.38 52.6 5.22
2019_04_01 1.5541E+12 desktop 45.57 52.3 6.73
2019_04_01 1.5541E+12 mobile 44.19 52 7.81
2019_03_01 1.5514E+12 desktop 48.49 50.6 2.11
2019_03_01 1.5514E+12 mobile 47.34 50.7 3.36
2019_02_01 1.549E+12 desktop 49.63 49.7 0.07
2019_02_01 1.549E+12 mobile 49.79 49.8 0.01
        48.3 48.3
        48.3 48.3
2018_12_15 1.5448E+12 desktop 32.8 47.8 15
2018_12_15 1.5448E+12 mobile 36.73 48.9 12.17
        49.1 49.1
        48.8 48.8
2018_11_15 1.5422E+12 desktop 46.92 48.4 1.48
2018_11_15 1.5422E+12 mobile 46.87 48.4 1.53
2018_11_01 1.541E+12 desktop 46.27 47.8 1.53
        47.5 47.5
2018_10_15 1.5396E+12 desktop 45.53 46.5 0.97
2018_10_15 1.5396E+12 mobile 45.13 46.2 1.07
2018_10_01 1.5384E+12 desktop 45.89 46 0.11
2018_10_01 1.5384E+12 mobile 45.53 45.5 -0.03
2018_09_15 1.537E+12 desktop 45.66 45.8 0.14
2018_09_15 1.537E+12 mobile 45.19 45.2 0.01
2018_09_01 1.5358E+12 desktop 44.83 45 0.17
2018_09_01 1.5358E+12 mobile 44.6 44.6 0
2018_08_15 1.5343E+12 desktop 44.65 44.8 0.15
        44.9 44.9
2018_08_01 1.5331E+12 desktop 44.26 44.4 0.14
2018_08_01 1.5331E+12 mobile 44.61 44.6 -0.01
2018_07_15 1.5316E+12 desktop 43.77 44 0.23
2018_07_15 1.5316E+12 mobile 44.3 44.3 0
2018_07_01 1.5304E+12 desktop 43.42 43.6 0.18
2018_07_01 1.5304E+12 mobile 41.37 41.6 0.23
2018_06_15 1.529E+12 desktop 38.59 38.8 0.21
2018_06_15 1.529E+12 mobile 40.36 40.6 0.24
2018_06_01 1.5278E+12 desktop 38.17 38.2 0.03
2018_06_01 1.5278E+12 mobile 39.9 40.1 0.2
2018_05_15 1.5263E+12 desktop 38.16 38.3 0.14
2018_05_15 1.5263E+12 mobile 39.56 39.7 0.14
2018_05_01 1.5251E+12 desktop 37.94 38 0.06
2018_05_01 1.5251E+12 mobile 39.21 39.4 0.19
2018_04_15 1.5238E+12 desktop 37.59 37.6 0.01
2018_04_15 1.5238E+12 mobile 39.16 39.4 0.24
        37.1 37.1
        38.7 38.7
2018_03_15 1.5211E+12 desktop 36.67 36.8 0.13
2018_03_15 1.5211E+12 mobile 37.82 38 0.18
2018_03_01 1.5199E+12 desktop 35.9 35.9 0
2018_03_01 1.5199E+12 mobile 37.1 37.3 0.2
2018_02_15 1.5187E+12 desktop 35.46 35.5 0.04
2018_02_15 1.5187E+12 mobile 36.39 36.5 0.11
2018_02_01 1.5174E+12 desktop 35.23 35.3 0.07
2018_02_01 1.5174E+12 mobile 35.98 36.1 0.12
2018_01_15 1.516E+12 desktop 33.9 34 0.1
2018_01_15 1.516E+12 mobile 34.69 34.8 0.11
2018_01_01 1.5148E+12 desktop 33.3 33.7 0.4
2018_01_01 1.5148E+12 mobile 34.3 34.7 0.4
2017_12_15 1.5133E+12 desktop 33 33.4 0.4
2017_12_15 1.5133E+12 mobile 34.03 34.4 0.37
2017_12_01 1.5121E+12 desktop 31.92 32.4 0.48
2017_12_01 1.5121E+12 mobile 32.58 33.1 0.52
2017_11_15 1.5107E+12 desktop 31.39 31.8 0.41
        32.6 32.6
2017_11_01 1.5095E+12 desktop 31.11 31.5 0.39
2017_11_01 1.5095E+12 mobile 31.76 32.4 0.64
2017_10_15 1.508E+12 desktop 30.19 30.6 0.41
2017_10_15 1.508E+12 mobile 31.06 31.5 0.44
2017_10_01 1.5068E+12 desktop 29.89 30.2 0.31
2017_10_01 1.5068E+12 mobile 30.54 31.1 0.56
2017_09_15 1.5054E+12 desktop 28.88 29.2 0.32
2017_09_15 1.5054E+12 mobile 29.43 30 0.57
2017_09_01 1.5042E+12 desktop 28.21 0 -28.21
2017_09_01 1.5042E+12 mobile 29 0.1 -28.9
2017_08_15 1.5028E+12 desktop 27.25 0 -27.25
2017_08_15 1.5028E+12 mobile 28.07 0 -28.07
2017_08_01 1.5015E+12 desktop 26.76 0 -26.76
2017_08_01 1.5015E+12 mobile 27.41 0 -27.41
2017_07_15 1.5001E+12 desktop 26.63 26.5 -0.13
2017_07_15 1.5001E+12 mobile 27.02 27.1 0.08
2017_07_01 1.4989E+12 desktop 26.14 26 -0.14
2017_07_01 1.4989E+12 mobile 26.44 26.5 0.06
2017_06_15 1.4975E+12 desktop 25.29 25.2 -0.09
2017_06_15 1.4975E+12 mobile 25.88 26 0.12
2017_06_01 1.4963E+12 desktop 25.05 25 -0.05
2017_06_01 1.4963E+12 mobile 25.47 25.7 0.23
2017_05_15 1.4948E+12 desktop 25.02 24.9 -0.12
2017_05_15 1.4948E+12 mobile 25.29 25.5 0.21
2017_05_01 1.4936E+12 desktop 24.87 23.9 -0.97
2017_05_01 1.4936E+12 mobile 24.49 23.8 -0.69
2017_04_15 1.4922E+12 desktop 25.12 24.9 -0.22
2017_04_15 1.4922E+12 mobile 25.41 25.2 -0.21
2017_04_01 1.491E+12 desktop 24.55 24.7 0.15
2017_04_01 1.491E+12 mobile 24.69 24.9 0.21
2017_03_15 1.4895E+12 desktop 23.78 24 0.22
2017_03_15 1.4895E+12 mobile 23.69 23.9 0.21
2017_03_01 1.4883E+12 desktop 23.4 23.4 0
2017_03_01 1.4883E+12 mobile 23.3 23.4 0.1
2017_02_15 1.4871E+12 desktop 23.07 23.1 0.03
2017_02_15 1.4871E+12 mobile 22.91 23.1 0.19
2017_02_01 1.4859E+12 desktop 22.74 22.8 0.06
2017_02_01 1.4859E+12 mobile 22.85 22.9 0.05
        22 22
2017_01_15 1.4844E+12 mobile 22 22 0
        21.3 21.3
2017_01_01 1.4832E+12 mobile 21.58 21.6 0.02
2016_12_15 1.4818E+12 desktop 19.68 20.9 1.22
        21.3 21.3
        20.7 20.7
        21.2 21.2
2016_11_15 1.4792E+12 desktop 20.54 20.3 -0.24
2016_11_15 1.4792E+12 mobile 20.55 20.6 0.05
2016_11_01 1.478E+12 desktop 20.25 20.3 0.05
2016_11_01 1.478E+12 mobile 19.91 20 0.09
2016_10_15 1.4765E+12 desktop 18.66 18.6 -0.06
2016_10_15 1.4765E+12 mobile 19.37 19.7 0.33
2016_10_01 1.4753E+12 desktop 18.5 18.7 0.2
2016_10_01 1.4753E+12 mobile 19.32 19.5 0.18
2016_09_15 1.4739E+12 desktop 17.11 17.4 0.29
2016_09_15 1.4739E+12 mobile 17.29 17.5 0.21
2016_09_01 1.4727E+12 desktop 16.45 16.5 0.05
2016_09_01 1.4727E+12 mobile 16.66 16.5 -0.16
2016_08_15 1.4712E+12 desktop 16.49 16.5 0.01
2016_08_15 1.4712E+12 mobile 16.4 16.4 0
2016_08_01 1.47E+12 desktop 16.36 16.4 0.04
        16.2 16.2
2016_07_15 1.4685E+12 desktop 15.9 0 -15.9
        0 0
2016_07_01 1.4673E+12 desktop 15.47 0 -15.47
        0 0
2016_06_15 1.4659E+12 desktop 15.16 0 -15.16
        0 0
2016_06_01 1.4647E+12 desktop 13.72 0 -13.72
        0 0
2016_05_15 1.4633E+12 desktop 13.15 0 -13.15
        0 0
2016_05_01 1.4621E+12 desktop 0 0 0
        0 0
2016_04_15 1.4607E+12 desktop 0 0 0
        0 0
2016_04_01 1.4595E+12 desktop 0 0 0
        0 0
2016_03_15 1.458E+12 desktop 0 0 0
        0 0
2016_03_01 1.4568E+12 desktop 0 0 0
2016_03_01 1.4568E+12 mobile 0 0 0
2016_02_15 1.4555E+12 desktop 0 0 0
2016_02_15 1.4555E+12 mobile 0 0 0
2016_02_01 1.4543E+12 desktop 0 0 0
2016_02_01 1.4543E+12 mobile 0 0 0
2016_01_15 1.4528E+12 desktop 0 0 0
2016_01_15 1.4528E+12 mobile 0 0 0
2016_01_01 1.4516E+12 desktop 0 0 0
2016_01_01 1.4516E+12 mobile 0 0 0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions