Open
Description
The latest lighthouse.2018_10_15 table is 237 GB. Querying all lighthouse tables currently costs 4.15 TB and runs in several minutes.
- identify parts of the JSON payload that are unnecessary or unlikely to have analytical value and also significant contributors to the payload size
- modify the Dataflow pipeline to omit these parts of the payload
- profit