Skip to content

Reduce size of Lighthouse payload #21

Open
@rviscomi

Description

@rviscomi

The latest lighthouse.2018_10_15 table is 237 GB. Querying all lighthouse tables currently costs 4.15 TB and runs in several minutes.

image

  1. identify parts of the JSON payload that are unnecessary or unlikely to have analytical value and also significant contributors to the payload size
  2. modify the Dataflow pipeline to omit these parts of the payload
  3. profit

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions