ADR 13 - Store Content Items JSON in Items Dimension #922
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Each Content Item is defined by a JSON that includes all the metadata,
the links and the content details. The links include information about
organisations, policies, taxonomies, and details including the actual
content of the page, the attachments, and other necessary details like
title, description, publishing-app, rendering app, etc.
In the Data Warehouse, we started storing the Event with the JSON, but
we decided to remove it because our database was growing too quickly.
Later on, we found out that we had a bug in our code because we were not
filtering Publishing-API events correctly. Some days we had more than 4
million events, and each one of them implied a new version of a Content
Item with no real changes.
As of today, the Content Items dimension is growing very slowly, as it
only tracks changes related to content updates. The current pace of
growth is around 600-800 rows per day, which is very small for a Data
Warehouse.
So I would like to explore this decision:
Store the JSON of the Content Items that grow the Items dimension, which
is the same as storing all the information that we currently have about
a Dimension Item.