Skip to content

Broadcast tracker script config updates#5806

Merged
apata merged 5 commits intomasterfrom
script-v2/broadcast-cache-updates
Oct 27, 2025
Merged

Broadcast tracker script config updates#5806
apata merged 5 commits intomasterfrom
script-v2/broadcast-cache-updates

Conversation

@apata
Copy link
Copy Markdown
Contributor

@apata apata commented Oct 14, 2025

Changes

Broadcasts tracker script config creates / updates to other app nodes, reducing the time between creating / updating a script and being able to reliably receive it.

Tests

  • Automated tests have been added
  • This PR does not require tests

Changelog

  • Entry has been added to changelog
  • This PR does not make a user-facing change

Documentation

  • Docs have been updated
  • This change does not need a documentation update

Dark mode

  • The UI has been tested both in dark and light mode
  • This PR does not change the UI

@apata apata added the preview label Oct 14, 2025
@github-actions
Copy link
Copy Markdown

Preview environment👷🏼‍♀️🏗️
PR-5806

require Logger

@spec broadcast_put(any(), Keyword.t()) :: :ok
@spec broadcast_put(any(), any(), Keyword.t()) :: :ok
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


def broadcast_script_update(tracker_script_configuration),
do:
PlausibleWeb.TrackerScriptCache.broadcast_put(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor, just noticed this - by convention, *Web namespace if for HTTP things. Cache is completely agnostic in that sense, even if Web things are its clients.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I hadn't considered the namespace question. At the moment we have the script at Plausible.Site.TrackerScriptConfiguration. By following existing naming principles, its Cache should be at Plausible.Site.TrackerScriptConfiguration.Cache, correct?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah damn, it's not so straightforward. It caches the script, not the configuration. The script's module is PlausibleWeb.Tracker. So PlausibleWeb.Tracker.Cache or refactor also PlausibleWeb.Tracker to Plausible.Tracker?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactor also PlausibleWeb.Tracker to Plausible.Tracker

Probably this, but we can live with not doing it

end

defp cache_content(tracker_script_configuration) do
def cache_content(tracker_script_configuration) do
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a bit unclear what is this function for. returns either always true or some string for CE? And build_script for CE does eex-like string replace but not using EEx? Very confusing. 🤔

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Agreed that this could use a comment. Will put something like the following:

On EE, we cache that this particular script exists to prevent Postgres stress from lookups of non-existent script IDs. The heavy lifting of caching is left to the CDN. If the ID exists, we always generate a fresh version of the script.

On CE, since we don't anticipate them using a CDN, we cache the rendered script.

Regarding the CE solution, it means that every refresh interval, every script is re-rendered by the server. This doesn't seem ideal actually.

Regarding building the script, we planned to use EEx but there were issues. There's a discussion on the PR that introduced it.

Copy link
Copy Markdown
Contributor Author

@apata apata Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the CE solution, it means that every refresh interval, every script is re-rendered by the server. This doesn't seem ideal actually.

Just realised that only every updated script is re-rendered by the server on CE. That's not a problem.

Copy link
Copy Markdown
Member

@aerosol aerosol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI error looks related to the change

)
) do
created_config =
Repo.preload(created_config, :site)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the two extra db queries is site is already available?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

I think we don't. In these functions, site is actually available as an argument, so theoretically we don't need additional DB queries at all.

Refresh all updated scripts -> load tracker script config with site association, run generate_script/1
Get or create config -> (site is argument) receive tracker script config without site association, run generate_script/2
Update config -> (site is argument) receive tracker script config without site association, run generate_script/2

I started to refactor it but it got messy so I pushed the interim state.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant was

%{created_config | site: site}

if this doesn't carry any risk of outdated reads

Copy link
Copy Markdown
Contributor Author

@apata apata Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this doesn't carry any risk of outdated reads

Good question.

What we need from the site association is site.domain for building the script for the cache

ON CE
T1 This conn: Starts process to update tracker script config
T2 Other conn: Starts process to update domain name
T3 Other conn: Finishes process to update domain name
T4 This conn: Expands stale site domain to tracker script
T5 This conn: Caches the stale tracker script and broadcasts this to all nodes
T6 Application: Checks for tracker_script_config entities that have been updated in the last 15 minutes, doesn't find anything
Result: stale tracker script cached until next refresh_all, which may be in 180mins

ON EE
T1 This conn: Starts process to update tracker script config
T2 Other conn: Starts process to update domain name
T3 Other conn: Finishes process to update domain name and starts a job that clears the cached tracker script on CDN in 10s
T4 This conn: Caches that a tracker script with this ID exists and broadcasts this to all nodes
T5 Application: Checks for tracker_script_config entities that have been updated in the last 15 minutes, doesn't find anything
Result: local cache is correct, CDN cache is purged

So there is some risk with outdated reads on CE. I don't think it's a huge risk, because we keep handling events with the stale domain for a while.

But I also see other risks doing it like this. The site from the function arguments has a bunch more associations loaded. By setting it to the struct we can accidentally expose those associations down the line.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aerosol I know you already approved the version with the preloads, but I didn't think it was good enough, so I had another look at this.

First I tried injecting the site from args only when it doesn't exist and is needed, accepting the small risk on CE from stale domain reads.

Then I realized we don't need to accept that small risk: on CE we don't need to optimize postgres queries so much.

I'm now proposing we reload the config with necessary associations after insert and update operations, but only on CE. I also changed the base query a little bit, so as not to select the whole site when only one field is needed.

@apata apata force-pushed the script-v2/broadcast-cache-updates branch from f104dfa to 472847b Compare October 23, 2025 10:29
@apata apata added this pull request to the merge queue Oct 27, 2025
Merged via the queue into master with commit 0e0415f Oct 27, 2025
35 of 50 checks passed
@apata apata deleted the script-v2/broadcast-cache-updates branch February 11, 2026 11:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants