Skip to content

Conversation

@hardillb
Copy link
Contributor

@hardillb hardillb commented Sep 12, 2025

part of https://github.com/FlowFuse/customer/issues/392

Description

Gethers certified nodes count every 24hrs

Dual implementation, telemetry and a standalone housekeeping task

need to pick one before merging

Related Issue(s)

https://github.com/FlowFuse/customer/issues/392

Checklist

  • I have read the contribution guidelines
  • Suitable unit/system level tests have been added and they pass
  • Documentation has been updated
    • Upgrade instructions
    • Configuration details
    • Concepts
  • Changes flowforge.yml?
    • Issue/PR raised on FlowFuse/helm to update ConfigMap Template
    • Issue/PR raised on FlowFuse/CloudProject to update values for Staging/Production
  • Link to Changelog Entry PR, or note why one is not needed.

Labels

  • Includes a DB migration? -> add the area:migration label

Gethers certified nodes count every 24hrs

Dual implementation, telemetry and a standalone housekeeping task

need to pick one before merging
@codecov
Copy link

codecov bot commented Sep 12, 2025

Codecov Report

❌ Patch coverage is 20.00000% with 44 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.64%. Comparing base (3409cf0) to head (d1a20ef).
⚠️ Report is 338 commits behind head on main.

Files with missing lines Patch % Lines
forge/housekeeper/tasks/certifiedNodes.js 18.51% 44 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6017      +/-   ##
==========================================
- Coverage   76.79%   76.64%   -0.16%     
==========================================
  Files         384      387       +3     
  Lines       19439    19557     +118     
  Branches     4671     4703      +32     
==========================================
+ Hits        14928    14989      +61     
- Misses       4511     4568      +57     
Flag Coverage Δ
backend 76.64% <20.00%> (-0.16%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@hardillb hardillb changed the title First pass Certified Nodes usage telemetry Sep 15, 2025
@hardillb hardillb requested a review from knolleary September 26, 2025 14:18
@hardillb
Copy link
Contributor Author

The consumer for this is all up and running now

Only outstanding question is should the scopes and upload URL be configurable?

@hardillb hardillb marked this pull request as ready for review September 26, 2025 14:21
@hardillb
Copy link
Contributor Author

@knolleary if you get a second

lastSeenAt: { [Op.gte]: new Date(now - 1000 * 60 * 60 * 24) }
}
})
for (const dev of runningDevices) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to review this in more detail - not sure how this'll handle devices in dev mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't easily count devices in dev mode until they push a snapshot, which this should pick up. But if they add nodes and stay in devMode we will never know

@hardillb
Copy link
Contributor Author

@knolleary this now runs the queries in batches of 20 every second until complete to reduce load on db, we can tune the time period if 20/s is still too fast

const instancePromise = new Promise((resolve, reject) => {
// uses getRuntimeSettings to include template merge
let instanceOffset = 0
const instanceInterval = setInterval(async () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If, for any reason, it takes longer than 1s to process the whole batch of twenty, this will end up scheduling batches in a parallel - putting even more load on the DB than if it was just doing them all in a single batch.

Lets revert the batching change here and go back to a simple iteration over the list. Add some log statements at start/end of the task so we can check back on how long it takes to run on FFC. We can then decide if it needs batching etc.

Copy link
Member

@knolleary knolleary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for bouncing this again. See me other comment - lets revert the batching change (for both instances and devices), add some logging and see how it does on FFC.

@knolleary knolleary merged commit 2ec399c into main Nov 17, 2025
24 of 25 checks passed
@knolleary knolleary deleted the certified-nodes-telemetry branch November 17, 2025 11:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants