feat(cdp): Use lazy loader for hog functions#30235
Conversation
There was a problem hiding this comment.
PR Summary
This PR reverts the lazy loading functionality for hog functions in the CDP component, focusing on fixing issues with filtering and groups parsing. Here are the key changes:
- Removed
HogFunctionManagerLazyServiceand its associated configuration flag, reverting to direct function loading throughHogFunctionManagerService - Changed function reloading from
reloadAllHogFunctions()to targetedonHogFunctionsReloaded(teamId, [itemId])pattern - Added validation for group properties in
GroupsManagerServiceto ensure group types and keys are strings - Modified service initialization order by removing explicit
start()/stop()lifecycle methods fromCdpApi - Improved parallel fetching of team hog functions and team data in
_parseKafkaBatchfor better performance
The changes appear to be a third attempt at implementing this feature, suggesting previous versions had issues that needed addressing. The removal of lazy loading aims to resolve problems with filtering and groups parsing while maintaining core functionality.
24 file(s) reviewed, 6 comment(s)
Edit PR Review Bot Settings | Greptile
| }) | ||
| // Trigger the reload that django would do | ||
| await processor.hogFunctionManager.reloadAllHogFunctions() | ||
| processor['hogFunctionManager']['onHogFunctionsReloaded'](team.id, [item.id]) |
There was a problem hiding this comment.
style: accessing private members with bracket notation suggests these properties should potentially be protected instead of private for testing purposes
| }) | ||
| // Trigger the reload that django would do | ||
| await processor.hogFunctionManager.reloadAllHogFunctions() | ||
| processor['hogFunctionManager']['onHogFunctionsReloaded'](teamId, [item.id]) |
There was a problem hiding this comment.
style: Using private property access with ['hogFunctionManager'] is brittle. Consider exposing a public method for testing or using dependency injection.
| const insertHogFunction = async (hogFunction: Partial<HogFunctionType>) => { | ||
| const item = await _insertHogFunction(hub.postgres, team.id, hogFunction) | ||
| // Trigger the reload that django would do | ||
| await processor.hogFunctionManager.reloadAllHogFunctions() | ||
| processor['hogFunctionManager']['onHogFunctionsReloaded'](team.id, [item.id]) | ||
| return item | ||
| } |
There was a problem hiding this comment.
style: The insertHogFunction helper now uses private property access (['hogFunctionManager']['onHogFunctionsReloaded']) which makes the code brittle to refactoring. Consider exposing this as a public method or creating a test-specific interface.
| const [teamHogFunctions, team] = await Promise.all([ | ||
| this.hogFunctionManager.getHogFunctionsForTeam(clickHouseEvent.team_id, [ | ||
| 'destination', | ||
| ]), | ||
| this.hub.teamManager.fetchTeam(clickHouseEvent.team_id), | ||
| ]) |
There was a problem hiding this comment.
logic: Parallel fetching of team and functions could lead to race conditions if team is deleted between fetches. Consider fetching team first and early-returning if not found.
plugin-server/src/cdp/hog-transformations/hog-transformer.service.test.ts
Show resolved
Hide resolved
|
|
||
| await manager.reloadAllHogFunctions() | ||
| const teamFunctions = manager.getTeamHogFunctions(teamId2) | ||
| manager['onHogFunctionsReloaded'](teamId2, [hogFunctions[2].id, hogFunctions[1].id]) |
There was a problem hiding this comment.
logic: Reloading with incorrect function IDs - using hogFunctions[2].id but working with teamId2 functions
Problem
Attempt number 3
Changes
👉 Stay up-to-date with PostHog coding conventions for a smoother review.
Does this work well for both Cloud and self-hosted?
How did you test this code?