-
Notifications
You must be signed in to change notification settings - Fork 0
Feat/analytics reporting engine #109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Adds new environment variables to `.env.example` required for the AnalyticsSyncWorker to connect to third-party reporting APIs.
Includes a property ID for Google Analytics and project/service account credentials for Mixpanel, ensuring all secrets are managed centrally.
Adds static getters to the `EnvironmentConfig` class to safely read the new Google Analytics and Mixpanel credentials from the environment.
This makes the credentials available to the application's dependency injection system in a type-safe and centralized manner.
Creates strongly-typed Dart models to parse the JSON response from the Google Analytics Data API's `runReport` method.
These models (`RunReportResponse`, `GARow`, `GADimensionValue`, `GAMetricValue`) ensure type safety and robust error handling when processing data fetched from Google Analytics
Creates strongly-typed Dart models to parse the JSON response from various Mixpanel API endpoints.
This includes a generic `MixpanelResponse` and specific data structures for handling results from segmentation and trends queries, ensuring type-safe data processing.
Creates the `AnalyticsReportingClient` abstract class, which defines the contract for fetching aggregated analytics data from any third-party provider.
This interface includes methods for getting time-series data, single metric values, and ranked lists, ensuring that the `AnalyticsSyncService` can work with any provider implementation in a uniform way.
Creates the `GoogleAnalyticsDataClient`, a concrete implementation of `AnalyticsReportingClient` for fetching data from the Google Analytics Data API.
This client reuses the existing `FirebaseAuthenticator` to obtain OAuth2 tokens and constructs the appropriate `runReport` requests to query metrics and dimensions from the GA4 property. It handles the transformation of the API response into the application's standard `DataPoint` and `RankedListItem` models.
Creates the `MixpanelDataClient`, a concrete implementation of `AnalyticsReportingClient` for fetching data from the Mixpanel API.
This client uses Basic Authentication with a service account and constructs requests for Mixpanel's segmentation and trends endpoints. It transforms the API responses into the application's standard `DataPoint` and `RankedListItem` models.
Creates the `AnalyticsSyncService`, the core orchestrator for the background worker. This service reads the remote config to determine the active provider, instantiates the correct reporting client, and iterates through all `KpiCardId`, `ChartCardId`, and `RankedListCardId` enums.
For each ID, it fetches the corresponding data from the provider, transforms it into the appropriate `KpiCardData`, `ChartCardData`, or `RankedListCardData` model, and upserts it into the database using the generic repositories. This service encapsulates the entire ETL (Extract, Transform, Load) logic.
Adds a new standalone Dart application in the `bin/` directory. This script serves as the entry point for the `AnalyticsSyncWorker` process. It initializes the application dependencies, retrieves the `AnalyticsSyncService`, and executes its `run()` method. This executable can be compiled and run by a cron job to perform the periodic data synchronization.
Adds a new `analytics.read` permission to the `Permissions` class. This permission will be used to grant dashboard administrators access to the pre-aggregated analytics data models.
Assigns the new `analytics.read` permission to the `_dashboardAdminPermissions` set. This ensures that users with the `admin` dashboard role can access the analytics card data via the generic `/data` API route.
…ummary
Updates the `DataOperationRegistry` to map the new analytics models (`kpi_card_data`, `chart_card_data`, `ranked_list_card_data`) to their corresponding `DataRepository` operations.
This enables the generic `/data` route to serve the pre-aggregated analytics data to the dashboard. The obsolete `DashboardSummaryService` and its related operations are also removed.
…ummary
Updates the `DataOperationRegistry` to map the new analytics models (`kpi_card_data`, `chart_card_data`, `ranked_list_card_data`) to their corresponding `DataRepository` operations.
This enables the generic `/data` route to serve the pre-aggregated analytics data to the dashboard. The obsolete `DashboardSummaryService` and its related operations are also removed.
Updates the `DatabaseSeedingService` to ensure indexes and placeholder documents are created for the new analytics collections (`kpi_card_data`, `chart_card_data`, `ranked_list_card_data`).
This structural seeding prevents `NotFound` errors on the dashboard before the first `AnalyticsSyncWorker` run and ensures the API always returns a valid, empty object.
- Removed the 'dashboard_summary' ModelConfig from the model registry - This change affects the registry/model_registry.dart file
…ric mapping functionalities - Inject headlineRepository into GoogleAnalyticsDataClient and MixpanelDataClient - Add headlineRepository and analyticsMetricMapper to AnalyticsSyncService - Create AnalyticsMetricMapper instance in AppDependencies
… repositories - Remove DashboardSummaryService import and usage - Add imports for KpiCardDataRepository, ChartCardDataRepository, and RankedListCardDataRepository - Update middleware to use new repository providers instead of DashboardSummaryService
- Add implementation for fetching ranked list from Google Analytics - Improve error handling and logging for getTimeSeries and getMetricTotal methods - Add DataRepository dependency for headline management
…ic total calculation - Add implementation for getRankedList method in MixpanelDataClient - Improve getMetricTotal by utilizing existing time series data - Enhance getTimeSeries to handle different metric names - Add DataRepository dependency for future enhancements
- Create AnalyticsMetricMapper class to map internal analytics card IDs to provider-specific metrics - Add support for Firebase, Mixpanel, and Demo analytics providers - Implement mappings for KPI cards, chart cards, and ranked list cards - Define ProviderMetrics typedef for holding metric and dimension names - Include error handling for demo provider which does not have metrics
…KPI, Chart, and RankedList cards - Add logic to fetch data from analytics provider based on active provider config - Implement KPI, Chart, and RankedList metric mapping - Add time frame calculation and data point enrichment for ranked list cards - Include headline repository for enriching ranked list items - Refactor sync processes to handle different analytics providers
- Export 'analytics_metric_mapper.dart' from the analytics service library - This change allows other parts of the application to use the analytics metric mapper functionality
Creates a new `AnalyticsQuery` sealed class hierarchy. This replaces
the fragile pattern of passing primitive strings for metrics and
dimensions.
By defining structured query types like `EventCountQuery` and
`StandardMetricQuery`, we create a strong contract between the sync
service and the data clients, improving type safety and making the
system more expressive and maintainable.
Summary of ChangesHello @fulleni, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request delivers a robust analytics reporting engine, transforming how the application gathers and presents key performance indicators and insights. It establishes a scheduled, provider-agnostic data pipeline that extracts raw data, processes it into structured metrics, and stores it efficiently for rapid dashboard display. This foundational change significantly enhances the application's analytical capabilities, moving from basic on-demand summaries to a comprehensive, high-performance system for understanding user behavior and application performance. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
…rformance section - Rename 'Powerful, Provider-Agnostic Analytics Pipeline' to 'Unified Business Intelligence Engine' - Update description to highlight dual-source ETL and combination of user behavior analytics with operational metrics - Remove 'Architecture & Infrastructure' section as it was redundant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a comprehensive and well-architected analytics reporting engine. The new feature includes a standalone worker for data synchronization, support for both Google Analytics and Mixpanel, and a clear separation of concerns with dedicated services, mappers, and data models. The addition of extensive tests for the new functionality is commendable. I've identified one critical bug related to how database-driven metrics are resolved, which could lead to incorrect analytics data. I've also included a couple of medium-severity suggestions to improve code safety and clarity regarding a dependency downgrade and null-safety patterns. Overall, this is a strong feature addition.
- Remove unnecessary null-aware operator from analyticsSyncService call - Improve code readability and performance slightly
- Change AnalyticsSyncService type from nullable to non-nullable - Ensure consistent null safety handling across app dependencies
- Update database metric names to include specific resource and measurement - Affected KPIs: headlines, sources, source followers, topics, topic followers, reports, user role distribution, views by topic, headlines by source, source engagement by type, source status distribution, topic engagement, breaking news distribution, reactions by type, report resolution time, reports by reason, app review feedback - Update ranked list metrics for most followed sources and topics
- Rename metrics to follow a consistent naming convention - Categorize metrics under users, reports, engagements, app_reviews, headlines, sources, and topics - Update method calls and variable names accordingly
- Update metric names to be more specific and consistent - Implement a more robust repository lookup mechanism - Add error handling for invalid or unknown metrics - Refactor switch statement for better readability and maintainability
- Update metric names to use the new hierarchical format - Change 'database:userRoleDistribution' to 'database:users:userRoleDistribution' - Change 'database:reportsByReason' to 'database:reports:byReason' - Change 'database:reactionsByType' to 'database:engagements:reactionsByType' - Change 'database:appReviewFeedback' to 'database:app_reviews:feedback' - Change 'database:topicEngagement' to 'database:headlines:topicEngagement' - Change 'database:sourcesByFollowers' to 'database:sources:byFollowers' - Change 'database:topicsByFollowers' to 'database:topics:byFollowers' - Change 'database:avgReportResolutionTime' to 'database:reports:avgResolutionTime'
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a comprehensive analytics reporting engine, which is a significant and well-architected feature. The use of a provider-agnostic design with a separate worker for ETL processes is excellent. The code is generally of high quality, with good separation of concerns and extensive test coverage.
My review focuses on improving robustness, efficiency, and maintainability. I've identified a critical issue where new analytics models are not registered, which would cause API failures. I've also found several areas where error handling can be improved to prevent potential runtime crashes from unsafe assumptions about API responses or aggregation results. Additionally, there are opportunities to optimize performance in the Mixpanel client and to improve dependency management practices in pubspec.yaml. Overall, these are great changes that build a solid foundation for analytics.
- Refactor the initial configuration setup to use a more generic sanitization process - Replace insertOne with updateOne using $setOnInsert to prevent overwriting admin changes - Add sanitization for ad platform and analytics provider, defaulting to non-demo options - Ensure proper timestamps are set for initial creation - Remove redundant existence check, simplifying the overall logic
- Add KpiCardData, ChartCardData, and RankedListCardData models to the registry - Set up appropriate permissions for analytics read access - Define model configurations including ID retrieval and permission settings
- Use `firstOrNull` instead of accessing the first element directly - Add null safety checks for label and value in `_formatDataPoints` method - Implement null checks and logging for ranked list items
- Use firstOrNull instead of first to avoid exceptions when lists are empty - Add null checks for dimensionValues and metricValues to ensure safe access - Update methods to handle cases where data might be missing without causing errors
- Remove unnecessary fallback to first value in segmentation data - Implement correct logic to fetch total value without 'unit' parameter - Update API call to use segmentation endpoint for total calculation - Add error handling for empty response values
- Update `getValue` method to handle cases where there are multiple values in the segmentation data - Instead of returning just the first value, it now sums all values in the list - This change ensures more accurate reporting when multiple engagement types are present
- Make unit parameter optional in MixpanelSegmentationRequest class - Update props getter to handle nullable unit - Adjust toJson method to accommodate nullable unit value
…h nullable 'unit' fix-(an Improvealytics code): consistency add and missing prepare ' forunit future' metric parameter work for Mixpanel
- Include MixpanelTimeUnit.day in contentPropertiesEvent query to ensure test consistency
- Modify GARow constructor to make dimensionValues a required parameter - This change ensures that dimensionValues are always provided, improving data integrity
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a comprehensive and well-architected analytics reporting engine. The new system is provider-agnostic, supports both Google Analytics and Mixpanel, and uses a scheduled worker to pre-aggregate data, which is an excellent design for dashboard performance. The introduction of structured query objects, dedicated client implementations, and a clear separation of concerns with mappers and query builders is commendable. I've identified a few areas for improvement, mainly concerning code health with a disabled lint, a minor null-safety pattern, and an opportunity for refactoring to reduce code duplication in the main sync service. Overall, this is a strong feature addition that significantly enhances the application's capabilities.
Status
READY
Description
This pull request delivers a robust analytics reporting engine, transforming how the application gathers and presents key performance indicators and insights. It establishes a scheduled, provider-agnostic data pipeline that extracts raw data, processes it into structured metrics, and stores it efficiently for rapid dashboard display. This foundational change significantly enhances the application's analytical capabilities, moving from basic on-demand summaries to a comprehensive, high-performance system for understanding user behavior and application performance.
Type of Change