Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance Sunbird Ed's Data Pipeline to Operate on Sunbird Obsrv. #6

Open
6 tasks
manjudr opened this issue May 3, 2023 · 0 comments
Open
6 tasks

Enhance Sunbird Ed's Data Pipeline to Operate on Sunbird Obsrv. #6

manjudr opened this issue May 3, 2023 · 0 comments

Comments

@manjudr
Copy link
Collaborator

manjudr commented May 3, 2023

Project Details

What is Sunbird Obsrv?

Sunbird Obsrv comprises several pluggable tools and microservices that come together to enable observability features on any platform/solution. This includes the ability to capture granular events via telemetry, create measures, and observe various events/actions carried out by the system/users/devices (like IoT devices) on any platform/solution.

Sunbird Obsrv can be utilized as an independent building block by adopters or as part of a system that employs other Sunbird building blocks. Sunbird Obsrv comes with a set of microservices, APIs, and some utility SDKs to make it easy for adopters to rapidly enable powerful data processing and aggregation infrastructure to process telemetry data, validate telemetry stream data, as well as aggregate and generate actionable insights via APIs. It also has built-in open data cataloguing and publishing capability. It is built keeping extensibility in mind so that adopters have the flexibility to adapt the telemetry and tools to their specific use cases.
More details are here

Features to be implemented

Enhance the Sunbird ed datasets (telemetry and summary) to run with the Sunbird obsrv platform.

  • Current Scenario: Sunbird OBSRV is a platform that enables users to configure multiple datasets and then easily ingests those datasets into an analytical data store.
    This allows users to ingest and analyze large amounts of data, making it easier to extract insights and value from the data. In addition to the data ingestion capabilities, Sunbird OBSRV also provides a comprehensive set of APIs that allow users to perform CRUD (Create, Read, Update, Delete) operations on the datasets. This gives users the ability to manage their data effectively and make changes to it as needed.
    Furthermore, Sunbird OBSRV offers APIs for data ingestion and data query, The data ingestion API enables users to easily load data into the datastore from external sources, while the data query API provides a way to retrieve data from the analytical datastore
    At present, Sunbird ED has summary and telemetry datasets that are not configured on the Sunbird OBSRV platform.
    Our goal is to configure these datasets to function seamlessly on the Sunbird OBSRV platform.

  • Acceptance Criteria: By using Sunbird OBSRV APIs to configure all datasets, ingest data into them, and generate visualizations charts.

Learning Path

  • Complexity - Large
  • Skills Required - Flink, Apache Druid, APIs, Querying
  • Name of Mentors - @manjudr @anandp504
  • Project size - 8 Weeks

Product Set Up

Milestones

  • Understanding of the existing Sunbird ed dataset configurations
  • Understanding of Obsrv APIs
  • Configure the multiple datasets (telemetry, summary)
  • Configure the multiple data sources (telemetry-raw, telemetry-rollup, telemetry-weekly-rollup, summary-raw)
  • Using the Data IN API index the data
  • Create visualization charts for each dataset.
manjudr pushed a commit that referenced this issue Jun 1, 2023
…t to process stats for all events by removing the event itself
manjudr pushed a commit that referenced this issue Jun 1, 2023
#5 [SV] - Master dataset processing pipeline
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant