cmcd-toolkit is a set of basic tools to collect and analyze data sent by video with CMCD v2 support.
This repository provides tools to collect and analize CMCD v2 locally and in different cloud providers.
The project is organized into the following modules:
- collector: Responsible for receiving, processing, and forwording CMCD data to a DB or storage instance. It can be configured to work with different backends like Fluentd or Google PubSub
- fluentd: A data collector for unified logging. It's primarily used in the local setup to forward CMCD data from the
collector
to an InfluxDB instance for quick visualization with Grafana. It can be configured to output to other destinations like BigQuery or apply filters and data enrichment (like adding geoip data based on the user IP address) - gcloud-big-table: Contains the schema definition (
bigquery-cmcdv2-schema.json
) for storing CMCD data in Google BigQuery within the Google Cloud flavor. - gcloud-collector-function: A Google Cloud Function implementation of the CMCD collector. This is an alternative to the Docker-based collector for serverless deployments on Google Cloud.
- grafana: Contains Grafana configurations for visualizing CMCD data. It includes dashboards for both local and Google Cloud setups.
- player: A web-based video player based in dash.js used for testing and demonstrating CMCD data collection. It can be configured to send CMCD data to the collector.
This setup allows you to run the cmcd-toolkit on your local machine using Docker.
This setup uses the following moduels: player, collector, fluentd and grafana
How to run:
- Run
docker compose up
(ordocker compose -f docker-compose.local.yml up
). - Player will be available at: http://localhost:8080,
- Press the "Collector" button to start sending CMCD v2 data to the local collector.
- Play any DASH content in the player.
- Login to grafana at http://localhost:8081
- User:
admin
- Password:
grafana
- User:
- Open a grafana dashboard to start analyzing the CMCD data from the player.
Assuming you already have a Google Cloud account, the high-level steps to deploy are:
- Create a CMCD BigTable using schema found in the
gcloud-big-table
folder - Create a Pub/Sub topic and a suscription to the CMCD BigTable
- Create a Cloud Run Function with the code found in
index.js
fromcollector-gcloud-function
. This will give you a{public url}
for the collector. - Configure in the palyer found in the
palyer
folder the following urls:- For response mode:
{public url}/cmcd/response-mode
- For event mode:
{public url}/cmcd/event-mode
- For response mode:
- (Optional) Create a bucket in Cloud Storage with public access and deploy the palyer for testing the system.
- (Optional) Create a bucket in Cloud Storage and suscribe to the Pub/Sub CMCD topic for long term CMCD storage.
- (Optinal) Connect a Grafana instance using the BigQuery plugin. Find a Grafana config example in the
grafana
folder.
To collect CMCD data from a player other than the one pre-configured in this project, you must configure CMCD v2 and set the response and event mode endpoints to the following URLs (Note that both collector
and gcloud-collector-function
have the same API):
- CMCD Response mode:
{collector_domain}:{collector_port}/cmcd/response-mode
- CMCD Event mode:
{collector_domain}:{collector_port}/cmcd/event-mode
Copy docker-comose.develop.yml
to docker-compose.override.yml
and then run docker compose up
. You will be able to modify the code while running the project
Notice:
- If you are making changes in the
fluentd
configuraiton, you MAY need to delete the docker volume of influxdb to se the cahnges applied - After changinge the codebase, you can run all the unit tests using this command:
docker compose -f docker-comose.test.yml up
This project is licensed under the Apache 2.0 License. See the LICENSE
file for more details.