Skip to content
Branch: master
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

AIStreamer Ingestion Library

Google Cloud Video Intelligence Streaming API enables real-time streaming analysis for live media and archived data. Supported features include:

  1. Live Label Detection

  2. Live Shot Change Detection

  3. Live Explicit Content Detection

  4. Live Object Tracking

AIStreamer ingestion library provides a set of open source interface and example code to connect to Google Cloud Video Intelligence Streaming API. The library supports:

  1. File Streaming

  2. HTTP Live Streaming (HLS): a HTTP based media streaming and communication protoocol.

  3. Real Time Streaming Protocol (RTSP): a network control protocol for streaming media servers. It is used in conjunction with Real Time Protocol (RTP) and Real Time Control Protocol (RTCP).

  4. Real Time Messaging Protocol (RTMP): a protocol for streaming audio, video and data over the Internet.

To start using AIStreamer

AIStreamer ingestion library provides a Docker example. Please refer to individual documentation:

Code architecture

AIStreamer ingestion library includes the following three directories:

  • client: Python & C++ client libraries for connecting to Cloud Video Intelligence.
  • env: Docker example for AIStreamer ingestion.
  • proto: Proto definitions and gRPC interface for Cloud Video Intelligence.

Third-party dependency

The open source AIStreamer ingestion library is based on the following Google-owned and third-party open source libraries.

  • Bazel: A build and test tool with multi-language support.
  • gRPC: A high performance, open-source universal RPC framework.
  • Protobuf: Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data.
  • rules_protobuf: Bazel rules for building protocol buffers and gRPC services.
  • glog: C++ implementation of the Google logging module.
  • gflags: C++ library that implements commandline flags processing.
  • ffmpeg: A complete, cross-platform solution to record, convert and stream audio and video.
  • gStreamer: Another cross-platform multimedia processing and streaming framework.

Protobuf definition

Google Cloud Video Intelligence Streaming API supports the following features in video_intelligence_streaming.proto.

// Streaming video annotation feature.
enum StreamingFeature {
  // Unspecified.
  // Label detection. Detect objects, such as dog or flower.
  // Shot change detection.
  // Explicit content detection.
  // Object tracking.

AIStreamer ingestion client sends StreamingAnnotateVideoRequest to Google Cloud Video Intelligence Streaming API servers. The first StreamingAnnotateVideoRequest message must only contain StreamingVideoConfig, and cannot include input_content. There is an option to store live annotation results to customer specified GCS bucket. By default, this storage option is disabled.

// The top-level message sent by the client for the `StreamingAnnotateVideo`
// method. Multiple `StreamingAnnotateVideoRequest` messages are sent.
// The first message must only contain a `StreamingVideoConfig` message.
// All subsequent messages must only contain `input_content` data.
message StreamingAnnotateVideoRequest {
  // *Required* The streaming request, which is either a streaming config or
  // video content.
  oneof streaming_request {
    // Provides information to the annotator, specifing how to process the
    // request. The first `AnnotateStreamingVideoRequest` message must only
    // contain a `video_config` message.
    StreamingVideoConfig video_config = 1;

    // The video data to be annotated. Chunks of video data are sequentially
    // sent in `StreamingAnnotateVideoRequest` messages. Except the initial
    // `StreamingAnnotateVideoRequest` message containing only
    // `video_config`, all subsequent `AnnotateStreamingVideoRequest`
    // messages must only contain `input_content` field.
    bytes input_content = 2;

// Provides information to the annotator that specifies how to process the
// request.
message StreamingVideoConfig {
  // Requested annotation feature.
  StreamingFeature feature = 1;

  // Config for requested annotation feature.
  oneof streaming_config {
    // Config for SHOT_CHANGE_DETECTION.
    StreamingShotChangeDetectionConfig shot_change_detection_config = 2;

    // Config for LABEL_DETECTION.
    StreamingLabelDetectionConfig label_detection_config = 3;

    StreamingExplicitContentDetectionConfig explicit_content_detection_config =

    StreamingObjectTrackingConfig object_tracking_config = 5;

  // Streaming storage option. By default: storage is disabled.
  StreamingStorageConfig storage_config = 30;

// Config for streaming storage option.
message StreamingStorageConfig {
  // Enable streaming storage. Default: false.
  bool enable_storage_annotation_result = 1;

  // GCS URI to store all annotation results for one client. Client should
  // specify this field as the top-level storage directory. Annotation results
  // of different sessions will be put into different sub-directories denoted
  // by project_name and session_id. All sub-directories will be auto generated
  // by program and will be made accessible to client in response proto.
  // URIs must be specified in the following format: `gs://bucket-id/object-id`
  // `bucket-id` should be a valid GCS bucket created by client and bucket
  // permission shall also be configured properly. `object-id` can be arbitrary
  // string that make sense to client. Other URI formats will return error and
  // cause GCS write failure.
  string annotation_result_storage_directory = 3;

AIStreamer ingestion client receives StreamingAnnotateVideoResponse from Google Cloud Video Intelligence Streaming API servers.

// `StreamingAnnotateVideoResponse` is the only message returned to the client
// by `StreamingAnnotateVideo`. A series of zero or more
// `StreamingAnnotateVideoResponse` messages are streamed back to the client.
message StreamingAnnotateVideoResponse {
  // If set, returns a [google.rpc.Status][] message that
  // specifies the error for the operation. error = 1;

  // Streaming annotation results.
  StreamingVideoAnnotationResults annotation_results = 2;

// Streaming annotation results corresponding to a portion of the video
// that is currently being processed.
message StreamingVideoAnnotationResults {
  // Shot annotation results. Each shot is represented as a video segment.
  repeated VideoSegment shot_annotations = 1;

  // Label annotation results.
  repeated LabelAnnotation label_annotations = 2;

  // Explicit content detection results.
  ExplicitContentAnnotation explicit_annotation = 3;

  // Object tracking results.
  repeated ObjectTrackingAnnotation object_annotations = 4;

Bidirectional streaming gRPC interface

AIStreamer ingestion client uses bidirectional streaming gRPC interface to talk to Google Cloud Video Intelligence Streaming API servers. The bidrectional gRPC streaming interface is defined as StreamingVideoIntelligenceService.

// Service that implements streaming Google Cloud Video Intelligence API.
service StreamingVideoIntelligenceService {
  // Performs video annotation with bidirectional streaming: emitting results
  // while sending video/audio bytes.
  // This method is only available via the gRPC API (not REST).
  rpc StreamingAnnotateVideo(stream StreamingAnnotateVideoRequest)
    returns (stream StreamingAnnotateVideoResponse);

AIStreamer ingestion client must use two threads (sender thread and receiver thread) to support bidirectional streaming gRPC interface. To see Python and C++ examples related to AIStreamer, go to client directory. To understand the basic gRPC concept and how it works, go to gRPC documentation.

You can’t perform that action at this time.