# Split VI Insights into Segments

Video Indexer generates multiple AI based insights on uploaded videos.<br>
A small sample of these insights include -
* Audio transcript
* Textual charachters which appear in video frames using OCR
* Topic detection
One intereting insight that VI produces is a segmentation of the video into scenes and shots, which allow a more granular view of the video

The VI generates a long JSON file which includes the summarization of all these insights.
By indexing the JSON file in Azure Cognitive search, we can search these insights and find **videos** containing any of the identified insights.

### Why split VI insights into smaller segments?
Azure simple search retrieves documents based on BM25 which is based on Term Frequency.
Using the full JSON as a single document for BM25 evaluation may dilute important information we want to search.
By splitting the VI insights into smaller interval based files, will allow to identify areas of interest and pinpoint to specific times in the video.

### How does the Splitter work?
The `VIinsightsToSegmentsParser` iterates over the various insights which are generated by the Video Indexer and splits the file into segments based on the following logic.
1. _scenes_ - The insights in the JSON file are split based on the `scenes` key in the insights.
2. _shots_ - The insights in the JSON file are split based on the `shots` key in the insights. Each scene may be split into multiple shots. A new shot represents a change in the angle of the camera.
3. _intervals_ - equal sized segments, split by n seconds. For example a video with duration 100 seconds and 30 second interval, will be split into 4 segments.

### How does assignment work?
A particular insights may span over multiple segments. For example a car may appear in a video for 30 seconds, but the scene ends at 00:00:25 seconds, meaning the car appears in multiple scenes. How do we split the insgiths into segments in this case?
The `overlap` parameter will define how the assignment works
1. _first_ - Will assign an insight into the **first** relevant segment. Meaning that each insight instance will appear only in 1 segment.
2. _duplicate_ - Will assign the insights into **every** segment relevant to the instance


## Load VI Insights File (will be read from the API)

In [1]:
"""
Copyright (c) Microsoft Corporation.
Licensed under the MIT license.
"""
import json

def read_json_file(location):
    # Opening JSON file
    f = open(location, "r")
    # returns JSON object as a dictionary
    data = json.load(f)
    return data

data = read_json_file('common/notebooks/demo/insights_splitter/insights/exmp2.json')


## Load the Parser

In [2]:
from enrichment.insights_splitter.insights_to_segments_splitter import VIinsightsToSegmentsParser
configuration = read_json_file("common/enrichment/insights_splitter/splitter_configuration.jsonc")
print(configuration.keys())
parser = VIinsightsToSegmentsParser(configuration)

dict_keys(['INSIGHTS_STATIC_FIELDS', 'SUMMARIZED_INSIGHTS_TO_KEEP', 'INSIGHTS_TO_PARSE', 'DEFAULT_OVERLAP', 'DEFAULT_INTERVAL_DURATION'])


## Let's Parse the insights into **equal** length intervals

In [None]:
interval_segments = parser.split_vi_insights(data,
                                            segment_type = 'interval',
                                            interval_duration = 5,
                                            keys_to_extract=['transcript','ocr'],
                                            overlap = 'first'
                                            )


#### Look at the generated segment keys

In [None]:
interval_segments.keys()

#### Now we can look at segment starting at 10<X<=15 seconds and look at all the transcripts generated at that interval.
You can see that the second instance starts at time 00:00:14 and lasts until 00:00:21 (spanning over multiple segments)<br>
We won't find this transcript in any additional segments


In [None]:
interval_segments[10]['videos'][0]['insights']['transcript']

## Let's parse the inisghts based on VI **scenes**
The example we're running includes 6 scenes, so we expect to have 6 generated segments

In [None]:
len(data['videos'][0]['insights']['scenes'])

In [None]:
scene_segments = parser.split_vi_insights(data,
                                            segment_type = 'scenes',
                                            keys_to_extract=['transcript','ocr'],
                                            overlap = 'first'
                                            )

In [None]:
scene_segments.keys()

#### Let's look at transcript starting at second 69

In [None]:
scene_segments[69]['videos'][0]['insights']['transcript'][0]

## Let's Look at **duplicate** overlap methodology 

In [3]:
shots_segments = parser.split_vi_insights(data,
                                        segment_type = 'shots',
                                        keys_to_extract=['labels'],
                                        overlap = 'duplicate'
                                        )

In the original insights, there is a _man_ label appearing in several timestamps.
The man appears from 01:15(75 seconds) - 01:25 (85 seconds)


In [12]:
[x for x in data['videos'][0]['insights']['labels'] if x['name']=='man'][0]['instances'][1]


{'confidence': 0.9764,
 'adjustedStart': '0:01:15.075',
 'adjustedEnd': '0:01:25.1183666',
 'start': '0:01:15.075',
 'end': '0:01:25.1183666'}

This insights is spanning across 2 segments; segment 69 & 85,
so we expect to see this insight duplicated across those segments


In [15]:
shots_segments.keys()

odict_keys([0.0, 5.0, 15.0, 53.0, 65.0, 69.0, 85.0, 94.0, 100.0, 113.0, 124.0])

In [37]:
shots_segments[85]['videos'][0]['insights']['labels'][3]['instances'][1]

{'confidence': 0.9764,
 'adjustedStart': '0:01:15.075',
 'adjustedEnd': '0:01:25.1183666',
 'start': '0:01:15.075',
 'end': '0:01:25.1183666'}

In [35]:
shots_segments[69]['videos'][0]['insights']['labels'][7]['instances'][1]

{'confidence': 0.9764,
 'adjustedStart': '0:01:15.075',
 'adjustedEnd': '0:01:25.1183666',
 'start': '0:01:15.075',
 'end': '0:01:25.1183666'}