<a href="https://colab.research.google.com/github/pavlinov/code_examples/blob/master/A_Pavlinov_for_Backend_Challenge.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
"""
Backend Challenge for Developer Stardome 🌟

Please find the technical challenge for the Backend Developer role below.

We know technical challenges require some personal time commitment, so just take your time to complete it.
We’d just like you to share an estimated delivery date once you get a chance to review it.

If you have any questions or doubts, please feel free to reach out to the person who has sent you this document.

Thank you and good luck! 🌐🚀🌌

================================== Challenge ==================================

The high-level goal is to take a series of video files, transcribe it using open-source models and make the transcript searchable using embedding-based search.

Please solve this challenge in Python and share detailed instructions on how to run the code.

● 🎙️➡️📝 Take input as a set of video files (.mp4), run it through any publicly available Speech-to-text engine (S2T).
● 📝➡️🔢 Chunk the output transcript, vectorize it using open-source text embedding models (BERT is one alternative, but feel free to explore others).
● 🗃️ Store the embeddings of the transcript chunks in an open-source graph database that can support vector search, a popular example is FAISS, but feel free to explore other options.
● 🔍 On the querying side, the user input is a text query, which should be used to search the top K results against the vector DB.

⭐⭐⭐ Bonus points:
● If your search system can support searching against both text and embeddings both in a single query and rerank the results based on relevance.
"""

In [27]:
!pip -q install python-dotenv tiktoken
!pip -q install openai
!pip -q install milvus pymilvus
!pip -q install langchain
!pip -q install pydub
!pip -q install --upgrade --force-reinstall sentence_transformers
!pip -q install instructor pydantic
!pip install --upgrade --force-reinstall nltk pyarrow


import os
# Before RUN set SECRET VARIABLE in colab: OPENAI_API_KEY <<<<<<<<<< READ THIS FIRST
from google.colab import userdata
OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
lida 0.0.10 requires fastapi, which is not installed.
lida 0.0.10 requires kaleido, which is not installed.
lida 0.0.10 requires python-multipart, which is not installed.
lida 0.0.10 requires uvicorn, which is not installed.
llmx 0.0.15a0 requires cohere, which is not installed.
gcsfs 2023.6.0 requires fsspec==2023.6.0, but you have fsspec 2023.12.2 which is incompatible.
imageio 2.31.6 requires pillow<10.1.0,>=8.3.2, but you have pillow 10.2.0 which is incompatible.
tensorflow-probability 0.22.0 requires typing-extensions<4.6.0, but you have typing-extensions 4.9.0 which is incompatible.
torchaudio 2.1.0+cu121 requires torch==2.1.0, but you have torch 2.1.2 which is incompatible.
torchdata 0.7.0 requires torch==2.1.0, but you have torch 2.1.2 which is incompatible.
torchtext 0.16.0 requires torch==2.1.0, but

In [4]:
from milvus import default_server
# Start Milvus on local env
default_server.stop()
default_server.set_base_dir('milvus_data')
default_server.cleanup()
default_server.start()

In [5]:
# STOP Vector DB
#default_server.stop()

In [2]:
patch = """--- __init__.py	2024-01-18 09:55:13.548485781 +0000
+++ __init___new.py	2024-01-18 09:55:47.077306881 +0000
@@ -11,8 +11,8 @@
 __version__ = "0.0.0.dev"


-with suppress(DistributionNotFound):
-    __version__ = get_distribution("pymilvus").version
+#with suppress(DistributionNotFound):
+#    __version__ = get_distribution("pymilvus").version


 def get_commit(version: str = "", short: bool = True) -> str:

"""
patch_path = '/root/pymilvus_client___init__.py.patch'
with open(patch_path, 'w') as file:
  file.write(patch)
  print(f"Patch created: {patch_path}")

!!patch --dry-run '/usr/local/lib/python3.10/dist-packages/pymilvus/client/__init__.py' '/root/pymilvus_client___init__.py.patch'
!!patch '/usr/local/lib/python3.10/dist-packages/pymilvus/client/__init__.py' '/root/pymilvus_client___init__.py.patch'

Patch created: /root/pymilvus_client___init__.py.patch


['patching file /usr/local/lib/python3.10/dist-packages/pymilvus/client/__init__.py',
 'Reversed (or previously applied) patch detected!  Assume -R? [n] ',
 'Apply anyway? [n] ',
 'Skipping patch.',
 '1 out of 1 hunk ignored -- saving rejects to file /usr/local/lib/python3.10/dist-packages/pymilvus/client/__init__.py.rej']

In [5]:
from pymilvus import utility, connections

# Constans
_HOST = "127.0.0.1"
_PORT = default_server.listen_port

_COLLECTION_NAME = 'VIDOSO_AUDIO'
_DIM = 384
_INDEX_PARAM = {
    "metric_type": "COSINE",
    "index_type": "HNSW",
    "params": {"M": 8, "efConstruction": 64},
}

_QUERY_PARAM = {
    "metric_type": "COSINE",
    "params": {"ef": 64},
}

# TODO: !!! Make fix in module to start conection in google collab:
# /usr/local/lib/python3.10/dist-packages/pymilvus/client/__init__.py
# +14 #with suppress(DistributionNotFound):
# +15 #     __version__ = get_distribution("pymilvus").version

def create_connection():
    print(f"Create connection...")
    connections.disconnect("default")
    connections.connect("default", timeout=10, host=_HOST, port=_PORT)
    print(f"List connections:")
    print(connections.list_connections())

create_connection()

Create connection...
List connections:
[('default', <pymilvus.client.grpc_handler.GrpcHandler object at 0x7924a208ae30>)]


In [6]:
print(utility.get_server_version())
print(connections.list_connections())

def list_collections():
    print("List collections:")
    print(utility.list_collections())

list_collections()

def drop_collection(name):
  if utility.has_collection(name):
    print(f"Dropping collection: {name}")
    utility.drop_collection(name)

v2.3.3-lite
[('default', <pymilvus.client.grpc_handler.GrpcHandler object at 0x7924a208ae30>)]
List collections:
[]


In [7]:
from pymilvus import FieldSchema, CollectionSchema, DataType, Collection

def create_vector_collection(collection_name):
    drop_collection(collection_name)

    fields = [
      primary_field   := FieldSchema(name='pk',       dtype=DataType.INT64,        auto_id=True,     is_primary=True,  description="int64", ),
      vector_field    := FieldSchema(name='vector',   dtype=DataType.FLOAT_VECTOR, dim=_DIM,         is_primary=False, description="float vector", ),
      file_name_field := FieldSchema(name="fileName", dtype=DataType.VARCHAR,      max_length=256,                     description="File name", ),
      text_field      := FieldSchema(name="text",     dtype=DataType.VARCHAR,      max_length=65535,                   description="Text description of the arifact", ),
    ]


    schema = CollectionSchema(fields, description="collection description")

    collection = Collection(name=collection_name, schema=schema)

    collection.create_index(field_name="vector", index_params=_INDEX_PARAM)

    collection.load()

    print("Collection created:", collection_name)
    return collection

# create collection
collection = create_vector_collection(_COLLECTION_NAME)


if utility.has_collection(_COLLECTION_NAME):
    print(f"Collection {_COLLECTION_NAME}")

list_collections()

Collection created: VIDOSO_AUDIO
Collection VIDOSO_AUDIO
List collections:
['VIDOSO_AUDIO']


In [9]:
import os
import openai

#from dotenv import load_dotenv
#load_dotenv()
# OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

from google.colab import userdata
OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

openai.api_key = OPENAI_API_KEY

In [9]:
#!pip install --upgrade --force-reinstall nltk pyarrow

Collecting nltk
  Using cached nltk-3.8.1-py3-none-any.whl (1.5 MB)
Collecting pyarrow
  Using cached pyarrow-14.0.2-cp310-cp310-manylinux_2_28_x86_64.whl (38.0 MB)
Collecting click (from nltk)
  Using cached click-8.1.7-py3-none-any.whl (97 kB)
Collecting joblib (from nltk)
  Using cached joblib-1.3.2-py3-none-any.whl (302 kB)
Collecting regex>=2021.8.3 (from nltk)
  Using cached regex-2023.12.25-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (773 kB)
Collecting tqdm (from nltk)
  Using cached tqdm-4.66.1-py3-none-any.whl (78 kB)
Collecting numpy>=1.16.6 (from pyarrow)
  Using cached numpy-1.26.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB)
Installing collected packages: tqdm, regex, numpy, joblib, click, pyarrow, nltk
  Attempting uninstall: tqdm
    Found existing installation: tqdm 4.66.1
    Uninstalling tqdm-4.66.1:
      Successfully uninstalled tqdm-4.66.1
  Attempting uninstall: regex
    Found existing installation: regex 2023.12.25
    Uni

In [10]:
from langchain.vectorstores import Milvus
from pprint import pprint
from langchain_community.embeddings import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
# re-install nltk, pyarrow in case:
# AttributeError: module 'numpy.linalg._umath_linalg' has no attribute '_ilp64'

VECTOR_DSN = {"host": _HOST, "port": _PORT}

pprint(embeddings)
pprint(f"Collection: {_COLLECTION_NAME}")
pprint(VECTOR_DSN)

vector_store = Milvus(embedding_function=embeddings, collection_name=_COLLECTION_NAME, connection_args=VECTOR_DSN)

#vector_store


HuggingFaceEmbeddings(client=SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
  (2): Normalize()
), model_name='all-MiniLM-L6-v2', cache_folder=None, model_kwargs={}, encode_kwargs={}, multi_process=False)
'Collection: VIDOSO_AUDIO'
{'host': '127.0.0.1', 'port': 19530}


In [11]:
from langchain.docstore.document import Document

def make_documents(texts, file_name='fileName.txt'):
    prepared_documents = []
    for item_text in texts:
        text_document = Document(
            page_content=item_text,
            metadata={
                "fileName": file_name,
                # ...
            },
        )
        prepared_documents.append(text_document)
    return prepared_documents

#prepared_documents = make_documents(texts, 'fileName.txt')

#prepared_documents

In [78]:
#vector_store.add_documents(prepared_documents)

[447063204769563312,
 447063204769563313,
 447063204769563314,
 447063204769563315]

In [None]:
# UPLOAD VIDEO FILE
from google.colab import files
uploaded = files.upload()

In [14]:
from werkzeug.utils import secure_filename
#uploaded.__dir__()
for uploaded_file in uploaded:
    ss_filename = secure_filename(uploaded_file)
    print(f"{uploaded_file:<44} {ss_filename:<44}")

#own_file, _ = uploaded.popitem()
#print(own_file)

Ground Effects Vehicles.mp4                  Ground_Effects_Vehicles.mp4                 
Anduril Lattice Counter Drone System.mp4     Anduril_Lattice_Counter_Drone_System.mp4    
videoplayback.mp4                            videoplayback.mp4                           


In [15]:
import os

def create_audio_folder():
  audio_folder = 'audio'
  if not os.path.exists(audio_folder):
    os.makedirs(audio_folder)

In [16]:
# Split the audio file into chunks
import time
from pydub import AudioSegment
from pydub.utils import make_chunks


from openai import OpenAI

def split_audio(filepath, chunk_duration):
    """
    Splits the audio file into chunks of 'chunk_duration' milliseconds.
    Returns a list of file paths for the chunks.
    """
    # Load the audio file
    audio = AudioSegment.from_file(filepath)

    # Calculate chunk duration in milliseconds
    chunk_length_ms = chunk_duration * 1000

    # Split file into chunks
    chunks = make_chunks(audio, chunk_length_ms)

    # Directory to save chunks
    chunk_dir = os.path.join(os.path.dirname(filepath), 'chunks')
    os.makedirs(chunk_dir, exist_ok=True)

    chunk_filepaths = []

    timestamp = int(time.time() * 1000)

    # Export each chunk as a separate file
    for i, chunk in enumerate(chunks):
        chunk_name = f"chunk{i}-{timestamp}.mp3"  # Change format as needed
        chunk_path = os.path.join(chunk_dir, chunk_name)
        chunk.export(chunk_path, format="mp3")  # Change format as needed
        chunk_filepaths.append(chunk_path)

    return chunk_filepaths


def transcribe_audio(file_path):
    """
    Transcribe the given audio file using OpenAI's Whisper API.
    """
    # Ensure the OpenAI API key is set
    client = OpenAI(api_key=OPENAI_API_KEY)
    # Load the audio file content

    # Send the audio data to OpenAI for transcription
    try:
        with open(file_path, 'rb') as audio_file:
            response = client.audio.transcriptions.create(
                model="whisper-1",
                file=audio_file,
                response_format = "text"
            )
            transcription = response
            print(f"Transcription: {transcription}")
    except Exception as e:
        print(f"An error occurred during transcription: {e}")
        transcription = ""

    return transcription

all_trans = {}

for uploaded_video_file in uploaded:
    print(f"{uploaded_video_file} - in processing")
    create_audio_folder()
    sanitized_filename = secure_filename(uploaded_video_file)
    chunks = split_audio(filepath=uploaded_video_file, chunk_duration=480)  # 240 seconds = 4 minutes

    # Process each chunk
    transcriptions = []
    for chunk in chunks:
        transcription = transcribe_audio(chunk)  # Assuming this is an async function
        transcriptions.append(transcription)

    print(transcriptions)

    # Save in Vectore Store
    prepared_documents = make_documents(transcriptions, file_name=uploaded_video_file)
    print(prepared_documents)

    ids = vector_store.add_documents(prepared_documents)
    print(ids)

    trans = {
        'filename': uploaded_video_file,
        'ids' : ids,
        'transcriptions': transcriptions,
        'prepared_documents': prepared_documents
    }
    all_trans[uploaded_video_file] = trans




Ground Effects Vehicles.mp4 - in processing
Transcription: I'm John Schuster, and I've been working with wings and ground effect for many years. Somewhat stimulated by the work that Dr. Alexander Lippisch did back in the 60s at Collins Radio Company where he developed the concept of the ram wing and the airfoil boat as they called it. This model here works on that principle and it's a model of one of Lippisch's later ground effect airplanes, their wing and ground effect. It's a Dorney Air model called the X-113, and you can look that up, very similar here except it will have sponsons because it takes off from water, and it will also have polyhedral wingtips somewhat like we have on this model in order to improve the stability of the airplane. The way this concept works is that it's called a captured air bubble. As the airplane moves along the ground, air is pushed under the wing and the trailing edges are sealed against the water or the ground, so the air basically comes to a halt unde

In [26]:
collection = Collection(_COLLECTION_NAME)
_QUERY_PARAM = {
    "metric_type": "COSINE",
    "params": {"ef": 64},
}

import textwrap

def query(query, embed = None, top_k = 5):
    text, expr = query
    if text is not None :
      embed = [embeddings.embed_query(text)]
    print(f"text: {text}, expr: {expr}")
    print(embed)
    res = collection.search(embed,
                            anns_field='vector',
                            expr = expr,
                            param=_QUERY_PARAM,
                            limit = top_k,
                            output_fields=['pk', 'text', 'fileName'])
    for i, hit in enumerate(res):
        print('Description:', text, 'Expression:', expr)
        print('Results:')
        for ii, hits in enumerate(hit):
            # pprint(hits)
            print('\t' + 'Rank:', ii + 1, 'PK:', hits.id, "Score:", hits.score, "<<<=== SCORE")
            print('\t\t' + 'File:', hits.entity.get('fileName'))
            print(textwrap.fill(hits.entity.get('text'), 88))
            print()

print("\n\n1) Query")
query(query=('Ukraine', 'fileName == "videoplayback.mp4"'), top_k=5)

print("\n\n2) Query ABSENT")
query(query=('He devoured a large quantity of pizza pie.', 'fileName == "fileName.txt"'), top_k=3)

print("\n\n3) Query")
query(query=(None,''), embed = [[0.0926586389541626, 0.10387899726629257, 0.02529732510447502, 0.07277646660804749, -0.09093283861875534, -0.02728363871574402, 0.0971444621682167, 0.06315481662750244, -0.006986891385167837, -0.08392268419265747, 0.01635073497891426, -0.0845772922039032, -0.01996513083577156, 0.028048578649759293, 0.026462916284799576, -0.11682454496622086, 0.11175842583179474, -0.019626278430223465, 0.004995360504835844, -0.03965060040354729, 0.05882453918457031, 0.031169146299362183, 0.02696499042212963, -0.05094258114695549, 0.0168340764939785, 0.013032169081270695, 0.015177512541413307, -0.09198766201734543, -0.007175867911428213, -0.036262597888708115, 0.00489719957113266, 0.046671703457832336, 0.05361352115869522, -0.047655750066041946, 0.04349580407142639, 0.0027455524541437626, 0.029793819412589073, -0.0436551570892334, 0.10616374760866165, 0.011718306690454483, 0.05967254936695099, 0.005528587382286787, 0.008567244745790958, 0.0020710784010589123, -0.02138313092291355, -0.013332471251487732, -0.07864672690629959, -0.022216206416487694, 0.050501499325037, 0.06512919068336487, -0.06660318374633789, -0.013271432369947433, 0.01898864097893238, -0.06039217486977577, 0.030402135103940964, -0.06714612990617752, 0.08087906986474991, 0.03060770407319069, 0.005723696667701006, -0.017260415479540825, -0.12468782812356949, -0.0235749538987875, 0.020616866648197174, 0.012743238359689713, 0.036718111485242844, -0.10431462526321411, -0.006439814809709787, -0.07772962003946304, -0.031953923404216766, 0.10103640705347061, 0.040175627917051315, 0.0724654570221901, 0.02476118877530098, -0.054198428988456726, -0.004520978778600693, -0.0386611707508564, 0.009580434300005436, -0.06233298033475876, -0.03669625148177147, -0.03803732246160507, 0.022714102640748024, 0.05617288872599602, -0.07597879320383072, 0.014564639888703823, -0.03854035586118698, 0.03904009237885475, -0.010148956440389156, -0.0006213047308847308, -0.04762570932507515, 0.03536989167332649, -0.0009647439583204687, -0.04930942505598068, -0.06874970346689224, 0.026779457926750183, -0.0548597127199173, -0.01728372648358345, -0.06143305450677872, -0.08544590324163437, -0.006181738805025816, 0.03193461522459984, 0.017922723665833473, 0.05423593893647194, 0.043810758739709854, -0.06143258139491081, 0.018590431660413742, 0.008964299224317074, -0.048544615507125854, 0.07939977943897247, 0.04870941489934921, 0.05023882910609245, -0.01509187277406454, -0.02796880714595318, -0.015311243012547493, -0.0244072787463665, 0.015338653698563576, 0.012920107692480087, 0.03404207527637482, -0.043722301721572876, -0.08840589970350266, 0.03347388654947281, 0.04067583754658699, 0.004066161811351776, -0.03975914418697357, 0.08265876024961472, -0.06772125512361526, -0.02502240613102913, -0.009103707037866116, -4.37171051048212e-33, 0.011438705958425999, -0.01165514811873436, 0.03374417871236801, 0.038243647664785385, 0.035488374531269073, 0.08459470421075821, -0.007342718541622162, 0.03705568239092827, -0.00926301535218954, -0.04624079912900925, 0.00176063587423414, -0.027906371280550957, -0.06502698361873627, 0.1026429608464241, -0.0005057338275946677, -0.03148965165019035, 0.02335231751203537, 0.046192944049835205, 0.02962225303053856, -0.08259624242782593, -0.01032576896250248, -0.06712818890810013, 0.005046976264566183, -0.012643732130527496, -0.01615000143647194, 0.03815736249089241, -0.0384981743991375, 0.002516858046874404, -0.030432768166065216, -0.0003836556279566139, 0.036077626049518585, 0.04007953032851219, 0.042268481105566025, 0.031312521547079086, 0.11821186542510986, -0.01619795523583889, -0.03402530774474144, 0.041535478085279465, -0.1024170070886612, 0.003834794508293271, -0.01642497442662716, 0.026925617828965187, 0.031583938747644424, 0.0043254392221570015, -0.14780984818935394, -0.0013432303676381707, -0.04860621690750122, 0.03370083123445511, 0.030036138370633125, -0.03732965886592865, 0.0840776190161705, 0.020545972511172295, 0.06505880504846573, -0.028838856145739555, -0.01303718239068985, -0.03180849552154541, 0.043376050889492035, -0.047743555158376694, -0.0018119545420631766, 0.004866858944296837, 0.07727839797735214, 0.06888441741466522, 0.00323205953463912, 0.054522521793842316, -0.10034676641225815, -0.008383814245462418, -0.024407081305980682, 0.01965228281915188, -0.0568317212164402, 0.06453584879636765, -0.0339960977435112, -0.02589862421154976, 0.07468190044164658, -0.05200282484292984, -0.11259341984987259, -0.03287438675761223, -0.04328393191099167, 0.01541831437498331, -0.017607716843485832, 0.036103326827287674, 0.04278700426220894, -0.036166269332170486, 0.0733690857887268, -0.047152526676654816, -0.06574824452400208, 0.05068761855363846, 0.023568615317344666, -0.05823178589344025, 0.06138287112116814, 0.04279528185725212, -0.0466163195669651, -0.050261739641427994, 0.029362615197896957, -0.061820607632398605, -0.11941111087799072, 1.4155788782376662e-33, -0.09106606990098953, 0.08200039714574814, -0.012199334800243378, 0.016970228403806686, -0.0061735836789011955, -0.03462458401918411, -0.06933239102363586, 0.014518879354000092, -0.01627506874501705, -0.09359608590602875, -0.022512471303343773, 0.011373378336429596, 0.12351550161838531, -0.04109586030244827, -0.004215164575725794, 0.11798997223377228, 0.03776470944285393, 0.061211343854665756, -0.008860506117343903, 0.017018551006913185, -0.06936217844486237, -0.0481453500688076, 0.027989724650979042, 0.053524330258369446, -0.030000004917383194, 0.06286395341157913, 0.02359664998948574, 0.006903366651386023, -0.05026064068078995, -0.002289589261636138, -0.04232688620686531, -0.05779300257563591, -0.02501806430518627, -0.013917786069214344, -0.02288943901658058, 0.09178279340267181, -0.010095297358930111, 0.03424561396241188, 0.036238107830286026, -0.039615143090486526, -0.04291335120797157, -0.06609933823347092, 0.027705848217010498, 0.06659886986017227, 0.0438113771378994, -0.025898879393935204, 0.03193632885813713, -0.022084593772888184, 0.03863959014415741, 0.0754740834236145, -0.011217174120247364, 0.0035740546882152557, -0.010626089759171009, 0.021468453109264374, 0.015233520418405533, 0.06683364510536194, -0.037982191890478134, 0.07035081088542938, -0.011910528875887394, -0.0893973708152771, -0.08400291949510574, -0.02802358567714691, 0.02073383890092373, 0.05715271458029747, -0.011274737305939198, 0.034678198397159576, 0.02664036862552166, -0.025614092126488686, -0.013190102763473988, 0.03359784930944443, -0.018544139340519905, 0.06211165711283684, 0.05550236999988556, 0.017478445544838905, -0.04125456511974335, -0.029141783714294434, -0.09825509786605835, -0.014822030439972878, -0.043641746044158936, 0.0005448315641842782, -0.02674930915236473, -0.07762695848941803, -0.002081233076751232, 0.024496326223015785, 0.03294822946190834, -0.08187809586524963, 0.0014667524956166744, -0.1092124879360199, -0.07265260815620422, 0.05271127447485924, -0.03906848654150963, 0.009134151972830296, 0.06368061155080795, -0.0259409099817276, 0.0829569548368454, -1.798183646428697e-08, 0.022204622626304626, -0.03584187105298042, -0.033664874732494354, 0.014221438206732273, 0.10967861115932465, -0.043677568435668945, -0.03691470995545387, 0.02896362729370594, -0.03736387565732002, 0.07086551934480667, -0.14406882226467133, 0.0727689266204834, 0.07011039555072784, 0.0073583098128438, 0.0046136993914842606, -0.043869711458683014, 0.02903830073773861, -0.03811796382069588, -0.03826652839779854, 0.0315927118062973, -0.021266378462314606, 0.01757352240383625, 0.02317694202065468, -0.04139934107661247, 0.04187537357211113, -0.0021486927289515734, -0.016432354226708412, 0.03809574618935585, -0.0028453872073441744, 0.022026343271136284, 0.05107757821679115, -0.026283428072929382, -0.12029750645160675, -0.038429830223321915, 0.0519731380045414, 0.022366836667060852, -0.054642681032419205, -0.03892935812473297, 0.05961472541093826, -0.01803513616323471, -0.045195382088422775, 0.05682922154664993, 0.03402811288833618, 0.04018094018101692, 0.00910746119916439, -0.008804233744740486, -0.07732673734426498, -0.008398588746786118, 0.004259184468537569, 0.08520487695932388, 0.0026367269456386566, 0.09572847932577133, 0.009628902189433575, -0.02094009704887867, 0.1389080286026001, -0.07229219377040863, 0.06461729109287262, 0.03899853676557541, 0.016131602227687836, 0.08523229509592056, -0.006593836471438408, 0.06992322951555252, 0.06427562981843948, -0.11694642156362534]])





1) Query
text: Ukraine, expr: fileName == "videoplayback.mp4"
[[0.035374291241168976, 0.042139098048210144, -0.03631306439638138, 0.047212567180395126, -0.005990773439407349, -0.0172591470181942, 0.06458228081464767, -0.03907416760921478, 0.060019560158252716, -0.019359348341822624, -0.05810486897826195, -0.028217393904924393, 0.03245609253644943, 0.036385077983140945, -0.03041859157383442, -0.032956335693597794, -0.03929492458701134, -0.14634238183498383, -0.051932770758867264, -0.08852016180753708, -0.04158470034599304, -0.009792939759790897, 0.07422547042369843, 0.020048413425683975, 0.07841993123292923, 0.013229778036475182, 0.04070732370018959, 0.028417538851499557, 0.02665472775697708, -0.04742160066962242, -0.008339102379977703, -0.07979680597782135, 0.019042091444134712, -0.04242469370365143, -0.0035160325933247805, 0.052693430334329605, -0.010908707045018673, -0.06763660162687302, -0.010740851052105427, 0.055566154420375824, -0.01854785904288292, -0.06257452070713043, -0.015