# Vector DB comparision

## [Vector DB Dashboard](https://superlinked.com/vector-db-comparison) -> Click to get more information

|       **Feature**      |      **FAISS**      |                        **ChromaDB**                       |      **Pinecone**      |                          **Weaviate**                          |                    **Qdrant**                    |            LanceDB            |
|:----------------------:|:-------------------:|:---------------------------------------------------------:|:----------------------:|:--------------------------------------------------------------:|:------------------------------------------------:|:-----------------------------:|
| **Type**               | Library             | Library + DB                                              | Managed Cloud DB       | Managed DB + OSS                                               | Managed DB + OSS                                 | Managed DB + OSS              |
| **Hosting**            | Local               | Local (some cloud support)                                | Cloud-only             | Cloud & Self-host                                              | Cloud & Self-host                                | Cloud & Self-host             |
| **Persistant Storage** | No (in-memory)      | Yes                                                       | Yes                    | Yes                                                            | Yes                                              | Yes (s3)                      |
| **Scalability**        | Manual              | Limited                                                   | Auto-scaling           | Auto-scaling                                                   | Auto-scaling                                     | Auto-scaling                  |
| **API Access**         | No REST API         | Python API                                                | REST/gRPC API          | REST/gRPC API                                                  | REST/gRPC API                                    | Pandas Style API              |
| **Indexing Option**    | IVF, HNSW, PQ, Flat | HNSW, SPANN                                               | Proprietery            | HNSW, Flat, IVF, Flat-BQ                                       | HNSW, IVF, PQ                                    | HNSW, IVF, PQ                 |
| **BM-25**              | No                  | No                                                        | No                     | Yes                                                            | No                                               | Yes                           |
| **Hybrid search**      | No                  | No                                                        | Yes                    | Yes                                                            | Yes                                              | Yes                           |
| **Metadata Filtering** | Manual              | Built-in                                                  | Built-in               | Built-in                                                       | Built-in                                         | Built-in (limited)            |
| **Replication**        | No                  | No                                                        | Yes                    | Yes                                                            | Yes                                              | NA                            |
| **Embeddings Storage** | Vectors only        | Vectors + Metadata + Documents                            | Vectors + Metadata     | Vectors + Metadata + Schema                                    | Vectors +  Payload (Custom  metadata)            | tabular + vector              |
| **Integrations**       | Custom              | Langchain, Lllamaindex                                    | Langchain, Lllamaindex | Langchain, Lllamaindex                                         | Langchain, Lllamaindex                           | Langchain, Lllamaindex        |
| **License**            | MIT                 | Apache 2.0                                                | Proprietery SaaS       | BSD                                                            | Apache 2.0                                       | Apache 2.0                    |
| **Best For**           | Fast local search   | Embedding + metadata                                      | Production Saas        | Knowledge graph &  vector search                               | Vector Search  with filtering                    | Tabular data +  Vector search |
| **Docker image**       | No                  | [chromadb](https://hub.docker.com/r/chromadb/chroma/tags) | NA                     | [weaviate](https://hub.docker.com/r/semitechnologies/weaviate) | [qdrant](https://hub.docker.com/r/qdrant/qdrant) | NA                            |
| **Dev Language**       | C++                 | rust                                                      | rust                   | go                                                             | rust                                             | rust                          |
| **Multi-tenant**       | No                  | Yes                                                       | Yes (via namespace)    | Yes                                                            | Yes (via collection/metadata)                    | No                            |

## Comparing different vector db

### Step 1 - Prepare test data

In [1]:
sentences = [
	"Artificial intelligence is transforming modern healthcare through diagnostic tools.",
	"Machine learning algorithms can predict patient outcomes with high accuracy.",
	"Neural networks require large amounts of training data to be effective.",
	"Cloud computing enables scalable AI deployment across industries.",
	"Natural language processing allows computers to understand human speech.",
	"Deep learning models excel at image recognition tasks.",
	"Data privacy remains a major concern in AI implementation.",
	"Quantum computing promises to revolutionize complex calculations.",
	"Robotics automation is changing manufacturing processes worldwide.",
	"Computer vision systems can identify objects in real-time video.",
	"Ethical AI development requires careful consideration of bias.",
	"The Internet of Things connects everyday devices to the cloud.",
	"Blockchain technology provides secure decentralized transactions.",
	"5G networks enable faster data transfer for mobile applications.",
	"Virtual reality creates immersive digital experiences for users.",
	"Cybersecurity threats continue to evolve with advancing technology.",
	"Big data analytics helps businesses make informed decisions.",
	"Autonomous vehicles use sensors and AI to navigate roads safely.",
	"Edge computing processes data closer to the source for reduced latency.",
	"Augmented reality overlays digital information onto the real world.",
	"Python is the most popular programming language for data science.",
	"Reinforcement learning allows AI to learn through trial and error.",
	"Semiconductor chips are essential components in all computing devices.",
	"Digital transformation affects every industry in the modern economy.",
	"AI ethics committees are being formed to guide responsible development.",
]

In [2]:
queries = [
	"AI in healthcare",
	"Machine learning applications",
	"Technology security concerns",
	"Neural networks and data requirements",
	"Real-time computer vision systems",
]

In [3]:
import os
from dotenv import load_dotenv, find_dotenv

load_dotenv(find_dotenv())
EURI_API_KEY = os.getenv("EURI_API_KEY")
PINECONE_API_KEY = os.getenv("PINECONE_API_KEY")
WEAVIATE_API_KEY = os.getenv("WEAVIATE_API_KEY")
WEAVIATE_REST_END_POINT = os.getenv("WEAVIATE_REST_ENDPOINT")

In [4]:
import requests
import numpy as np

In [5]:
def generate_embeddings(text: str) -> np.ndarray:
	url = "https://api.euron.one/api/v1/euri/embeddings"
	headers = {
		"Content-Type": "application/json",
		"Authorization": f"Bearer {EURI_API_KEY}",
	}
	payload = {"input": text, "model": "text-embedding-3-small"}

	response = requests.post(url, headers=headers, json=payload)
	data = response.json()

	embedding = np.array(data["data"][0]["embedding"], dtype=np.float32)

	return embedding

In [6]:
text = "The weather is sunny today."

embedding = generate_embeddings(text)

In [7]:
print(f"embedding shape: {embedding.shape} and embedding type: {embedding.dtype}")

embedding shape: (1536,) and embedding type: float32


In [8]:
embeddings = []
for i in sentences:
	emb = generate_embeddings(text=i)
	embeddings.append(emb)

In [9]:
embeddings

[array([ 0.00979638, -0.02955946,  0.01706536, ...,  0.00463446,
        -0.03079018,  0.02714195], shape=(1536,), dtype=float32),
 array([-0.00839084, -0.00398261,  0.03710102, ..., -0.00083121,
        -0.01360887, -0.00225663], shape=(1536,), dtype=float32),
 array([ 0.00685719,  0.01064172,  0.03272605, ..., -0.02078094,
         0.00418498,  0.01802759], shape=(1536,), dtype=float32),
 array([-0.00989157, -0.03057791,  0.04363609, ..., -0.01714974,
        -0.01270996,  0.04276554], shape=(1536,), dtype=float32),
 array([-0.03052658,  0.0204996 , -0.0008052 , ..., -0.00183322,
         0.02163397,  0.03516533], shape=(1536,), dtype=float32),
 array([ 0.01004403, -0.03063818, -0.0151439 , ...,  0.01736293,
         0.01142606,  0.00414122], shape=(1536,), dtype=float32),
 array([ 0.03147297,  0.00470725,  0.03637992, ..., -0.01520014,
         0.03124474,  0.00258043], shape=(1536,), dtype=float32),
 array([-0.02740892,  0.01017416, -0.0199316 , ..., -0.02636115,
         0.0294806

In [10]:
len(embeddings[0])

1536

In [9]:
embeddings_array = np.vstack(embeddings)
embeddings_array.shape

(25, 1536)

In [10]:
dimension = embeddings_array.shape[1]
dimension

1536

### Running in FAISS

In [13]:
import faiss

In [14]:
index = faiss.IndexFlatL2(dimension)
index

<faiss.swigfaiss.IndexFlatL2; proxy of <Swig Object of type 'faiss::IndexFlatL2 *' at 0x114642640> >

In [15]:
index.add(embeddings_array)
index.ntotal

25

In [16]:
query = queries[0]
query

'AI in healthcare'

In [17]:
query_vec = generate_embeddings(text=query).reshape(1, -1)
query_vec.shape

(1, 1536)

In [18]:
distance, indices = index.search(query_vec, 2)

In [19]:
print(
	f"Query: {query}\n"
	f"Top 2 most similar sentences:\n"
	f"{sentences[indices[0][0]]}\n"
	f"{sentences[indices[0][1]]}\n"
)

Query: AI in healthcare
Top 2 most similar sentences:
Artificial intelligence is transforming modern healthcare through diagnostic tools.
Cloud computing enables scalable AI deployment across industries.



In [20]:
# save index to disk
faiss.write_index(index, "../data/faiss_index.faiss")

## Chroma DB

In [13]:
import chromadb

In [117]:
import chromadb.config

In [118]:
chroma_client = chromadb.PersistentClient(path="../data/chroma_db")

ValueError: An instance of Chroma already exists for ephemeral with different settings

In [98]:
collection_list = chroma_client.list_collections()
if len(collection_list) > 0:
	print(collection_list)

In [97]:
chroma_client.delete_collection(name="chroma_db_test")

In [99]:
chroma_collection = chroma_client.create_collection(
	name="chroma_db_test",
)

In [100]:
len(embeddings_array[0])

1536

In [104]:
chroma_collection.add(
	documents=sentences,
	embeddings=embeddings_array,
	ids=[f"rec_{i}" for i in range(len(sentences))],
)

In [105]:
chroma_collection.count()

25

In [106]:
chroma_collection.get()

{'ids': ['rec_0',
  'rec_1',
  'rec_2',
  'rec_3',
  'rec_4',
  'rec_5',
  'rec_6',
  'rec_7',
  'rec_8',
  'rec_9',
  'rec_10',
  'rec_11',
  'rec_12',
  'rec_13',
  'rec_14',
  'rec_15',
  'rec_16',
  'rec_17',
  'rec_18',
  'rec_19',
  'rec_20',
  'rec_21',
  'rec_22',
  'rec_23',
  'rec_24'],
 'embeddings': None,
 'documents': ['Artificial intelligence is transforming modern healthcare through diagnostic tools.',
  'Machine learning algorithms can predict patient outcomes with high accuracy.',
  'Neural networks require large amounts of training data to be effective.',
  'Cloud computing enables scalable AI deployment across industries.',
  'Natural language processing allows computers to understand human speech.',
  'Deep learning models excel at image recognition tasks.',
  'Data privacy remains a major concern in AI implementation.',
  'Quantum computing promises to revolutionize complex calculations.',
  'Robotics automation is changing manufacturing processes worldwide.',
  'C

In [107]:
query

'AI in healthcare'

In [108]:
query_vec = generate_embeddings(text=query).reshape(1, -1)

In [109]:
result = chroma_collection.query(query_embeddings=query_vec, n_results=2)

In [110]:
print(result)

{'ids': [['rec_0', 'rec_3']], 'embeddings': None, 'documents': [['Artificial intelligence is transforming modern healthcare through diagnostic tools.', 'Cloud computing enables scalable AI deployment across industries.']], 'uris': None, 'included': ['metadatas', 'documents', 'distances'], 'data': None, 'metadatas': [[None, None]], 'distances': [[0.7103437781333923, 0.9832593202590942]]}


In [111]:
def print_chroma_results_detailed(result, query_text=None):
	"""
	Detailed formatting with metadata support.
	"""
	if query_text:
		print(f"🔍 QUERY: '{query_text}'")
		print("=" * 80)

	documents = result["documents"][0]
	distances = result["distances"][0]
	ids = result["ids"][0]
	metadatas = (
		result["metadatas"][0] if result["metadatas"] else [None] * len(documents)
	)

	print(f"📊 Found {len(documents)} results (lower distance = more similar)\n")

	for i, (doc_id, document, distance, metadata) in enumerate(
		zip(ids, documents, distances, metadatas)
	):
		print(f"🏆 RANK {i + 1} | Distance: {distance:.4f} | ID: {doc_id}")
		print(f"📄 Document: {document}")
		if metadata:
			print(f"📋 Metadata: {metadata}")
		print("-" * 80)

In [112]:
print_chroma_results_detailed(result, query_text=query)

🔍 QUERY: 'AI in healthcare'
📊 Found 2 results (lower distance = more similar)

🏆 RANK 1 | Distance: 0.7103 | ID: rec_0
📄 Document: Artificial intelligence is transforming modern healthcare through diagnostic tools.
--------------------------------------------------------------------------------
🏆 RANK 2 | Distance: 0.9833 | ID: rec_3
📄 Document: Cloud computing enables scalable AI deployment across industries.
--------------------------------------------------------------------------------


## Pinecone DB

In [13]:
from pinecone import Pinecone

In [14]:
pc = Pinecone(api_key=PINECONE_API_KEY)

In [17]:
from tqdm import tqdm as notebook_tqdm
index = pc.Index(name="compare-vector-db") # created manually directly in the pinecone with 1536 dimension

In [20]:
index.upsert(
    vectors = [
        (
        "0",
        embeddings_array[0].tolist(),
        {"text": sentences[0]}
        )
    ]
)

{'upserted_count': 1}

### Creating pinecone compatible record structure with custom metadata

In [21]:
from datetime import datetime
records = []
for i, (text, emb) in enumerate(zip(sentences, embeddings_array)):
    records.append(
        (
            f'id_{i}',
            emb.tolist(),
            {
                "text": text,
                "inserted_on" : str(datetime.now()),
                "inserted_by" : 'batch_python'
            }
        )
    )

In [22]:
records[0]

('id_0',
 [0.009796377271413803,
  -0.029559455811977386,
  0.017065364867448807,
  0.019296061247587204,
  0.03424061834812164,
  -0.013098464347422123,
  -0.009060137905180454,
  0.054899271577596664,
  0.0063239652663469315,
  0.02257067710161209,
  0.011384236626327038,
  -0.06922846287488937,
  0.012658919207751751,
  -0.04747094586491585,
  -0.016153307631611824,
  -0.012911657802760601,
  0.0018296093912795186,
  -0.0019669674802571535,
  0.00877443328499794,
  0.01140621304512024,
  0.027383703738451004,
  -0.002911990974098444,
  0.031251706182956696,
  -0.03169125318527222,
  0.03566914051771164,
  -0.03476807102560997,
  -0.028548499569296837,
  -0.01384569238871336,
  -0.02126302756369114,
  0.02151576615869999,
  -0.03004295565187931,
  -0.023229995742440224,
  0.003194948425516486,
  -0.02650461159646511,
  0.007395357824862003,
  0.049448903650045395,
  -0.0496247224509716,
  0.02184542641043663,
  0.04138323664665222,
  0.038438279181718826,
  0.005736072547733784,
  0.

In [24]:
index.upsert(vectors=records)

{'upserted_count': 25}

In [25]:
index.describe_index_stats()

{'dimension': 1536,
 'index_fullness': 0.0,
 'metric': 'cosine',
 'namespaces': {'': {'vector_count': 26}},
 'total_vector_count': 26,
 'vector_type': 'dense'}

In [26]:
index.fetch(ids=['id_17'])

FetchResponse(namespace='', vectors={'id_17': Vector(id='id_17', values=[0.00860816706, -0.00987092406, -0.00555274729, 0.0322453938, 0.0242404193, -0.0373415202, -0.0184452683, 0.0032048088, 0.0190766454, 0.0668810084, -0.0188624281, -0.0492024124, -0.0131687485, -0.0317944102, 0.0159423035, -0.0358081721, -0.0361013114, 0.0357179753, 0.021185, -0.0204408746, 0.0222109891, 0.0251649376, 0.036822889, -0.0164271127, -0.000687751453, -0.0529004857, 0.0347709097, 0.044512175, -0.0198320448, 0.000751875807, 0.0471278839, -0.0179153606, -0.0108574526, -0.0391229093, 0.00192232162, 0.0247816015, 0.022380108, -0.00829247851, 0.0122329555, -0.014499153, 0.0260894559, -0.00836576335, 0.035920918, 0.0370483808, -0.04261804, -0.0103275459, -0.0164383873, 0.0162241682, -0.0157280862, 0.0256159212, -0.0284345746, -0.00449011475, -0.0177575164, 0.00192795892, -0.0545240305, 0.0502847768, -0.00575287174, 0.00684932759, 0.0307345968, 0.0324483365, -0.0091268, -0.00450702664, -0.00254383474, -0.0074525

In [27]:
query

NameError: name 'query' is not defined

In [28]:
query = queries[0]
query

'AI in healthcare'

## Please notice we need to change query vector output to list to be used with pinecone

In [33]:
query_vec = generate_embeddings(text=query).reshape(1, -1).tolist()

In [35]:
query_vec

[[-0.04016132652759552,
  -0.021061312407255173,
  0.03964837267994881,
  0.0357559509575367,
  0.019205622375011444,
  -0.026402072980999947,
  -0.014777617529034615,
  0.04577365145087242,
  -0.018768101930618286,
  0.0015605511143803596,
  0.01493602991104126,
  -0.07205503433942795,
  -0.011549021117389202,
  -0.0006477937567979097,
  -0.021544091403484344,
  -0.026386987417936325,
  -0.004145125392824411,
  -0.03886385262012482,
  -0.014272206462919712,
  -0.008501468226313591,
  0.01329910196363926,
  0.04517017677426338,
  0.0174102820456028,
  -0.015826158225536346,
  -0.006136596202850342,
  -0.02069922536611557,
  0.013148232363164425,
  -0.01728958636522293,
  0.0012908728094771504,
  -0.022449307143688202,
  0.00841094646602869,
  -0.02290191315114498,
  -0.035574909299612045,
  -0.02193635143339634,
  0.00466939527541399,
  0.024380428716540337,
  -0.020156096667051315,
  0.030897969380021095,
  0.02287174016237259,
  0.0025968325790017843,
  0.023912735283374786,
  0.0208

In [38]:
result = index.query(vector=query_vec, top_k=2, include_metadata=True)

In [41]:
def analyze_pinecone_results(result, query_text=None, show_metadata=True, show_usage=True):
    """
    Advanced Pinecone results parser with filtering options.
    """
    if query_text:
        print(f"🔍 QUERY: '{query_text}'")
        print("=" * 80)
    
    matches = result.get('matches', [])
    namespace = result.get('namespace', 'default')
    
    if not matches:
        print("❌ No matches found.")
        return
    
    print(f"📊 Found {len(matches)} results in namespace '{namespace}'")
    print(f"💡 Higher score = more similar (cosine similarity)\n")
    
    for i, match in enumerate(matches):
        score = match.get('score', 0)
        doc_id = match.get('id', 'N/A')
        metadata = match.get('metadata', {})
        text = metadata.get('text', 'No text available')
        
        # Calculate similarity percentage
        similarity_pct = score * 100
        
        # Visual score indicator
        score_bar = "█" * int(score * 20) + "░" * (20 - int(score * 20))
        
        print(f"🏆 RANK {i+1}")
        print(f"   📍 ID: {doc_id}")
        print(f"   📈 Score: {score:.4f} ({similarity_pct:.1f}%)")
        print(f"   📊 {score_bar}")
        print(f"   📄 Text: {text}")
        
        if show_metadata and metadata:
            # Exclude text from metadata display since we already show it
            other_metadata = {k: v for k, v in metadata.items() if k != 'text'}
            if other_metadata:
                print(f"   📋 Metadata:")
                for key, value in other_metadata.items():
                    print(f"      • {key}: {value}")
        
        print("-" * 80)

In [42]:
analyze_pinecone_results(result, query, show_metadata=True, show_usage=True)

🔍 QUERY: 'AI in healthcare'
📊 Found 2 results in namespace ''
💡 Higher score = more similar (cosine similarity)

🏆 RANK 1
   📍 ID: id_0
   📈 Score: 0.6443 (64.4%)
   📊 ████████████░░░░░░░░
   📄 Text: Artificial intelligence is transforming modern healthcare through diagnostic tools.
   📋 Metadata:
      • inserted_by: batch_python
      • inserted_on: 2025-09-24 10:38:17.490094
--------------------------------------------------------------------------------
🏆 RANK 2
   📍 ID: 0
   📈 Score: 0.6443 (64.4%)
   📊 ████████████░░░░░░░░
   📄 Text: Artificial intelligence is transforming modern healthcare through diagnostic tools.
--------------------------------------------------------------------------------


## WEAVIATE DB

In [11]:
import weaviate
from weaviate.classes.init import Auth

In [13]:
client = weaviate.connect_to_weaviate_cloud(
    cluster_url=WEAVIATE_REST_END_POINT,
    auth_credentials=Auth.api_key(WEAVIATE_API_KEY),
)

In [14]:
print(client.is_ready())

True


In [15]:
import weaviate.classes as wvc

In [16]:
weaviate_colleciton = client.collections.create(
    name="vectordb_compare",
    vector_config=wvc.config.Configure.Vectors.self_provided(),
)

### creating record structure expected by weaviate, also there is default metadata created on insert

In [17]:
from datetime import datetime
records = []
for i, (text, emb) in enumerate(zip(sentences, embeddings_array)):
    records.append(wvc.data.DataObject(
            properties = {
                "text": text,
                "inserted_on" : str(datetime.now()),
                "inserted_by" : 'batch_python'
            },
            vector = emb.tolist()
    ))

In [18]:
weaviate_colleciton.data.insert_many(records)

BatchObjectReturn(_all_responses=[UUID('258d8b85-0297-46e8-aad4-c8339a7170a3'), UUID('177a6d2d-54f5-4e3f-8198-c4684c5ea062'), UUID('057683d0-f6b1-45dd-89fb-dd03404a57ec'), UUID('3ad937f1-e727-4353-b072-e2f90a2da1cd'), UUID('83de45d9-5fad-4cf7-98ad-ee3273d9d7bd'), UUID('0c31caf3-e4f2-4118-b2cb-a29cecbf8108'), UUID('ebf9c2fe-7ce0-4d2e-8e53-0561adf1f751'), UUID('259c4bf0-2390-4134-9e83-46c9fdfa7488'), UUID('9c3479c1-17b6-41b9-a760-4d3cfbcbfbb5'), UUID('ded6229a-ab21-47fc-8ec6-c93d82dd1aa0'), UUID('91db67f3-5ba4-421a-ab8d-79567da1a997'), UUID('bc488902-156a-4e76-ba02-3c9c63f484e6'), UUID('c3abf0f4-3ef4-4468-82db-07afbbc43144'), UUID('607c095c-014f-418d-b79a-0f821d8be204'), UUID('09a6e866-5d85-462e-86fe-a6836785a200'), UUID('312c4d51-78d0-47dd-bafc-b08d77ef27d1'), UUID('9556633c-83a9-4c38-82fc-19deca43f98e'), UUID('1a507786-6568-45f9-a0c0-6db252e132af'), UUID('ebe5ee11-69e1-438e-bc90-9c3f02bdabf3'), UUID('ef12554a-4c5b-4c45-9228-6f286c7677a4'), UUID('54a8d078-fc5b-4b53-bd67-9c0eaea206f6'), 

In [19]:
query = queries[0]
query

'AI in healthcare'

In [35]:
query_vec = generate_embeddings(text=query).reshape(1, -1).tolist()

In [36]:
query_vec[0]

[-0.040189046412706375,
 -0.021105283871293068,
 0.03967612236738205,
 0.03578393906354904,
 0.019174277782440186,
 -0.026415547356009483,
 -0.014776716008782387,
 0.04574068635702133,
 -0.01878204382956028,
 0.0015425413148477674,
 0.014995462261140347,
 -0.07199028879404068,
 -0.0115181440487504,
 -0.0006477541755884886,
 -0.021512605249881744,
 -0.02637029066681862,
 -0.004159958567470312,
 -0.0388614796102047,
 -0.014293964020907879,
 -0.008485862985253334,
 0.013283204287290573,
 0.04516742005944252,
 0.01743939146399498,
 -0.015825191512703896,
 -0.006151307839900255,
 -0.020758306607604027,
 0.013132344000041485,
 -0.017288530245423317,
 0.001303994213230908,
 -0.02247810736298561,
 0.008417976088821888,
 -0.02290051430463791,
 -0.035602908581495285,
 -0.021980270743370056,
 0.004699282348155975,
 0.024394026026129723,
 -0.020185038447380066,
 0.030896084383130074,
 0.0229306872934103,
 0.002562730573117733,
 0.02391127496957779,
 0.020878994837403297,
 -0.002962508937343955,
 0

In [43]:
response = weaviate_colleciton.query.near_vector(
    near_vector=query_vec[0],
    limit=2,
    return_metadata=wvc.query.MetadataQuery(certainty=True, distance=True)
)

In [44]:
for o in response.objects:
    print(o.properties)
    print(o.metadata.distance)

{'text': 'Artificial intelligence is transforming modern healthcare through diagnostic tools.', 'inserted_on': '2025-09-24 13:30:05.148075', 'inserted_by': 'batch_python'}
0.355183482170105
{'text': 'Cloud computing enables scalable AI deployment across industries.', 'inserted_on': '2025-09-24 13:30:05.148387', 'inserted_by': 'batch_python'}
0.49164456129074097
