<a href="https://colab.research.google.com/github/graphlit/graphlit-samples/blob/main/python/Notebook%20Examples/Graphlit_2024_12_25_Compare_RAG_Performance_between_OpenAI%2C_Groq%2C_Anthropic_and_more.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Description**

This example shows how to compare LLM performance between RAG pipelines.

**Requirements**

Prior to running this notebook, you will need to [signup](https://docs.graphlit.dev/getting-started/signup) for Graphlit, and [create a project](https://docs.graphlit.dev/getting-started/create-project).

You will need the Graphlit organization ID, preview environment ID and JWT secret from your created project.

Assign these properties as Colab secrets: GRAPHLIT_ORGANIZATION_ID, GRAPHLIT_ENVIRONMENT_ID and GRAPHLIT_JWT_SECRET.


---

Install Graphlit Python client SDK

In [23]:
!pip install --upgrade graphlit-client



Initialize Graphlit

In [24]:
import os
from google.colab import userdata
from graphlit import Graphlit
from graphlit_api import input_types, enums, exceptions

os.environ['GRAPHLIT_ORGANIZATION_ID'] = userdata.get('GRAPHLIT_ORGANIZATION_ID')
os.environ['GRAPHLIT_ENVIRONMENT_ID'] = userdata.get('GRAPHLIT_ENVIRONMENT_ID')
os.environ['GRAPHLIT_JWT_SECRET'] = userdata.get('GRAPHLIT_JWT_SECRET')

graphlit = Graphlit()

Define Graphlit helper functions

In [25]:
from typing import List, Optional

async def ingest_uri(uri: str):
    if graphlit.client is None:
        return;

    try:
        # Using synchronous mode, so the notebook waits for the content to be ingested
        response = await graphlit.client.ingest_uri(uri=uri, is_synchronous=True)

        return response.ingest_uri.id if response.ingest_uri is not None else None
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None

async def create_openai_specification(model: enums.OpenAIModels):
    if graphlit.client is None:
        return;

    input = input_types.SpecificationInput(
        name=f"OpenAI [{str(model)}]",
        type=enums.SpecificationTypes.COMPLETION,
        serviceType=enums.ModelServiceTypes.OPEN_AI,
        openAI=input_types.OpenAIModelPropertiesInput(
            model=model
        )
    )

    try:
        response = await graphlit.client.create_specification(input)

        return response.create_specification.id if response.create_specification is not None else None
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None

    return None

async def create_cohere_specification(model: enums.CohereModels):
    if graphlit.client is None:
        return;

    input = input_types.SpecificationInput(
        name=f"Cohere [{str(model)}]",
        type=enums.SpecificationTypes.COMPLETION,
        serviceType=enums.ModelServiceTypes.COHERE,
        cohere=input_types.CohereModelPropertiesInput(
            model=model
        )
    )

    try:
        response = await graphlit.client.create_specification(input)

        return response.create_specification.id if response.create_specification is not None else None
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None

    return None

async def create_google_specification(model: enums.GoogleModels):
    if graphlit.client is None:
        return;

    input = input_types.SpecificationInput(
        name=f"Google [{str(model)}]",
        type=enums.SpecificationTypes.COMPLETION,
        serviceType=enums.ModelServiceTypes.GOOGLE,
        google=input_types.GoogleModelPropertiesInput(
            model=model
        )
    )

    try:
        response = await graphlit.client.create_specification(input)

        return response.create_specification.id if response.create_specification is not None else None
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None

    return None

async def create_groq_specification(model: enums.GroqModels):
    if graphlit.client is None:
        return;

    input = input_types.SpecificationInput(
        name=f"Groq [{str(model)}]",
        type=enums.SpecificationTypes.COMPLETION,
        serviceType=enums.ModelServiceTypes.GROQ,
        groq=input_types.GroqModelPropertiesInput(
            model=model
        )
    )

    try:
        response = await graphlit.client.create_specification(input)

        return response.create_specification.id if response.create_specification is not None else None
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None

    return None

async def create_anthropic_specification(model: enums.AnthropicModels):
    if graphlit.client is None:
        return;

    input = input_types.SpecificationInput(
        name=f"Anthropic [{str(model)}]",
        type=enums.SpecificationTypes.COMPLETION,
        serviceType=enums.ModelServiceTypes.ANTHROPIC,
        anthropic=input_types.AnthropicModelPropertiesInput(
            model=model
        )
    )

    try:
        response = await graphlit.client.create_specification(input)

        return response.create_specification.id if response.create_specification is not None else None
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None

    return None

async def create_deepseek_specification(model: enums.DeepseekModels):
    if graphlit.client is None:
        return;

    input = input_types.SpecificationInput(
        name=f"Deepseek [{str(model)}]",
        type=enums.SpecificationTypes.COMPLETION,
        serviceType=enums.ModelServiceTypes.DEEPSEEK,
        deepseek=input_types.DeepseekModelPropertiesInput(
            model=model
        )
    )

    try:
        response = await graphlit.client.create_specification(input)

        return response.create_specification.id if response.create_specification is not None else None
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None

    return None

async def create_conversation(specification_id: str):
    if graphlit.client is None:
        return;

    input = input_types.ConversationInput(
        name="Conversation",
        specification=input_types.EntityReferenceInput(
            id=specification_id
        )
    )

    try:
        response = await graphlit.client.create_conversation(input)

        return response.create_conversation.id if response.create_conversation is not None else None
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None

async def delete_conversation(conversation_id: str):
    if graphlit.client is None:
        return;

    if conversation_id is not None:
        _ = await graphlit.client.delete_conversation(conversation_id)

async def prompt_conversation(conversation_id: str, prompt: str):
    if graphlit.client is None:
        return None, None

    try:
        response = await graphlit.client.prompt_conversation(prompt, conversation_id, include_details=True)

        message = response.prompt_conversation.message if response.prompt_conversation is not None else None
        details = response.prompt_conversation.details if response.prompt_conversation is not None else None

        return message, details
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None, None

async def delete_all_specifications():
    if graphlit.client is None:
        return;

    _ = await graphlit.client.delete_all_specifications(is_synchronous=True)

async def delete_all_conversations():
    if graphlit.client is None:
        return;

    _ = await graphlit.client.delete_all_conversations(is_synchronous=True)

async def delete_all_contents():
    if graphlit.client is None:
        return;

    _ = await graphlit.client.delete_all_contents(is_synchronous=True)


Execute Graphlit example

In [26]:
from IPython.display import display, Markdown, HTML
import time

# Remove any existing contents, conversations and specifications; only needed for notebook example
await delete_all_conversations()
await delete_all_specifications()
await delete_all_contents()

print('Deleted all contents, conversations and specifications.')

content_id = await ingest_uri(uri="https://graphlitplatform.blob.core.windows.net/samples/Unstructured%20Data%20is%20Dark%20Data%20Podcast.mp3")

if content_id is not None:
    print(f'Ingested content [{content_id}]:')

Deleted all contents, conversations and specifications.
Ingested content [a2664d78-a881-4f02-b4c1-ed30871c71e8]:


In [27]:
    # Specify the RAG prompt
    prompt = "In 3-5 detailed paragraphs, explain unstructured data and its usefulness for knowledge capture and retrieval."

Create OpenAI GPT-4o specification.

In [28]:
    specification_id = await create_openai_specification(enums.OpenAIModels.GPT4O_128K)

    if specification_id is not None:
        print(f'Created OpenAI GPT-4o specification [{specification_id}].')

        conversation_id = await create_conversation(specification_id)

        if conversation_id is not None:
            print(f'Created conversation [{conversation_id}].')

            message, details = await prompt_conversation(conversation_id, prompt)

            if message is not None:
                display(Markdown(f'### Completion took [{message.completion_time}]:'))
                display(Markdown(f'**User:**\n{prompt}'))
                display(Markdown(f'**Assistant:**'))
                print(message.message)
                print()

            if details is not None:
                display(Markdown('### Details:'))
                display(Markdown(f'**Model**: {details.model_service} {details.model}'))
                display(Markdown(f'**Token Limit**: {details.token_limit}'))
                display(Markdown(f'**Completion Token Limit**: {details.completion_token_limit}'))

            await delete_conversation(conversation_id)

Created OpenAI GPT-4o specification [0ccfca3b-4053-49c8-ac90-a7f223cb35d9].
Created conversation [4a39d2ad-b0b7-41cc-8e32-201e4266e1e6].


### Completion took [PT6.5102745S]:

**User:**
In 3-5 detailed paragraphs, explain unstructured data and its usefulness for knowledge capture and retrieval.

**Assistant:**

Unstructured data refers to information that does not have a predefined data model or is not organized in a pre-defined manner. This type of data includes a wide variety of formats such as text, images, audio, video, and more complex data types like 3D models and point clouds. Unlike structured data, which is neatly organized in databases with clear relationships, unstructured data is often stored in its raw form, making it more challenging to process and analyze. Despite this complexity, unstructured data is incredibly valuable because it represents the vast majority of data generated today, encompassing everything from social media posts and emails to satellite imagery and IoT sensor data.

The usefulness of unstructured data lies in its potential for knowledge capture and retrieval. By leveraging advanced technologies such as machine learning and natural language processing, organizations can extract meaningful insights from unstructured data. For instance, images can be analyzed to

### Details:

**Model**: OPEN_AI GPT-4o 128k (Latest)

**Token Limit**: 128000

**Completion Token Limit**: 16384

Create Groq Llama 3.2 specification.

In [29]:
    specification_id = await create_groq_specification(enums.GroqModels.LLAMA_3_2_11B_VISION_PREVIEW)

    if specification_id is not None:
        print(f'Created Groq Llama 3.2 specification [{specification_id}].')

        conversation_id = await create_conversation(specification_id)

        if conversation_id is not None:
            print(f'Created conversation [{conversation_id}].')

            message, details = await prompt_conversation(conversation_id, prompt)

            if message is not None:
                display(Markdown(f'### Completion took [{message.completion_time}]:'))
                display(Markdown(f'**User:**\n{prompt}'))
                display(Markdown(f'**Assistant:**'))
                print(message.message)
                print()

            if details is not None:
                display(Markdown('### Details:'))
                display(Markdown(f'**Model**: {details.model_service} {details.model}'))
                display(Markdown(f'**Token Limit**: {details.token_limit}'))
                display(Markdown(f'**Completion Token Limit**: {details.completion_token_limit}'))

            await delete_conversation(conversation_id)

Created Groq Llama 3.2 specification [3c88c956-adcb-420e-be89-8698850c1f22].
Created conversation [faeeae2a-22c6-4afd-8c66-88477482cf71].


### Completion took [PT1.2487748S]:

**User:**
In 3-5 detailed paragraphs, explain unstructured data and its usefulness for knowledge capture and retrieval.

**Assistant:**

Unstructured data refers to a broad set of data that lacks a predefined structure or organization, making it difficult to analyze and process using traditional data management tools. This type of data can include everything from images and audio files to documents and emails, as well as 3D models and geometry point clouds. According to Kirk Marple, the founder of Unstruct Data, unstructured data is a "broad set of file-based" data that is often overlooked in favor of more structured data.

Despite its challenges, unstructured data holds significant value for knowledge capture and retrieval. By leveraging machine learning and artificial intelligence, it is possible to extract insights and meaning from unstructured data, creating a knowledge graph that can be used to make connections and inferences between different pieces of information. This can be particularly useful in industries such as geospatial, where data is often generated by sensors and other devices, and needs to be analyzed 

### Details:

**Model**: GROQ LLaMA 3.2 11b Vision Preview

**Token Limit**: 8192

**Completion Token Limit**: 820

Create Groq Llama 3.3 specification.

In [30]:
    specification_id = await create_groq_specification(enums.GroqModels.LLAMA_3_3_70B)

    if specification_id is not None:
        print(f'Created Groq Llama 3.3 specification [{specification_id}].')

        conversation_id = await create_conversation(specification_id)

        if conversation_id is not None:
            print(f'Created conversation [{conversation_id}].')

            message, details = await prompt_conversation(conversation_id, prompt)

            if message is not None:
                display(Markdown(f'### Completion took [{message.completion_time}]:'))
                display(Markdown(f'**User:**\n{prompt}'))
                display(Markdown(f'**Assistant:**'))
                print(message.message)
                print()

            if details is not None:
                display(Markdown('### Details:'))
                display(Markdown(f'**Model**: {details.model_service} {details.model}'))
                display(Markdown(f'**Token Limit**: {details.token_limit}'))
                display(Markdown(f'**Completion Token Limit**: {details.completion_token_limit}'))

            await delete_conversation(conversation_id)

Created Groq Llama 3.3 specification [100afb5b-3998-47a1-806d-661eb5bcc697].
Created conversation [06189990-fe41-4bfe-a9ba-95358fb7aaaf].


### Completion took [PT3.3312745S]:

**User:**
In 3-5 detailed paragraphs, explain unstructured data and its usefulness for knowledge capture and retrieval.

**Assistant:**

Unstructured data refers to a broad set of data that does not have a predefined format or organization, making it difficult to search and analyze using traditional methods. This type of data can come in various forms, including images, audio, videos, documents, emails, and social media posts. Despite its lack of structure, unstructured data can be incredibly valuable for knowledge capture and retrieval, as it often contains rich and nuanced information that can provide insights and context. For instance, an image of a piece of equipment can reveal details about its condition, location, and surroundings, while an audio recording of a meeting can capture discussions, decisions, and action items.

The usefulness of unstructured data lies in its ability to provide a more complete and accurate picture of a particular situation or phenomenon. By analyzing unstructured data, organizations can gain a deeper understanding of their customers, operations, and environments, which can inform decisi

### Details:

**Model**: GROQ LLaMA 3.3 70b

**Token Limit**: 8192

**Completion Token Limit**: 820

Create Anthropic Sonnet 3.5 specification.

In [31]:
    specification_id = await create_anthropic_specification(enums.AnthropicModels.CLAUDE_3_5_SONNET_20241022)

    if specification_id is not None:
        print(f'Created Anthropic Sonnet 3.5 specification [{specification_id}].')

        conversation_id = await create_conversation(specification_id)

        if conversation_id is not None:
            print(f'Created conversation [{conversation_id}].')

            message, details = await prompt_conversation(conversation_id, prompt)

            if message is not None:
                display(Markdown(f'### Completion took [{message.completion_time}]:'))
                display(Markdown(f'**User:**\n{prompt}'))
                display(Markdown(f'**Assistant:**'))
                print(message.message)
                print()

            if details is not None:
                display(Markdown('### Details:'))
                display(Markdown(f'**Model**: {details.model_service} {details.model}'))
                display(Markdown(f'**Token Limit**: {details.token_limit}'))
                display(Markdown(f'**Completion Token Limit**: {details.completion_token_limit}'))

            await delete_conversation(conversation_id)

Created Anthropic Sonnet 3.5 specification [9205cf85-52e2-469c-ae0e-1bbe7aed3965].
Created conversation [164b2f67-1222-403e-a027-ddda6e128fb6].


### Completion took [PT11.0075589S]:

**User:**
In 3-5 detailed paragraphs, explain unstructured data and its usefulness for knowledge capture and retrieval.

**Assistant:**

Unstructured data encompasses a broad range of file-based content including images, audio, 3D geometry, point clouds, documents, and emails. While these files have defined structures and formats, they are considered "unstructured" because their valuable content and meaning requires additional processing and analysis to extract. This data often contains rich metadata at multiple levels - from basic file headers and technical details to deeper semantic meaning derived through advanced processing.

The real power of unstructured data emerges through the creation of knowledge graphs and relationship networks. By analyzing the content through techniques like computer vision, natural language processing, and audio transcription, organizations can extract meaningful insights and create connections between different pieces of information. This allows for dynamic discovery of relationships between people, places, things and concepts that may not be immediately obvious when looking at individual

### Details:

**Model**: ANTHROPIC Claude 3.5 Sonnet (10-22-2024 version)

**Token Limit**: 200000

**Completion Token Limit**: 8191

Create Anthropic Haiku 3.5 specification.

In [37]:
    specification_id = await create_anthropic_specification(enums.AnthropicModels.CLAUDE_3_5_HAIKU_20241022)

    if specification_id is not None:
        print(f'Created Anthropic Haiku 3.5 specification [{specification_id}].')

        conversation_id = await create_conversation(specification_id)

        if conversation_id is not None:
            print(f'Created conversation [{conversation_id}].')

            message, details = await prompt_conversation(conversation_id, prompt)

            if message is not None:
                display(Markdown(f'### Completion took [{message.completion_time}]:'))
                display(Markdown(f'**User:**\n{prompt}'))
                display(Markdown(f'**Assistant:**'))
                print(message.message)
                print()

            if details is not None:
                display(Markdown('### Details:'))
                display(Markdown(f'**Model**: {details.model_service} {details.model}'))
                display(Markdown(f'**Token Limit**: {details.token_limit}'))
                display(Markdown(f'**Completion Token Limit**: {details.completion_token_limit}'))

            await delete_conversation(conversation_id)

Created Anthropic Haiku 3.5 specification [fe01f6c9-ce5e-402c-a142-8f3adeac5b85].
Created conversation [7a15fea4-89ea-4b52-ada1-35d4b9637c41].


### Completion took [PT13.1650699S]:

**User:**
In 3-5 detailed paragraphs, explain unstructured data and its usefulness for knowledge capture and retrieval.

**Assistant:**

Unstructured data represents a broad category of digital information that includes imagery, audio, 3D geometry, point clouds, documents, and emails, which traditionally have been challenging to systematically analyze and extract insights from. Unlike structured data stored in traditional databases, unstructured data requires sophisticated processing techniques to transform raw files into meaningful, searchable information. The key challenge is converting these diverse file types into a format where relationships, context, and insights can be discovered and leveraged.

The process of working with unstructured data involves multiple layers of metadata extraction, starting with "first order" metadata like file headers and embedded information, progressing to "second order" metadata through techniques like object detection and computer vision, and ultimately reaching "third order" metadata where machine learning can create sophisticated inferences and connections. For example, an image of 

### Details:

**Model**: ANTHROPIC Claude 3.5 Haiku (10-22-2024 version)

**Token Limit**: 200000

**Completion Token Limit**: 8191

Create Deepseek Chat specification.

In [32]:
    specification_id = await create_deepseek_specification(enums.DeepseekModels.CHAT)

    if specification_id is not None:
        print(f'Created Deepseek Chat specification [{specification_id}].')

        conversation_id = await create_conversation(specification_id)

        if conversation_id is not None:
            print(f'Created conversation [{conversation_id}].')

            message, details = await prompt_conversation(conversation_id, prompt)

            if message is not None:
                display(Markdown(f'### Completion took [{message.completion_time}]:'))
                display(Markdown(f'**User:**\n{prompt}'))
                display(Markdown(f'**Assistant:**'))
                print(message.message)
                print()

            if details is not None:
                display(Markdown('### Details:'))
                display(Markdown(f'**Model**: {details.model_service} {details.model}'))
                display(Markdown(f'**Token Limit**: {details.token_limit}'))
                display(Markdown(f'**Completion Token Limit**: {details.completion_token_limit}'))

            await delete_conversation(conversation_id)

Created Deepseek Chat specification [47c3307f-5fbc-4b9f-bf68-862c913eb6ad].
Created conversation [8157b083-16bf-4919-b321-f58c9f6af036].


### Completion took [PT12.656048S]:

**User:**
In 3-5 detailed paragraphs, explain unstructured data and its usefulness for knowledge capture and retrieval.

**Assistant:**

Unstructured data refers to information that does not have a predefined data model or is not organized in a structured format like a database. It encompasses a wide range of file types, including images, audio, videos, 3D geometry, point clouds, documents, emails, and more. Unlike structured data, which is neatly organized into rows and columns, unstructured data is often raw and unprocessed, making it challenging to analyze directly. However, it is incredibly valuable because it captures real-world information in its most natural form, such as photos of physical assets, maintenance reports, or even recorded Zoom meetings about inspections. This data is often generated by drones, robots, mobile phones, or other devices that capture real-world assets, making it a critical resource for industries like geospatial, engineering, and infrastructure.

The usefulness of unstructured data lies in its ability to provide rich, contextual insights when properly processed and analyzed. For instance

### Details:

**Model**: DEEPSEEK Deepseek Chat

**Token Limit**: 128000

**Completion Token Limit**: 4095

Create Google Gemini 1.5 Flash specification.

In [33]:
    specification_id = await create_google_specification(enums.GoogleModels.GEMINI_1_5_FLASH_002)

    if specification_id is not None:
        print(f'Created Google Gemini 1.5 Flash specification [{specification_id}].')

        conversation_id = await create_conversation(specification_id)

        if conversation_id is not None:
            print(f'Created conversation [{conversation_id}].')

            message, details = await prompt_conversation(conversation_id, prompt)

            if message is not None:
                display(Markdown(f'### Completion took [{message.completion_time}]:'))
                display(Markdown(f'**User:**\n{prompt}'))
                display(Markdown(f'**Assistant:**'))
                print(message.message)
                print()

            if details is not None:
                display(Markdown('### Details:'))
                display(Markdown(f'**Model**: {details.model_service} {details.model}'))
                display(Markdown(f'**Token Limit**: {details.token_limit}'))
                display(Markdown(f'**Completion Token Limit**: {details.completion_token_limit}'))

            await delete_conversation(conversation_id)

Created Google Gemini 1.5 Flash specification [5410dc05-d3a5-4525-bef3-0e1fe4cf71e5].
Created conversation [327e0f1f-a27a-4eb4-b9a4-1d0d6acb16ec].


### Completion took [PT4.298679S]:

**User:**
In 3-5 detailed paragraphs, explain unstructured data and its usefulness for knowledge capture and retrieval.

**Assistant:**

Unstructured data encompasses a wide range of information formats, including images, audio, 3D models, documents, and emails, lacking the rigid structure of traditional databases.  While often perceived as disorganized, this data is incredibly rich in potential insights.  The challenge lies in effectively processing and extracting meaningful information from these diverse sources.  This is where the concept of metadata becomes crucial, allowing for the organization and contextualization of unstructured data.  Different levels of metadata exist, from basic file headers (first-order) to complex inferences derived from machine learning (third-order), enabling increasingly sophisticated knowledge capture.

The process of transforming unstructured data into usable knowledge involves several key steps.  First, initial metadata is extracted from the files themselves.  Then, more detailed information is derived through techniques like object detection in images or sentiment analysis in text.  

### Details:

**Model**: GOOGLE Gemini 1.5 Flash (002 version)

**Token Limit**: 1048576

**Completion Token Limit**: 8191

Create Google Gemini 2.0 Flash specification.

In [34]:
    specification_id = await create_google_specification(enums.GoogleModels.GEMINI_2_0_FLASH_EXPERIMENTAL)

    if specification_id is not None:
        print(f'Created Google Gemini 2.0 Flash specification [{specification_id}].')

        conversation_id = await create_conversation(specification_id)

        if conversation_id is not None:
            print(f'Created conversation [{conversation_id}].')

            message, details = await prompt_conversation(conversation_id, prompt)

            if message is not None:
                display(Markdown(f'### Completion took [{message.completion_time}]:'))
                display(Markdown(f'**User:**\n{prompt}'))
                display(Markdown(f'**Assistant:**'))
                print(message.message)
                print()

            if details is not None:
                display(Markdown('### Details:'))
                display(Markdown(f'**Model**: {details.model_service} {details.model}'))
                display(Markdown(f'**Token Limit**: {details.token_limit}'))
                display(Markdown(f'**Completion Token Limit**: {details.completion_token_limit}'))

            await delete_conversation(conversation_id)

Created Google Gemini 2.0 Flash specification [35604323-8d5d-4532-8bfe-f0293b7f6259].
Created conversation [b8462250-d561-48de-b8cb-ca4e90d6c95e].


### Completion took [PT4.137467S]:

**User:**
In 3-5 detailed paragraphs, explain unstructured data and its usefulness for knowledge capture and retrieval.

**Assistant:**

Unstructured data encompasses a wide range of file-based information, including imagery, audio, 3D models, documents, and emails. While these files have a defined structure, the content within them is often considered unstructured because it lacks a predefined format for easy analysis. This is in contrast to structured data, which is organized in databases with clear schemas. The challenge with unstructured data lies in extracting meaningful insights from the raw content, which requires parsing files and identifying relevant information.

Metadata plays a crucial role in making unstructured data more accessible and useful. First-order metadata, such as EXIF data in images or file headers, provides basic information about the file itself. Second-order metadata involves analyzing the content of the file, such as identifying objects in an image or extracting terms from a document. Third-order metadata goes further by creating inferences and connections between the data and real-world enti

### Details:

**Model**: GOOGLE Gemini 2.0 Flash (Experimental)

**Token Limit**: 1048576

**Completion Token Limit**: 8191

Create Cohere R7B specification.

In [35]:
    specification_id = await create_cohere_specification(enums.CohereModels.COMMAND_R7_B_202412)

    if specification_id is not None:
        print(f'Created Cohere R7B specification [{specification_id}].')

        conversation_id = await create_conversation(specification_id)

        if conversation_id is not None:
            print(f'Created conversation [{conversation_id}].')

            message, details = await prompt_conversation(conversation_id, prompt)

            if message is not None:
                display(Markdown(f'### Completion took [{message.completion_time}]:'))
                display(Markdown(f'**User:**\n{prompt}'))
                display(Markdown(f'**Assistant:**'))
                print(message.message)
                print()

            if details is not None:
                display(Markdown('### Details:'))
                display(Markdown(f'**Model**: {details.model_service} {details.model}'))
                display(Markdown(f'**Token Limit**: {details.token_limit}'))
                display(Markdown(f'**Completion Token Limit**: {details.completion_token_limit}'))

            await delete_conversation(conversation_id)

Created Cohere R7B specification [0004b180-1b7a-4a88-852e-2db2c60c125c].
Created conversation [7bbfaf0d-4d88-4007-a291-ccf7a84f57cf].


### Completion took [PT6.9068167S]:

**User:**
In 3-5 detailed paragraphs, explain unstructured data and its usefulness for knowledge capture and retrieval.

**Assistant:**

Unstructured data encompasses a broad range of information, including imagery, audio, 3D geometry, point clouds, documents, and emails. It is characterized by its lack of a predefined structure, which sets it apart from structured data. While structured data is organized and easily searchable, unstructured data requires additional processing to extract meaningful insights. This is where metadata comes into play, acting as a bridge between the raw data and the information it contains.

The value of unstructured data lies in its ability to capture and contextualize real-world entities, such as people, places, and things. By adding metadata, which provides additional context, unstructured data can be transformed into a powerful tool for knowledge capture and retrieval. For example, in the geospatial domain, unstructured data can include satellite imagery, drone footage, and mobile phone data, all of which can be enriched with metadata to create a comprehensive knowledge graph.

One of the

### Details:

**Model**: COHERE Command R7B (2024-12 version)

**Token Limit**: 128000

**Completion Token Limit**: 4095