<a href="https://colab.research.google.com/github/graphlit/graphlit-samples/blob/main/python/Notebook%20Examples/Graphlit_2024_12_27_Publish_Audio_Summary_of_Year_in_Review.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Description**

This example shows how to ingest Graphlit changelog, use OpenAI O1 to write a comprehensive year-in-review, and published using an [ElevenLabs](https://elevenlabs.io/) voice.

**Requirements**

Prior to running this notebook, you will need to [signup](https://docs.graphlit.dev/getting-started/signup) for Graphlit, and [create a project](https://docs.graphlit.dev/getting-started/create-project).

You will need the Graphlit organization ID, preview environment ID and JWT secret from your created project.

Assign these properties as Colab secrets: GRAPHLIT_ORGANIZATION_ID, GRAPHLIT_ENVIRONMENT_ID and GRAPHLIT_JWT_SECRET.


---

Install Graphlit Python client SDK

In [3]:
!pip install --upgrade graphlit-client



In [4]:
!pip install --upgrade isodate

Collecting isodate
  Downloading isodate-0.7.2-py3-none-any.whl.metadata (11 kB)
Downloading isodate-0.7.2-py3-none-any.whl (22 kB)
Installing collected packages: isodate
Successfully installed isodate-0.7.2


In [5]:
import os
from google.colab import userdata
from graphlit import Graphlit
from graphlit_api import input_types, enums, exceptions

os.environ['GRAPHLIT_ORGANIZATION_ID'] = userdata.get('GRAPHLIT_ORGANIZATION_ID')
os.environ['GRAPHLIT_ENVIRONMENT_ID'] = userdata.get('GRAPHLIT_ENVIRONMENT_ID')
os.environ['GRAPHLIT_JWT_SECRET'] = userdata.get('GRAPHLIT_JWT_SECRET')

graphlit = Graphlit()

Define Graphlit helper functions

In [6]:
from typing import List, Optional

async def create_specification(model: enums.OpenAIModels):
    if graphlit.client is None:
        return;

    input = input_types.SpecificationInput(
        name=f"OpenAI {model}]",
        type=enums.SpecificationTypes.COMPLETION,
        serviceType=enums.ModelServiceTypes.OPEN_AI,
        openAI=input_types.OpenAIModelPropertiesInput(
            model=model,
        )
    )

    try:
        response = await graphlit.client.create_specification(input)

        return response.create_specification.id if response.create_specification is not None else None
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None

    return None

async def create_web_feed(uri: str, correlation_id: Optional[str], limit: Optional[int] = None):
    if graphlit.client is None:
        return;

    input = input_types.FeedInput(
        name=uri,
        type=enums.FeedTypes.WEB,
        web=input_types.WebFeedPropertiesInput(
            uri=uri,
            readLimit=limit if limit is not None else 100
        )
    )

    try:
        response = await graphlit.client.create_feed(input, correlation_id=correlation_id)

        return response.create_feed.id if response.create_feed is not None else None
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None

    return None

async def is_feed_done(feed_id: str):
    if graphlit.client is None:
        return;

    response = await graphlit.client.is_feed_done(feed_id)

    return response.is_feed_done.result if response.is_feed_done is not None else None


async def lookup_usage(correlation_id: str):
    if graphlit.client is None:
        return;

    try:
        response = await graphlit.client.lookup_usage(correlation_id)

        return response.lookup_usage if response.lookup_usage is not None else None
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None

async def lookup_credits(correlation_id: str):
    if graphlit.client is None:
        return;

    try:
        response = await graphlit.client.lookup_credits(correlation_id)

        return response.lookup_credits if response.lookup_credits is not None else None
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None


def dump_usage_record(record):
    print(f"{record.date}: {record.name}")

    duration = isodate.parse_duration(record.duration)

    if record.workflow:
        print(f"- Workflow [{record.workflow}] took {duration}, used credits [{record.credits:.8f}]")
    else:
        print(f"- Operation took {duration}, used credits [{record.credits:.8f}]")

    if record.entity_id:
        if record.entity_type:
            if record.entity_type == enums.EntityTypes.CONTENT and record.content_type:
                print(f"- {record.entity_type} [{record.entity_id}]: Content type [{record.content_type}], file type [{record.file_type}]")
            else:
                print(f"- {record.entity_type} [{record.entity_id}]")
        else:
            print(f"- Entity [{record.entity_id}]")

    if record.model_service:
        print(f"- Model service [{record.model_service}], model name [{record.model_name}]")

    if record.processor_name:
        if record.processor_name in ["Deepgram Audio Transcription", "Assembly.AI Audio Transcription"]:
            length = timedelta(milliseconds=record.count or 0)

            if record.model_name:
                print(f"- Processor name [{record.processor_name}], model name [{record.model_name}], length [{length}]")
            else:
                print(f"- Processor name [{record.processor_name}], length [{length}]")
        else:
            if record.count:
                if record.model_name:
                    print(f"- Processor name [{record.processor_name}], model name [{record.model_name}], units [{record.count}]")
                else:
                    print(f"- Processor name [{record.processor_name}], units [{record.count}]")
            else:
                if record.model_name:
                    print(f"- Processor name [{record.processor_name}], model name [{record.model_name}]")
                else:
                    print(f"- Processor name [{record.processor_name}]")

    if record.uri:
        print(f"- URI [{record.uri}]")

    if record.name == "Prompt completion":
        if record.prompt:
            print(f"- Prompt [{record.prompt_tokens} tokens (includes RAG context tokens)]:")
            print(record.prompt)

        if record.completion:
            print(f"- Completion [{record.completion_tokens} tokens (includes JSON guardrails tokens)], throughput: {record.throughput:.3f} tokens/sec:")
            print(record.completion)

    elif record.name == "Text embedding":
        if record.prompt_tokens is not None:
            print(f"- Text embedding [{record.prompt_tokens} tokens], throughput: {record.throughput:.3f} tokens/sec")

    elif record.name == "Document preparation":
        if record.prompt_tokens is not None and record.completion_tokens is not None:
            print(f"- Document preparation [{record.prompt_tokens} input tokens, {record.completion_tokens} output tokens], throughput: {record.throughput:.3f} tokens/sec")

    elif record.name == "Data extraction":
        if record.prompt_tokens is not None and record.completion_tokens is not None:
            print(f"- Data extraction [{record.prompt_tokens} input tokens, {record.completion_tokens} output tokens], throughput: {record.throughput:.3f} tokens/sec")

    elif record.name == "GraphQL":
        if record.request:
            print(f"- Request:")
            print(record.request)

        if record.variables:
            print(f"- Variables:")
            print(record.variables)

        if record.response:
            print(f"- Response:")
            print(record.response)

    if record.name.startswith("Upload"):
        print(f"- File upload [{record.count} bytes], throughput: {record.throughput:.3f} bytes/sec")

    print()

async def get_content(content_id: str):
    if graphlit.client is None:
        return;

    response = await graphlit.client.get_content(content_id)

    return response.content

async def query_contents(feed_id: str):
    if graphlit.client is None:
        return;

    try:
        response = await graphlit.client.query_contents(
            filter=input_types.ContentFilter(
                feeds=[
                    input_types.EntityReferenceFilter(
                        id=feed_id
                    )
                ]
            )
        )

        return response.contents.results if response.contents is not None else None
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None

async def publish_contents(feed_id: str, summary_specification_id: str, publish_specification_id: str, summary_prompt: str, publish_prompt: str, correlation_id: str, voice_id: Optional[str] = None):
    if graphlit.client is None:
        return;

    try:
        response = await graphlit.client.publish_contents(
            name="Published Summary",
            connector=input_types.ContentPublishingConnectorInput(
               type=enums.ContentPublishingServiceTypes.ELEVEN_LABS_AUDIO,
               format=enums.ContentPublishingFormats.MP3,
               elevenLabs=input_types.ElevenLabsPublishingPropertiesInput(
                   model=enums.ElevenLabsModels.TURBO_V2_5,
                   voice=voice_id if voice_id is not None else "ZF6FPAbjXT4488VcRRnw" # ElevenLabs Amelia voice
               )
            ),
            summary_prompt=summary_prompt,
            summary_specification=input_types.EntityReferenceInput(
                id=summary_specification_id
            ),
            publish_prompt = publish_prompt,
            publish_specification=input_types.EntityReferenceInput(
                id=publish_specification_id
            ),
            filter=input_types.ContentFilter(
                feeds=[input_types.EntityReferenceFilter(id=feed_id)]
            ),
            is_synchronous=True,
            correlation_id=correlation_id
        )

        return response.publish_contents.content.id if response.publish_contents is not None and response.publish_contents.content is not None else None
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None

async def delete_all_specifications():
    if graphlit.client is None:
        return;

    _ = await graphlit.client.delete_all_specifications(is_synchronous=True)

async def delete_all_feeds():
    if graphlit.client is None:
        return;

    _ = await graphlit.client.delete_all_feeds(is_synchronous=True)

async def delete_all_contents():
    if graphlit.client is None:
        return;

    _ = await graphlit.client.delete_all_contents(is_synchronous=True)

In [7]:
import time
import isodate
from IPython.display import display, Markdown, HTML
from datetime import datetime, timedelta

# Remove any existing feeds, contents and specifications; only needed for notebook example
await delete_all_feeds()
await delete_all_specifications()
await delete_all_contents()

print('Deleted all feeds, contents and specifications.')

# NOTE: create a unique cost correlation ID
ingestion_correlation_id = datetime.now().isoformat()
publish_correlation_id = datetime.now().isoformat()

uri = "https://changelog.graphlit.dev"
limit = 100 # maximum number of web pages to ingest

feed_id = await create_web_feed(uri, ingestion_correlation_id, limit)

if feed_id is not None:
    print(f'Created feed [{feed_id}]: {uri}')

    # Wait for feed to complete, since ingestion happens asychronously
    done = False
    time.sleep(5)
    while not done:
        done = await is_feed_done(feed_id)

        if not done:
            time.sleep(10)

    print(f'Completed feed [{feed_id}].')

    # Query contents by feed
    contents = await query_contents(feed_id)

    if contents is not None:
        print(f'Found {len(contents)} contents in feed [{feed_id}].')
        print()

        for content in contents:
            if content is not None:

                display(Markdown(f'# Ingested content [{content.id}]'))

                print(f'Text Mezzanine: {content.text_uri}')

                print(content.markdown)

Deleted all feeds, contents and specifications.
Created feed [b3609ece-d81d-4a38-b359-30951f4a1931]: https://changelog.graphlit.dev
Completed feed [b3609ece-d81d-4a38-b359-30951f4a1931].
Found 48 contents in feed [b3609ece-d81d-4a38-b359-30951f4a1931].



# Ingested content [989c5568-cfd4-4ee4-ba79-73e3552cfdc2]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/989c5568-cfd4-4ee4-ba79-73e3552cfdc2/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎒	September 2024

# September 30: Support for Azure AI Inference models, Mistral Pixtral and latest Google Gemini models
### New Features

- 💡 Graphlit now supports the Azure AI Model Inference API (aka Models as a Service) model service which offers serverless hosting to many models such as Meta Llama 3.2, Cohere Command-R, and many more. For Azure AI, all models are 'custom', and you will need to provide the serverless endpoint, API key and number of tokens accepted in context window, after provisioning the model of your choice.
- We have added support for the multimodal Mistral Pixtral model, under the model enum PIXTRAL_12B_2409.
- We have added versioned model enums for Google Gemini, so you can access GEMINI_1_5_FLASH_001, GEMINI_1_5_FLASH_002,

# Ingested content [9e6ef8e1-5448-4933-9705-d9d9502b384c]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/9e6ef8e1-5448-4933-9705-d9d9502b384c/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎒	September 2024

# September 3: Support for web search feeds, model deprecations
### New Features

- 💡 Graphlit now supports web search feeds, using the Tavily and Exa.AI web search APIs. You can choose the SEARCH feed type, and assign your search text property, and we will ingest the referenced web pages from the search results. Optionally, you can select the search service via the serviceType property under search feed properties. By default, Graphlit will use the Tavily API.
- ⚡ We have deprecated these OpenAI models, according to the future support OpenAI is providing to these legacy models: GPT35_TURBO, GPT35_TURBO_0613, GPT35_TURBO_16K, GPT35_TURBO_16K_0125, GPT35_TURBO_16K_0613, GPT35_TURBO_16K_1106, GPT4, GPT4_0613, GPT4_32K, GPT4_32K_0613, 

# Ingested content [090640e2-097b-4684-90fa-86ab0cba94fc]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/090640e2-097b-4684-90fa-86ab0cba94fc/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🛠️	September 2023

# September 4: Workflow configuration; support for Notion feeds; document OCR
### New Features

- 🔥 Added Workflow entity to data model for configuring stages of content workflow; can be assigned to Feed or with ingestPage, ingestFile, or ingestText mutations to control how content is ingested, prepared, extracted and enriched into the knowledge graph.
- 💡 Added support for Notion feeds: now can create feed to ingest files from Notion pages or databases (i.e. wikis).
- 💡 Added support for API-created Observation entities, which allow for custom observations of observable entities (i.e. Person, Label) on Content.
- 💡 Added support for Azure AI Document Intelligence as an optional method for preparing PDF files, using OCR and advance

# Ingested content [a704d7fa-1161-4d3a-9be7-569d3af4d1cf]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/a704d7fa-1161-4d3a-9be7-569d3af4d1cf/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎒	September 2024

# September 26: Support for Google AI and Cerebras models, and latest Groq models
### New Features

- 💡 Graphlit now supports the Cerebras model service which offers the LLAMA_3_1_70B and LLAMA_3_1_8B models.
- 💡 Graphlit now supports the Google AI model service which offers the GEMINI_1_5_PRO and GEMINI_1_5_FLASH models.
- We have added support for the latest Groq Llama 3.2 preview models, including LLAMA_3_2_1B_PREVIEW, LLAMA_3_2_3B_PREVIEW, LLAMA_3_2_11B_TEXT_PREVIEW, and LLAMA_3_2_90B_TEXT_PREVIEW. We have also added support for the Llama 3.2 multimodal model LLAMA_3_2_11B_VISION_PREVIEW.
- We have added a new specification parameter to the promptConversation mutation. Now you can specify your initial specification for a new con

# Ingested content [3a2727e0-f370-4b5c-9d04-59e54995f166]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/3a2727e0-f370-4b5c-9d04-59e54995f166/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🛠️	September 2023

# September 24: Support for YouTube feeds; added documentation; bug fixes
### New Features

- 🔥 Graphlit now supports YouTube feeds, where you can ingest a set of YouTube videos, or an entire YouTube playlist or channel. Note, we currently support only the ingestion of audio from YouTube videos, which gets transcribed and added to your conversational knowledge graph.

### New Documentation

- Added documentation for observable entities mutations and queries (Label, Category, Person, Organization, Place, Event, Product, Repo, Software).
- Added documentation for using custom Azure OpenAI and OpenAI models with Specifications

### Bugs Fixed

- GPLA-1459: LLM prompt formatting was exceeding the token budget with long user prompts.
- 

# Ingested content [550d544c-3dc6-4fae-a4d1-89577544d9fc]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/550d544c-3dc6-4fae-a4d1-89577544d9fc/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🛠️	September 2023

# September 20: Paid subscription plans; support for custom observed entities & Azure OpenAI GPT-4
### New Features

- 🔥 Graphlit now supports paid Hobby, Starter and Growth tiers for projects, in addition to the existing Free tier. Starting at $49/mo, plus $0.10/credit for usage, we now support higher quota based on your subscribed tier. By providing a payment method for your organization in the Developer Portal, you can upgrade each project individually to the tier that fits your application's needs.
- 💡 Added GraphQL mutations for the creation, update and deletion of observed entities (i.e. Person, Organization, Place, Product, Event, Label, Category).
- 💡 Added new observed entity types to knowledge graph: Repo (i.e. Git repo),

# Ingested content [85a254e5-593b-4823-b1b9-d48506b794f8]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/85a254e5-593b-4823-b1b9-d48506b794f8/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎃	October 2024

# October 9: Support for GitHub repository feeds, bug fixes
### New Features

- 💡 Graphlit now supports GitHub feeds, by providing the repository owner and name similar to GitHub Issues feeds, and will ingest code files from any GitHub repository.

### Bugs Fixed

- GPLA-3262: Missing row separator in table markdown formatting

PreviousOctober 21: Support OpenAI, Cohere, Jina, Mistral, Voyage and Google AI embedding models
NextOctober 7: Support for Anthropic and Gemini tool calling
Last updated2 months ago 


# Ingested content [3f9aa839-735f-4aa9-a599-c7d59d2898aa]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/3f9aa839-735f-4aa9-a599-c7d59d2898aa/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎃	October 2024

# October 7: Support for Anthropic and Gemini tool calling
### New Features

- 💡 Graphlit now supports tool calling with Anthropic and Google Gemini models.
- ⚡ We have removed the uri property for tools from ToolDefinitionInput, such that inline webhook tools are no longer supported. Now you can define any external tools to be called, and those can support sync or async data access to fulfill the tool call.

PreviousOctober 9: Support for GitHub repository feeds, bug fixes
NextOctober 3: Support tool calling, ingestBatch mutation, Gemini Flash 1.5 8b, bug fixes
Last updated2 months ago 


# Ingested content [49836c55-86df-40ce-9f5e-98b3eb2ebc41]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/49836c55-86df-40ce-9f5e-98b3eb2ebc41/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎃	October 2024

# October 31: Support for simulated tool calling, bug fixes
### New Features

- Graphlit now supports simulated tool calling for LLMs which don't natively support it, such as OpenAI o1-preview and o1-mini. Tool schema will be formatted into the LLM prompt context, and tool responses are parsed out of the JSON formatted response.
- ⚡ Given customer feedback, we have lowered the vector and hybrid thresholds used by the semantic search. Previously, some content at a low relevance was being excluded from the semantic search results. Now, more low-relevance content will be included in the results, used by the RAG pipeline. Reranking can be used to sort the search results for relevance.

### Bugs Fixed

- GPLA-3357: Not extracting all image

# Ingested content [0e61c44c-cae3-4a7a-95f0-081150f5860e]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/0e61c44c-cae3-4a7a-95f0-081150f5860e/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎃	October 2024

# October 3: Support tool calling, ingestBatch mutation, Gemini Flash 1.5 8b, bug fixes
### New Features

- 💡 Graphlit now supports the ingestBatch mutation, which accepts an array of URIs to files or web pages, and will asynchronously ingest these into content objects.
- 💡 Graphlit now supports the continueConversation mutation, which accepts an array of called tool responses. Also, promptConversation now accepts an array of tool definitions. When tools are called by the LLM, the assistant message returned from promptConversation will have a list of toolCalls which need to responded to from your calling code. These responses are to be provided back to the LLM via the continueConversation mutation.
- 💡 Graphlit now supports tool calli

# Ingested content [8e5925f7-0358-42ff-84df-8522c7f265a8]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/8e5925f7-0358-42ff-84df-8522c7f265a8/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎃	October 2024

# October 22: Support for latest Anthropic Sonnet 3.5 model, Cohere image embeddings
### New Features

- Graphlit now supports the latest Anthropic Sonnet 3.5 model (released 10/22/2024). We have added date-versions model enums for the Anthropic models: CLAUDE_3_5_SONNET_20240620, CLAUDE_3_5_SONNET_20241022, CLAUDE_3_HAIKU_20240307, CLAUDE_3_OPUS_20240229, CLAUDE_3_SONNET_20240229. The existing model enums will target the latest released models, as specified by Anthropic.
- Graphlit now supports image embeddings using the Cohere Embed 3.0 models.

PreviousOctober 31: Support for simulated tool calling, bug fixes
NextOctober 21: Support OpenAI, Cohere, Jina, Mistral, Voyage and Google AI embedding models
Last updated2 months ago 


# Ingested content [bf1f3903-d17d-4ec6-b76a-cd7687b49dbf]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/bf1f3903-d17d-4ec6-b76a-cd7687b49dbf/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎒	September 2024

# September 1: Support for FHIR enrichment, latest Cohere models, bug fixes
### New Features

- 💡 Graphlit now supports entity enrichment from Fast Healthcare Interoperability Resources (FHIR) servers. You can provide the endpoint for a FHIR server, and Graphlit will enrich medical-related entities from the data found in the FHIR server.
- Added support for latest Cohere models (COMMAND_R_202408, COMMAND_R_PLUS_202408) and added datestamped model enums for the previous versions (COMMAND_R_202403, COMMAND_R_PLUS_202404). The latest model enums (COMMAND_R and COMMAND_R_PLUS) currently point to the models (COMMAND_R_202403 and COMMAND_R_PLUS_202404) as specified by the Cohere API.
- Added support for the latest Azure AI Document Intell

# Ingested content [cfa6a062-2cfa-4994-a066-6443539c4553]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/cfa6a062-2cfa-4994-a066-6443539c4553/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎃	October 2024

# October 21: Support OpenAI, Cohere, Jina, Mistral, Voyage and Google AI embedding models
### New Features

- 💡 Graphlit now supports the configuration of image and text embedding models, at the Project level. You can create an embedding specification for a text or image embedding model, and then assign that to the Project, and all further embedding requests will use that embedding model. See this Colab notebook for an example of how to configure the project.
- 💡 Graphlit now supports the OpenAI Embedding-3-Small and Embedding-3-Large, Cohere Embed 3.0, Jina Embed 3.0, Mistral Embed, and Voyage 2.0 and 3.0 text embedding models. Graphlit also now supports Jina CLIP image embeddings, which are used by default for image search.
- Graph

# Ingested content [53a0f36a-7c33-4514-901f-4865aa3fe0a9]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/53a0f36a-7c33-4514-901f-4865aa3fe0a9/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎃	October 2023

# October 30: Optimized conversation responses; added observable aliases; bug fixes
### New Features

- 💡 Graphlit now supports 'aliases' of observable names, as the alternateNames property. When an observed entity, such as Organization, is enriched, we store the original name and the enriched name as an alias. For example, "OpenAI" may be enriched to "OpenAI, Inc.", and we store "OpenAI" as an alias, and update the name to "OpenAI, Inc.".
- 💡 Added workflows filter to ContentCriteriaInput type, for filtering content by workflow(s) when creating conversation.
- Optimized formatting of content sources into prompt context, for more accurate conversation responses.
- Optimized formatting of extracted text from Slack messages, for better 

# Ingested content [e1cd970b-fde4-41b4-892a-1a64cb531334]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/e1cd970b-fde4-41b4-892a-1a64cb531334/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎃	October 2023

# October 15: Support for Anthropic Claude models, Slack feeds and entity enrichment
### New Features

- 🔥 Graphlit now supports Anthropic Claude and Anthropic Claude Instant large language models.
- 🔥 Graphlit now supports Slack feeds, and will ingest Slack messages and linked file attachments from a Slack channel. Note, this requires the creation of a Slack bot which has been added to the appropriate Slack channel.
- 💡 Added support for entity enrichment to workflow object, which offers Diffbot, Wikipedia and Crunchbase enrichment of observed entities, such as Person, Organization and Place.
- 💡 Added support for text extraction from images. When using Azure Image Analytics for entity extraction, Graphlit will extract and store any 

# Ingested content [119976d5-ad8a-4a2a-8c4d-e008bc61e848]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/119976d5-ad8a-4a2a-8c4d-e008bc61e848/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🦃	November 2024

# November 4: Support for Anthropic Claude 3.5 Haiku, bug fixes
### New Features

- Graphlit now supports the latest Anthropic Haiku 3.5 model, with the model enum CLAUDE_3_5_HAIKU_20241022.
- ⚡ Once a project has hit the free tier quota, we will now automatically disable all feeds. Once the project has been upgraded to a paid tier, you can use the enableFeed mutation to re-enable your existing feeds to continue ingestion.
- ⚡ We have added the disableFallback flag to the RetrievalStrategyInput type, so you can disable the default behavior of falling back to the previous conversation's contents, or worst-case, falling back to the most recently uploaded content. By setting disableFallback to true, conversations will only attempt to re

# Ingested content [91123175-dc50-4c4e-9421-971e59c75969]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/91123175-dc50-4c4e-9421-971e59c75969/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🦃	November 2024

# November 24: Support for direct LLM prompt, multi-turn image analysis, bug fixes
### New Features

- 💡 Graphlit now supports multi-turn analysis of images with the reviseImage and reviseEncodedImage mutations. You can provide an LLM prompt and either a URI or Base-64 encoded image and MIME type, along with an optional LLM specification. This can be used for analyzing any image and having a multi-turn conversation with the LLM to revise the output from the LLM. (Colab Notebook Example)
- 💡 Graphlit now supports directly prompting an LLM with the prompt mutation, bypassing any RAG content retrieval, while providing an optional list of previous conversation messages. This also accepts an optional LLM specification. (Colab Notebook Exa

# Ingested content [55afbdb7-30a8-4eaa-92aa-02ae98fb57a5]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/55afbdb7-30a8-4eaa-92aa-02ae98fb57a5/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🦃	November 2024

# November 10: Support for web search, multi-turn content summarization, Deepgram language detection
### New Features

- 💡 Graphlit now supports web search with the searchWeb mutation. You can select the search service, either Tavily or Exa.AI, and provide the search query and number of search results to be returned. This is different than the web search feed, in that searchWeb returns the relevant text from the web page and the web page URL from each search hit, but does not ingest each of the web pages. This new mutation is optimized to be used from within an LLM tool.
- 💡 Graphlit now supports multi-turn summarization of content with the reviseContent mutation. You can provide an LLM prompt and a content reference, along with an o

# Ingested content [f9db5d97-0274-4ca3-b867-3f2c46a5359e]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/f9db5d97-0274-4ca3-b867-3f2c46a5359e/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🦃	November 2024

# November 16: Support for image description, multi-turn text summarization
### New Features

- 💡 Graphlit now supports multi-turn summarization of text with the reviseText mutation. You can provide an LLM prompt and text string, along with an optional specification. This can be used for summarizing any raw text and having a multi-turn conversation with the LLM to revise the output from the LLM. (Colab Notebook Example)
- 💡 Graphlit now supports image descriptions using vision LLMs, without needing to ingest the image first. With the new describeImage mutation, which takes a URI, and describeEncodedImage mutation, which takes a Base-64 encoded image and MIME type, you can use any vision LLM to prompt an image description. These mutat

# Ingested content [375b16d0-9e97-44ae-b1a9-4933a2e34836]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/375b16d0-9e97-44ae-b1a9-4933a2e34836/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
💐	May 2024

# May 5: Support for Jina and Pongo rerankers, Microsoft Teams feed, new YouTube downloader, bug fixes
### New Features

- 💡 Graphlit now supports the Jina reranker and Pongo semantic filtering (reranking), in the Specification object. Now you can choose between COHERE, PONGO and JINA for your reranking serviceType.
- 💡 Graphlit now supports Microsoft Teams feeds for reading messages from Teams channels.
- Given changes in YouTube video player HTML, we have rewritten the YouTube downloader to support the new page format.
- Added better handling of HTTP errors when validating URIs. Previously some websites were returning HTTP 403 (Forbidden) errors when validating their URI, or downloading content. Now Graphlit is able to scrape these site

# Ingested content [8d95b393-a271-4cb7-956b-6f2808937151]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/8d95b393-a271-4cb7-956b-6f2808937151/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
💐	May 2024

# May 15: Support for GraphRAG, OpenAI GPT-4o model, performance improvements and bug fixes
### New Features

- 💡 Graphlit now supports GraphRAG, where the extracted entities in the knowledge graph can be added as additional context to your RAG con,versation. Also, with GraphRAG, entities can be extracted from the user prompt, and used as additional content filters - or can be used to query related content sources, which are combined with the vector search results. This can be configured by specifying your graphStrategy in the Specification object.
- 💡 Graphlit now supports LLM revisions within RAG conversations, where the LLM can be prompted to revise its initial completion response. From our testing, this has been shown to provide 35% m

# Ingested content [2611e484-0933-494b-bebe-511835b64bb9]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/2611e484-0933-494b-bebe-511835b64bb9/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🍀	March 2024

# March 23: Support for Linear, GitHub Issues and Jira issue feeds, ingest files via Web feed sitemap
### New Features

- 💡 Graphlit now supports Linear, GitHub Issues and Atlassian Jira feeds. Graphlit will ingest issues (aka tasks, stories) from these issue-tracking services as individual content items, which will be made searchable and conversational.
- 💡 Added support for ISSUEcontent type, which includes metadata such as title, authors, commenters, status, type, project and team.
- 💡 Added support for default feed read limit. Now, if you don't specify the readLimit property on feeds, it will default to reading 100 content items. You can override this default by assigning a custom read limit, which has no upper bounds. However, one-

# Ingested content [feb66b5f-ba9e-448e-9909-ba5e2fdbf56d]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/feb66b5f-ba9e-448e-9909-ba5e2fdbf56d/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🍀	March 2024

# March 13: Support for Claude 3 Haiku model, direct ingestion of Base64 encoded files
### New Features

- 💡 Graphlit now supports the Claude 3 Haiku model.
- Added support for direct ingestion of Base64 encoded files with the ingestEncodedFile mutation. You can pass a Base64 encoded string and MIME type of the file, and it will be ingested into the Graphlit Platform.
- Added modelService and model properties to ConversationMessage type, which return the model service and model which was used for the LLM completion.

PreviousMarch 23: Support for Linear, GitHub Issues and Jira issue feeds, ingest files via Web feed sitemap
NextMarch 10: Support for Claude 3, Mistral and Groq models, usage/credits telemetry, bug fixes
Last updated8 month

# Ingested content [3b356809-5179-4144-80bc-baf4dd184462]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/3b356809-5179-4144-80bc-baf4dd184462/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🍀	March 2024

# March 10: Support for Claude 3, Mistral and Groq models, usage/credits telemetry, bug fixes
### New Features

- 💡 Graphlit now supports a Command-Line Interface (CLI) for directly accessing the Graphlit Data API without writing code. See the documentation here.
- 💡 Graphlit now supports the Groq Platform, and models such as Mixtral 8x7b.
- 💡 Graphlit now supports Claude 3 Opus and Sonnet models.
- 💡 Graphlit now supports Mistral La Plateforme, and models such as Mistral Small, Medium, and Large and Mixtral 8x7b.
- 💡 Graphlit now supports the latest v4 of Azure Document Intelligence, including their new models such as Credit Card, Marriage Certificate, and Mortgage documents.
- Added support for detailed usage and credits telemetry via

# Ingested content [bbe12218-8806-409d-a79b-148d8f0eb515]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/bbe12218-8806-409d-a79b-148d8f0eb515/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎓	June 2024

# June 21: Support for the Claude 3.5 Sonnet model, knowledge graph semantic search, and bug fixes
### New Features

- 💡 Graphlit now supports the Anthropic Claude 3.5 Sonnet model, which can be assigned with the CLAUDE_3_5_SONNET model enum.
- 💡 Graphlit now supports semantic search of observable entities in the knowledge graph, such as Person, Organization and Place. These entity types will now have vector embeddings created from their enriched metadata, and support searching by similar text, and searching by similar entities.
- ⚡ We have changed the Google Drive and Google Email feed properties to require the Google OAuth client ID and client secret, along with the existing refresh token, for proper authentication against Google APIs.

# Ingested content [dd2cef1a-5292-4654-a2e2-d40b4f82db84]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/dd2cef1a-5292-4654-a2e2-d40b4f82db84/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
☀️	July 2024

# July 4: Support for webhook Alerts, keywords summarization, Deepseek 128k context window, bug fixes
### New Features

- 💡 Graphlit now supports webhook Alerts. In addition to Slack notifications, you can now receive an HTTP POST webhook with the results of the published text (or text and audio URI) from a prompted alert.
- Updated the Deepseek chat and coder models to support a 128k token context window.
- Added customSummary property to Content object, which returns the custom summary generated via preparation workflow.
- Added keywords summarization type, which is now stored in keywords property in Content object.
- Added slackChannels query, which returns the list of Slack channels from the workspace authenticated by the Slack bot 

# Ingested content [595fbc8a-88d8-4e2f-a3f0-33a86c892d76]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/595fbc8a-88d8-4e2f-a3f0-33a86c892d76/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎓	June 2024

# June 9: Support for Deepseek models, JSON-LD webpage parsing, performance improvements and bug fixes
### New Features

- 💡 Graphlit now supports Deepseek LLMs for prompt completion. We offer the deepseek-chat and deepseek-coder models.
- 💡 Graphlit now supports parsing embedded JSON-LD from web pages. If a web page contains 'script' tags with JSON-LD, we will automatically parse and inject into the knowledge graph.
- ⚡ We have changed the default model for entity extraction and image completions to be OpenAI GPT-4o. This provides faster performance and better quality output.
- ⚡ We have changed the behavior of knowledge graph generation, from a prompted conversation, to be opt-in. In order to receive the graph's nodes and edges with th

# Ingested content [ce68f926-3ab6-4a73-8f33-3f5e552c2714]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/ce68f926-3ab6-4a73-8f33-3f5e552c2714/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
☀️	July 2024

# July 28: Support for indexing workflow stage, Azure AI language detection, bug fixes
### New Features

- Added indexing workflow stage. This provides for configuration of indexing services, which may infer metadata from the content.
- Added AZURE_AI_LANGUAGE content indexing service, which supports inferring the language of extracted text or transcript.
- Added support for language content metadata. This returns a list of languages in ISO 639-1 format, which may have been inferred from the extracted text or transcript.
- Added support for MODEL_IMAGE extraction service. This provides integration with vision models beyond those provided by OpenAI. You can assign a custom specification and bring-your-own API key for image extraction mod

# Ingested content [23ae5fe0-5e6d-411c-b219-5ebaf593062e]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/23ae5fe0-5e6d-411c-b219-5ebaf593062e/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
☀️	July 2024

# July 25: Support for Mistral Large 2 & Nemo, Groq Llama 3.1 models, bug fixes
### New Features

- 💡 Graphlit now supports the Mistral Large 2 and Mistral Nemo models. The existing MISTRAL_LARGE model enum now will use Mistral Large 2.
- 💡 Graphlit now supports the Llama 3.1 8b, 70b and 405b models on Groq. (Note, these are rate-limited according to Groq's platform constraints.)
- Added support for revision strategy on data extraction specifications. Now you can prompt the LLM to revise its previous data extraction response, similar to the existing completion revision strategy.
- Added version property for AzureDocumentPreparationProperties type for assigning the API version used by Azure AI Document Intelligence. By default, Graphlit 

# Ingested content [51e379d2-307c-42f3-b5cf-eb3b02c6f068]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/51e379d2-307c-42f3-b5cf-eb3b02c6f068/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎇	July 2023

# July 15: Support for SharePoint feeds, new Conversation features
### New Features

- 💡 Added support for SharePoint feeds: now can create feed to ingest files from SharePoint document library (and optionally, folder within document library)
- 💡 Added support for PII detection during entity extraction from text documents and audio transcripts: now we will create labels such as PII: Social Security Number automatically when PII is detected
- 💡 Added support for developer's own OpenAI API keys and Azure OpenAI deployments in Specifications
- ℹ️ Changed semantics of deleteFeed to delete the contents ingested by the feed; since contents are linked to feeds, now feeds can be disabled, while keeping the lineage to the feed, and if feeds are d

# Ingested content [636bd286-d807-4838-af4d-0c37d24772b7]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/636bd286-d807-4838-af4d-0c37d24772b7/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
☀️	July 2024

# July 19: Support for OpenAI GPT-4o Mini, BYO-key for Azure AI, similarity by summary, bug fixes
### New Features

- 💡 Graphlit now supports the OpenAI GPT-4o Mini model, with 16k output tokens.
- 💡 Graphlit now supports 'bring-your-own-key' for Azure AI Document Intelligence models. We have added a custom endpoint and key property, which can be assigned to use your own Azure AI resource.
- Updated to use Jina reranker v2 (jina-reranker-v2-base-multilingual) by default.
- Updated to assign the summary, bullets, etc properties when calling summarizeContents mutation. Now when summarizing contents, we will store the resulting summary in the content itself, in addition to returning the summarized results.
- Added relevance property to all

# Ingested content [5c9b0f96-4efc-4319-b2c2-0fc27875f030]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/5c9b0f96-4efc-4319-b2c2-0fc27875f030/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎆	January 2024

# January 22: Support for Google and Microsoft email feeds, reingest content in-place, bug fixes
### New Features

- 💡 Graphlit now supports Google and Microsoft email feeds. Email feeds can be created to ingest past emails, or poll for new emails. Emails create an EMAIL content type. Attachment files can optionally be extracted from emails, and will be linked to their parent email content. If assigning a workflow to the feed, the workflow will be applied both to the email content and the extracted attachment files.
- 💡 Graphlit now supports reingesting content in-place. The ingestText, ingestPage and ingestFile mutations now take an optional id parameter for an existing content object. If this id is provided, the existing content wil

# Ingested content [3f2f43b0-49a5-462a-88c6-e0ecd92f41c1]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/3f2f43b0-49a5-462a-88c6-e0ecd92f41c1/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎆	January 2024

# January 18: Support for content publishing, LLM tools, CLIP image embeddings, bug fixes
### New Features

- 💡 Graphlit now supports content publishing, where documents, audio transcripts and even image descriptions, can be summarized, and repurposed into blog posts, emails or AI-generated podcasts. With the new publishContents mutation, you can configure LLM prompts for summarization and publishing, and assign specifications to use different models and/or system prompts for each step in the process. The published content will be reingested into Graphlit, and can be searched or used for conversations, like any other form of content.
- 💡 Graphlit now supports publishing conversations as content with the new publishConversation mutatio

# Ingested content [a41c7ac9-806a-4fd2-9652-759775a70d8c]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/a41c7ac9-806a-4fd2-9652-759775a70d8c/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🌧️	February 2024

# February 21: Support for OneDrive and Google Drive feeds, extract images from PDFs, bug fixes
### New Features

- 💡 Graphlit now supports OneDrive and Google Drive feeds. Files can be ingested from OneDrive or Google Drive, including shared drives where the authenticated user has access. Both OneDrive and Google Drive support the reading of existing files, and tracking new files added to storage with recurrent feeds.
- 💡 Graphlit now supports email backup files, such as EML or MSG, which will be assigned the EMAIL file type. During email file preparation, we will automatically extract and ingest any file attachments.
- 💡 Graphlit now automatically extracts embedded images in PDF files, ingests them as content objects, and links th

# Ingested content [cd3f1b88-90bb-4de6-a91a-4c74ecfbcf5c]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/cd3f1b88-90bb-4de6-a91a-4c74ecfbcf5c/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎄	December 2024

# December 1: Support for retrieval-only RAG pipeline, bug fixes
### New Features

- 💡 Graphlit now supports formatting of LLM-ready prompts with our RAG pipeline, via the new formatConversation and completeConversation mutations. This is valuable for supporting LLM streaming by directly calling the LLM from your application, and using Graphlit for RAG retrieval and conversation history. (Colab Notebook Example)
- We have added support for inline hyperlinks in extracted text from documents and web pages.

### Bugs Fixed

- GPLA-3466: Owner ID should accept any non-whitespace string
- GPLA-3458: Not getting Person-to-Organization edges from entity extraction

PreviousDecember 9: Support for website mapping, web page screenshots, Groq 

# Ingested content [bae5b595-6a1e-4dcb-95aa-3eaf6b55cc52]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/bae5b595-6a1e-4dcb-95aa-3eaf6b55cc52/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🌧️	February 2024

# February 2: Support for Semantic Alerts, OpenAI 0125 models, performance enhancements, bug fixes
### New Features

- 💡 Graphlit now supports Semantic Alerts, which allows for LLM summarization and publishing of content, on a periodic basis. This is useful for generating daily reports from email, Slack or other time-based feeds. Alerts support the same publishing options, i.e. audio and text, as the publishContents mutation.
- 💡 Graphlit now supports the latest OpenAI 0125 model versions, for GPT-4 and GPT-3.5 Turbo. We will add support for Azure OpenAI when Microsoft releases support for these.
- Slack feeds now support a listing type field, where you can specify if you want PAST or NEW Slack messages in the feed.
- 🔥 This release

# Ingested content [da1c0b2c-4bbf-4467-af16-f80122aeedd3]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/da1c0b2c-4bbf-4467-af16-f80122aeedd3/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎄	December 2024

# December 9: Support for website mapping, web page screenshots, Groq Llama 3.3 model, bug fixes
### New Features

- 💡 Graphlit now supports mapping a website with the mapWebmutation. You can provide a URL to a website, and the query will return a list of URLs based on the sitemap.xml (or sitemap-index.xml) file, at or underneath the provided URL.
- 💡 Graphlit now supports the generation of web page screenshots with the screenshotPagemutation. By providing the URL of a web page, and optionally, the maximum desired height of the screenshot, we will screenshot the webpage and ingest it automatically as content. You can provide an optional workflow, which will be applied to the ingested image content, for operations like generating imag

# Ingested content [7fff5e48-bdc8-4dbc-ad26-da250228e7ee]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/7fff5e48-bdc8-4dbc-ad26-da250228e7ee/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎄	December 2024

# December 22: Support for Dropbox, Box, Intercom and Zendesk feeds, OpenAI o1, Gemini 2.0, bug fixes
### New Features

- 💡 Graphlit now supports Dropbox feeds for ingesting files on the Dropbox cloud service. Dropbox feeds require your appKey, appSecret, redirectUriand refreshTokento be assigned. The feed also accepts an optional pathparameter to read files from a specific Dropbox folder.
- 💡 Graphlit now supports Box feeds for ingesting files on the Box cloud service. Box feeds require your clientId, clientSecret, redirectUriand refreshTokento be assigned.
- 💡 Graphlit now supports Intercom feeds for ingesting Intercom Articles and Tickets. We will ingest Intercom Articles as PAGEcontent type, and Tickets as ISSUEcontent type. Inte

# Ingested content [0ec87d20-b463-4f65-a2f4-e32bc44f1b0d]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/0ec87d20-b463-4f65-a2f4-e32bc44f1b0d/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎄	December 2023

# December 10: Support for OpenAI GPT-4 Turbo, Llama 2 and Mistral models; query by example, bug fixes
### New Features

- 💡 Graphlit now supports the OpenAI GPT-4 Turbo 128k model, both in Azure OpenAI and native OpenAI services. Added new model enum GPT4_TURBO_VISION_128K.
- 💡 Graphlit now supports Llama 2 7b, 13b, 70b models and Mistral 7b model, via Replicate. Developers can use their own Replicate API key, or be charged as credits for Graphlit usage.
- 💡 Graphlit now supports the Anthropic Claude 2.1 model. Added new model enum CLAUDE_2_1.
- 💡 Graphlit now supports the OpenAI GPT-4 Vision model for image descriptions and text extraction. Added new model enum GPT4_TURBO_VISION_128K. See usage example in "Multimodal RAG" blog post

# Ingested content [a9adfd3e-0f66-40b0-9039-c00cd8fbf761]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/a9adfd3e-0f66-40b0-9039-c00cd8fbf761/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎂	August 2024

# August 8: Support for LLM-based document extraction, .NET SDK, bug fixes
### New Features

- 💡 Graphlit now supports LLM-based document preparation, using vision-capable models such as OpenAI GPT-4o and Anthropic Sonnet 3.5. This is available via the MODEL_DOCUMENT preparation service type, and you can assign a customspecification object and bring your own LLM keys.
- 💡 Graphlit now provides an open source .NET SDK, supporting .NET 6 and .NET 8 (and above). SDK package can be found on Nuget.org. Code samples can be found on GitHub.
- Added identifier property to Content object for mapping content to external database identifiers. This is supported for content filtering as well.
- Added support for Claude 3 vision models for image-bas

# Ingested content [26deb281-fa20-4f6d-839b-31033675141b]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/26deb281-fa20-4f6d-839b-31033675141b/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎂	August 2024

# August 20: Support for medical entities, Anthropic prompt caching, bug fixes
### New Features

- 💡 Graphlit now supports the extraction of medical-related entities: MedicalStudy, MedicalCondition, MedicalGuideline, MedicalDrug, MedicalDrugClass, MedicalIndication, MedicalContraindication, MedicalTest, MedicalDevice, MedicalTherapy, and MedicalProcedure.
- 💡 Graphlit now supports medical-related entities in GraphRAG, and via API for queries and mutations.
- Added support for Anthropic prompt caching. When using Anthropic Sonnet 3.5 or Haiku 3, Anthropic will now cache the entity extraction and LLM document preparation system prompts, which saves on token cost and increases performance.

### Bugs Fixed

- GPLA-3104: Should default sear

# Ingested content [4f6da3ce-9feb-4ecb-9c05-dcefa5c6de6b]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/4f6da3ce-9feb-4ecb-9c05-dcefa5c6de6b/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎂	August 2023

# August 3: New data model for Observations, new Category entity
### New Features

- 💡 Revised data model for Observations, Occurrences and observables (i.e. Person, Organization). Now after entity extraction, content will have one Observation for each observed entity, and a list of occurrences. Occurrence now supports text, time and image occurrence types. (Text: page index, time: start/end timestamp, image: bounding box) Observations now have ObservableType and Observable fields, which specify the observed entity type and entity reference.
- 💡 Added Category entity to GraphQL data model, which supports PII categories such as Phone Number or Credit Card Number.
- Added probability field to model properties, for the LLM's token probabi

# Ingested content [0cbb6490-3cff-4aa5-8d05-b94ab7c3d531]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/0cbb6490-3cff-4aa5-8d05-b94ab7c3d531/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎂	August 2024

# August 11: Support for Azure AI Document Intelligence by default, language-aware summaries
### New Features

- Added support for language-aware summaries when using LLM-based document extraction. Now the summaries for tables and sections generated by the LLM will follow the language of the source text.
- Added support for language-aware entity descriptions with using LLM-based entity extraction. Now the entity descriptions generated by the LLM will follow the language of the source text.
- ⚡ We have changed the default document preparation method to use Azure AI Document Intelligence, rather than our built-in document parsers. We have found that the fidelity of Azure AI is considerably better for complex PDFs, and provides better sup

# Ingested content [448971ac-c917-4d00-a846-52a5844472f9]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/448971ac-c917-4d00-a846-52a5844472f9/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎂	August 2023

# August 9: Support direct text, Markdown and HTML ingestion; new Specification LLM strategy
### New Features

- 💡 Added ingestText mutation which supports direct Content ingestion of plain text, Markdown and HTML. Now, if you have pre-scraped HTML or Markdown text, you can ingest it into Graphlit without reading from a URL.
- 💡 Added Specification strategy property, which allows customization of the LLM context when prompting a conversation. ConversationStrategy now provides Windowed and Summarized message histories, as well as configuration of the weight between existing conversation messages and Content text pages (or audio transcript segments) in the LLM context.
- 💡 Added auto-summarization of extracted text and audio transcripts.

# Ingested content [a540acea-c8ed-4352-a5cc-91c3dbe4ba91]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/a540acea-c8ed-4352-a5cc-91c3dbe4ba91/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎂	August 2023

# August 17: Prepare for usage-based billing; append SAS tokens to URIs
### New Features

- ℹ️ Behind the scenes, Graphlit is preparing to launch usage-based billing. This release put in place the infrastructure to track billable events. Organizations now have a Stripe customer associated with them, and Graphlit projects are auto-subscribed to a Free/Hobby pricing plan. In a future release, we will provide the ability to upgrade to a paid plan in the Graphlit Developer Portal. Also, we will provide visualization of usage, on granular basis, in the Portal.
- 💡 Content URIs now have Shared Access Signature (SAS) token appended, so they are accessible after query. For example, content.transcriptUri will now be able to be downloaded or use

# Ingested content [cd1c5052-f3bb-4747-b785-5e83ed6dfd80]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/cd1c5052-f3bb-4747-b785-5e83ed6dfd80/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🐇	April 2024

# April 23: Support for Python and TypeScript SDKs, latest OpenAI, Cohere & Groq models, bug fixes
### New Features

- 💡 Graphlit now supports a native Python SDK, using Pydantic types. The Python SDK is code-generated from the current GraphQL schema, but does not require GraphQL knowledge. You can find the latest PyPi package here. The Streamlit sample applications have been updated to use the new Python SDK.
- 💡 Graphlit now supports a native Node.js SDK, using TypeScript types. The Node.js SDK is code-generated from the current GraphQL schema, but does not require GraphQL knowledge. You can find the latest NPM package here.
- 💡 Graphlit now supports the 2024-04-09 models in the OpenAI model service. GPT4_TURBO-128K will give the late

# Ingested content [39b8a12e-de8a-41c9-9224-4b0e8e0909d8]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/39b8a12e-de8a-41c9-9224-4b0e8e0909d8/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🐇	April 2024

# April 7: Support for Discord feeds, Cohere reranking, section-aware chunking and retrieval
### New Features

- 💡 Graphlit now supports Discord feeds. By connecting to a Discord channel and providing a bot token, you can ingest all Discord messages and file attachments.
- 💡 Graphlit now supports Cohere reranking after content retrieval in RAG pipeline. You can optionally use the Cohere rerank model to semantically rerank the semantic search results, before providing as context to the LLM.
- Added support for section-aware text chunking and retrieval. Now, when using section-aware document preparation, such as Azure AI Document Intelligence, Graphlit will store the extracted text according to the semantic chunks (i.e. sections). The tex

# Ingested content [e46ab310-60bd-439d-9471-0cfebd1843cd]

Text Mezzanine: https://graphlit20241212dc396403.blob.core.windows.net/files/e46ab310-60bd-439d-9471-0cfebd1843cd/Mezzanine/page.json?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D
🎄	December 2024

# December 27: Support for LLM fallbacks, native Google Docs formats, website unblocking, bug fixes
### New Features

- 💡 Graphlit now supports LLM fallbacks which can help protect your application from model provider downtime. By assigning the fallbacksproperty when creating your conversation, you can provide an optional list of LLM specifications to be used (in order). These fallback specifications will only be used when we failed to prompt the conversation via the main specification. Caveat, the RAG pipeline will only use the strategies provided in the main specification for prompt rewriting, content retrieval, etc. Content is not re-retrieved upon fallback - the formatted LLM prompt will be tried against each fallback specificati

In [8]:
# Assign the ElevenLabs voice ID to use
voice_id = "ZF6FPAbjXT4488VcRRnw" # ElevenLabs Amelia voice

# Prompt which gets run on each web page to summarize key points
summary_prompt = """
You are an AI assistant that extracts the most important information from product changelog pages.

You are being provided a changelog web page for one of many releases of the Graphlit Platform in 2024.

Your task is to produce a concise summary that covers:

New Features – Briefly list or describe each new capability.
Enhancements/Improvements – Any notable improvements or changes.
Bug Fixes – Summaries of what was fixed and why it matters.
Other Key Details – Any version numbers, feature flags, or breaking changes.
Dates - When a feature was released
Value - What this offers to developers.
Keep it succinct, accurate, and organized. Use short sentences or bullet points so it’s easy to incorporate into a map/reduce pipeline. Omit any superfluous text.

Output:
A concise summary in bullet points highlighting the essential updates from the changelog.
"""

# Prompt which gets run against all summaries (in map/reduce manner) to generate final script for ElevenLabs audio
publish_prompt = """
You are an enthusiastic host focused on developer marketing, and you work for Graphlit who is creating a 2024 year-in-review of their API-based platform.

Don't refer to yourself in the script. Just talk to the audience.

Don't add in any podcast-like references like intro music, sound effects, etc.  This will be used with a text-to-speech API to generate an audio recording.

Your audience is somewhat technical — software engineers, product builders, and tech-savvy product managers — so the script should be clear, concise, and sprinkled with a bit of technical depth.

Using the provided changelog for the Graphlit Platform, create a podcast-like script that:

- Sets the stage with a warm, engaging introduction.
- Highlights each new feature, explaining how it helps developers or teams be more productive, efficient, or creative.
- Refers to when a feature was released.
- Mentions any model updates and why they matter for technical use cases.
- Reviews notable bug fixes, providing just enough context to show the improvements without overwhelming detail.
- Closes with a quick recap and a call to action, encouraging listeners to try out the new features or learn more.

At the very end, mention that the listener can signup for free at graphlit.com and try out all these features.
Also, mention that in 2025, Graphlit will be offering exciting new features to accelerate the building of AI agents.

The tone should be friendly, positive, and confident—like a technology evangelist who’s genuinely excited about these updates.

Keep it interesting and conversational, but maintain enough depth to engage developers who care about how things work under the hood.
Use analogies or practical examples to illustrate why certain features are useful.
Feel free to add transitions such as “Now, let’s dive in,” or “Moving on to our next highlight” to keep it flowing.

Output: A detailed, TTS-ready 10-minute long script that hits all the points above.
"""

if feed_id is not None:
    summary_specification_id = await create_specification(enums.OpenAIModels.GPT4O_MINI_128K)

    if summary_specification_id is not None:
        print(f'Created summary specification [{summary_specification_id}]:')

        publish_specification_id = await create_specification(enums.OpenAIModels.O1_200K)

        if publish_specification_id is not None:
            print(f'Created publish specification [{publish_specification_id}]:')

            display(Markdown(f'### Publishing Contents...'))

            published_content_id = await publish_contents(feed_id, summary_specification_id, publish_specification_id, summary_prompt, publish_prompt, publish_correlation_id, voice_id)

            if published_content_id is not None:
                print(f'Completed publishing content [{published_content_id}].')

                # Need to reload content to get presigned URL to MP3
                published_content = await get_content(published_content_id)

                if published_content is not None:
                    display(Markdown(f'### Published [{published_content.name}]({published_content.audio_uri})'))

                    display(HTML(f"""
                    <audio controls>
                    <source src="{published_content.audio_uri}" type="audio/mp3">
                    Your browser does not support the audio element.
                    </audio>
                    """))

                    # After the audio is generated, we ingest the MP3 as a new content object in Graphlit, and it gets auto-transcribed
                    display(Markdown('### Transcript'))
                    display(Markdown(published_content.markdown))


Created summary specification [7cf625a7-1add-4983-8c19-c15328c9aa1f]:
Created publish specification [a58f46d3-59d6-4727-959f-06e57ca42611]:


### Publishing Contents...

Completed publishing content [3932f651-18ae-40d1-b548-51b6612ec4d8].


### Published [Published Summary.mp3](https://graphlit20241212dc396403.blob.core.windows.net/files/3932f651-18ae-40d1-b548-51b6612ec4d8/Mezzanine/Published%20Summary.mp3?sv=2025-01-05&se=2024-12-29T09%3A26%3A13Z&sr=c&sp=rl&sig=k3QvrRHHMyWK6xXy5Vc2jkdg9RZ3SFIa%2BJ1qCFMQuUs%3D)

### Transcript

[00:00:00] Hello, everyone,

[00:00:01] and welcome to a special year in review for the Graphlet platform.

[00:00:06] It's been an incredible stretch of milestones

[00:00:10] and enhancements

[00:00:11] all designed to make your development workflow simpler,

[00:00:14] more powerful,

[00:00:15] and more creative.

[00:00:18] Let's explore how each month's updates can help you, whether you're coding in Python, dot net, JavaScript, or any other environment.

[00:00:27] Let's start back in July 2023.

[00:00:31] One key addition was SharePoint feed support, allowing direct file ingestion from SharePoint libraries.

[00:00:37] This came with new conversation features, such as timestamped messages

[00:00:42] to help keep your discussions organized and relevant.

[00:00:45] These upgrades reduced friction when pulling in data from corporate SharePoint,

[00:00:50] making you more efficient in building knowledge based applications.

[00:00:54] Moving into August 2023,

[00:00:56] Grafler introduced a new data model for observations,

[00:01:00] enabling more structured ways to track entities like people or organizations.

[00:01:06] Then came a direct text ingestion for plain text, markdown,

[00:01:10] or HTML,

[00:01:11] eliminating the need for a file or URL.

[00:01:14] This is perfect when you just want to send raw data into Graphlet and get back structured content or embeddings for analysis.

[00:01:23] Also, in August, usage based billing infrastructure was rolled out.

[00:01:27] While still offering a free tier to get started right away, it set the stage for future expansions of credit based

[00:01:34] usage, helpful for devs who prefer pay as you go models.

[00:01:39] September 2023

[00:01:40] ushered in a wave of productivity features.

[00:01:43] Workflow configuration

[00:01:44] let you define how your content passes through ingestion,

[00:01:48] enrichment, or summarization

[00:01:49] steps with support for Notion feeds to quickly import your team's notes and pages.

[00:01:56] Paid subscription plans also launched, providing hobby, starter,

[00:02:00] and growth tiers,

[00:02:02] each with different quotas.

[00:02:04] For those who love video content, YouTube feed support arrived,

[00:02:09] complete with automatic transcription for ingesting the audio.

[00:02:13] Now your dev team can treat video content just like any text document

[00:02:18] and search or summarize it with ease.

[00:02:21] October

[00:02:22] 2023

[00:02:23] focused on deeper language model integrations.

[00:02:26] Anthropic Claude models joined the party for advanced text completions,

[00:02:30] while Slack feed ingestion

[00:02:33] made it easier for your dev teams to bring chat messages

[00:02:36] and file attachments

[00:02:37] into the knowledge base.

[00:02:40] The platform also

[00:02:41] expanded entity enrichment,

[00:02:44] letting you layer in details from sources like Wikipedia

[00:02:47] or Crunchbase.

[00:02:49] By the end of the month,

[00:02:51] conversation responses were further optimized, and you could add aliases to your observed entities,

[00:02:56] especially handy when there are multiple ways to refer to the same person or place.

[00:03:02] December

[00:03:03] 2023

[00:03:03] brought some major model updates.

[00:03:06] The platform introduced OpenAI GPT 4 Turbo with a 128

[00:03:11] k token window and also added llama 2, Mistral, and anthropic Claude 2.1.

[00:03:19] All these models expand your options for generating text,

[00:03:22] summarizing content, or performing semantic searches.

[00:03:27] There was also support for query by example,

[00:03:30] letting you simply submit a sample snippet of text or conversation to find related content.

[00:03:37] Jumping into January 2024,

[00:03:39] new content publishing features transformed how you repurpose

[00:03:43] and summarize documents,

[00:03:45] audio transcripts,

[00:03:46] and images.

[00:03:49] LLM tools, including function calls in OpenAI models,

[00:03:53] make it simpler to integrate prompt driven tasks directly into your code.

[00:03:59] Additionally, Google and Microsoft email feeds became 1st class citizens,

[00:04:03] enabling ingestion of both historic and new emails,

[00:04:07] attachments included.

[00:04:10] That month also introduced reingestion capabilities

[00:04:13] so you can update existing content in place with minimal fuss.

[00:04:18] February

[00:04:19] 2024

[00:04:20] introduced semantic alerts

[00:04:22] for automatically generating daily or periodic summaries,

[00:04:26] perfect for receiving quick bulletins on everything new in your workspace.

[00:04:31] OneDrive and Google Drive feed support was expanded,

[00:04:35] including the ability to ingest images

[00:04:38] from PDFs or to capture embedded image

[00:04:41] images in your documents.

[00:04:43] These features cut down on tedious overhead

[00:04:46] so you can focus on building robust apps.

[00:04:50] March 2 24 saw comprehensive usage and credits telemetry

[00:04:55] so you can monitor precisely how many tokens or data credits your team is using.

[00:05:01] For devs who prefer command line utilities,

[00:05:04] a new CLI tool was released for efficient graphlet data

[00:05:08] API interactions.

[00:05:10] Further in March came support for additional issue tracking feeds,

[00:05:14] linear, GitHub issues, and Jira,

[00:05:17] turning your project tickets into fully searchable items.

[00:05:21] And for file management, you could now ingest base 64 encoded files directly.

[00:05:28] April 2024

[00:05:30] was all about chat and ingestion flows.

[00:05:33] Discord feed support arrived,

[00:05:35] letting you bring chat messages

[00:05:37] and attached media directly into Graphlet.

[00:05:40] Cohere re ranking and improved text chunking

[00:05:43] enhanced the retrieval augmented generation pipeline.

[00:05:47] Most notably, native SDKs in Python and TypeScript

[00:05:51] came out later in April.

[00:05:53] Code generated from the GraphQL schema, so you no longer need to be a GraphQL expert to start building.

[00:06:00] There were also brand new model integrations for OpenAI's

[00:06:04] g GPT

[00:06:05] 4 expansions

[00:06:07] as well as support for advanced

[00:06:09] and coherent models.

[00:06:11] May 2024

[00:06:13] introduced new re ranking services from Jina and Pongo,

[00:06:17] plus the brand new graphRAG feature,

[00:06:19] a system that harnesses extracted entities to supercharge your your retrieval augmented generation workflow.

[00:06:27] Open AI GPT 4 o replaced Azure Open AI GPT 3.5,

[00:06:33] 16 k as the default model,

[00:06:35] promising a more robust experience.

[00:06:38] For Microsoft Teams fans, feed support allowed you to ingest channel messages automatically

[00:06:44] while performance optimizations

[00:06:46] in entity extraction boosted speeds for large datasets.

[00:06:51] In June 2024,

[00:06:53] developers gained the ability to pass embedded JSON LD data from web pages,

[00:06:59] enriching

[00:07:00] your knowledge graph with schema.org

[00:07:02] or other structured data.

[00:07:05] Deep seek model support was introduced. So

[00:07:08] if you prefer their AI completions and coders,

[00:07:11] you have that option too.

[00:07:14] Then in late June came the Claude 3.5 Sonnet model and semantic search for observable entities,

[00:07:21] a boon for advanced knowledge graph queries when dealing with people,

[00:07:26] organizations,

[00:07:27] or places.

[00:07:29] July 2024

[00:07:31] was packed.

[00:07:33] Webhook alerts went live,

[00:07:35] enabling automatic notifications when new content is published,

[00:07:39] and the deep seek 128 k token context was introduced

[00:07:43] to handle giant transcripts or text blocks in a single pass.

[00:07:48] Then soon after, gpt4

[00:07:50] o Mini arrived,

[00:07:52] offering a more lightweight

[00:07:54] gpt4

[00:07:55] variant for faster completions.

[00:07:57] Updates to Mistral Large 2 and Nemo

[00:08:00] plus new GrokLama

[00:08:02] 3.1 models delivered even more alternatives to handle your AI workloads.

[00:08:07] The next update introduced an indexing workflow stage and Azure AI language detection

[00:08:14] so you can automatically detect and store the languages a piece of content is written in.

[00:08:20] In August 2024,

[00:08:22] the platform unveiled an open source dotnet SDK on Nougat,

[00:08:26] giving

[00:08:27] Csearchiles developers

[00:08:28] first class

[00:08:30] access to the Graphlet APIs.

[00:08:33] Document preparation

[00:08:34] switched to Azure AI document intelligence by default,

[00:08:38] boosting accuracy for complex PDFs.

[00:08:41] Medical entities were also added,

[00:08:44] letting you extract detailed references

[00:08:46] to medical conditions,

[00:08:48] drugs, or procedures.

[00:08:50] Meanwhile, anthropic prompt caching

[00:08:53] reduced token costs,

[00:08:55] speeding up repeated queries.

[00:08:58] September 24 stepped up the AI model offerings further with FHIR enrichment for health care data and the newly updated Coherent models.

[00:09:08] Google AI search feeds and Cerebras model support also landed,

[00:09:12] plus more advanced grok llama versions.

[00:09:15] By now, you could combine advanced conversation prompts with fallback retrieval of relevant content,

[00:09:22] ensuring your AI app stays robust even if the main model hits a snag.

[00:09:26] October 2, 2024 introduced a flurry of tool calling capabilities

[00:09:31] across Anthropic, Google Gemini, Mistral, and more.

[00:09:36] This let large language models handle tasks by calling external tools in real time, bridging the gap between pure text generation

[00:09:45] and practical app logic.

[00:09:47] GitHub repository feeds also debuted so you can auto ingest code files

[00:09:52] while conversation interactions got new ways to simulate tool calls or to pass tool responses from JSON.

[00:10:00] November

[00:10:01] 2024

[00:10:02] delivered web search features.

[00:10:04] Imagine searching the open web,

[00:10:06] retrieving

[00:10:07] relevant pages,

[00:10:09] and combining them with your content all in one shot.

[00:10:14] Multi turn text summarization

[00:10:16] and multi turn image analysis

[00:10:18] made it possible to revise or refine a summary

[00:10:22] over a series of LLM prompts.

[00:10:24] Deepgram language detection took speech transcription to the next level,

[00:10:29] and feed management was enhanced so you can control usage or upgrade easily when you run out of free tier credits.

[00:10:36] Finally, December 2024

[00:10:39] concluded with major expansions in the retrieval pipeline.

[00:10:43] You can run a retrieval only rag process or map entire websites with the new site mapping features. Summaries and bullet points are more accurate than ever with refined chunking strategies.

[00:10:54] And let's not forget the big arrival

[00:10:56] of new model support for Grok Yammer

[00:10:59] 3.3

[00:11:00] or the ability to handle advanced content from Dropbox,

[00:11:04] Box, Intercom,

[00:11:05] Zendesk, and beyond.

[00:11:08] Throughout these releases, loads of bugs were squashed

[00:11:11] from synchronizing your search index in real time

[00:11:14] to extracting images from PDFs

[00:11:16] to properly handling text from every corner of your data sources.

[00:11:21] Each fix was about giving you a more stable and confident development experience.

[00:11:27] In short, the Graphlet platform has advanced leaps and bounds this past year.

[00:11:33] From brand new SDKs

[00:11:35] and improved content ingestion

[00:11:37] to powerful LLM integrations

[00:11:40] and next level conversation features, there's something here for every team building with AI.

[00:11:46] The best way to learn more?

[00:11:48] Try it. You can sign up for free right now at graphlet.com

[00:11:53] and explore all these capabilities for yourself.

[00:11:57] And here's one last bit of news.

[00:12:00] 2025

[00:12:01] is right around the corner,

[00:12:03] and the team is preparing even more excitement,

[00:12:07] especially for building, testing, and deploying AI agents.

[00:12:13] Keep an eye out for those features because they're going to transform

[00:12:17] how you build intelligent

[00:12:20] automated systems.

[00:12:22] Thanks for tuning in to this whirlwind tour of Graphlet's 24 updates.

[00:12:27] Good luck with your next AI driven project,

[00:12:30] and we can't wait to see the amazing creations you'll build with these new tools.

[00:12:36] Enjoy exploring the platform,

[00:12:38] and happy coding.



In [9]:
from IPython.display import display, HTML, JSON
from datetime import datetime, timedelta

time.sleep(10) # give it some time for billing events to catch up

credits = await lookup_credits(ingestion_correlation_id)

if credits is not None:
    display(Markdown(f"### Credits used: {credits.credits:.6f} for ingestion"))
    print(f"- storage [{credits.storage_ratio:.2f}%], compute [{credits.compute_ratio:.2f}%]")
    print(f"- embedding [{credits.embedding_ratio:.2f}%], completion [{credits.completion_ratio:.2f}%]")
    print(f"- ingestion [{credits.ingestion_ratio:.2f}%], indexing [{credits.indexing_ratio:.2f}%], preparation [{credits.preparation_ratio:.2f}%], extraction [{credits.extraction_ratio:.2f}%], enrichment [{credits.enrichment_ratio:.2f}%], publishing [{credits.publishing_ratio:.2f}%]")
    print(f"- search [{credits.search_ratio:.2f}%], conversation [{credits.conversation_ratio:.2f}%]")
    print()

usage = await lookup_usage(ingestion_correlation_id)

if usage is not None:
    display(Markdown(f"### Usage records:"))

    for record in usage:
        dump_usage_record(record)
    print()


### Credits used: 3.826806 for ingestion

- storage [1.21%], compute [45.40%]
- embedding [4.18%], completion [0.00%]
- ingestion [0.00%], indexing [0.00%], preparation [49.21%], extraction [0.00%], enrichment [0.00%], publishing [0.00%]
- search [0.00%], conversation [0.00%]



### Usage records:

2024-12-29T03:26:12.046Z: Serverless compute
- Workflow [Entity Event] took 0:00:07.950695, used credits [0.01431456]
- CONTENT [550d544c-3dc6-4fae-a4d1-89577544d9fc]

2024-12-29T03:26:11.897Z: Text embedding
- Workflow [Preparation] took 0:00:00.362794, used credits [0.00113600]
- CONTENT [550d544c-3dc6-4fae-a4d1-89577544d9fc]: Content type [PAGE], file type [DOCUMENT]
- Model service [OpenAI], model name [Ada_002]
- Text embedding [568 tokens], throughput: 1565.627 tokens/sec

2024-12-29T03:26:11.733Z: Text embedding
- Workflow [Preparation] took 0:00:00.205283, used credits [0.00093600]
- CONTENT [550d544c-3dc6-4fae-a4d1-89577544d9fc]: Content type [PAGE], file type [DOCUMENT]
- Model service [OpenAI], model name [Ada_002]
- Text embedding [468 tokens], throughput: 2279.781 tokens/sec

2024-12-29T03:26:10.545Z: Upload Mezzanine
- Workflow [Preparation] took 0:00:00.053080, used credits [0.00001700]
- CONTENT [550d544c-3dc6-4fae-a4d1-89577544d9fc]: Content type [PAGE], file type [DAT

In [10]:
credits = await lookup_credits(publish_correlation_id)

if credits is not None:
    display(Markdown(f"### Credits used: {credits.credits:.6f} for publishing"))
    print(f"- storage [{credits.storage_ratio:.2f}%], compute [{credits.compute_ratio:.2f}%]")
    print(f"- embedding [{credits.embedding_ratio:.2f}%], completion [{credits.completion_ratio:.2f}%]")
    print(f"- ingestion [{credits.ingestion_ratio:.2f}%], indexing [{credits.indexing_ratio:.2f}%], preparation [{credits.preparation_ratio:.2f}%], extraction [{credits.extraction_ratio:.2f}%], enrichment [{credits.enrichment_ratio:.2f}%], publishing [{credits.publishing_ratio:.2f}%]")
    print(f"- search [{credits.search_ratio:.2f}%], conversation [{credits.conversation_ratio:.2f}%]")
    print()

usage = await lookup_usage(publish_correlation_id)

if usage is not None:
    display(Markdown(f"### Usage records:"))

    for record in usage:
        dump_usage_record(record)
    print()


### Credits used: 57.772510 for publishing

- storage [0.25%], compute [0.09%]
- embedding [0.02%], completion [26.31%]
- ingestion [0.00%], indexing [0.00%], preparation [2.55%], extraction [0.00%], enrichment [0.00%], publishing [70.75%]
- search [0.02%], conversation [0.00%]



### Usage records:

2024-12-29T03:29:18.453Z: Serverless compute
- Workflow [Entity Event] took 0:00:16.635641, used credits [0.02995108]
- CONTENT [3932f651-18ae-40d1-b548-51b6612ec4d8]

2024-12-29T03:29:18.070Z: Text embedding
- Workflow [Preparation] took 0:00:00.573852, used credits [0.00410200]
- CONTENT [3932f651-18ae-40d1-b548-51b6612ec4d8]: Content type [FILE], file type [AUDIO]
- Model service [OpenAI], model name [Ada_002]
- Text embedding [2051 tokens], throughput: 3574.094 tokens/sec

2024-12-29T03:29:17.932Z: Text embedding
- Workflow [Preparation] took 0:00:00.469946, used credits [0.00411200]
- CONTENT [3932f651-18ae-40d1-b548-51b6612ec4d8]: Content type [FILE], file type [AUDIO]
- Model service [OpenAI], model name [Ada_002]
- Text embedding [2056 tokens], throughput: 4374.972 tokens/sec

2024-12-29T03:29:17.020Z: GraphQL
- Operation took 0:02:58.736666, used credits [0.00000000]
- Request:
mutation PublishContents($summaryPrompt: String, $publishPrompt: String!, $connector: ContentPublish