In [1]:
%pip install --upgrade --quiet pip setuptools wheel
%pip install --upgrade --quiet  langchain langchain-openai faiss-cpu tiktoken crate 'crate[sqlalchemy]' pandas jq 
%pip install --use-pep517 --quiet python-dotenv

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


# Use CrateDB as fulltext search retriver and Mistral-7B as language model

## Setup environment variables

In [2]:
import os

from dotenv import load_dotenv

load_dotenv()

True

## setup embeddings

In [3]:
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()
len(embeddings.embed_query("a"))

1536

In [4]:
conn_url = "crate://{user}:{password}@{server}".format(
    user=os.environ["CRATEDB_USER"],
    password=os.environ["CRATEDB_PASS"],
    server=os.environ["CRATEDB_SERVER"],
)
conn_url

'crate://crate:@localhost:4201'

In [5]:
# open file
from langchain_community.document_loaders import JSONLoader, DirectoryLoader


def metadata_func(record: dict, metadata: dict) -> dict:
    metadata["source_url"] = record.get("url")
    metadata["source_title"] = record.get("title")

    if "source" in metadata:
        metadata["source"] = metadata["source_url"]

    return metadata


loader = DirectoryLoader(
    './',
    glob="everything-*.json",
    loader_cls=JSONLoader,
    loader_kwargs={
        "jq_schema": ".[]",
        "text_content": False,
        "content_key": "html",
        "metadata_func": metadata_func,
    }
)

data = loader.load()
# data[:1]

In [6]:
# split documents
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    separators=[
        "\n\n",
        "\n",
        " ",
        ".",
        ",",
    ],
    chunk_size=500,
    chunk_overlap=50,
    length_function=len,
    is_separator_regex=False,
)

docs_splits = text_splitter.split_documents(data)
# docs_splits[:2]

## RAG search, indexing pipeline

In [7]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI

In [8]:
from rag.vectorstore.crate import CrateVectorStore

vectorstore = CrateVectorStore.from_documents(
    # assumes that data was imported already
    # allow faster recomputation of notebook, without need of reindexing
    documents=[],
    # documents=docs_splits,
    embedding=embeddings,
    database_kwargs={
        "database_uri": conn_url,
    },
    # vectorstore_kwargs={
    #    "drop_if_exists" : True,
    # },
)
vectorstore

<rag.vectorstore.crate.CrateVectorStore at 0x31d0d6750>

In [35]:
from langchain.retrievers import EnsembleRetriever

retriever = EnsembleRetriever(
    retrievers=[
        vectorstore.as_retriever(
            search_kwargs={'k': 10, 'fetch_k': 100, "algorith": "knn"}
        ),
        vectorstore.as_retriever(
            search_kwargs={'k': 10, 'fetch_k': 100, "algorith": "fulltext"}
        )
    ],
    weights=[0.5, 0.5],
)

In [36]:
import json

In [37]:
template = """Answer the question based only on the following context, if possible use links inside answer to reference the source, use markdown:

today date is 2024 April 3rd

{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
model = ChatOpenAI()


def format_docs(docs):
    breakpoint()
    return json.dumps([{"text": d.page_content, "source": d.metadata.get('source')} for d in docs])


chain = (
        {"context": retriever | format_docs,
         "question": RunnablePassthrough()}
        | prompt
        | model
        | StrOutputParser()
)

# result = chain.invoke("How to limit permissions?")
# result = chain.invoke(" How AWS marketplace works, and why I cannot see deployment in my account?")
# result = chain.invoke("What are edge regions and how to use them?")
result = chain.invoke("Write me example of using blobs?")
# result = chain.invoke("How to use BLOB store in CrateDB? and what are the benefits?")
result


'To use blobs in CrateDB, you first need to create a blob table using the Crate Shell. You can issue a SQL statement like this:\n\n```sh\ncrash -c "create blob table myblobs clustered into 3 shards with (number_of_replicas=1)"\n```\n\nAfter creating the blob table, you can upload a blob by issuing a PUT request like this:\n\n```sh\ncurl -isSX PUT \'127.0.0.1:4200/_blobs/myblobs/your_blob_id\' -d \'contents\'\n```\n\nTo retrieve a blob, you can use a GET request:\n\n```sh\ncurl -sS \'127.0.0.1:4200/_blobs/myblobs/your_blob_id\' contents\n```\n\nAnd to delete a blob, you can use a DELETE request:\n\n```sh\ncurl -isS -XDELETE \'127.0.0.1:4200/_blobs/myblobs/your_blob_id\'\n```\n\nFor more information on using blobs in CrateDB, you can refer to the [official documentation](https://cratedb.com/docs/python/en/latest/blobs.html).'

In [38]:
from IPython.display import display, Markdown

display(Markdown(result))

To use blobs in CrateDB, you first need to create a blob table using the Crate Shell. You can issue a SQL statement like this:

```sh
crash -c "create blob table myblobs clustered into 3 shards with (number_of_replicas=1)"
```

After creating the blob table, you can upload a blob by issuing a PUT request like this:

```sh
curl -isSX PUT '127.0.0.1:4200/_blobs/myblobs/your_blob_id' -d 'contents'
```

To retrieve a blob, you can use a GET request:

```sh
curl -sS '127.0.0.1:4200/_blobs/myblobs/your_blob_id' contents
```

And to delete a blob, you can use a DELETE request:

```sh
curl -isS -XDELETE '127.0.0.1:4200/_blobs/myblobs/your_blob_id'
```

For more information on using blobs in CrateDB, you can refer to the [official documentation](https://cratedb.com/docs/python/en/latest/blobs.html).

In [39]:
display(Markdown(chain.invoke("What are edge regions and how to use them?")))

Edge regions are custom regions created for hosting CrateDB Edge clusters either locally or on existing cloud providers without relying on the default cloud regions. To create an edge region, users can follow these steps:

1. **Creating a Custom Region**: 
   - Users can create a custom region by signing up or logging into the CrateDB Cloud Console and going to the Regions tab in the Subscription overview.
   - Click on "Create Edge region" to create a custom edge region.
   - Once the region appears in the regions list, a script will be provided that can be copied into the CLI for installation confirmation.
   - The script may prompt for installation of prerequisite tools as needed.
   - To configure necessary storage classes, users can follow the instructions provided. 
   - Source: [CrateDB Edge Regions Creation](https://cratedb.com/docs/cloud/en/latest/tutorials/edge/managed-kubernetes.html#edge-providers)

2. **Deploying CrateDB Cloud on Kubernetes**:
   - For users with access to CrateDB Cloud on Kubernetes, the Regions tab allows the deployment of CrateDB Cloud on Kubernetes clusters in a custom region.
   - Provide a name for the custom region and click "Create edge region" to deploy the cluster.
   - Once created, the custom region will appear for further configuration.
   - Source: [CrateDB Cloud on Kubernetes Deployment](https://cratedb.com/docs/cloud/en/latest/reference/overview.html#import)

3. **Upgrading Edge Regions**:
   - Components of deployed Edge Regions are not updated automatically, and users should update their edge regions regularly for new features, bug fixes, and security updates.
   - If an edge region is outdated, users will see an "Upgrade this Edge region" button next to the region.
   - Clicking it will display a command that updates the Edge Region, which should be pasted into the environment where the Edge cluster is deployed.
   - Source: [Edge Regions Upgradation](https://cratedb.com/docs/cloud/en/latest/tutorials/edge/introduction.html#edge-disclaimer)

By following these steps, users can effectively create, deploy, and maintain custom edge regions for hosting CrateDB Edge clusters based on their infrastructure requirements.

In [40]:
display(Markdown(chain.invoke("How AWS marketplace works, and why I cannot see deployment in my account?")))

To deploy a cluster on CrateDB Cloud via AWS Marketplace, you will need to sign up for an AWS Marketplace account. The hourly usage is billed directly by Amazon, not by Crate.io. You can find the CrateDB Cloud offer on the AWS Marketplace page by searching for "CrateDB Cloud" in the search bar or going directly to the AWS offer page. Once you subscribe to the CrateDB Cloud offering, it may take up to 10 minutes for the subscription to be confirmed and usable in the CrateDB Cloud console. 

If you are unable to see the deployment in your account, it may be because the subscription process is still pending or has not been completed. Additionally, to delete a cluster created via AWS Marketplace, you must unsubscribe from the offer on the AWS Marketplace website. Make sure you are logged in with the account used to subscribe to the offer, find your account name in the top right corner, and select "Your Marketplace Software" from the dropdown menu.

For more information, you can refer to the source documentation on [AWS Marketplace deployment with CrateDB Cloud](https://cratedb.com/docs/cloud/en/latest/tutorials/deploy/marketplace/subscribe-aws.html) and [deleting a cluster via AWS Marketplace](https://cratedb.com/docs/cloud/en/latest/howtos/delete-cluster.html).

In [41]:
display(Markdown(chain.invoke("What are recent blog posts about CrateDB?")))

Recent blog posts about CrateDB include topics such as setting up a small CrateDB cluster with Docker and useful Docker commands for performance analysis. Additionally, there are posts discussing distributed query execution in CrateDB, indexing and storage in CrateDB, and handling dynamic objects in CrateDB. 

Sources: 
- [Setting up a CrateDB cluster with Docker](https://cratedb.com/blog/tag/metrics)
- [Distributed query execution in CrateDB](https://cratedb.com/blog/how-to-automatically-create-and-manage-database-backups)
- [Indexing and Storage in CrateDB](https://cratedb.com/product/features/indexing-columnar-storage-aggregations)

In [42]:
display(Markdown(chain.invoke("Write me example python code to use CrateDB?")))

To use CrateDB with Python, you can follow the example code provided in the [CrateDB Python driver documentation](https://cratedb.com/connect/python). Here is a basic example of how you can connect to CrateDB using the Python driver:

```python
from crate.client import connect

# Establish connection to CrateDB
connection = connect("http://localhost:4200/")

# Create a cursor object
cursor = connection.cursor()

# Execute a query
cursor.execute("SELECT * FROM my_table")

# Fetch the results
results = cursor.fetchall()

# Print the results
for row in results:
    print(row)

# Close the cursor and connection
cursor.close()
connection.close()
```

This code snippet demonstrates how to connect to CrateDB, execute a query, fetch the results, and then print them out. Make sure to replace `"http://localhost:4200/"` with the actual connection string for your CrateDB instance.

In [43]:
display(Markdown(chain.invoke("Write me example golang code to use CrateDB?")))

To use CrateDB with Go, you can utilize the pgx driver specifically designed for PostgreSQL. Here is an example of Go code that demonstrates how to connect to CrateDB using pgx:

```go
package main

import (
    "context"
    "fmt"
    "os"

    "github.com/jackc/pgx/v4"
)

func main() {
    // Define the connection string
    connString := "postgres://crate@localhost:5432/mydb"

    // Establish a connection to CrateDB
    conn, err := pgx.Connect(context.Background(), connString)
    if err != nil {
        fmt.Fprintf(os.Stderr, "Unable to connect to database: %v\n", err)
        os.Exit(1)
    }
    defer conn.Close(context.Background())

    // Perform database operations
    // Example: Querying data
    rows, err := conn.Query(context.Background(), "SELECT * FROM my_table")
    if err != nil {
        fmt.Fprintf(os.Stderr, "Query failed: %v\n", err)
        os.Exit(1)
    }
    defer rows.Close()

    for rows.Next() {
        var id int
        var name string
        if err := rows.Scan(&id, &name); err != nil {
            fmt.Fprintf(os.Stderr, "Scan failed: %v\n", err)
            os.Exit(1)
        }
        fmt.Printf("ID: %d, Name: %s\n", id, name)
    }
}
```

For further details and more examples, you can refer to the [CrateDB documentation on connecting to CrateDB with Go](https://cratedb.com/connect/go).

In [44]:
display(Markdown(chain.invoke("create RAG search with CrateDB and OpenAI?")))

To create a Retrieval Augmented Generation (RAG) search using CrateDB and OpenAI, you can leverage vector search to use embeddings and generative AI. The RAG approach is based on using CrateDB as a vector store and the OpenAI embedding model. By following the RAG workflow with CrateDB, you can identify key data sets for training, create a high-quality prompt for content generation, build a knowledge-based index, and optimize retrieval of information from a large collection of data.

For more detailed information on how to implement RAG search with CrateDB and OpenAI, you can refer to the following sources:

1. [Leverage Vector Search to Use Embeddings and Generative AI: Retrieval Augmented Generation (RAG) with CrateDB](https://cratedb.com/blog/leverage-vector-search-to-use-embeddings-and-generative-ai-retrieval-augmented-generation-rag-with-cratedb)
2. [Figure 1: RAG workflow with CrateDB](https://cratedb.com/blog/leverage-vector-search-to-use-embeddings-and-generative-ai-retrieval-augmented-generation-rag-with-cratedb)
3. [The role of vector store and vector similarity search](https://cratedb.com/blog/leverage-vector-search-to-use-embeddings-and-generative-ai-retrieval-augmented-generation-rag-with-cratedb)
4. [In this post, we will introduce the RAG approach based on CrateDB as a vector store and the OpenAI embedding model](https://cratedb.com/blog/leverage-vector-search-to-use-embeddings-and-generative-ai-retrieval-augmented-generation-rag-with-cratedb)

In [45]:
display(Markdown(chain.invoke("how to alter table and add fulltext index?")))

To alter a table and add a fulltext index in CrateDB, you can follow the syntax provided in the documentation. First, you need to create a table with the desired columns. Then, you can alter the table to add a fulltext index using the `ALTER TABLE` statement with the `ADD INDEX` clause.

Here is an example of how to add a fulltext index to a table in CrateDB:

```sql
ALTER TABLE table_name ADD INDEX index_name USING FULLTEXT(columns) WITH (analyzer = 'english');
```

Make sure to replace `table_name` with the name of your table, `index_name` with the desired name for the index, and `columns` with the column(s) you want to include in the fulltext index.

For more information on creating fulltext indices in CrateDB, you can refer to the official documentation [here](https://cratedb.com/docs/crate/reference/en/master/general/ddl/fulltext-indices.html).

In [46]:
display(Markdown(chain.invoke("how to alter table and add vector type field that allows for KNN search?")))

To alter a table and add a vector type field that allows for KNN search in CrateDB, you can follow the steps below:

1. Make sure you are using CrateDB version 5.5 or higher, as this version introduced the vector support and KNN search functionality.

2. Create a new table or use an existing table where you want to add the vector type field. For example, you can create a table named `my_data` with a field `xs` of type `FLOAT_VECTOR(2)`.

```sql
CREATE TABLE my_data (
  xs FLOAT_VECTOR(2)
);
```

3. Use the `ALTER TABLE` command to add a new field of type `FLOAT_VECTOR(n)` to the existing table. Replace `n` with the desired length of the vector. For example, to add a new field named `new_vector_field` of type `FLOAT_VECTOR(3)` to the `my_data` table, you can use the following command:

```sql
ALTER TABLE my_data ADD COLUMN new_vector_field FLOAT_VECTOR(3);
```

4. Once you have added the vector type field to the table, you can insert vector data into the table. For example:

```sql
INSERT INTO my_data VALUES ([1.6, 2.7]), ([4.6, 7.8]);
```

5. You can then perform KNN search queries on the vector data stored in the table using the `knn_match` function. For example, to find the k-nearest neighbors of a query vector `[3.14, 8]` in the `xs` field of the `my_data` table, you can use the following query:

```sql
SELECT xs, _score FROM my_data
WHERE knn_match(xs, [3.14, 8], 2)
ORDER BY _score DESC;
```

By following these steps, you can alter a table in CrateDB and add a vector type field that allows for KNN search. You can find more information about vector support and KNN search in CrateDB in the [official documentation](https://cratedb.com/blog/unlocking-the-power-of-vector-support-and-knn-search-in-cratedb).

In [47]:
display(Markdown(chain.invoke("create table with fields ID, name, vector, and index vector field for KNN search?")))

To create a table with fields ID, name, vector, and index the vector field for KNN search in CrateDB, you can use the following SQL commands:

```sql
CREATE TABLE my_table (
  ID INTEGER PRIMARY KEY,
  name TEXT,
  vector FLOAT_VECTOR(2),
  INDEX vector_ft USING FULLTEXT(vector)
);
```

This SQL statement creates a table called `my_table` with fields `ID` as the primary key, `name` as text, `vector` as a float vector with two components, and an index `vector_ft` on the `vector` field for full-text search to enable KNN search in CrateDB.

For more information, you can refer to the source: [CrateDB Multi-Model Database Solutions](https://cratedb.com/solutions/multi-model-database)

In [48]:
display(Markdown(chain.invoke("What are limits and limitations of CrateDB?")))

The limitations of the single-node CRFEE plan for CrateDB include the lack of capabilities such as high speed, scalability, and high-availability that are typically found in a standard distributed cluster. However, it is possible to easily create a new cluster with another plan from within the Cloud Console. 

Source: [CrateDB Limitations](https://cratedb.com/lp-crfree?hsCtaTracking=43b563de-8b00-42d1-b008-73ca8a3353a1|398e0b9d-de53-4207-9b91-a092772b42e3#main-content)

In [49]:
display(Markdown(chain.invoke("What are the benefits of using CrateDB?")))

The benefits of using CrateDB include:

- **High performance**: CrateDB provides high-performance capabilities with query response time in milliseconds to process and analyze data efficiently. It offers a high-performance distributed query engine, writes, and reads, enhancing query performance significantly. ([source](https://cratedb.com/product/features/query-performance))

- **Scalability**: CrateDB offers horizontal scalability, allowing users to add as many nodes as needed. It is a multi-model database engine with support for structured, semi-structured, and unstructured schemas. It also provides multi-platform support, as it can run anywhere. ([source](https://cratedb.com/customers/abb))

- **Flexibility**: CrateDB is an open-source, multi-model, and distributed database that offers flexibility in managing extensive concurrent reads and writes efficiently. It supports multiple data types and provides a single source of truth updated in near real-time. ([source](https://cratedb.com/solutions/database-consolidation))

- **Simplicity**: CrateDB simplifies data infrastructure and helps overcome the challenges of growing complexity and technical debt. It uses native SQL as its query language, reducing the learning curve and allowing users to focus on query logic. The fully distributed query engine and columnar storage bring benefits such as immediate data availability for queries, hyper-fast aggregations, and in-memory SQL query performance. ([source](https://cratedb.com/product/features/distributed-database), [source](https://cratedb.com/product/features/query-performance))

- **Cost-effectiveness**: CrateDB reduces total cost of ownership (TCO) by delivering high performance, scalability, and flexibility. It simplifies management and maintenance, provides data synchronization, and empowers data applications with support for AI/ML. ([source](https://cratedb.com/blog/what-is-data-consolidation-an-overview))

- **Real-time capabilities**: CrateDB supports real-time full-text search over millions of documents in just a few seconds. It provides a powerful REST API for managing and accessing various aspects of CrateDB programmatically. ([source](https://cratedb.com/blog/automating-export-of-cratedb-data-to-s3-using-apache-airflow), [source](https://cratedb.com/product/features/rest-api))

Overall, using CrateDB can offer significant benefits in terms of performance, scalability, flexibility, simplicity, cost-effectiveness, and real-time capabilities.

In [50]:
display(Markdown(chain.invoke("What are technical limitations?")))

Technical limitations include the size and complexity of machine-generated data, diverse architecture of machine data pipelines, dated or proprietary communication protocols used by historians, specialized client applications required by some historians, and constraints on database capacity and performance due to strong consistency in relational databases. These limitations can hinder the realization of modern architectural design, real-time analytics requirements, and integration with other systems and tools. For more information, you can refer to the sources [here](https://cratedb.com/blog/new-partner-on-board-welcome-roosi), [here](https://cratedb.com/blog/time-series-databases-operational-historians), and [here](https://cratedb.com/blog/myths-relational-databases-operational-historians).

In [51]:
display(Markdown(chain.invoke("Does index creation block write operations?")))

Yes, according to the CrateDB documentation, if the `write.wait_for_active_shards` setting is set to 1 and a node is stopped, the write operations would block until the replica is fully replicated again or the write operations would timeout. This means that index creation can indeed block write operations. You can read more about this in the CrateDB documentation [here](https://cratedb.com/docs/crate/reference/en/5.6/sql/statements/create-table.html).

In [52]:
display(Markdown(chain.invoke("Does crate supports conditional indices")))

Based on the information provided, CrateDB automatically indexes every attribute by default, utilizing strategies like Inverted Index for text values, Block k-d trees for numeric, date, and geospatial values, and Hierarchical Navigable Small World (HNSW) graphs for high dimensional vectors ([source](https://cratedb.com/product/features/data-storage)). Additionally, CrateDB's indexing strategy, based on a Lucene index, automatically generates indexes for all attributes regardless of their depth, enabling rapid search capabilities for stored objects and facilitating efficient updates ([source](https://cratedb.com/solutions/json-database)). 

Therefore, it seems that CrateDB does not support conditional indices in the traditional sense, as it automatically indexes attributes based on predefined strategies.

In [53]:
display(Markdown(chain.invoke("How to create ID field that is autoincremented?")))

To create an auto-incremented ID field in CrateDB, you can use the `_id` system column as a primary key. This column contains a unique identifier for each record and its value is deterministic. This means that two individual records in different tables with the same primary key values will also have identical `_id` values. You can refer to the CrateDB documentation on [performance and selects](https://cratedb.com/docs/guide/performance/selects.html) for more information.

In [54]:
display(Markdown(chain.invoke("how to create analysers for fulltext search?")))

To create analyzers for fulltext search in CrateDB, you can use language-specific analyzers, tokenizers, and token-filters. You can get proper search results for data provided in a certain language by utilizing these elements. 

One way to create an analyzer is by defining a fulltext index with an analyzer. CrateDB provides the option to use the built-in English analyzer or create custom analyzers. 

For more information on creating analyzers for fulltext search in CrateDB, you can refer to the documentation on [Fulltext Indices](https://cratedb.com/docs/crate/reference/en/4.8/general/ddl/fulltext-indices.html) and [Analyzers](https://cratedb.com/docs/crate/reference/en/3.3/general/ddl/analyzers.html). 

Additionally, you can also check out the [CrateDB blog](https://cratedb.com/blog/crate-for-pythonistas-with-sqlalchemy) for practical examples and guidance on creating analyzers for fulltext search.

In [55]:
display(Markdown(chain.invoke("give me information about password and admin")))

Based on the provided context, password authentication is required for accessing the Admin UI in CrateDB. Users need to provide a password in addition to their username when using the password authentication method. The Admin UI allows users to display cluster health, monitoring, checks, list available nodes in the cluster, list tables, and control access via the CrateDB REST and PostgreSQL wire protocol interfaces and command line tools.

For more information about password authentication and the Admin UI in CrateDB, you can refer to the following sources:
- [CrateDB Documentation on Authentication Methods](https://cratedb.com/docs/crate/reference/en/5.6/admin/auth/methods.html)
- [CrateDB Blog Announcing CrateDB 2.3](https://cratedb.com/blog/announcing-cratedb-2-3)
- [CrateDB Blog on Visualizing Time Series Data with Grafana and CrateDB](https://cratedb.com/blog/visualizing-time-series-data-with-grafana-and-cratedb)
- [CrateDB Documentation on Deletion Protection, Credentials, and Upgrading CrateDB](https://cratedb.com/docs/cloud/en/latest/reference/overview.html#import)

In [56]:
display(Markdown(chain.invoke("Shared file system implementation of the BlobStoreRepository")))

The shared file system implementation of the BlobStoreRepository in CrateDB allows for defining a custom directory path for storing blob data, separate from the normal data path. This enables storing normal data on a fast SSD and blob data on a large, cost-effective spinning disk. This feature simplifies data management and optimizes storage resources. You can learn more about this feature in the CrateDB documentation on [Custom location for storing blob data](https://cratedb.com/docs/crate/reference/en/5.6/general/blobs.html).

In [57]:
display(Markdown(chain.invoke("Is Cloud UI opensource?")))

Based on the information provided, the Cloud UI offered by CrateDB is not open source. The blog post mentions that despite not being a good match for a specific use case, they loved the Cloud UI and its various features. Additionally, the blog post discussing the farewell to the CrateDB Enterprise License FAQ also states that the future is moving towards fully managed SaaS solutions, indicating that the Cloud UI is not open source. 

Sources:
- [Comparing Databases in Industrial IoT Use Case](https://cratedb.com/blog/comparing-databases-industrial-iot-use-case)
- [Farewell to the CrateDB Enterprise License FAQ](https://cratedb.com/blog/farewell-to-the-cratedb-enterprise-license-faq)

In [58]:
display(Markdown(chain.invoke("How to do fusion search and connect vector search with fulltext search")))

To perform fusion search and connect vector search with full-text search, you can leverage the advanced search capabilities offered by CrateDB. This includes combining vector, full-text, and keyword searches to improve semantic similarity and keyword matching, enhancing search precision and relevance. By utilizing these features, you can integrate vector search with full-text search seamlessly, enhancing the overall search experience. For more information on how CrateDB facilitates this fusion search, you can refer to the source [here](https://cratedb.com/blog/open-source-vector-database).

In [60]:
display(Markdown(chain.invoke("How to MATH fulltext ")))

To use fulltext search in CrateDB, you need to create a fulltext index with an analyzer for the column you want to search. Different types of fulltext indices exist with different goals. However, it's not possible to query multiple index columns with different index types within the same MATCH predicate. More information can be found in the CrateDB documentation on [fulltext searches](https://cratedb.com/docs/crate/reference/en/5.6/general/dql/fulltext.html).