diff --git a/docs/developer-guides/ai-features.md b/docs/developer-guides/ai-features.md new file mode 100644 index 00000000..e81962c8 --- /dev/null +++ b/docs/developer-guides/ai-features.md @@ -0,0 +1,167 @@ +--- +title: Utilizing AI Features in SettleMint - OpenAI Nodes and pgvector in Hasura +description: A Guide to Building an AI-Powered Workflow with OpenAI Nodes and Vector Storage in Hasura +sidebar_position: 2 +keywords: [integration studio, OpenAI, Hasura, pgvector, AI, SettleMint] +--- + +This guide will demonstrate how to use the **SettleMint Integration Studio** to create a flow that incorporates OpenAI nodes for vectorization and utilizes the `pgvector` plugin in Hasura for similarity searches. If you are new to SettleMint, check out the [Getting Started Guide](../about-settlemint/0_intro.mdx). + +In this guide, you will learn to create workflows that: +- Use **OpenAI nodes** to vectorize data. +- Store vectorized data in **Hasura** using `pgvector`. +- Conduct similarity searches to find relevant matches for new queries. + +### Prerequisites +- A SettleMint Platform account with **Integration Studio** and **Hasura** deployed +- Access to the Integration Studio and Hasura consoles in your SettleMint environment +- An OpenAI API key for using the OpenAI nodes +- A data source to vectorize (e.g., Graph Node, Attestation Indexer, or external API endpoint) + +### Example Flow Available +The Integration Studio includes a pre-built AI example flow that demonstrates these concepts. The flow uses the SettleMint Platform's attestation indexer as a data source, showing how to: +- Fetch attestation data via HTTP endpoint +- Process and vectorize the attestation content +- Store vectors in Hasura +- Perform similarity searches + +You can use this flow as a reference while building your own implementation. Each step described in this guide can be found in the example flow. + +--- + +## Part 1: Creating a Workflow to Fetch, Vectorize, and Store Data + +### Step 1: Set Up Vector Storage in Hasura + +1. Access your SettleMint's Hasura instance through the admin console. + +2. Create a new table called `document_embeddings` with the following columns: + - `id` (type: UUID, primary key) + - `embedding` (type: vector(1536)) - For storing OpenAI embeddings + +### Step 2: Set Up the Integration Studio Flow + +1. **Open Integration Studio** in SettleMint and click on **Create Flow** to start a new workflow. + +### Step 3: Fetch Data from an External API + +1. **Add an HTTP Request Node** to retrieve data from an external API, such as a document or product listing service. +2. Configure the **API endpoint** and any necessary authentication settings. +3. **Add a JSON Node** to parse the response data, focusing on fields like `id` and `content` for further processing. + +### Step 4: Vectorize Data with OpenAI Node + +1. **Insert an OpenAI Node** in the workflow: + - Use this node to generate vector embeddings for the text data using OpenAI's Embedding API. + - Configure the OpenAI node to use the appropriate model and input data, such as `text-embedding-ada-002`. + +![Create an OpenAI node](../../static/img/developer-guides/openai-node.png) + +### Step 5: Store Vectors in Hasura with pgvector + +1. **Add a GraphQL Node** to save the vector embeddings and data `id` in Hasura. +2. Set up a **GraphQL Mutation** to store the vectors and associated IDs in a table enabled with `pgvector`. + +Example Mutation: +```graphql +mutation insertVector($id: uuid!, $vector: [Float!]!) { + insert_vectors(objects: {id: $id, vector: $vector}) { + affected_rows + } +} +``` + +3. Ensure correct data mapping from the fetched data and vectorized output. + +### Step 6: Deploy and Test the Workflow + +1. **Deploy the Flow** within Integration Studio and **run it** to confirm that data is fetched, vectorized, and stored in Hasura. +2. **Verify Hasura Data** by checking the table to ensure vectorized entries and corresponding IDs are stored correctly. + +--- + +## Part 2: Setting Up a Similarity Search Endpoint + +### Step 1: Create a POST Endpoint + +1. **Add an HTTP POST Node** to accept a JSON payload with a `query` string to be vectorized and compared to stored data. + +Payload Example: +```json +{ + "query": "input string for similarity search" +} +``` + +2. **Parse the Request** by adding a JSON node to extract the `query` field from the incoming POST request. + +### Step 2: Vectorize the Input Query + +1. **Add an OpenAI Node** to convert the incoming `query` string into a vector representation. + +Example Configuration: +```text +Model: text-embedding-ada-002 +Input: {{msg.payload.query}} +``` + +### Step 3: Perform a Similarity Search with Hasura + +1. **Add a GraphQL Node** to perform a vector similarity search within Hasura using the `pgvector` plugin. +2. Use a **GraphQL Query** to order results by similarity, returning the top 5 most similar records. + +Example Query: +```graphql +query searchVectors($vector: [Float!]!) { + vectors(order_by: {vector: {_vector_distance: $vector}}, limit: 5) { + id + vector + } +} +``` + +3. Map the vector from the OpenAI node output as the `vector` input for the Hasura query. + +### Step 4: Format and Return the Results + +1. **Add a Function Node** to format the response, listing the top 5 matches in a structured JSON format. + +### Step 5: Test the Flow + +1. **Deploy the Flow** and send a POST request to confirm the similarity search functionality. +2. **Verify Response** to ensure that the flow accurately returns the top 5 matches from the vectorized data in Hasura. + +--- + +## Next Steps + +Now that you have built an AI-powered workflow, here are some blockchain-specific applications you can explore: + +### Vectorize On-Chain Data +- Index and vectorize smart contract events for similarity-based event monitoring +- Create embeddings from transaction data to detect patterns or anomalies +- Vectorize NFT metadata for content-based recommendations +- Build semantic search for on-chain attestations + +### Advanced Use Cases +- Combine transaction data with natural language descriptions for enhanced search +- Create AI-powered analytics dashboards using vectorized blockchain metrics +- Implement fraud detection by vectorizing transaction patterns +- Build a semantic search engine for smart contract code and documentation + +### Integration Ideas +- Connect to multiple blockchain indexers to vectorize data across networks +- Combine off-chain and on-chain data vectors for comprehensive analysis +- Set up automated alerts based on similarity to known patterns +- Create a knowledge base from vectorized blockchain documentation + +For further resources, check out: + +- [SettleMint Integration Studio Documentation](https://console.settlemint.com/documentation/docs/using-platform/integration-studio/) +- [Node-RED Documentation](https://nodered.org/docs/) +- [OpenAI API Documentation](https://beta.openai.com/docs/) +- [Hasura pgvector Documentation](https://hasura.io/docs/3.0/connectors/postgresql/native-operations/vector-search/) + +--- + +This guide should enable you to build AI-powered workflows with SettleMint's new OpenAI nodes and `pgvector` support in Hasura for efficient similarity searches. \ No newline at end of file diff --git a/static/img/developer-guides/openai-node.png b/static/img/developer-guides/openai-node.png new file mode 100644 index 00000000..49a531f4 Binary files /dev/null and b/static/img/developer-guides/openai-node.png differ