## VectorSearch.ipynb

### Written by Taiob Ali

@sqlworldwide

Reference: [Azure OpenAI Embeddings](https:\github.com\AzureSQLDB\GenAILab\blob\main\docs\2-creating-embedding-and-storing-in-SQL-database.md)

Create a function to create embeddings. You will need to change the the url and api-key value.

An embedding is a special format of data representation that machine learning models and algorithms can easily use. The embedding is an information dense representation of the semantic meaning of a piece of text. Each embedding is a vector of floating-point numbers, such that the distance between two embeddings in the vector space is correlated with semantic similarity between two inputs in the original format. For example, if two texts are similar, then their vector representations should also be similar.

In [None]:
CREATE OR ALTER PROCEDURE dbo.create_embeddings
@inputText nvarchar(max),
@embedding vector(1536) OUT
AS
DECLARE @url nvarchar(4000) = N'https://ta-openai.openai.azure.com/openai/deployments/ta-model-text-embedding-ada-002/embeddings?api-version=2023-05-15';

DECLARE @headers nvarchar(300) = N'{"api-key": "AapkdJSzEkoYtxyD3J3dVfr1g1ac7xy4c1pkmrxzy84ImajIV5EHJQQJ99ALACHYHv6XJ3w3AAABACOGOFPU"}';

DECLARE @message nvarchar(max);
DECLARE @payload nvarchar(max) = N'{"input": "' + @inputText + '"}';
DECLARE @retval int, @response nvarchar(max);

exec @retval = sp_invoke_external_rest_endpoint 
    @url = @url,
    @method = 'POST',
    @headers = @headers,
    @payload = @payload,
    @timeout = 230,
    @response = @response output;

DECLARE @re vector(1536);
IF (@retval = 0) 
	BEGIN
    SET @re = cast(json_query(@response, '$.result.data[0].embedding') AS vector(1536))
	END ELSE BEGIN
	DECLARE @msg nvarchar(max) =  
			'Error calling OpenAI API' + char(13) + char(10) + 
			'[HTTP Status: ' + json_value(@response, '$.response.status.http.code') + '] ' +
			json_value(@response, '$.result.error.message');
	THROW 50000, @msg, 1;
END

SET @embedding = @re;

RETURN @retval
GO

In [None]:
/*
A function to clean up your data (My colleague Howard Dunn wrote this )
*/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE FUNCTION [dbo].[cleanString] (@str NVARCHAR(MAX))
RETURNS NVARCHAR(MAX)
AS
BEGIN
    DECLARE @i INT = 1
    DECLARE @cleaned NVARCHAR(MAX) = ''

    WHILE @i <= LEN(@str)
    BEGIN
        IF SUBSTRING(@str, @i, 1) LIKE '[a-zA-Z0-9 .,!?]'
            SET @cleaned = @cleaned + SUBSTRING(@str, @i, 1)
        SET @i = @i + 1
    END

    RETURN @cleaned
END
GO

In [None]:
DROP TABLE IF EXISTS  vectorTable
SELECT TOP 250 ID, product_name, sku, brand, review_count, description
INTO dbo.vectortable
FROM [dbo].[walmartProducts]
WHERE ID not IN (2, 7)
ORDER BY [ID]
GO

ALTER TABLE vectorTable
ADD description_vector vector(1536) NULL;
GO

DECLARE @i int = 1;
DECLARE @text nvarchar(max);
DECLARE @vector vector(1536);

while @i <= 1000
    BEGIN
    SET @text = (SELECT isnull([product_name],'') + ': ' + isnull([brand],'')+': ' + isnull([description],'' ) 
	  FROM dbo.vectortable 
	  WHERE ID = @i);

    IF(@text <> '')
        BEGIN TRY
          exec dbo.create_embeddings @text, @vector OUTPUT;
          update dbo.vectortable set [description_vector ] = @vector WHERE ID= @i;
        END TRY
        BEGIN CATCH
          SELECT ERROR_NUMBER() AS ErrorNumber,
          ERROR_MESSAGE() AS ErrorMessage;
        END CATCH
    
    SET @i = @i + 1;
END

In [None]:
DELETE FROM dbo.vectortable WHERE description_vector IS NULL;
SELECT Count(*) FROM dbo.vectortable;
SELECT TOP 10 * FROM dbo.vectortable;

In [None]:
-- Declare the search text
declare @search_text nvarchar(max) = 'help me plan a high school graduation party';

-- Declare a variable to hold the search vector
declare @search_vector vector(1536);

-- Generate the search vector using the 'create_embeddings' stored procedure
exec dbo.create_embeddings @search_text, @search_vector output;

-- Perform the search query
SELECT TOP(10) 
  product_name, brand, DESCRIPTION,
  -- Calculate the cosine distance between the search vector and product description vectors
  vector_distance('cosine', @search_vector, description_vector) AS distance
FROM [dbo].[vectorTable]
WHERE vector_distance('cosine', @search_vector, description_vector) IS NOT NULL
ORDER BY distance; -- Order by the closest distance

### Filtered Semantic Search with SQL

[](https:\github.com\AzureSQLDB\GenAILab\blob\main\docs\4-filtered-semantic-search.md#filtered-semantic-search-with-sql)

This section explains how to implement a Filtered Search query in SQL. Hybrid Search combines traditional SQL queries with vector-based search capabilities to enhance search results.

### SQL Query for Hybrid Search

[](https:\github.com\AzureSQLDB\GenAILab\blob\main\docs\4-filtered-semantic-search.md#sql-query-for-hybrid-search)

The following SQL script demonstrates a hybrid search in an SQL database. It uses vector embeddings to find the most relevant products based on a textual description and combines with the availability of free returns

In [None]:
-- Declare the search text
declare @search_text nvarchar(max) = 'help me plan a high school graduation party';

-- Declare a variable to hold the search vector
declare @search_vector vector(1536);

-- Generate the search vector using the 'create_embeddings' stored procedure
exec dbo.create_embeddings @search_text, @search_vector output;

-- Perform the search query
SELECT TOP(10) 
  vt.product_name, vt.brand, vt.DESCRIPTION,
  -- Calculate the cosine distance between the search vector and product description vectors
  vector_distance('cosine', @search_vector, description_vector) AS distance
FROM [dbo].[vectorTable] AS vt
JOIN dbo.walmartProducts AS wpn
ON vt.id = wpn.id
WHERE vector_distance('cosine', @search_vector, description_vector) IS NOT NULL
AND wpn.free_returns ='Free 30-day returns'
ORDER BY distance; -- Order by the closest distance

### Azure OpenAi Recommendations

Copied and edited from [here](https:\github.com\AzureSQLDB\GenAILab\blob\main\docs\5-azure-openai-recommendation.md).

In [7]:
declare @search_text nvarchar(max) = 'help me plan a high school graduation party'

-- Get the search vector for the search text
declare @search_vector vector(1536)
exec dbo.create_embeddings @search_text, @search_vector output;

-- Get the top 50 products that are closest to the search vector
drop table if exists #t;
with cte as 
(
    select         
        id, product_name, [description], description_vector,        
        row_number() over (partition by product_name order by id ) as rn
        FROM [dbo].[vectorTable]
WHERE vector_distance('cosine', @search_vector, description_vector) IS NOT NULL
), 
cte2 as -- remove duplicates
(
    select 
        *
    from
        cte 
    where
        rn = 1
)
select top(25)
    id, product_name, [description],
    vector_distance('cosine', @search_vector, description_vector) as distance
into
    #t
from 
    cte2
order by 
    distance;

-- Aggregate the search results to make them easily consumable by the LLM
declare @search_output nvarchar(max);
select 
    @search_output = string_agg(cast(t.[id] as varchar(10)) +'=>' + t.[product_name] + '=>' + t.[description], char(13) + char(10))
from 
    #t as t;

-- Generate the payload for the LLM
declare @llm_payload nvarchar(max);
set @llm_payload = 
json_object(
    'messages': json_array(
            json_object(
                'role':'system',
                'content':'
                    You are an awesome AI shopping assistant  tasked with helping users find appropriate items they are looking for the occasion. 
                    You have access to a list of products, each with an ID, product name, and description, provided to you in the format of "Id=>Product=>Description". 
                    When users ask for products for specific occasions, you can leverage this information to provide creative and personalized suggestions. 
                    Your goal is to assist users in planning memorable celebrations using the available products.
                '
            ),
            json_object(
                'role':'user',
                'content': '## Source ##
                    ' + @search_output + '
                    ## End ##

                    Your answer needs to be a json object with the following format.
                    {
                        "answer": // the answer to the question, add a source reference to the end of each sentence. Source reference is the product Id.
                        "products": // a comma-separated list of product ids that you used to come up with the answer.
                        "thoughts": // brief thoughts on how you came up with the answer, e.g. what sources you used, what you thought about, etc.
                    }'
            ),
            json_object(
                'role':'user',
                'content': + @search_text
            )
    ),
    'max_tokens': 800,
    'temperature': 0.3,
    'frequency_penalty': 0,
    'presence_penalty': 0,
    'top_p': 0.95,
    'stop': null
);

-- Invoke the LLM to get the response
declare @retval int, @response nvarchar(max);
declare @headers nvarchar(300) = N'{"api-key": "AapkdJSzEkoYtxyD3J3dVfr1g1ac7xy4c1pkmrxzy84ImajIV5EHJQQJ99ALACHYHv6XJ3w3AAABACOGOFPU", "content-type": "application/json"}';
exec @retval = sp_invoke_external_rest_endpoint
    @url = N'https://ta-openai.openai.azure.com/openai/deployments/ta-model-gpt-4/chat/completions?api-version=2024-08-01-preview',
    @headers = @headers,
    @method = 'POST',    
    @timeout = 120,
    @payload = @llm_payload,
    @response = @response output;
select @retval as 'Return Code', @response as 'Response';

-- Get the answer from the response
select [key], [value] 
from openjson(( 
    select t.value 
    from openjson(@response, '$.result.choices') c cross apply openjson(c.value, '$.message') t
    where t.[key] = 'content'
))

Return Code,Response
0,"{""response"":{""status"":{""http"":{""code"":200,""description"":""""}},""headers"":{""Cache-Control"":""no-cache, must-revalidate"",""Date"":""Fri, 06 Dec 2024 12:01:15 GMT"",""Content-Length"":""2282"",""Content-Type"":""application\/json"",""access-control-allow-origin"":""*"",""apim-request-id"":""b446b7b2-8755-465b-a9df-a2efb59a1b5d"",""strict-transport-security"":""max-age=31536000; includeSubDomains; preload"",""x-content-type-options"":""nosniff"",""x-ms-region"":""East US 2"",""x-ratelimit-remaining-requests"":""57"",""x-ratelimit-remaining-tokens"":""50147"",""x-accel-buffering"":""no"",""x-ms-rai-invoked"":""true"",""x-request-id"":""f00d435e-4be4-4d1a-8323-e74c65da3e44"",""azureml-model-session"":""d001-20241125160720"",""x-envoy-upstream-service-time"":""5834"",""x-ms-client-request-id"":""b446b7b2-8755-465b-a9df-a2efb59a1b5d""}},""result"":{""choices"":[{""content_filter_results"":{""hate"":{""filtered"":false,""severity"":""safe""},""protected_material_code"":{""filtered"":false,""detected"":false},""protected_material_text"":{""filtered"":false,""detected"":false},""self_harm"":{""filtered"":false,""severity"":""safe""},""sexual"":{""filtered"":false,""severity"":""safe""},""violence"":{""filtered"":false,""severity"":""safe""}},""finish_reason"":""stop"",""index"":0,""logprobs"":null,""message"":{""content"":""{\n \""answer\"": \""For a memorable high school graduation party, start by setting up a comfortable and stylish seating area with the WestinTrends Julia 10 Ft Outdoor Patio Cantilever Umbrella, which provides ample shade and a chic ambiance for your garden or backyard gathering. Source: 49. Add a touch of elegance with the Amay Grommet Top Blackout Curtain Panels in White, which can be used to create a beautiful backdrop or partition areas within your party space. Source: 76. For entertainment, consider setting up areas for games or photo booths with props, and don't forget to curate a playlist of the graduate's favorite songs to keep the energy lively. Lastly, serve refreshments and snacks on the UDPATIO High Back Outdoor Dining Chairs, ensuring comfort for your guests as they celebrate this significant milestone. Source: 51.\"",\n \""products\"": \""49, 76, 51\"",\n \""thoughts\"": \""I selected the outdoor patio umbrella and dining chairs to provide comfortable seating and a pleasant environment for the party. The blackout curtain panels were chosen to enhance the decor and can be creatively used for setting up distinct areas or as elegant backdrops. These items together help create a functional and festive atmosphere suitable for a high school graduation party.\""\n}"",""role"":""assistant""}}],""created"":1733486470,""id"":""chatcmpl-AbRJ8e25uZMeCQXKMACjPvrFj2qGq"",""model"":""gpt-4-turbo-2024-04-09"",""object"":""chat.completion"",""prompt_filter_results"":[{""prompt_index"":0,""content_filter_results"":{""hate"":{""filtered"":false,""severity"":""safe""},""jailbreak"":{""filtered"":false,""detected"":false},""self_harm"":{""filtered"":false,""severity"":""safe""},""sexual"":{""filtered"":false,""severity"":""safe""},""violence"":{""filtered"":false,""severity"":""safe""}}}],""system_fingerprint"":""fp_5603ee5e2e"",""usage"":{""completion_tokens"":259,""prompt_tokens"":6154,""total_tokens"":6413}}}"


key,value
answer,"For a memorable high school graduation party, start by setting up a comfortable and stylish seating area with the WestinTrends Julia 10 Ft Outdoor Patio Cantilever Umbrella, which provides ample shade and a chic ambiance for your garden or backyard gathering. Source: 49. Add a touch of elegance with the Amay Grommet Top Blackout Curtain Panels in White, which can be used to create a beautiful backdrop or partition areas within your party space. Source: 76. For entertainment, consider setting up areas for games or photo booths with props, and don't forget to curate a playlist of the graduate's favorite songs to keep the energy lively. Lastly, serve refreshments and snacks on the UDPATIO High Back Outdoor Dining Chairs, ensuring comfort for your guests as they celebrate this significant milestone. Source: 51."
products,"49, 76, 51"
thoughts,I selected the outdoor patio umbrella and dining chairs to provide comfortable seating and a pleasant environment for the party. The blackout curtain panels were chosen to enhance the decor and can be creatively used for setting up distinct areas or as elegant backdrops. These items together help create a functional and festive atmosphere suitable for a high school graduation party.
