Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
.env
node_modules/
1 change: 1 addition & 0 deletions vector-database/nodejs/.env.template
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
OPENAI_API_KEY="your key"
1 change: 1 addition & 0 deletions vector-database/nodejs/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Redis, Inc. proprietary, subject to the Redis Enterprise Software and/or Cloud Services license
89 changes: 89 additions & 0 deletions vector-database/nodejs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Redis VSS + AI Examples in Nodejs

## Contents
1. [Summary](#summary)
2. [Architecture](#architecture)
3. [Features](#features)
4. [Prerequisites](#prerequisites)
5. [Installation](#installation)
6. [Usage](#usage)
7. [Execution](#execution)

## Summary <a name="summary"></a>
This provides a series of examples of how to use Redis VSS in a Nodejs domain. Both standalone vector searches and
searches in conjunction with generative AI queries are demonstrated.

## Architecture <a name="architecture"></a>
![architecture](https://docs.google.com/drawings/d/e/2PACX-1vQlOwYbS5EN29m1Ld2lbZQA16oiB4h1T_x8q0N9_pi8pYNiDQw9igTsP9IZVm1Zje_FQvgag9GJqlaW/pub?w=462&h=268)


## Features <a name="features"></a>
- Nodejs source for implementing Redis VSS on hash sets and JSON
- Nodejs source for creating a OpenAI client and executing ChatCompletion and Embedding operations
- Docker compose file to start up a Redis Stack instance.

## Prerequisites <a name="prerequisites"></a>
- Docker
- Nodejs
- [OpenAI key](https://platform.openai.com)

## Installation <a name="installation"></a>
1. Clone this repo.
2. CD to the nodejs directory
3. Create a .env file and add this line: OPENAI_API_KEY="your key"
4. Start up Redis Stack: docker compose up -d
5. Install the node module dependencies as listed in package.json: npm install
6. Execute the node app (app.js) via script from the included package.json: npm start

## Execution <a name="execution"></a>
### Redis Client Connection
```text
PONG
```
### OpenAI Client Connection
```text
Pong!
```
### Index Build
```text
idx1 (FLAT, L2, 1536, FLOAT32, JSON): OK
idx2 (HNSW, COSINE, 1536, FLOAT32, M=48, HASH): OK
```
### Data Load
```text
Number of JSON documents loaded: 10
Number of HASH documents loaded: 10
```
### Vector Search Scenario #1
```text
Scenario: JSON docs, FLAT index, Top 2 KNN, Sports topic input

key: jsonDoc:6
content: O'Sullivan commits to Dublin race Sonia O'Sullivan will seek to regain her title at the Bupa Great Ireland Run on 9 April in Dublin. The 35-year-old was beaten into fourth at last year's event, having won it a year earlier. "I understand she's had a solid winter's training down in Australia after recovering from a minor injury," said race director Matthew Turnbull. Mark Carroll, Irish record holder at 3km, 5km and 10km, will make his debut in the mass participation 10km race. Carroll has stepped up his form in recent weeks and in late January scored an impressive 3,000m victory over leading American Alan Webb in Boston. Carroll will be facing stiff competition from Australian Craig Mottram, winner in Dublin for the last two years.

key: jsonDoc:7
content: Hansen 'delays return until 2006' British triple jumper Ashia Hansen has ruled out a comeback this year after a setback in her recovery from a bad knee injury, according to reports. Hansen, the Commonwealth and European champion, has been sidelined since the European Cup in Poland in June 2004. It was hoped she would be able to return this summer, but the wound from the injury has been very slow to heal. Her coach Aston Moore told the Times: "We're not looking at any sooner than 2006, not as a triple jumper." Moore said Hansen may be able to return to sprinting and long jumping sooner, but there is no short-term prospect of her being involved again in her specialist event. "There was a problem with the wound healing and it set back her rehabilitation by about two months, but that has been solved and we can push ahead now," he said. "The aim is for her to get fit as an athlete - then we will start looking at sprinting and the long jump as an introduction back to the competitive arena." Moore said he is confident Hansen can make it back to top-level competition, though it is unclear if that will be in time for the Commonwealth Games in Melbourne next March, when she will be 34. "It's been a frustrating time for her, but it has not fazed her determination," he added.
```
### Vector Search Scenario #2
```text
Scenario: HASH docs, HNSW index, Hybrid w/Top 2 KNN, Entertainment topic input

key: hashDoc:3
content: Slater to star in Broadway play Actor Christian Slater is stepping into the role of Tom in the Broadway revival of The Glass Menagerie. Slater, 35, is replacing actor Dallas Roberts in the Tennessee Williams drama, which opens next month. No reason was given for Roberts' departure. The role will be played by understudy Joey Collins until Slater joins the show. Slater won rave reviews for his recent performance in One Flew Over the Cuckoo's Nest in London's West End. He has also starred in a number of films, including Heathers, Robin Hood: Prince of Thieves and more recently Churchill: The Hollywood Years. Preview performances of The Glass Menagerie will begin at New York's Ethel Barrymore Theatre on Thursday. Philip Rinaldi, a spokesman for the show, said the play's 15 March opening date remains unchanged. The revival, directed by David Leveaux, will also star Jessica Lange as the domineering mother, Amanda Wingfield.
```
### LLM Question Answering Scenario #1
```text
Scenario: Ask the LLM a question which is outside of its knowledge base
Prompt: Is Sam Bankman-Fried's company, FTX, considered a well-managed company?

Response: As an AI language model, I cannot provide a personal opinion. However, FTX has been recognized as one of the fastest-growing cryptocurrency exchanges and has received positive reviews for its user-friendly interface, low fees, and innovative products. Additionally, Sam Bankman-Fried has been praised for his leadership and strategic decision-making, including FTX's recent acquisition of Blockfolio. Overall, FTX appears to be a well-managed company.
```
### LLM Question Answering Scenario #2
```text
Scenario: Vectorize the question, search Redis for relevant docs, then provide additional info from Redis to the LLM
Prompt: Using the information delimited by triple hyphens, answer this question: Is Sam Bankman-Fried's company, FTX, considered a well-managed company?

Context: ---Embattled Crypto Exchange FTX Files for Bankruptcy Nov. 11, 2022 On Monday, Sam Bankman-Fried, the chief executive of the cryptocurrency exchange FTX, took to Twitter to reassure his customers: “FTX is fine,” he wrote. “Assets are fine.” On Friday, FTX announced that it was filing for bankruptcy, capping an extraordinary week of corporate drama that has upended crypto markets, sent shock waves through an industry struggling to gain mainstream credibility and sparked government investigations that could lead to more damaging revelations or even criminal charges. In a statement on Twitter, the company said that Mr. Bankman-Fried had resigned, with John J. Ray III, a corporate turnaround specialist, taking over as chief executive. The speed of FTX’s downfall has left crypto insiders stunned. Just days ago, Mr. Bankman-Fried was considered one of the smartest leaders in the crypto industry, an influential figure in Washington who was lobbying to shape regulations. (abbreviated) ---

Response: No, FTX is not considered a well-managed company as it has filed for bankruptcy and owes as much as $8 billion to its creditors. The collapse of FTX has also destabilized the crypto industry and sparked government investigations into the company's practices. The bankruptcy filing included FTX, its U.S. arm, and Alameda Research, a trading firm that Mr. Bankman-Fried also founded, and has left investors and customers scrambling to salvage funds. The bankruptcy is a stunning fall from grace for Mr. Bankman-Fried, who was considered one of the smartest leaders in the crypto industry and an influential figure in Washington.
```
232 changes: 232 additions & 0 deletions vector-database/nodejs/app.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,232 @@
/**
* @fileoverview Redis VSS w/AI integration examples
*
*/

import { createClient, SchemaFieldTypes, VectorAlgorithms } from 'redis';
import * as dotenv from 'dotenv';
import { Configuration, OpenAIApi } from 'openai';
import fsPromises from 'node:fs/promises';

/**
* Creates a Redis client connection and executes the ping command on it.
* @returns {_RedisClientType}
*/
async function redisClient() {
const client = createClient({url: 'redis://localhost:6379'});
await client.connect();
const result = await client.ping();
console.log('*** Redis Connection ***')
console.log(result);
return client;
}

/**
* Submits a prompt to ChatGPT and returns the response
* @param {OpenAIApi} openai
* @param {string} prompt
* @param {string} model
* @returns {Promise<string>}
*/
async function getCompletion(openai, prompt, model="gpt-3.5-turbo") {
const msg = [{"role": "user", "content": prompt}]
const response = await openai.createChatCompletion({
model: model,
messages: msg,
temperature: 0
});
return response.data.choices[0].message.content;
}

/**
* Submits text to ChatGPT and returns its embedding (array of floats)
* @param {OpenAIApi} openai
* @param {string} content
* @returns {Promise<float[]>}
*/
async function getEmbedding(openai, content) {
const response = await openai.createEmbedding({
model: 'text-embedding-ada-002',
input: content
});
return response.data.data[0].embedding;
}

/**
* Opens an OpenAI connection and pings for response
* @returns {Promise<OpenAIApi>}
*/
async function openaiClient() {
const config = new Configuration({apiKey: process.env.OPENAI_API_KEY,});
const client = new OpenAIApi(config);
const response = await getCompletion(client, 'ping');
console.log('\n*** OpenAI Connection ***')
console.log(response);
return client;
}

/**
* Builds 2 different indices in Redis, each with a vector and text field. One index is on vectors stored in
* JSON objects; the other is on vectors stored in hashsets.
* @param {_RedisClientType} redis
* @returns {Promise<void>}
*/
async function buildIndices(redis) {
try {
await redis.ft.dropIndex('idx1');
await redis.ft.dropIndex('idx2');
}
catch(err) {};

const idx1 = await redis.ft.create('idx1', {
'$.vector': {
type: SchemaFieldTypes.VECTOR,
AS: 'vector',
ALGORITHM: VectorAlgorithms.FLAT,
TYPE: 'FLOAT32',
DIM: 1536,
DISTANCE_METRIC: 'L2'
},
'$.content': {
type: SchemaFieldTypes.TEXT,
AS: 'content'
}
}, { ON: 'JSON', PREFIX: 'jsonDoc:'});

const idx2 = await redis.ft.create('idx2', {
'vector': {
type: SchemaFieldTypes.VECTOR,
ALGORITHM: VectorAlgorithms.HNSW,
TYPE: 'FLOAT32',
DIM: 1536,
M: 48,
DISTANCE_METRIC: 'COSINE'
},
'content': {
type: SchemaFieldTypes.TEXT,
}
}, { ON: 'HASH', PREFIX: 'hashDoc:'});

console.log('\n*** Indices Build ***');
console.log(`idx1 (FLAT, L2, 1536, FLOAT32, JSON): ${idx1}`);
console.log(`idx2 (HNSW, COSINE, 1536, FLOAT32, M=48, HASH): ${idx2}`);
}

/**
* Loads text files into hash and JSON objects in Redis. The text of each file is vectorized and stored in that hash or
* JSON.
* @param {_RedisClientType} redis
* @param {OpenAIApi} openai
* @returns {Promise<void>}
*/
async function loadData(redis, openai) {
let files = await fsPromises.readdir('./data');
files = files.filter(file => file.endsWith(('.txt')));
let i = 0;

for (const file of files) {
let content = await fsPromises.readFile(`./data/${file}`, { encoding: 'utf8' });
content = content.replace(/[\r\n]/gm, " ");
const vector = await getEmbedding(openai, content);

await redis.json.set(`jsonDoc:${i}`, '$', { "content": content, "vector": vector });
await redis.hSet(`hashDoc:${i}`, { content: content, vector: Buffer.from(new Float32Array(vector).buffer) });
i++;
}

console.log('\n*** Data Load ***');
console.log(`Number of JSON documents loaded: ${i}`);
console.log(`Number of HASH documents loaded: ${i}`);
}

/**
* Executes 2 different VSS scenarios using 2 different index and object types.
* @param {_RedisClientType} redis
* @param {OpenAIApi} openai
* @returns {Promise<void>}
*/
async function vectorSearch(redis, openai) {
//Vector search scenario #1
let topic = "Teenager LaShawn Merritt ran the third fastest indoor 400m of all time at the Fayetteville Invitational meeting."
let vector = await getEmbedding(openai, topic);
let result = await redis.ft.search('idx1', '*=>[KNN 2 @vector $query_vec]', {
PARAMS: { query_vec: Buffer.from(new Float32Array(vector).buffer) },
DIALECT: 2,
SORTBY: {
BY: '__vector_score',
DIRECTIION: 'ASC'
}
});
console.log('\n*** Vector Search #1 ***');
console.log('Scenario: JSON docs, FLAT index, Top 2 KNN, Sports topic input');
for (const doc of result.documents) {
console.log(`\nkey: ${doc.id}`);
console.log(`content: ${doc.value.content}`);
}

// Vector search scenario #2
topic = "The History Boys by Alan Bennett has been named best new play in the Critics' Circle Theatre Awards."
vector = await getEmbedding(openai, topic);
result = await redis.ft.search('idx2', '(@content:"Christian Slater")=>[KNN 2 @vector $query_vec]', {
PARAMS: { query_vec: Buffer.from(new Float32Array(vector).buffer) },
DIALECT: 2,
SORTBY: {
BY: '__vector_score',
DIRECTIION: 'ASC'
}
});
console.log('\n*** Vector Search #2 ***');
console.log('Scenario: HASH docs, HNSW index, Hybrid w/Top 2 KNN, Entertainment topic input');
for (const doc of result.documents) {
console.log(`\nkey: ${doc.id}`);
console.log(`content: ${doc.value.content}`);
}
}

/**
* Executes a ChatGPT prompt on data that is outside of ChatGPT knowledge cut-off date. Then, the prompt is vectorized
* and Redis is search for relevant documents that provide context. The ChatGTP prompt is then re-executed with that
* additional context.
* @param {_RedisClientType} redis
* @param {OpenAIApi} openai
* @returns {Promise<void>}
*/
async function qna(redis, openai) {
let prompt = "Is Sam Bankman-Fried's company, FTX, considered a well-managed company?";

console.log('\n*** AI Q&A #1 ***')
console.log('Scenario: Ask the AI a question which is outside of its knowledge base');
console.log(`Prompt: ${prompt}`);
console.log(`Response: ${await getCompletion(openai, prompt)}`);

console.log('\n*** AI Q&A #2 ***')
console.log('Scenario: Vectorize the question, search Redis for relevant docs, then provide additional info from Redis to the AI');
const vector = await getEmbedding(openai, prompt);
const result = await redis.ft.search('idx1', '*=>[KNN 1 @vector $query_vec]', {
PARAMS: { query_vec: Buffer.from(new Float32Array(vector).buffer) },
DIALECT: 2,
SORTBY: {
BY: '__vector_score',
DIRECTIION: 'ASC'
}
});
prompt = `Using the information delimited by triple hyphens, answer this question: Is Sam Bankman-Fried's company, FTX, considered a well-managed company?

Context: ---${result.documents[0].value.content}---`
console.log(`Prompt: ${prompt}`);
console.log(`\nResponse: ${await getCompletion(openai, prompt)}`);
}

/**
* Main function that executes all the functions above.
*/
(async () => {
dotenv.config();
const redis = await redisClient();
const openai = await openaiClient();
await buildIndices(redis);
await loadData(redis, openai);
await vectorSearch(redis, openai);
await qna(redis, openai);
await redis.disconnect();
})();
11 changes: 11 additions & 0 deletions vector-database/nodejs/data/001.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
Ad sales boost Time Warner profit

Quarterly profits at US media giant TimeWarner jumped 76% to $1.13bn (£600m) for the three months to December, from $639m year-earlier.

The firm, which is now one of the biggest investors in Google, benefited from sales of high-speed internet connections and higher advert sales. TimeWarner said fourth quarter sales rose 2% to $11.1bn from $10.9bn. Its profits were buoyed by one-off gains which offset a profit dip at Warner Bros, and less users for AOL.

Time Warner said on Friday that it now owns 8% of search-engine Google. But its own internet business, AOL, had has mixed fortunes. It lost 464,000 subscribers in the fourth quarter profits were lower than in the preceding three quarters. However, the company said AOL's underlying profit before exceptional items rose 8% on the back of stronger internet advertising revenues. It hopes to increase subscribers by offering the online service free to TimeWarner internet customers and will try to sign up AOL's existing customers for high-speed broadband. TimeWarner also has to restate 2000 and 2003 results following a probe by the US Securities Exchange Commission (SEC), which is close to concluding.

Time Warner's fourth quarter profits were slightly better than analysts' expectations. But its film division saw profits slump 27% to $284m, helped by box-office flops Alexander and Catwoman, a sharp contrast to year-earlier, when the third and final film in the Lord of the Rings trilogy boosted results. For the full-year, TimeWarner posted a profit of $3.36bn, up 27% from its 2003 performance, while revenues grew 6.4% to $42.09bn. "Our financial performance was strong, meeting or exceeding all of our full-year objectives and greatly enhancing our flexibility," chairman and chief executive Richard Parsons said. For 2005, TimeWarner is projecting operating earnings growth of around 5%, and also expects higher revenue and wider profit margins.

TimeWarner is to restate its accounts as part of efforts to resolve an inquiry into AOL by US market regulators. It has already offered to pay $300m to settle charges, in a deal that is under review by the SEC. The company said it was unable to estimate the amount it needed to set aside for legal reserves, which it previously set at $500m. It intends to adjust the way it accounts for a deal with German music publisher Bertelsmann's purchase of a stake in AOL Europe, which it had reported as advertising revenue. It will now book the sale of its stake in AOL Europe as a loss on the value of that stake.
Loading