Sahha x Rag

This project utilises Sahha demo data to provide LLM inference on time series health data. Currently supported for daily and week long inference.

What we're building

We are building an app that takes text (text files), embeds them into vectors, stores them into Pinecone, and allows semantic searching of the data.

For anyone wondering what Semantic search is, here is an overview (taken directly from ChatGPT4):

Semantic search refers to a search approach that understands the user's intent and the contextual meaning of search queries, instead of merely matching keywords.

It uses natural language processing and machine learning to interpret the semantics, or meaning, behind queries. This results in more accurate and relevant search results. Semantic search can consider user intent, query context, synonym recognition, and natural language understanding. Its applications range from web search engines to personalized recommendation systems.

Running the app

In this section I will walk you through how to deploy and run this app.

Prerequisites

To run this app, you need the following:

An OpenAI API key
Pinecone API Key

Up and running

To run the app locally, follow these steps:

Clone this repo
Change into the directory and install the dependencies using either NPM or Yarn
Copy .example.env.local to a new file called .env.local and update with your API keys and environment.

Be sure your environment is an actual environment given to you by Pinecone, like us-west4-gcp-free
(Optional) - Add your own custom text or markdown files into the /documents folder.
Run the app:

npm run dev

Need to know

When creating the embeddings and the index, it can take up to 2-4 minutes for the index to fully initialize. There is a settimeout function of 180 seconds in the utils that waits for the index to be created.

If the initialization takes longer, then it will fail the first time you try to create the embeddings. If this happens, visit the Pinecone console to watch and wait for the status of your index being created to finish, then run the function again.

The base of this project was guided by this Node.js tutorial, with some restructuring and ported over to Next.js. You can also follow them here on Twitter!

Getting your data

I recommend checking out GPT Repository Loader which makes it simple to turn any GitHub repo into a text format, preserving the structure of the files and file contents, making it easy to chop up and save into pinecone using my codebase.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
app		app
components/ui		components/ui
documents/Docs		documents/Docs
lib		lib
public		public
.eslintrc.json		.eslintrc.json
.example.env.local		.example.env.local
.gitignore		.gitignore
README.md		README.md
components.json		components.json
config.ts		config.ts
next.config.js		next.config.js
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json
utils.ts		utils.ts
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Sahha x Rag

What we're building

Running the app

Prerequisites

Up and running

Need to know

Getting your data

About

Uh oh!

Releases

Packages

Languages

Paramstr/Sahha_x_RAG

Folders and files

Latest commit

History

Repository files navigation

Sahha x Rag

What we're building

Running the app

Prerequisites

Up and running

Need to know

Getting your data

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages