Doxie

Doxie is an experiment in LLM-based information retrieval. Or in plain words: Doxie uses ChatGPT to answer questions based on the content of one or more websites, documents, or whatever other text content you want to feed it. It exposes a simple, ChatGPT like interface.

Doxie implements the most basic RAG loop imaginable, with simple context management. Have a look at these source files:

Srapping and preparing data from a specific source src/server/berufslexikon.ts
Embedding data
- src/server/embedder.ts chunking, embedding
- src/server/embedder-cli.ts takes the output of the scrapper and converts it into a xxx.embeddings.bin file to be loaded into the vector database
Vector database, loads a file generated by embedder-cli.ts into a collection and lets you query it src/server/rag.ts
Chat session management, keeps track of chat sessions, expands queries for RAG, and submits user queries + RAG context to GPT 3.5-turbo to get hopefully meaningful answers with source citations. [src/server/chatsessions.ts]

To run Doxie you need the following software installed on your system:

Doxie currently does not have a proper ingestion pipeline, and is hard coded for the test use case. This will probably, maybe amended in the coming days/weeks, depending on my mood.

If you want to play around with Doxie on some real-world data, run the ./download-testdata.sh script in the root folder. It will download a data set with embeddings generated from AMS Berufslexikon to ./docker/data/berufslexikon.embeddings.bin. You can then follow the instructions in the Development section below.

Development

npm run dev

In VS Code run the dev launch configurations, which will attach the debugger to the server, spawn a browser window, and also attach to that for frontend debugging.

Deployment

Deploy backend & frontend: ./publish.sh server
Deploy just the frontend: ./publish.sh

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
.vscode		.vscode
cli		cli
docker		docker
html		html
jnn		jnn
src		src
.editorconfig		.editorconfig
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc		.prettierrc
LICENSE		LICENSE
README.md		README.md
configure.mjs		configure.mjs
esbuild.server.mjs		esbuild.server.mjs
esbuild.site.mjs		esbuild.site.mjs
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
publish.sh		publish.sh
stats.sh		stats.sh
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Doxie

Development

Deployment

About

Releases

Packages

Languages

License

badlogic/doxie

Folders and files

Latest commit

History

Repository files navigation

Doxie

Development

Deployment

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages