Skip to content

Commit

Permalink
Merge branch 'main' into patch-3
Browse files Browse the repository at this point in the history
  • Loading branch information
adarsh-jha-dev committed Nov 2, 2023
2 parents 461901d + c9dd219 commit b5437d0
Show file tree
Hide file tree
Showing 25 changed files with 422 additions and 95 deletions.
4 changes: 1 addition & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,6 @@

Thank you for choosing to contribute to DocsGPT! We are all very grateful!

### [🎉 Join the Hacktoberfest with DocsGPT and Earn a Free T-shirt! 🎉](https://github.com/arc53/DocsGPT/blob/main/HACKTOBERFEST.md)

# We accept different types of contributions

📣 **Discussions** - Engage in conversations, start new topics, or help answer questions.
Expand Down Expand Up @@ -73,7 +71,7 @@ Here's a step-by-step guide on how to contribute to DocsGPT:
- Before you make any changes, make sure that your fork is in sync to avoid merge conflicts using:
```shell
git remote add upstream https://github.com/arc53/DocsGPT.git
git pull upstream master
git pull upstream main
```

4. **Create and Switch to a New Branch:**
Expand Down
22 changes: 10 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,27 +18,25 @@ Say goodbye to time-consuming manual searches, and let <strong><a href="https://
<a href="https://github.com/arc53/DocsGPT">![link to main GitHub showing Forks number](https://img.shields.io/github/forks/arc53/docsgpt?style=social)</a>
<a href="https://github.com/arc53/DocsGPT/blob/main/LICENSE">![link to license file](https://img.shields.io/github/license/arc53/docsgpt)</a>
<a href="https://discord.gg/n5BX8dh8rU">![link to discord](https://img.shields.io/discord/1070046503302877216)</a>
<a href="https://twitter.com/ATushynski">![X (formerly Twitter) URL](https://img.shields.io/twitter/url?url=https%3A%2F%2Ftwitter.com%2FATushynski)</a>
<a href="https://twitter.com/ATushynski">![X (formerly Twitter) URL](https://img.shields.io/twitter/follow/ATushynski)</a>


</div>

### Production Support / Help for companies:
### Production Support / Help for Companies:

We're eager to provide personalized assistance when deploying your DocsGPT to a live environment.

- [Book Demo :wave:](https://airtable.com/appdeaL0F1qV8Bl2C/shrrJF1Ll7btCJRbP)
- [Send Email :email:](mailto:contact@arc53.com?subject=DocsGPT%20support%2Fsolutions)

### [:tada: Join the Hacktoberfest with DocsGPT and Earn a Free T-shirt! :tada:](https://github.com/arc53/DocsGPT/blob/main/HACKTOBERFEST.md)

![video-example-of-docs-gpt](https://d3dg1063dc54p9.cloudfront.net/videos/demov3.gif)

## Roadmap

You can find our roadmap [here](https://github.com/orgs/arc53/projects/2). Please don't hesitate to contribute or create issues, it helps us improve DocsGPT!

## Our Open-Source models optimized for DocsGPT:
## Our Open-Source Models Optimized for DocsGPT:

| Name | Base Model | Requirements (or similar) |
| --------------------------------------------------------------------- | ----------- | ------------------------- |
Expand All @@ -52,7 +50,7 @@ If you don't have enough resources to run it, you can use bitsnbytes to quantize

![Main features of DocsGPT showcasing six main features](https://user-images.githubusercontent.com/17906039/220427472-2644cff4-7666-46a5-819f-fc4a521f63c7.png)

## Useful links
## Useful Links

- :mag: :fire: [Live preview](https://docsgpt.arc53.com/)

Expand All @@ -66,7 +64,7 @@ If you don't have enough resources to run it, you can use bitsnbytes to quantize

- :house: :closed_lock_with_key: [How to host it locally (so all data will stay on-premises)](https://docs.docsgpt.co.uk/Guides/How-to-use-different-LLM)

## Project structure
## Project Structure

- Application - Flask app (main application).

Expand Down Expand Up @@ -104,9 +102,9 @@ Otherwise, refer to this Guide:

To stop, just run `Ctrl + C`.

## Development environments
## Development Environments

### Spin up mongo and redis
### Spin up Mongo and Redis

For development, only two containers are used from [docker-compose.yaml](https://github.com/arc53/DocsGPT/blob/main/docker-compose.yaml) (by deleting all services except for Redis and Mongo).
See file [docker-compose-dev.yaml](./docker-compose-dev.yaml).
Expand All @@ -118,7 +116,7 @@ docker compose -f docker-compose-dev.yaml build
docker compose -f docker-compose-dev.yaml up -d
```

### Run the backend
### Run the Backend

Make sure you have Python 3.10 or 3.11 installed.

Expand Down Expand Up @@ -153,7 +151,7 @@ pip install -r requirements.txt
4. Run the app using `flask --app application/app.py run --host=0.0.0.0 --port=7091`.
5. Start worker with `celery -A application.app.celery worker -l INFO`.

### Start frontend
### Start Frontend

Make sure you have Node version 16 or higher.

Expand All @@ -176,7 +174,7 @@ Please refer to the [CONTRIBUTING.md](CONTRIBUTING.md) file for information abou

We as members, contributors, and leaders, pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation. Please refer to the [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md) file for more information about contributing.

## Many Thanks To Our Contributors
## Many Thanks To Our Contributors

<a href="https://github.com/arc53/DocsGPT/graphs/contributors" alt="View Contributors">
<img src="https://contrib.rocks/image?repo=arc53/DocsGPT" alt="Contributors" />
Expand Down
2 changes: 1 addition & 1 deletion application/core/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ class Settings(BaseSettings):
API_URL: str = "http://localhost:7091" # backend url for celery worker

API_KEY: str = None # LLM api key
EMBEDDINGS_KEY: str = None # api key for embeddings (if using openai, just copy API_KEY
EMBEDDINGS_KEY: str = None # api key for embeddings (if using openai, just copy API_KEY)
OPENAI_API_BASE: str = None # azure openai api base url
OPENAI_API_VERSION: str = None # azure openai api version
AZURE_DEPLOYMENT_NAME: str = None # azure deployment name for answering
Expand Down
17 changes: 17 additions & 0 deletions application/worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,17 +20,34 @@
pass


# Define a function to extract metadata from a given filename.
def metadata_from_filename(title):
store = '/'.join(title.split('/')[1:3])
return {'title': title, 'store': store}


# Define a function to generate a random string of a given length.
def generate_random_string(length):
return ''.join([string.ascii_letters[i % 52] for i in range(length)])

current_dir = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))

# Define the main function for ingesting and processing documents.
def ingest_worker(self, directory, formats, name_job, filename, user):
"""
Ingest and process documents.
Args:
self: Reference to the instance of the task.
directory (str): Specifies the directory for ingesting ('inputs' or 'temp').
formats (list of str): List of file extensions to consider for ingestion (e.g., [".rst", ".md"]).
name_job (str): Name of the job for this ingestion task.
filename (str): Name of the file to be ingested.
user (str): Identifier for the user initiating the ingestion.
Returns:
dict: Information about the completed ingestion task, including input parameters and a "limited" flag.
"""
# directory = 'inputs' or 'temp'
# formats = [".rst", ".md"]
input_files = None
Expand Down
2 changes: 1 addition & 1 deletion codecov.yml
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
ignore:
- "*/tests/*
- "*/tests/*"
2 changes: 1 addition & 1 deletion docs/pages/Deploying/Railway-Deploying.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Here's a step-by-step guide on how to host DocsGPT on Railway App.



At first Clone and setup the project locally to run , test and Modify.
At first Clone and set up the project locally to run , test and Modify.



Expand Down
7 changes: 6 additions & 1 deletion docs/pages/Developing/API-docs.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,8 @@ This endpoint will make sure documentation is loaded on the server (just run it

**Request:**

**Method**: `POST`

**Headers**: Content-Type should be set to `application/json; charset=utf-8`

**Request Body**: JSON object with the field:
Expand Down Expand Up @@ -116,6 +118,7 @@ This endpoint is used to upload a file that needs to be trained, response is JSO
**Request:**

**Method**: `POST`

**Request Body**: A multipart/form-data form with file upload and additional fields, including `user` and `name`.

HTML example:
Expand Down Expand Up @@ -143,7 +146,9 @@ JSON response with a status and a task ID that can be used to check the task's p
This endpoint is used to get the status of a task (`task_id`) from `/api/upload`

**Request:**
**Method**: `GE`T

**Method**: `GET`

**Query Parameter**: `task_id` (task ID to check)

**Sample JavaScript Fetch Request:**
Expand Down
8 changes: 4 additions & 4 deletions docs/pages/Guides/Customising-prompts.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,25 @@
# Customizing the Main Prompt

To customize the main prompt for DocsGPT, follow these steps:
Customizing the main prompt for DocsGPT gives you the ability to tailor the AI's responses to your specific requirements. By modifying the prompt text, you can achieve more accurate and relevant answers. Here's how you can do it:

1. Navigate to `/application/prompts/combine_prompt.txt`.

2. Edit the `combine_prompt.txt` file to modify the prompt text. You can experiment with different phrasings and structures to see how the model responds.
2. Open the `combine_prompt.txt` file and modify the prompt text to suit your needs. You can experiment with different phrasings and structures to observe how the model responds. The main prompt serves as guidance to the AI model on how to generate responses.

## Example Prompt Modification

**Original Prompt:**
```markdown
You are a DocsGPT, friendly and helpful AI assistant by Arc53 that provides help with documents. You give thorough answers with code examples if possible.
Use the following pieces of context to help answer the users question. If its not relevant to the question, provide friendly responses.
Use the following pieces of context to help answer the users question. If it's not relevant to the question, provide friendly responses.
You have access to chat history, and can use it to help answer the question.
When using code examples, use the following format:

(code)
{summaries}
```


Feel free to customize the prompt to align it with your specific use case or the kind of responses you want from the AI. For example, you can focus on specific document types, industries, or topics to get more targeted results.

## Conclusion

Expand Down
32 changes: 18 additions & 14 deletions docs/pages/Guides/How-to-train-on-other-documentation.md
Original file line number Diff line number Diff line change
@@ -1,43 +1,47 @@
## How to train on other documentation
This AI can use any documentation, but first it needs to be prepared for similarity search.

This AI can utilize any documentation, but it requires preparation for similarity search. Follow these steps to get your documentation ready:

**Step 1: Prepare Your Documentation**
![video-example-of-how-to-do-it](https://d3dg1063dc54p9.cloudfront.net/videos/how-to-vectorise.gif)

Start by going to `/scripts/` folder.

If you open this file, you will see that it uses RST files from the folder to create a `index.faiss` and `index.pkl`.

It currently uses OPEN_AI to create the vector store, so make sure your documentation is not too big. Pandas cost me around $3-$4.
It currently uses OPENAI to create the vector store, so make sure your documentation is not too large. Using Pandas cost me around $3-$4.

You can usually find documentation on Github in `docs/` folder for most open-source projects.
You can typically find documentation on GitHub in the `docs/` folder for most open-source projects.

### 1. Find documentation in .rst/.md and create a folder with it in your scripts directory
### 1. Find documentation in .rst/.md format and create a folder with it in your scripts directory.
- Name it `inputs/`.
- Put all your .rst/.md files in there.
- The search is recursive, so you don't need to flatten them.

If there are no .rst/.md files just convert whatever you find to .txt file and feed it. (don't forget to change the extension in script)
If there are no .rst/.md files, convert whatever you find to a .txt file and feed it. (Don't forget to change the extension in the script).

### 2. Create .env file in `scripts/` folder
And write your OpenAI API key inside
`OPENAI_API_KEY=<your-api-key>`.
### Step 2: Configure Your OpenAI API Key
1. Create a .env file in the scripts/ folder.
- Add your OpenAI API key inside: OPENAI_API_KEY=<your-api-key>.

### 3. Run scripts/ingest.py
### Step 3: Run the Ingestion Script

`python ingest.py ingest`

It will tell you how much it will cost.
It will provide you with the estimated cost.

### 4. Move `index.faiss` and `index.pkl` generated in `scripts/output` to `application/` folder.
### Step 4: Move `index.faiss` and `index.pkl` generated in `scripts/output` to `application/` folder.


### 5. Run web app
Once you run it will use new context that is relevant to your documentation.
Make sure you select default in the dropdown in the UI.
### Step 5: Run the Web App
Once you run it, it will use new context relevant to your documentation.Make sure you select default in the dropdown in the UI.

## Customization
You can learn more about options while running ingest.py by running:
- Make sure you select 'default' from the dropdown in the UI.

## Customization
You can learn more about options while running ingest.py by executing:
`python ingest.py --help`
| Options | |
|:--------------------------------:|:------------------------------------------------------------------------------------------------------------------------------:|
Expand Down
5 changes: 4 additions & 1 deletion docs/pages/Guides/How-to-use-different-LLM.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,14 @@ You can omit the keys if users provide their own. Ensure you set `LLM_NAME` and
## Step 2: Choose Your Models

**Options for `LLM_NAME`:**
- OpenAI ([More details](https://platform.openai.com/docs/models))
- openai ([More details](https://platform.openai.com/docs/models))
- anthropic ([More details](https://docs.anthropic.com/claude/reference/selecting-a-model))
- manifest ([More details](https://python.langchain.com/docs/integrations/llms/manifest))
- cohere ([More details](https://docs.cohere.com/docs/llmu))
- Arc53/DocsGPT-7B ([More details](https://huggingface.co/Arc53/DocsGPT-7B))
- Arc53/docsgpt-14b ([More details](https://huggingface.co/Arc53/docsgpt-14b))
- Arc53/docsgpt-7b-falcon ([More details](https://huggingface.co/Arc53/docsgpt-7b-falcon))
- Arc53/docsgpt-40b-falcon ([More details](https://huggingface.co/Arc53/docsgpt-40b-falcon))
- llama.cpp ([More details](https://python.langchain.com/docs/integrations/llms/llamacpp))

**Options for `EMBEDDINGS_NAME`:**
Expand Down
2 changes: 0 additions & 2 deletions docs/pages/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,6 @@ DocsGPT 🦖 is an innovative open-source tool designed to simplify the retrieva

Try it yourself: [https://docsgpt.arc53.com/](https://docsgpt.arc53.com/)

### [🎉 Join the Hacktoberfest with DocsGPT and Contribute to Earn a Free T-shirt!](https://github.com/arc53/DocsGPT/blob/main/HACKTOBERFEST.md)

<Cards
num={3}
children={Object.keys(allGuides).map((key, i) => (
Expand Down
2 changes: 1 addition & 1 deletion extensions/react-widget/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# DocsGPT react widget


THis widget will allow you to embed a DocsGPT assistant in your react app.
This widget will allow you to embed a DocsGPT assistant in your React app.

## Installation

Expand Down
2 changes: 2 additions & 0 deletions frontend/src/App.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import PageNotFound from './PageNotFound';
import { inject } from '@vercel/analytics';
import { useMediaQuery } from './hooks';
import { useState } from 'react';
import Setting from './Setting';

inject();

Expand All @@ -27,6 +28,7 @@ export default function App() {
<Route path="/" element={<Conversation />} />
<Route path="/about" element={<About />} />
<Route path="*" element={<PageNotFound />} />
<Route path="/settings" element={<Setting />} />
</Routes>
</div>
</div>
Expand Down
Loading

0 comments on commit b5437d0

Please sign in to comment.