Merge branch 'main' into patch-3

arc53 · Nov 2, 2023 · b5437d0 · b5437d0
2 parents 461901d + c9dd219
commit b5437d0
Show file tree

Hide file tree

Showing 25 changed files with 422 additions and 95 deletions.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -2,8 +2,6 @@
 
 Thank you for choosing to contribute to DocsGPT! We are all very grateful! 
 
-### [🎉 Join the Hacktoberfest with DocsGPT and Earn a Free T-shirt! 🎉](https://github.com/arc53/DocsGPT/blob/main/HACKTOBERFEST.md)
-
 # We accept different types of contributions
 
 📣 **Discussions** - Engage in conversations, start new topics, or help answer questions.
@@ -73,7 +71,7 @@ Here's a step-by-step guide on how to contribute to DocsGPT:
    - Before you make any changes, make sure that your fork is in sync to avoid merge conflicts using:
      ```shell
      git remote add upstream https://github.com/arc53/DocsGPT.git
-     git pull upstream master
+     git pull upstream main
      ```
 
 4. **Create and Switch to a New Branch:**

diff --git a/README.md b/README.md
@@ -18,27 +18,25 @@ Say goodbye to time-consuming manual searches, and let <strong><a href="https://
   <a href="https://github.com/arc53/DocsGPT">![link to main GitHub showing Forks number](https://img.shields.io/github/forks/arc53/docsgpt?style=social)</a>
   <a href="https://github.com/arc53/DocsGPT/blob/main/LICENSE">![link to license file](https://img.shields.io/github/license/arc53/docsgpt)</a>
   <a href="https://discord.gg/n5BX8dh8rU">![link to discord](https://img.shields.io/discord/1070046503302877216)</a>
-  <a href="https://twitter.com/ATushynski">![X (formerly Twitter) URL](https://img.shields.io/twitter/url?url=https%3A%2F%2Ftwitter.com%2FATushynski)</a>
+  <a href="https://twitter.com/ATushynski">![X (formerly Twitter) URL](https://img.shields.io/twitter/follow/ATushynski)</a>
 
 
 </div>
 
-### Production Support / Help for companies:
+### Production Support / Help for Companies:
 
 We're eager to provide personalized assistance when deploying your DocsGPT to a live environment.
 
 - [Book Demo :wave:](https://airtable.com/appdeaL0F1qV8Bl2C/shrrJF1Ll7btCJRbP)
 - [Send Email :email:](mailto:contact@arc53.com?subject=DocsGPT%20support%2Fsolutions)
 
-### [:tada: Join the Hacktoberfest with DocsGPT and Earn a Free T-shirt! :tada:](https://github.com/arc53/DocsGPT/blob/main/HACKTOBERFEST.md)
-
 ![video-example-of-docs-gpt](https://d3dg1063dc54p9.cloudfront.net/videos/demov3.gif)
 
 ## Roadmap
 
 You can find our roadmap [here](https://github.com/orgs/arc53/projects/2). Please don't hesitate to contribute or create issues, it helps us improve DocsGPT!
 
-## Our Open-Source models optimized for DocsGPT:
+## Our Open-Source Models Optimized for DocsGPT:
 
 | Name                                                                  | Base Model  | Requirements (or similar) |
 | --------------------------------------------------------------------- | ----------- | ------------------------- |
@@ -52,7 +50,7 @@ If you don't have enough resources to run it, you can use bitsnbytes to quantize
 
 ![Main features of DocsGPT showcasing six main features](https://user-images.githubusercontent.com/17906039/220427472-2644cff4-7666-46a5-819f-fc4a521f63c7.png)
 
-## Useful links
+## Useful Links
 
 - :mag: :fire: [Live preview](https://docsgpt.arc53.com/)
 
@@ -66,7 +64,7 @@ If you don't have enough resources to run it, you can use bitsnbytes to quantize
 
 - :house: :closed_lock_with_key: [How to host it locally (so all data will stay on-premises)](https://docs.docsgpt.co.uk/Guides/How-to-use-different-LLM)
 
-## Project structure
+## Project Structure
 
 - Application - Flask app (main application).
 
@@ -104,9 +102,9 @@ Otherwise, refer to this Guide:
 
 To stop, just run `Ctrl + C`.
 
-## Development environments
+## Development Environments
 
-### Spin up mongo and redis
+### Spin up Mongo and Redis
 
 For development, only two containers are used from [docker-compose.yaml](https://github.com/arc53/DocsGPT/blob/main/docker-compose.yaml) (by deleting all services except for Redis and Mongo).
 See file [docker-compose-dev.yaml](./docker-compose-dev.yaml).
@@ -118,7 +116,7 @@ docker compose -f docker-compose-dev.yaml build
 docker compose -f docker-compose-dev.yaml up -d
 ```
 
-### Run the backend
+### Run the Backend
 
 Make sure you have Python 3.10 or 3.11 installed.
 
@@ -153,7 +151,7 @@ pip install -r requirements.txt
 4. Run the app using `flask --app application/app.py run --host=0.0.0.0 --port=7091`.
 5. Start worker with `celery -A application.app.celery worker -l INFO`.
 
-### Start frontend
+### Start Frontend
 
 Make sure you have Node version 16 or higher.
 
@@ -176,7 +174,7 @@ Please refer to the [CONTRIBUTING.md](CONTRIBUTING.md) file for information abou
 
 We as members, contributors, and leaders, pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation. Please refer to the [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md) file for more information about contributing.
 
-## Many Thanks To Our Contributors
+## Many Thanks To Our Contributors⚡
 
 <a href="https://github.com/arc53/DocsGPT/graphs/contributors" alt="View Contributors">
   <img src="https://contrib.rocks/image?repo=arc53/DocsGPT" alt="Contributors" />

diff --git a/application/core/settings.py b/application/core/settings.py
@@ -19,7 +19,7 @@ class Settings(BaseSettings):
     API_URL: str = "http://localhost:7091"  # backend url for celery worker
 
     API_KEY: str = None  # LLM api key
-    EMBEDDINGS_KEY: str = None  # api key for embeddings (if using openai, just copy API_KEY
+    EMBEDDINGS_KEY: str = None  # api key for embeddings (if using openai, just copy API_KEY)
     OPENAI_API_BASE: str = None  # azure openai api base url
     OPENAI_API_VERSION: str = None  # azure openai api version
     AZURE_DEPLOYMENT_NAME: str = None  # azure deployment name for answering

diff --git a/application/worker.py b/application/worker.py
@@ -20,17 +20,34 @@
     pass
 
 
+# Define a function to extract metadata from a given filename.
 def metadata_from_filename(title):
     store = '/'.join(title.split('/')[1:3])
     return {'title': title, 'store': store}
 
 
+# Define a function to generate a random string of a given length.
 def generate_random_string(length):
     return ''.join([string.ascii_letters[i % 52] for i in range(length)])
 
 current_dir = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 
+# Define the main function for ingesting and processing documents.
 def ingest_worker(self, directory, formats, name_job, filename, user):
+    """
+    Ingest and process documents.
+
+    Args:
+        self: Reference to the instance of the task.
+        directory (str): Specifies the directory for ingesting ('inputs' or 'temp').
+        formats (list of str): List of file extensions to consider for ingestion (e.g., [".rst", ".md"]).
+        name_job (str): Name of the job for this ingestion task.
+        filename (str): Name of the file to be ingested.
+        user (str): Identifier for the user initiating the ingestion.
+
+    Returns:
+        dict: Information about the completed ingestion task, including input parameters and a "limited" flag.
+    """
     # directory = 'inputs' or 'temp'
     # formats = [".rst", ".md"]
     input_files = None

diff --git a/codecov.yml b/codecov.yml
@@ -1,2 +1,2 @@
 ignore:
-  - "*/tests/*”
+  - "*/tests/*"
diff --git a/docs/pages/Deploying/Railway-Deploying.md b/docs/pages/Deploying/Railway-Deploying.md
@@ -7,7 +7,7 @@ Here's a step-by-step guide on how to host DocsGPT on Railway App.
 
 
 
-At first Clone and setup the project locally to run , test and Modify.
+At first Clone and set up the project locally to run , test and Modify.
 
 
 

diff --git a/docs/pages/Developing/API-docs.md b/docs/pages/Developing/API-docs.md
@@ -55,6 +55,8 @@ This endpoint will make sure documentation is loaded on the server (just run it
 
 **Request:**
 
+**Method**: `POST`
+
 **Headers**: Content-Type should be set to `application/json; charset=utf-8`
 
 **Request Body**: JSON object with the field:
@@ -116,6 +118,7 @@ This endpoint is used to upload a file that needs to be trained, response is JSO
 **Request:**
 
 **Method**: `POST`
+
 **Request Body**: A multipart/form-data form with file upload and additional fields, including `user` and `name`.
 
 HTML example:
@@ -143,7 +146,9 @@ JSON response with a status and a task ID that can be used to check the task's p
 This endpoint is used to get the status of a task (`task_id`) from `/api/upload`
 
 **Request:**
-**Method**: `GE`T
+
+**Method**: `GET`
+
 **Query Parameter**: `task_id` (task ID to check)
 
 **Sample JavaScript Fetch Request:**

diff --git a/docs/pages/Guides/Customising-prompts.md b/docs/pages/Guides/Customising-prompts.md
@@ -1,25 +1,25 @@
 # Customizing the Main Prompt
 
-To customize the main prompt for DocsGPT, follow these steps:
+Customizing the main prompt for DocsGPT gives you the ability to tailor the AI's responses to your specific requirements. By modifying the prompt text, you can achieve more accurate and relevant answers. Here's how you can do it:
 
 1. Navigate to `/application/prompts/combine_prompt.txt`.
 
-2. Edit the `combine_prompt.txt` file to modify the prompt text. You can experiment with different phrasings and structures to see how the model responds.
+2. Open the `combine_prompt.txt` file and modify the prompt text to suit your needs. You can experiment with different phrasings and structures to observe how the model responds. The main prompt serves as guidance to the AI model on how to generate responses.
 
 ## Example Prompt Modification
 
 **Original Prompt:**
 ```markdown
 You are a DocsGPT, friendly and helpful AI assistant by Arc53 that provides help with documents. You give thorough answers with code examples if possible.
-Use the following pieces of context to help answer the users question. If its not relevant to the question, provide friendly responses.
+Use the following pieces of context to help answer the users question. If it's not relevant to the question, provide friendly responses.
 You have access to chat history, and can use it to help answer the question.
 When using code examples, use the following format:
 
 (code)
 {summaries}
 ```
 
-
+Feel free to customize the prompt to align it with your specific use case or the kind of responses you want from the AI. For example, you can focus on specific document types, industries, or topics to get more targeted results.
 
 ## Conclusion
 

diff --git a/docs/pages/Guides/How-to-train-on-other-documentation.md b/docs/pages/Guides/How-to-train-on-other-documentation.md
@@ -1,43 +1,47 @@
 ## How to train on other documentation
-This AI can use any documentation, but first it needs to be prepared for similarity search. 
 
+This AI can utilize any documentation, but it requires preparation for similarity search. Follow these steps to get your documentation ready:
+
+**Step 1: Prepare Your Documentation**
 ![video-example-of-how-to-do-it](https://d3dg1063dc54p9.cloudfront.net/videos/how-to-vectorise.gif)
 
 Start by going to `/scripts/` folder.
 
 If you open this file, you will see that it uses RST files from the folder to create a `index.faiss` and `index.pkl`. 
 
-It currently uses OPEN_AI to create the vector store, so make sure your documentation is not too big. Pandas cost me around $3-$4.
+It currently uses OPENAI to create the vector store, so make sure your documentation is not too large. Using Pandas cost me around $3-$4.
 
-You can usually find documentation on Github in `docs/` folder for most open-source projects.
+You can typically find documentation on GitHub in the `docs/` folder for most open-source projects.
 
-### 1. Find documentation in .rst/.md and create a folder with it in your scripts directory
+### 1. Find documentation in .rst/.md format and create a folder with it in your scripts directory.
 - Name it `inputs/`.
 - Put all your .rst/.md files in there.  
 - The search is recursive, so you don't need to flatten them.
 
-If there are no .rst/.md files just convert whatever you find to .txt file and feed it. (don't forget to change the extension in script)
+If there are no .rst/.md files, convert whatever you find to a .txt file and feed it. (Don't forget to change the extension in the script).
 
-### 2. Create .env file in `scripts/` folder
-And write your OpenAI API key inside
-`OPENAI_API_KEY=<your-api-key>`.
+### Step 2: Configure Your OpenAI API Key
+1. Create a .env file in the scripts/ folder.
+  - Add your OpenAI API key inside: OPENAI_API_KEY=<your-api-key>.
 
-### 3. Run scripts/ingest.py
+### Step 3: Run the Ingestion Script
 
 `python ingest.py ingest`
 
-It will tell you how much it will cost.
+It will provide you with the estimated cost.
 
-### 4. Move `index.faiss` and `index.pkl` generated in `scripts/output` to `application/` folder. 
+### Step 4: Move `index.faiss` and `index.pkl` generated in `scripts/output` to `application/` folder. 
 
 
-### 5. Run web app
-Once you run it will use new context that is relevant to your documentation.  
-Make sure you select default in the dropdown in the UI.
+### Step 5: Run the Web App
+Once you run it, it will use new context relevant to your documentation.Make sure you select default in the dropdown in the UI.
 
 ## Customization 
 You can learn more about options while running ingest.py by running:
+  - Make sure you select 'default' from the dropdown in the UI.
 
+## Customization
+You can learn more about options while running ingest.py by executing:
 `python ingest.py --help`
 |              Options             |                                                                                                                                |
 |:--------------------------------:|:------------------------------------------------------------------------------------------------------------------------------:|

diff --git a/docs/pages/Guides/How-to-use-different-LLM.md b/docs/pages/Guides/How-to-use-different-LLM.md
@@ -19,11 +19,14 @@ You can omit the keys if users provide their own. Ensure you set `LLM_NAME` and
 ## Step 2: Choose Your Models
 
 **Options for `LLM_NAME`:**
-- OpenAI ([More details](https://platform.openai.com/docs/models))
+- openai ([More details](https://platform.openai.com/docs/models))
+- anthropic ([More details](https://docs.anthropic.com/claude/reference/selecting-a-model))
 - manifest ([More details](https://python.langchain.com/docs/integrations/llms/manifest))
 - cohere ([More details](https://docs.cohere.com/docs/llmu))
+- Arc53/DocsGPT-7B ([More details](https://huggingface.co/Arc53/DocsGPT-7B))
 - Arc53/docsgpt-14b ([More details](https://huggingface.co/Arc53/docsgpt-14b))
 - Arc53/docsgpt-7b-falcon ([More details](https://huggingface.co/Arc53/docsgpt-7b-falcon))
+- Arc53/docsgpt-40b-falcon ([More details](https://huggingface.co/Arc53/docsgpt-40b-falcon))
 - llama.cpp ([More details](https://python.langchain.com/docs/integrations/llms/llamacpp))
 
 **Options for `EMBEDDINGS_NAME`:**

diff --git a/docs/pages/index.mdx b/docs/pages/index.mdx
@@ -25,8 +25,6 @@ DocsGPT 🦖 is an innovative open-source tool designed to simplify the retrieva
 
 Try it yourself: [https://docsgpt.arc53.com/](https://docsgpt.arc53.com/)
 
-### [🎉 Join the Hacktoberfest with DocsGPT and Contribute to Earn a Free T-shirt!](https://github.com/arc53/DocsGPT/blob/main/HACKTOBERFEST.md)
-
 <Cards
       num={3}
       children={Object.keys(allGuides).map((key, i) => (

diff --git a/extensions/react-widget/README.md b/extensions/react-widget/README.md
@@ -1,7 +1,7 @@
 # DocsGPT react widget
 
 
-THis widget will allow you to embed a DocsGPT assistant in your react app.
+This widget will allow you to embed a DocsGPT assistant in your React app.
 
 ## Installation
 

diff --git a/frontend/src/App.tsx b/frontend/src/App.tsx
@@ -6,6 +6,7 @@ import PageNotFound from './PageNotFound';
 import { inject } from '@vercel/analytics';
 import { useMediaQuery } from './hooks';
 import { useState } from 'react';
+import Setting from './Setting';
 
 inject();
 
@@ -27,6 +28,7 @@ export default function App() {
           <Route path="/" element={<Conversation />} />
           <Route path="/about" element={<About />} />
           <Route path="*" element={<PageNotFound />} />
+          <Route path="/settings" element={<Setting />} />
         </Routes>
       </div>
     </div>