Merge pull request #250 from stochasticai/tushar/docs

Documentation revamping
stochasticai · Sep 6, 2023 · 4c71825 · 4c71825
2 parents 9b98c68 + 85cf772
commit 4c71825
Show file tree

Hide file tree

Showing 56 changed files with 7,785 additions and 3,129 deletions.
diff --git a/.github/stochastic_logo_dark.svg b/.github/stochastic_logo_dark.svg
diff --git a/.github/stochastic_logo_light.svg b/.github/stochastic_logo_light.svg
diff --git a/README.md b/README.md
@@ -220,6 +220,36 @@ model = BaseModel.load("x/distilgpt2_lora_finetuned_alpaca")
 
 <br>
 
+## Supported Models
+Below is a list of all the supported models via `BaseModel` class of `xTuring` and their corresponding keys to load them.
+
+|  Model |  Key |
+| -- | -- |
+|Bloom | bloom|
+|Cerebras | cerebras|
+|DistilGPT-2 | distilgpt2|
+|Falcon-7B | falcon|
+|Galactica | galactica|
+|GPT-J | gptj|
+|GPT-2 | gpt2|
+|LlaMA | llama|
+|LlaMA2 | llama2|
+|OPT-1.3B | opt|
+
+The above mentioned are the base variants of the LLMs. Below are the templates to get their `LoRA`, `INT8`, `INT8 + LoRA` and `INT4 + LoRA` versions.
+
+| Version | Template |
+| -- | -- |
+| LoRA|  <model_key>_lora|
+| INT8|  <model_key>_int8|
+| INT8 + LoRA|  <model_key>_lora_int8|
+
+** In order to load any model's __`INT4+LoRA`__ version, you will need to make use of `GenericLoraKbitModel` class from `xturing.models`. Below is how to use it:
+```python
+model = GenericLoraKbitModel('<model_path>')
+```
+The `model_path` can be replaced with you local directory or any HuggingFace library model like `facebook/opt-1.3b`.
+
 ## 📈 Roadmap
 - [x] Support for `LLaMA`, `GPT-J`, `GPT-2`, `OPT`, `Cerebras-GPT`, `Galactica` and `Bloom` models
 - [x] Dataset generation using self-instruction

diff --git a/docs/docs/about.md b/docs/docs/about.md
diff --git a/docs/docs/advanced/_category_.json b/docs/docs/advanced/_category_.json
@@ -0,0 +1,9 @@
+{
+    "label": "🧗🏻 Advanced Topics",
+    "position": 3,
+    "collapsed": true,
+    "link": {
+        "type": "doc",
+        "id": "advanced"
+    }
+}
diff --git a/docs/docs/advanced/advanced.md b/docs/docs/advanced/advanced.md
@@ -0,0 +1,12 @@
+---
+sidebar_position: 3
+title: 🧗🏻 Advanced topics
+description: Guide for people who want to customise xTuring even further.
+---
+
+import DocCardList from '@theme/DocCardList';
+
+
+# 🧗🏻 Advanced Topics
+
+<DocCardList />
diff --git a/docs/docs/advanced/anymodel.md b/docs/docs/advanced/anymodel.md
@@ -0,0 +1,180 @@
+---
+title: 🌦️ Work with any model
+description: Use self-instruction to generate a dataset
+sidebar_position: 2
+---
+
+<!-- ## class `GenericModel` -->
+<!-- ## Load Any Model via `GenericModel` wrapper -->
+The `GenericModel` class makes it possible to test and fine-tune the models which are not directly available via the `BaseModel` class. Apart from the base class, we can use classes mentioned below to load the models for memory-efficient computations:
+
+| Class Name | Description |
+| ---------- | ----------- |
+| `GenericModel` |    Loads the normal version of the model     |
+| `GenericInt8Model` |    Loads the model ready to fine-tune in __INT8__ precision     |
+| `GenericLoraModel` |    Loads the model ready to fine-tune using __LoRA__ technique     |
+| `GenericLoraInt8Model` |   Loads the model ready to fine-tune using __LoRA__ technique in __INT8__ precsion        |
+| `GenericLoraKbitModel` |   Loads the model ready to fine-tune using __LoRA__ technique in __INT4__ precision         |
+
+<!-- Let us circle back to the above example and see how we can replicate the results of the `BaseModel` class as shown [here](/overview/quickstart/load_save_models). -->
+
+<!-- Start by downloading the Alpaca dataset from [here](https://d33tr4pxdm6e2j.cloudfront.net/public_content/tutorials/datasets/alpaca_data.zip) and extract it to a folder. We will load this dataset using the `InstructionDataset` class. -->
+
+<!-- ```python
+from xturing.datasets import InstructionDataset
+
+dataset_path = './alpaca_data'
+
+dataset = InstructionDataset(dataset_path)
+``` -->
+
+
+To initialize the model, simply run the following 2 commands:
+```python
+from xturing.models import GenericModel
+
+model_path = 'aleksickx/llama-7b-hf'
+
+model = GenericLoraModel(model_path)
+```
+The _'model_path'_ can be a locally saved model and/or any model available on the HuggingFace's [Model Hub](https://huggingface.co/models).
+
+To fine-tune the model on a dataset, we will use the default configuration for the fine-tuning.
+
+```python
+model.finetune(dataset=dataset)
+```
+
+In order to see how to load a pre-defined dataset, go [here](/overview/quickstart/prepare), and to see how to generate a dataset, refer [this](/advanced/generate) page.
+
+Let's test our fine-tuned model, and make some inference.
+
+```python
+output = model.generate(texts=["Why LLM models are becoming so important?"])
+```
+We can print the `output` variable to see the results.
+
+Next, we need to save our fine-tuned model using the `.save()` method. We will send the path of the directory as parameter to the method to save the fine-tuned model.
+
+```python
+model.save('/path/to/a/directory/')
+```
+
+We can also see our model(s) in action with a beautiful UI by launchung the playground locally.
+
+```python
+from xturing.ui.playground import Playground
+
+Playground().launch()
+```
+
+<!-- ## GenericModel classes
+The `GenericModel` classes consists of:
+1. `GenericModel`
+2. `GenericInt8Model`
+3. `GenericLoraModel`
+4. `GenericLoraInt8Model`
+5. `GenericLoraKbitModel`
+
+The below pieces of code will work for all of the above classes by replacing the `GenericModel` in below codes with any of the above classes. The pieces of codes presented below are very similar to that mentioned above with only slight difference.
+
+### 1. Load a pre-trained and/or fine-tuned model
+
+To load a pre-trained (or fine-tuned) model, run the following line of code. This will load the model with the default weights in the case of a pre-trained model, and the weights which were saved in the case of a fine-tuned one.
+```python
+from xturing.models import GenericModel
+
+model = GenericModel("<model_path>")
+'''
+The <model_path> can be path to a local model, for example, "./saved_model" or path from the HuggingFace library, for example, "facebook/opt-1.3b"
+
+For example,
+model = GenericModel('./saved_model')
+OR
+model = GenericModel('facebook/opt-1.3b')
+'''
+```
+
+### 2. Save a fine-tuned model
+
+After fine-tuning your model, you can save it as simple as:
+
+```python
+model.save("/path/to/a/directory")
+```
+
+Remember that the path that you specify should be a directory. If the directory doesn't exist, it will be created.
+
+The model weights will be saved into 2 files. The whole model weights including based model parameters and LoRA parameters are stored in `pytorch_model.bin` file and only LoRA parameters are stored in `adapter_model.bin` file.
+
+
+<details>
+    <summary> <h3> Examples to load fine-tuned and pre-trained models</h3> </summary>
+
+1. To load a pre-trained model
+
+```python
+## Make the necessary imports
+from xturing.models import GenericModel
+
+## Loading the model
+model = GenericModel("facebook/opt-1.3b")
+
+## Saving the model
+model.save("/path/to/a/directory")
+```
+
+2. To load a fine-tuned model
+```python
+## Make the necessary imports
+from xturing.models import GenericModel
+
+## Loading the model
+model = GenericModel("./saved_model")
+
+```
+
+</details>
+
+## Inference via `GenericModel`
+
+Once you have fine-tuned your model, you can run the inferences as simple as follows.
+
+### Using a local model
+
+Start with loading your model from a checkpoint after fine-tuning it.
+
+```python
+# Make the ncessary imports
+from xturing.modelsimport GenericModel
+# Load the desired model
+model = GenericModel("/path/to/local/model")
+```
+
+Next, we can run do the inference on our model using the `.generate()` method.
+
+```python
+# Make inference
+output = model.generate(texts=["Why are the LLMs so important?"])
+# Print the generated outputs
+print("Generated output: {}".format(output))
+```
+### Using a pretrained model
+
+Start with loading your model with the default weights.
+
+```python
+# Make the ncessary imports
+from xturing.models import GenericModel
+# Load the desired model
+model = GenericModel("llama_lora")
+```
+
+Next, we can run do the inference on our model using the `.generate()` method.
+
+```python
+# Make inference
+output = model.generate(texts=["Why are the LLMs so important?"])
+# Print the generated outputs
+print("Generated output: {}".format(output))
+``` -->
diff --git a/docs/docs/inference/api_server.md → docs/docs/advanced/api_server.md b/docs/docs/inference/api_server.md → docs/docs/advanced/api_server.md
@@ -1,19 +1,24 @@
 ---
-title: FastAPI server
+title: ⚡️ FastAPI server
 description: FastAPI inference server
 sidebar_position: 3
 ---
 
-Once you have fine-tuned your model, you can run the inference using a FastAPI server.
+# ⚡️ Running model inference with FastAPI Ssrver
 
-### 1. Launch API server from CLI
+<!-- Once you have fine-tuned your model, you can run the inference using a FastAPI server. -->
+After successfully fine-tuning your model, you can perform inference using a FastAPI server. The following steps guide you through launching and utilizing the API server for your fine-tuned model.
+
+### 1. Launch API server from Command Line Interface (CLI)
+
+To initiate the API server, execute the following command in your command line interface:
 
 ```sh
-xturing api -m "/path/to/the/model"
+$ xturing api -m "/path/to/the/model"
 ```
 
 :::info
-Model path should be a directory containing a valid `xturing.json` config file.
+Ensure that the model path you provide is a directory containing a valid xturing.json configuration file.
 :::
 
 ### 2. Health check API
@@ -69,3 +74,5 @@ Model path should be a directory containing a valid `xturing.json` config file.
     "response": ["JP Morgan is multinational investment bank and financial service headquartered in New York city."]
   }
   ```
+
+By following these steps, you can effectively run your fine-tuned model for text generation through the FastAPI server, facilitating seamless inference with structured requests and responses.