# Code Alpaca

github: https://github.com/sahil280114/codealpaca?tab=readme-ov-file#fine-tuning

## Stanford Alpaca

The current Alpaca model is fine-tuned from a 7B LLaMA model on 52K instruction-following data generated by the techniques in the Self-Instruct, with some modifications.

* We used `text-davinci-003` to generate the instruction data instead of `davinci`.

* We wrote a new prompt (prompt.txt)  that explicitly gave the requirement of instruction generation to `text-davinci-003`.

* We adopted much more aggressive batch decoding, i.e., generating 20 instructions at once, which significantly reduced the cost of data generation.

* We simplified the data generation pipeline by discarding the difference between classification and non-classification instructions.

* We only generated a single instance for each instruction.

This produced an instruction-following dataset with 52K examples obtained at a much lower cost. We also find our 52K generated data to be much more diverse than the data released by self-instruct.

`prompt.txt`:

```{tip}
You are asked to come up with a set of 20 diverse task instructions. These task instructions will be given to a GPT model and we will evaluate the GPT model for completing the instructions.

Here are the requirements:<br>
1. Try not to repeat the verb for each instruction to maximize diversity.<br>
2. The language used for the instruction also should be diverse. For example, you should combine questions with imperative instrucitons.<br>
3. The type of instructions should be diverse. The list should include diverse types of tasks like open-ended generation, classification, editing, etc.<br>
4. A GPT language model should be able to complete the instruction. For example, do not ask the assistant to create any visual or audio output. For another example, do not ask the assistant to wake you up at 5pm or set a reminder because it cannot perform any action.<br>
5. The instructions should be in English.<br>
6. The instructions should be 1 to 2 sentences long. Either an imperative sentence or a question is permitted.<br>
7. You should generate an appropriate input to the instruction. The input field should contain a specific example provided for the instruction. It should involve realistic data and should not contain simple placeholders. The input should provide substantial content to make the instruction challenging but should ideally not exceed 100 words.<br>
8. Not all instructions require input. For example, when a instruction asks about some general information, "what is the highest peak in the world", it is not necssary to provide a specific context. In this case, we simply put "<noinput>" in the input field.<br>
9. The output should be an appropriate response to the instruction and the input. Make sure the output is less than 100 words.

List of 20 tasks:
```

## Code Alpaca

Code Alpaca is fully based on Stanford Alpaca ,and only changes the data used for training.

* Modified prompt to focus on code generation/editing/optimization tasks instead of general tasks.

* Modified seed tasks to only be related to code generation.

This produced an instruction-following dataset with 20K examples obtained at a much lower cost (less than $200).

### Data

`data/code_alpaca_20k.json` contains 20K instruction-following data used for fine-tuning the Code Alpaca model. This JSON file is a list of dictionaries, each dictionary contains the following fields `instruction`, `input` and `output`.

We used the following prompts for fine-tuning the model:

* for examples with a non-empty input field:

    ```
    Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

    ### Instruction:
    {instruction}

    ### Input:
    {input}

    ### Response:
    ```

* for examples with an empty input field:

    ```
    Below is an instruction that describes a task. Write a response that appropriately completes the request.

    ### Instruction:
    {instruction}

    ### Response:
    ```

During inference, we use the user instruction with an empty input field (second option).