# Code Llama by Meta AI

<!--<badge>--><a href="https://colab.research.google.com/github/kuennethgroup/MLinMS/blob/main/1_exercise/deploy_codellama_13B_tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a><!--</badge>-->

> A state-of-the-art large language model for coding

Links:
* https://ai.meta.com/blog/code-llama-large-language-model-coding/
* https://github.com/facebookresearch/codellama

Code Llama is an enhanced version of Llama 2, tailored for programming tasks. It was developed by subjecting Llama 2 to additional training using specialized code-related datasets. Code Llama exhibits enhanced coding proficiency and can generate code snippets and textual descriptions related to coding when provided with prompts in either code or natural language. It boasts compatibility with a wide array of widely-used programming languages, including but not limited to Python, C++, Java, PHP, TypeScript (JavaScript), C#, Rust, and many others.

Code Llama is an advanced LLM (large language model) with the ability to create code and provide explanations about code in response to prompts written in code or natural language.

It is freely accessible for both research and commercial purposes. This model is constructed upon the foundation of Llama 2 and comes in three variations:


*   A foundational model that knows many programming languages
*   A specialized model for Python, tailored to excel in Python-related tasks
*   An Instruct model, fine-tuned to comprehend and respond to natural language instructions

## Instructions

1. (optional) Save notebook in you Google Drive: File -> Save copy in Drive
2. Change runtime type to T4 in the top right corner
3. Runtime -> Run all
4. While waiting for the process to complete, we watch the video below together
5. Read throug the notes in each cell and follow the installation process
6. Wait for the gradio link to appear at the end of the notebook (e.g., Running on public URL: https://ff64af66ebaae89911.gradio.live)
7. You have now delpoyed your own LLM model for programming tasks.
8. Ask as many coding questions as you want :)



In [3]:
# @title Watch this video on how Google Colab works
from IPython.display import YouTubeVideo

YouTubeVideo("RLYoEyIHL6A&t=4s")

In [None]:
# Colab starts a virtual machine in the background. The virtual machine is a fully fledged Linux system.
# Command for the Linux systems are initiated by "!"
# Command to list the files and directories on Linux
!ls

In [None]:
# Change directory to /content. "/" indicates the root directory on Linux systems; all directories start at the root directory
!cd /content

In [None]:
# ls to show directories and files of /content
!ls

In [None]:
# "Git" is a very powerful subversioning tool that takes care of versioning you code.
!git help

In [None]:
# We use git to clone the "text-generation-webui" from GitHub. Github stores and manages your code, as well as tracks and controls changes to the code base.
# "text-generation-webui" allows as to deploy trained LLMs using a "chat-like" interface (similar to chatGPT)
!git clone -b v2.5 https://github.com/camenduru/text-generation-webui

In [None]:
# Change to the new directory that was created when cloning text-generation-webui
%cd /content/text-generation-webui

In [None]:
# pip is tool for installing Python packages. Syntax: pip install pkg_name, pip uninstall pkg_name, ...
# We install all packages defined in the file "requirements.txt"
!pip install -q -r requirements.txt

In [None]:
# We want to use the dark theme of the text-webui 🌑
!echo "dark_theme: true" > /content/settings.yaml
# And a specific chat sytle
!echo "chat_style: wpp" >> /content/settings.yaml


In [None]:
from pathlib import Path
import requests
import shutil
from tqdm.auto import tqdm


# Function to download a file
def download_file(url, dest):
    # makes an HTTP request within a context manager
    with requests.get(url, stream=True) as r:

        # check header to get content length, in bytes
        total_length = int(r.headers.get("Content-Length"))

        # Implements a progress bar via tqdm
        with tqdm.wrapattr(r.raw, "read", total=total_length, desc="")as raw:

            # Saves the output to dest
            with open(dest, 'wb')as output:
                shutil.copyfileobj(raw, output)

# File avaialable at
urls = [
    "https://huggingface.co/TheBloke/CodeLlama-13B-Instruct-GGUF/resolve/main/codellama-13b-instruct.Q5_0.gguf",
    'https://huggingface.co/TheBloke/CodeLlama-13B-Instruct-GGUF/raw/main/config.json'
    ]

# Save the download to the following directory

model_dir = Path('/content/text-generation-webui/models/codellama-13b-instruct')
model_dir.mkdir(exist_ok=True)

# Download the urls
for url in urls:
    save_to = model_dir / Path(url).name
    download_file(url, save_to)

In [None]:
%cd /content/text-generation-webui
!python server.py --share --settings /content/settings.yaml --model {model_dir} --gpu-memory 15000MB --n-gpu-layers 36

## Example prompts

1. Provide the answer in Python. Write a function that adds two numbers
2. Provide the answer in Python. Write a function to download the file https://huggingface.co/TheBloke/CodeLlama-13B-Instruct-GGUF/raw/main/config.json to the directory /content/models
3. Provide the answer in Python. Write a function that computes the set of sums of all contiguous sublists of a given list.
4. Provide the answer in Python. Write the class for a basic neural network implementation in pytorch

## Template for code


```
[INST] Write code to solve the following coding problem that obeys the constraints and passes the example test cases. Please wrap your code answer using ```:
{prompt}
[/INST]
```
