In [1]:
from IPython.display import clear_output

In [2]:
# No need to run this on colab. These libraries come pre-installed on colab
# %pip install torch torchvision torchaudio

# Content:

In this demo, we will take do some AI-based code generation, like the kind done my github co-pilot or codenium or other code completion services.

The model we will use is codeLlama. CodeLlama models are basically llama-v2 models fine tuned for coding tasks. Code llama is available in different sizes and we'll use the 7B params model variant.


For this, we need to install the library and to download the model weights file. The file can be downloaded from huggingface [repo](https://huggingface.co/TheBloke/CodeLlama-7B-GGUF) of [TheBloke](https://huggingface.co/TheBloke). Credits to him for quantizing the model, saving it in different formats like GGML and GGUF and sharing with the community. He has a lot of other models on his channel that you can check out, including different versions of llama

## Downloading model file

In [3]:
!wget https://huggingface.co/TheBloke/CodeLlama-7B-GGUF/resolve/main/codellama-7b.Q5_K_M.gguf

clear_output()

## Installing llama-cpp-python

installing supports different versions of hardware acceleration.

We will go with Cuda. Checkout the [Github Repo](https://github.com/abetlen/llama-cpp-python) for more options

In [4]:
!CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python==0.2.74  # This takes a few mins when building wheel. Be patient.

Collecting llama-cpp-python
  Downloading llama_cpp_python-0.2.79.tar.gz (50.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.3/50.3 MB[0m [31m13.0 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Installing backend dependencies ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting diskcache>=5.6.1 (from llama-cpp-python)
  Downloading diskcache-5.6.3-py3-none-any.whl (45 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.5/45.5 kB[0m [31m6.6 MB/s[0m eta [36m0:00:00[0m
Building wheels for collected packages: llama-cpp-python
  Building wheel for llama-cpp-python (pyproject.toml) ... [?25l[?25hdone
  Created wheel for llama-cpp-python: filename=llama_cpp_python-0.2.79-cp310-cp310-linux_x86_64.whl size=172348951 sha256=9a7230215712e5c9d21b89b26776d555c5e5a88ef19168afc8fbb1f332eba27a
  Stored in direc

## Running Llama-v2

In [5]:
import json

from llama_cpp import Llama

In [7]:
model = Llama(
    "codellama-7b.Q5_K_M.gguf",
    n_gpu_layers=-1, # To use GPU
    n_ctx=2048,
)

clear_output()

Let's try a code generation example. How about a function to open an RGB image, convert it to greyscale and then save it.

In [8]:
prompt = """
from PIL import Image

def convert_to_greyscale(input_path, output_path):
"""

In [9]:
output = model.__call__(
    prompt,
    max_tokens=None,  # sets no length limit
    temperature=0.5,
)


llama_print_timings:        load time =     406.14 ms
llama_print_timings:      sample time =    1051.62 ms /  2019 runs   (    0.52 ms per token,  1919.89 tokens per second)
llama_print_timings: prompt eval time =     405.59 ms /    29 tokens (   13.99 ms per token,    71.50 tokens per second)
llama_print_timings:        eval time =   57862.54 ms /  2018 runs   (   28.67 ms per token,    34.88 tokens per second)
llama_print_timings:       total time =   63874.75 ms /  2047 tokens


In [12]:
output.keys()

dict_keys(['id', 'object', 'created', 'model', 'choices', 'usage'])

In [11]:
print(output["choices"][0]['text'])

    """
    Converts a color image to greyscale.

    Parameters:
        input_path (str): Path to the input image.
        output_path (str): Path to save the converted image.

    Returns:
        None
    """
    img = Image.open(input_path)
    img = img.convert('L')
    img.save(output_path)

def convert_to_binary(input_path, output_path):
    """
    Converts an image to a binary image by thresholding.

    Parameters:
        input_path (str): Path to the input image.
        output_path (str): Path to save the converted image.

    Returns:
        None
    """
    img = Image.open(input_path)
    # Convert to greyscale and threshold
    img = img.convert('L')
    img = img.point(lambda x : 255 if x > 128 else 0, '1')
    img.save(output_path)

def convert_to_negative(input_path, output_path):
    """
    Converts an image to negative by inverting the pixels.

    Parameters:
        input_path (str): Path to the input image.
        output_path (str): Path to save the convert

Let's try a translation example now. Something a little more complex.

How about a deep learning example

In [41]:
prompt = """
# Simple script to download MNIST, Create a pytorch classifier class, and training it for the MNIST.

import torch
"""

In [42]:
output = model(prompt, max_tokens=None, temperature=0.1)

Llama.generate: prefix-match hit

llama_print_timings:        load time =     406.14 ms
llama_print_timings:      sample time =     536.29 ms /  1009 runs   (    0.53 ms per token,  1881.44 tokens per second)
llama_print_timings: prompt eval time =     121.30 ms /    24 tokens (    5.05 ms per token,   197.86 tokens per second)
llama_print_timings:        eval time =   28394.03 ms /  1008 runs   (   28.17 ms per token,    35.50 tokens per second)
llama_print_timings:       total time =   30370.10 ms /  1032 tokens


In [44]:
print(output["choices"][0]['text'])

from torch import nn
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
import matplotlib.pyplot as plt
import numpy as np
import os

# Download MNIST dataset
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transforms.ToTensor())
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transforms.ToTensor())

# Create data loaders
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=1024, shuffle=False)

# Define a classifier
class Classifier(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=5)
        self.pool1 = nn.MaxPool2d(kernel_size=2)
        self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=5)
        self.pool2 = nn.MaxPool2d(kernel_size=2)
        self.fc1 = nn.Linear(in_features=7*7*64, out_features=1024)
     