<a href="https://colab.research.google.com/github/tykiww/LLM_Implementations/blob/main/llama_exploration.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Llama Initial Exploration

So, this isn't actually a full-llama v1 model exploration.

Rather, it is the research-ONLY llama 7B model adapted with alpaca weights.

This model is fine tuned with alpaca data using LoRA.

This requires a GPU for inference.
Luckily, a T4 or RTX3080 is sufficient so cheapo tyki is willing to work with it.
*italicized text*
I am following the examples given by Sam Witteveen https://www.youtube.com/watch?v=JzBR8oieyy8&list=WL&index=3 and adapting the code for my own purposes.

### Setup environment

In [None]:
!pip install -q datasets loralib sentencepiece
!pip uninstall transformers # mostly to do with re-running this notebook
!pip install -q git+https://github.com/zphang/transformers@c3dc391 # transformers pulled in from gh
!pip -q install git+https://github.com/huggingface/peft.git # peft from hf
!pip -q install bitsandbytes torch
!python --version

In [1]:
# what gpu (colab)?
from torch import cuda

def gpu_info():
    """Searches for supported GPU. Ecxepts all errors"""
    try:
        device = cuda.get_device_name()
        n_gpus = cuda.device_count()
        print(device + ',', n_gpus, 'NVIDIA GPU(s) found.')
    except Exception as e:
        print('Supported NVIDIA GPU not found or encountered an error:\n', e)


gpu_info()

Tesla T4, 1 NVIDIA GPU(s) found.


### Load Pretrained Models and Weights

In [4]:
from peft import PeftModel
from transformers import LLaMATokenizer, LLaMAForCausalLM, GenerationConfig
import textwrap

In [6]:

# retrieve pretrained tokenizer
tokenizer = LLaMATokenizer.from_pretrained("decapoda-research/llama-7b-hf")

# retrieving the 7b llama model from huggingface
model = LLaMAForCausalLM.from_pretrained(
    "decapoda-research/llama-7b-hf",
    load_in_8bit=True,
    device_map="auto")

# produce fine tuned model using alpaca-lora weights
model = PeftModel.from_pretrained(model, "tykiww/alpaca7B-lora")



Loading checkpoint shards:   0%|          | 0/33 [00:00<?, ?it/s]

Downloading (…)/adapter_config.json:   0%|          | 0.00/370 [00:00<?, ?B/s]

Downloading adapter_model.bin:   0%|          | 0.00/8.43M [00:00<?, ?B/s]

### Define models for usage

In [41]:
def model_definition():
    """set config object"""
    config = GenerationConfig(
        temperature=0.2,
        top_p=0.95,
        repetition_penalty=1.2)
    return config


def tokenization(text):
    """token outputs as pytorch tensors"""
    tokens = tokenizer(text, return_tensors="pt")
    return tokens["input_ids"].cuda()


def alpaca_model(text):
    """run model"""

    generation_output = model.generate(
        input_ids=tokenization(text),
        generation_config=model_definition(),
        return_dict_in_generate=True,
        output_scores=True,
        max_new_tokens=512)

    return generation_output


def decode(generated_output):
    # Output generated text
    generated_text = ""
    for s in generated_output.sequences:
        generated_text += tokenizer.decode(s)

    return generated_text


def alpaca(text,print_only=True):
    """string outputs together"""
    output = alpaca_model(text)
    output = decode(output)
    if print_only:
        print(output)
    else:
        return output

### Output Generation

In [49]:
# 7/10. I like it.
alpaca(
"""
Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
How do you add, subtract, or multiply numbers on an HP12 C financial calculator?

### Response:
""")

 
Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
How do you add, subtract, or multiply numbers on an HP12 C financial calculator?

### Response:
To perform addition and subtraction operations in the HP-12C Financial Calculator, press [+] to enter the first number followed by pressing [=]. To perform multiplication operation, press [*] then follow it with the second number.
