<a href="https://colab.research.google.com/github/robgon-art/personal-llama/blob/main/4_Evaluate_LLaMa_13B_Inference.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install git+https://github.com/robgon-art/llama-cpp-python.git --force-reinstall --upgrade --no-cache-dir --verbose
!pip install llama_index huggingface_hub sentence-transformers
!pip uninstall -y numpy
!pip install numpy==1.25.2

In [None]:
!gdown 14j_8FcDCJEt5-LPCwXDuU8Yrjpr7nTAp

Downloading...
From: https://drive.google.com/uc?id=14j_8FcDCJEt5-LPCwXDuU8Yrjpr7nTAp
To: /content/robgon_qa.csv
  0% 0.00/192k [00:00<?, ?B/s]100% 192k/192k [00:00<00:00, 2.61MB/s]


In [None]:
model_size = "13B"

if model_size == "7B":
  model_name_or_path = "TheBloke/Llama-2-7B-chat-GGUF"
  model_basename = "llama-2-7b-chat.Q4_K_M.gguf"
elif model_size == "13B":
  model_name_or_path = "TheBloke/Llama-2-13B-chat-GGUF"
  model_basename = "llama-2-13b-chat.Q4_K_M.gguf"
else:
  print("Invalid model size")

In [None]:
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(repo_id=model_name_or_path, filename=model_basename)

Downloading (…)13b-chat.Q4_K_M.gguf:   0%|          | 0.00/7.87G [00:00<?, ?B/s]

In [None]:
# Wrap words in text output in Colab
from IPython.display import HTML, display

def set_css():
  display(HTML('''
  <style>
    pre {
        white-space: pre-wrap;
    }
  </style>
  '''))
get_ipython().events.register('pre_run_cell', set_css)

In [None]:
from llama_index.llms import LlamaCPP
from llama_index.llms.llama_utils import messages_to_prompt, completion_to_prompt

In [None]:
llm = LlamaCPP(
    # You can pass in the URL to a GGUF model to download it automatically
    model_url=None,
    # optionally, you can set the path to a pre-downloaded model instead of model_url
    model_path=model_path,
    temperature=0.1,
    max_new_tokens=512, # 256
    # llama2 has a context window of 4096 tokens, but we set it lower to allow for some wiggle room
    context_window=3900,
    # kwargs to pass to __call__()
    generate_kwargs={},
    # kwargs to pass to __init__()
    # set to at least 1 to use GPU
    model_kwargs={"n_gpu_layers": 43},
    # transform inputs into Llama2 format
    messages_to_prompt=messages_to_prompt,
    completion_to_prompt=completion_to_prompt,
    verbose=True, # True
)

AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 


In [None]:
# try to answer without any references
response = llm.complete("What is Muybridge Derby?")
print(response.text.strip())

Muybridge Derby was a series of photographs taken by Eadweard Muybridge in 1878 at the Palo Alto Stock Farm in California. The photographs were taken using a sequence of cameras placed along a track, and they captured the motion of a horse named "Debris" as it galloped at various speeds. The resulting images were published as a book called "Animal Locomotion" and are considered to be some of the earliest examples of motion photography. The Muybridge Derby is still studied today by photographers, artists, and scientists interested in the study of movement and locomotion.


In [None]:
from sentence_transformers import SentenceTransformer

# Load the model
encoder = SentenceTransformer('sentence-transformers/all-mpnet-base-v2')

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

Downloading (…)a8e1d/.gitattributes:   0%|          | 0.00/1.18k [00:00<?, ?B/s]

Downloading (…)_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

Downloading (…)b20bca8e1d/README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

Downloading (…)0bca8e1d/config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

Downloading (…)ce_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

Downloading (…)e1d/data_config.json:   0%|          | 0.00/39.3k [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/438M [00:00<?, ?B/s]

Downloading (…)nce_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

Downloading (…)a8e1d/tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

Downloading (…)8e1d/train_script.py:   0%|          | 0.00/13.1k [00:00<?, ?B/s]

Downloading (…)b20bca8e1d/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)bca8e1d/modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

In [None]:
import numpy as np
sentences = ["This is an example sentence", "This is another example sentence"]
embeddings = encoder.encode(sentences)
print(embeddings)

# Normalize the embeddings
text_features_1 = embeddings[0] / np.linalg.norm(embeddings[0], axis=-1, keepdims=True)
text_features_2 = embeddings[1] / np.linalg.norm(embeddings[1], axis=-1, keepdims=True)

# Calculate the cosine similarity
similarity = np.dot(text_features_1, text_features_2.T)
print(similarity)

[[ 0.02250259 -0.07829171 -0.02303071 ... -0.0082793   0.02652686
  -0.00201896]
 [ 0.05012981 -0.03327598 -0.01251665 ... -0.01040575  0.02814883
  -0.01429978]]
0.90200233


In [None]:
import pandas as pd

# Read the CSV file into a DataFrame
df = pd.read_csv("/content/robgon_qa.csv")
df

Unnamed: 0,doc_metadata,doc_content,question,answer
0,"{'title': 'Writing Songs with GPT-4: Part 2, C...",Source Code\nThe source code for this project ...,What is the main objective of Robert A. Gonsal...,The main objective of his project 'GPT-4 Chord...
1,"{'title': '#Hands-on Tutorials', 'subtitle': '...",Using CLIP to Filter the Images for Training\n...,How did Robert use CLIP to filter the images f...,Robert used CLIP to filter the images by compa...
2,{'title': 'Using AI to Create New Comic Strips...,"Final Thoughts\nComparing the two systems, I f...",What are the limitations Robert A. Gonsalves f...,Robert found that while DALL-E could generate ...
3,{'title': 'Using AI to Create New Comic Strips...,"Mark Madness\nFor the Mark Madness comic, I us...",How did Robert A. Gonsalves modify the images ...,"Robert cleaned up the images in Photoshop, add..."
4,{'title': 'BIG.art: Using Machine Learning to ...,Source Code\nThe source code for this projec...,What machine learning tools does Robert A. Gon...,He uses GLIDE and BSRGAN to create these high-...
...,...,...,...,...
95,{'title': 'Frost Songs: Using AI to Generate M...,"Background\nFor the last five months, I have b...",What was Robert A. Gonsalves' critique of curr...,Robert A. Gonsalves noted that while AI models...
96,{'title': 'AI-Memer: Using Machine Learning to...,Next Steps\nAlthough the results are pretty go...,What is the next model that the developers beh...,The developers at EleutherAI are planning to b...
97,{'title': 'AI-Memer: Using Machine Learning to...,GPT-3 Da Vinci\nOpenAI’s GPT-3 Da Vinci is cur...,How does Robert A. Gonsalves use GPT-3 Da Vinc...,Robert A. Gonsalves uses GPT-3 Da Vinci by cre...
98,{'title': 'Benford’s Law — A Simple Explanatio...,"Summary\nIn this article, I gave an overview o...",What are the three real datasets that Robert A...,Robert A. Gonsalves used the datasets of city/...


In [None]:
# Iterate over the first five rows
for index, row in df.iloc[:5].iterrows():
  print(f"Row {index+1}:")
  print(f"doc_metadata: {row['doc_metadata']}")
  print(f"doc_content: {row['doc_content']}")
  print(f"question: {row['question']}")
  print(f"answer: {row['answer']}\n")
  print("************************************")


Row 1:
doc_metadata: {'title': 'Writing Songs with GPT-4: Part 2, Chords', 'subtitle': 'How to use the latest large language model from OpenAI to help compose chords for original songs', 'author': 'Robert A. Gonsalves', 'date': 'GPT-4 Chords - https://medium.com/towards-data-science/writing-songs-with-gpt-4-part-2-chords-173cfda0e5a1', 'nickname': 'GPT-4 Chords', 'url': 'https://medium.com/towards-data-science/writing-songs-with-gpt-4-part-2-chords-173cfda0e5a1'}
doc_content: Source Code
The source code for this project is available on  GitHub .
question: What is the main objective of Robert A. Gonsalves' project 'GPT-4 Chords'?
answer: The main objective of his project 'GPT-4 Chords' is to demonstrate how to use the latest large language model from OpenAI, GPT-4, to help compose chords for original songs.

************************************
Row 2:
doc_metadata: {'title': '#Hands-on Tutorials', 'subtitle': '# MAGNet: Modern Art Generator using Deep Neural Networks', 'author': '## How

In [None]:
prefix = "Answer the question briefly."

In [None]:
# Iterate over the rows
scores = []
for index, row in df.iloc[:].iterrows():
  reference_answer = row['answer'].replace("Robert A. Gonsalves", "he").replace("Robert", "he")
  prompt =row['doc_metadata'] + "\n" + row['doc_content'] + "\n\n" + prefix + "\n\n" + row['question']
  print("Index:           ", index)
  print("Reference:       ", row['doc_content'])
  print("Question:        ", row['question'])
  print("GPT-4 Answer:    ", reference_answer)
  response = llm.complete(prompt)
  answer = response.text.strip().replace("Robert A. Gonsalves", "he").replace("Robert", "he")
  print("LLaMa 13B Answer:", answer)

  sentences = [reference_answer, answer]
  embeddings = encoder.encode(sentences)

  # Normalize the embeddings
  text_features_1 = embeddings[0] / np.linalg.norm(embeddings[0], axis=-1, keepdims=True)
  text_features_2 = embeddings[1] / np.linalg.norm(embeddings[1], axis=-1, keepdims=True)

  # Calculate the cosine similarity
  similarity = np.dot(text_features_1, text_features_2.T)
  scores.append(similarity)
  print("LLaMa 13B Score: ", similarity)
  print()

Index:            0
Reference:        Source Code
The source code for this project is available on  GitHub .
Question:         What is the main objective of Robert A. Gonsalves' project 'GPT-4 Chords'?
GPT-4 Answer:     The main objective of his project 'GPT-4 Chords' is to demonstrate how to use the latest large language model from OpenAI, GPT-4, to help compose chords for original songs.
LLaMa 13B Answer: Based on the information provided, the main objective of he' project "GPT-4 Chords" is to use the latest large language model from OpenAI to help compose chords for original songs.
LLaMa 13B Score:  0.9687352

Index:            1
Reference:        Using CLIP to Filter the Images for Training
After cropping the images, I ended up having over 12,000 paintings to work with. That’s enough to train a GAN, but not all of the paintings are good. Just because a painter is tagged on WikiArt as being “modern” doesn’t mean that all of their works are good examples of modern painting. I used CL

In [None]:
# Example 1D array
print("num scores:  ", len(scores))
arr = np.array(scores)

# Calculate the average
average = np.mean(arr)
median = np.median(arr)
low = np.min(arr)
high = np.max(arr)

# Print the results
print("low score:   ", round(low*100, 3))
print("mean score:  ", round(average*100, 3))
print("median score:", round(median*100, 3))
print("high score:  ", round(high*100, 3))

num scores:   100
low score:    59.663
mean score:   84.501
median score: 84.785
high score:   98.218


In [None]:
# Open a file in write mode. If the file doesn't exist, it will be created.
with open('llama13b.txt', 'w') as file:
    # Iterate through the array
    for item in scores:
        # Write each item to the file, converting it to a string and adding a newline character
        file.write(str(item) + '\n')