<a href="https://colab.research.google.com/github/robgon-art/personal-llama/blob/main/1_Inference_with_LLaMa2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install git+https://github.com/robgon-art/llama-cpp-python.git --force-reinstall --upgrade --no-cache-dir --verbose
!pip install llama_index huggingface_hub sentence-transformers
!pip uninstall -y numpy
!pip install numpy==1.25.2

In [None]:
!gdown 15vLnOyyJBtkjhizR-FqMiILpSD-tYgT3
!unzip robgon_articles_md.zip

In [None]:
model_size = "7B"

if model_size == "7B":
  model_name_or_path = "TheBloke/Llama-2-7B-chat-GGUF"
  model_basename = "llama-2-7b-chat.Q4_K_M.gguf"
elif model_size == "13B":
  model_name_or_path = "TheBloke/Llama-2-13B-chat-GGUF"
  model_basename = "llama-2-13b-chat.Q4_K_M.gguf"
else:
  print("Invalid model size")

In [None]:
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(repo_id=model_name_or_path, filename=model_basename)

Downloading (…)-7b-chat.Q4_K_M.gguf:   0%|          | 0.00/4.08G [00:00<?, ?B/s]

In [None]:
from IPython.display import HTML, display

def set_css():
  display(HTML('''
  <style>
    pre {
        white-space: pre-wrap;
    }
  </style>
  '''))
get_ipython().events.register('pre_run_cell', set_css)

In [None]:
from langchain.embeddings import HuggingFaceEmbeddings
from llama_index import (
    SimpleDirectoryReader,
    VectorStoreIndex,
    ServiceContext,
)
from llama_index.llms import LlamaCPP
from llama_index.llms.llama_utils import messages_to_prompt, completion_to_prompt

In [None]:
llm = LlamaCPP(
    # You can pass in the URL to a GGUF model to download it automatically
    model_url=None,
    # optionally, you can set the path to a pre-downloaded model instead of model_url
    model_path=model_path,
    temperature=0.1,
    max_new_tokens=512, # 256
    # llama2 has a context window of 4096 tokens, but we set it lower to allow for some wiggle room
    context_window=3900,
    # kwargs to pass to __call__()
    generate_kwargs={},
    # kwargs to pass to __init__()
    # set to at least 1 to use GPU
    model_kwargs={"n_gpu_layers": 43},
    # transform inputs into Llama2 format
    messages_to_prompt=messages_to_prompt,
    completion_to_prompt=completion_to_prompt,
    verbose=True, # True
)

AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 


In [None]:
response_iter = llm.stream_complete("What is Muybridge Derby?")

for response in response_iter:
    print(response.delta, end="", flush=True)

  Thank you for asking! I'm here to help you with your question. However, I must inform you that "Muybridge Derby" is not a real or recognized term in any context. It's possible that it may be a misspelling or a made-up term, and I cannot provide information on something that does not exist.
If you could provide more context or clarify the term you are referring to, I would be happy to help you to the best of my abilities.

In [None]:
import glob
import pandas as pd

def extract_info_from_file(filename):
    with open(filename, 'r') as f:
        lines = [line.strip() for line in f.readlines()[:6]]

    info = {
        'filename': filename,
        'title': lines[0].replace('# ', ''),
        'subtitle': lines[1].replace('## ', ''),
        'author': lines[2],
        'date': lines[4].replace('</br>', ''),
        'nickname': lines[4].split(' - ')[0],
        'url': lines[4].split(' - ')[1].replace('</br></br>', '')
    }
    return info

files = glob.glob('/content/robgon_articles_md/*.md')
data_list = [extract_info_from_file(file) for file in files]

df = pd.DataFrame(data_list)
df

Unnamed: 0,filename,title,subtitle,author,date,nickname,url
0,/content/robgon_articles_md/2020-09-01_got-wri...,Got Writer’s Block? It’s GPT-2 to the Rescue!,Using AI to create plot summaries of books tha...,Robert A. Gonsalves,PlotJam - https://medium.com/towards-data-scie...,PlotJam,https://medium.com/towards-data-science/got-wr...
1,/content/robgon_articles_md/2021-07-05_magnet-...,#Hands-on Tutorials,# MAGNet: Modern Art Generator using Deep Neur...,"## How I used CLIP, SWAGAN, and a custom genet...",MAGnet - https://medium.com/towards-data-scien...,MAGnet,https://medium.com/towards-data-science/magnet...
2,/content/robgon_articles_md/2022-08-08_explori...,Exploring DALL-E for Digital Art Creation,I tested OpenAI’s text-to-image generator to s...,Robert A. Gonsalves,DALL-E Art - https://medium.com/towards-data-s...,DALL-E Art,https://medium.com/towards-data-science/explor...
3,/content/robgon_articles_md/2021-10-27_spookyg...,SpookyGAN - Rendering Scary Faces with Machine...,"How to use StyleGAN 2, VQGAN, and CLIP to crea...",Robert A. Gonsalves,SpookyGAN - https://medium.com/towards-data-sc...,SpookyGAN,https://medium.com/towards-data-science/spooky...
4,/content/robgon_articles_md/2022-03-08_deep-ha...,Deep Haiku: Teaching GPT-J to Compose with Syl...,How to generate rhythmic prose after fine-tuni...,Robert A. Gonsalves,Deep Haiku - https://medium.com/towards-data-s...,Deep Haiku,https://medium.com/towards-data-science/deep-h...
5,/content/robgon_articles_md/2023-07-25_muybrid...,Muybridge Derby: Bringing Animal Locomotion Ph...,How I used Midjourney and RunwayML to transfor...,Robert A. Gonsalves,Muybridge Derby - https://medium.com/towards-d...,Muybridge Derby,https://medium.com/towards-data-science/muybri...
6,/content/robgon_articles_md/2021-02-01_invento...,InventorBot: Using AI to Generate New Ideas in...,How a neural network trained on the US Patent ...,Robert A. Gonsalves,InventorBot - https://medium.com/geekculture/i...,InventorBot,https://medium.com/geekculture/inventorbot-usi...
7,/content/robgon_articles_md/2023-01-04_using-c...,Using ChatGPT as a Creative Writing Partner — ...,How the latest language model from OpenAI can ...,Robert A. Gonsalves,ChatGPT Prose - https://medium.com/towards-dat...,ChatGPT Prose,https://medium.com/towards-data-science/using-...
8,/content/robgon_articles_md/2022-11-09_digital...,"Digital Art Showdown: Stable Diffusion, DALL-E...",A comparison of popular AI diffusion models fo...,Robert A. Gonsalves,Digital Art Showdown - https://medium.com/towa...,Digital Art Showdown,https://medium.com/towards-data-science/digita...
9,/content/robgon_articles_md/2021-06-03_ai-meme...,AI-Memer: Using Machine Learning to Create Fun...,How to create new memes using images from Wiki...,Robert A. Gonsalves,AIMemer - https://medium.com/towards-data-scie...,AIMemer,https://medium.com/towards-data-science/ai-mem...


In [None]:
# use Huggingface embeddings
embed_model = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-mpnet-base-v2"
)

# create a service context
service_context = ServiceContext.from_defaults(llm=llm, embed_model=embed_model)

def filename_fn(filename):
  entry = df[df['filename'] == filename].iloc[0].to_dict()
  entry.pop('filename', None)  # Remove the 'filename' key if it exists
  return entry

# load documents
documents = SimpleDirectoryReader("/content/robgon_articles_md", file_metadata=filename_fn).load_data()

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

Downloading (…)a8e1d/.gitattributes:   0%|          | 0.00/1.18k [00:00<?, ?B/s]

Downloading (…)_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

Downloading (…)b20bca8e1d/README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

Downloading (…)0bca8e1d/config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

Downloading (…)ce_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

Downloading (…)e1d/data_config.json:   0%|          | 0.00/39.3k [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/438M [00:00<?, ?B/s]

Downloading (…)nce_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

Downloading (…)a8e1d/tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

Downloading (…)8e1d/train_script.py:   0%|          | 0.00/13.1k [00:00<?, ?B/s]

Downloading (…)b20bca8e1d/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)bca8e1d/modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

[nltk_data] Downloading package punkt to /tmp/llama_index...
[nltk_data]   Unzipping tokenizers/punkt.zip.


In [None]:
print("Num documents:", len(documents))
print()
for doc in documents[:5]:
  print(doc.metadata)
  print(doc.get_content())
  print("****")

Num documents: 632

{'title': 'MachineRay: Using AI to Create Abstract Art', 'subtitle': 'How I trained a GAN using public domain paintings', 'author': 'Robert A. Gonsalves', 'date': 'MachineRay - https://medium.com/towards-data-science/machineray-using-ai-to-create-abstract-art-39829438076a', 'nickname': 'MachineRay', 'url': 'https://medium.com/towards-data-science/machineray-using-ai-to-create-abstract-art-39829438076a'}


MachineRay: Using AI to Create Abstract Art
Robert A. Gonsalves
Aug 3, 2020
MachineRay - https://medium.com/towards-data-science/machineray-using-ai-to-create-abstract-art-39829438076a
For the past three months, I have been exploring the latest techniques in Artificial Intelligence (AI) and Machine Learning (ML) to create abstract art. During my investigation, I learned that three things are needed to create abstract paintings: (A) source images, (B) an ML model, and (C) a lot of time to train the model on a high-end GPU. Before I discuss my work, let’s take a look

In [None]:
index = VectorStoreIndex.from_documents(documents, service_context=service_context)
system_prompt = """You are a helpful assistant who answers questions about Robert A. Gonsalves's projects.
 Answer the questions in a straightforward way, user using the provided context, but don't mention the terms like 'the context' or 'the articles provided'.
 Use the metadata in the context to identify the articles, and only refer to articles that are relevant to the user's question.
 """
chat_engine = index.as_chat_engine(chat_mode='context', similarity_top_k=2, system_prompt=system_prompt)

In [None]:
def print_response(response):
  started_printing = False
  for token in response.response_gen:
    if token == '</s>':
      break
    if not started_printing:
      token = token.lstrip()
      if token:
        started_printing = True
    print(token, end="")

In [None]:
# Global list
articles = []

def get_articles(response):
    global articles  # Indicate that we're using the global articles list

    for r in response.sources:
        # Split the content by lines
        lines = r.content.split('\n')

        for line in lines:
            if line.startswith("title:") and line not in articles:
                articles.append(line)  # Add to the global articles list

In [None]:
def format_query(query, articles):
    context_string = "These articles were referenced in the previous answer:\n"
    if articles:
        context_string += "\n".join(articles)
    context_string += "\n"
    return context_string + query + " Be brief."

In [None]:
chat_engine.reset()
articles = []

In [None]:
query = "What is Muybridge Derby?" # @param {type:"string"}
response = chat_engine.stream_chat(query)
get_articles(response)
print_response(response)

Muybridge Derby is a project by Robert A. Gonsalves where he used Midjourney and RunwayML to transform Eadweard Muybridge's photo sequences into high-resolution videos.

In [None]:
query = "Tell me more about the AI systems." # @param {type:"string"}
query = format_query(query, articles)
response = chat_engine.stream_chat(query)
get_articles(response)
print_response(response)

In the Muybridge Derby project, Robert A. Gonsalves used two AI systems to transform Eadweard Muybridge's photo sequences into high-resolution videos:
1. Midjourney: a tool for creating interactive visual stories using AI-generated images and text.
2. RunwayML: a platform for creating, training, and deploying machine learning models.
These systems allowed Gonsalves to transform Muybridge's static photographs into dynamic videos that bring the animal locomotion to life.

In [None]:
query = "What music did he use in the video?" # @param {type:"string"}
query = format_query(query, articles)
response = chat_engine.stream_chat(query)
get_articles(response)
print_response(response)

In the Muybridge Derby project, Robert A. Gonsalves used a song generated with AI for a previous article as the music played over the credits. The song is called "I'll Get There When I Get There," which is kinda appropriate for a derby race.

In [None]:
print(articles)