# Using llama-cpp-python to summarize a number of wiki articles

## Installation
1. Fork https://github.com/abetlen/llama-cpp-python/forkand clone       
```
git clone https://github.com/<your-git-id>/llama-cpp-python llama
```
2. Update the source tree (download llama-cpp)
```
cd llama
git pull origin
git submodule init
git submodule update
```
3. Install the llama-cpp-python
```
python -m pip install --upgrade --force-reinstall --no-cache-dir .
```

## Coding the summarizer
See following

In [3]:
!pip3 install huggingface-hub




[notice] A new release of pip is available: 23.2 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip


Download the llama2 model which is in pytorch .gguf format.

In [4]:
# Download the model to directory "./models"
from huggingface_hub import snapshot_download
import pathlib

model_id="TheBloke/Llama-2-7B-GGUF"
model_path = pathlib.Path("models/llama-2-7b-gguf")
if not model_path.exists():
    model_path.mkdir(parents = True, exist_ok = True)
    snapshot_download(repo_id=model_id, local_dir="models/llama-2-7B-gguf",
                    local_dir_use_symlinks=False, revision="main")

In [5]:
!pip3 install -r ../vendor/llama.cpp/requirements.txt

ERROR: Could not open requirements file: [Errno 2] No such file or directory: '../vendor/llama.cpp/requirements.txt'

[notice] A new release of pip is available: 23.2 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip


Convert .bin model to gguf model. Check out the ref link https://www.secondstate.io/articles/convert-pytorch-to-gguf/. But following two lines of code still don't work.

In [6]:
!python ../vendor/llama.cpp/convert.py models/llama-2-7b-chat

python: can't open file 'c:\\projects\\llmplayground\\vendor\\llama.cpp\\convert.py': [Errno 2] No such file or directory


In [7]:
!python ../vendor/llama.cpp/convert-llama-ggml-to-gguf.py --input ./models --output llama-2-7b-chat.ggmlv3.gguf 

python: can't open file 'c:\\projects\\llmplayground\\vendor\\llama.cpp\\convert-llama-ggml-to-gguf.py': [Errno 2] No such file or directory


References
- https://github.com/tushitdave/Text_summarization/blob/main/Llama_2_Text_Summ.ipynb
- https://medium.com/@tushitdavergtu/llama2-and-text-summarization-e3eafb51fe28

In [8]:
GREEN = '\033[92m'
END_COLOR = '\033[0m'

import logging as logging
from llama_cpp import Llama
import torch as torch

In [9]:
def load_model(device_type, model_id, model_path, model_basename=None):

    logging.info(f"Loading Model: {model_id}, on: {device_type}")
    logging.info("This action can take a few minutes!")

    if model_basename is not None:
        if ".ggml" in model_basename:
            logging.info("Using Llamacpp for GGML quantized models")

            max_ctx_size = 4096
            kwargs = {
                "model_path": model_path,
                "n_ctx": max_ctx_size,
                "max_tokens": max_ctx_size,
            }
            if device_type.lower() == "mps":
                kwargs["n_gpu_layers"] = 1000
            if device_type.lower() == "cuda":
                kwargs["n_gpu_layers"] = 1000
                kwargs["n_batch"] = max_ctx_size
            return Llama(**kwargs)

In [10]:
DEVICE_TYPE = "cuda" if torch.cuda.is_available() else "cpu"
SHOW_SOURCES = True
logging.info(f"Running on: {DEVICE_TYPE}")
logging.info(f"Display Source Documents set to: {SHOW_SOURCES}")

In [11]:
model_id = "TheBloke/Llama-2-7B-Chat-GGML"
model_basename = "llama-2-7b-chat.ggmlv3.q4_0.bin"
LLM = load_model(device_type=DEVICE_TYPE, model_path="./models/llama-2-7b-gguf/llama-2-7b.Q4_0.gguf", model_id=model_id, model_basename=model_basename)

ValueError: Model path does not exist: ./models/llama-2-7b-gguf/llama-2-7b.Q4_0.gguf

In [None]:
!pip3 install transformers_longformer_tokenizer

ERROR: Could not find a version that satisfies the requirement transformers_longformer_tokenizer (from versions: none)
ERROR: No matching distribution found for transformers_longformer_tokenizer

[notice] A new release of pip is available: 23.2 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip


In [None]:
from transformers import LongformerTokenizer
import requests
import re


tokenizer = LongformerTokenizer.from_pretrained("allenai/longformer-base-4096")

def fetch_and_save_wiki_text(title):
    response = requests.get(
        "https://en.wikipedia.org/w/api.php",
        params={
            "action": "query",
            "format": "json",
            "titles": title,
            "prop": "extracts",
            "explaintext": True,
        },
    ).json()
    
    page = next(iter(response["query"]["pages"].values()))
    wiki_text = page["extract"]
    
    return wiki_text

def clean_text(text):
    # Remove special characters except "."
    text = re.sub(r'[^A-Za-z0-9\s.\(\)\[\]\{\}]+', '', text)
    # Convert to lowercase
    text = text.lower()
    # Remove extra whitespace
    text = ' '.join(text.split())
    return text

def count_tokens(text):
    tokens = tokenizer.encode(text, add_special_tokens=True)
    return len(tokens)

In [None]:
import pandas as pd

wonders_cities = [
    'Beirut',
    'Doha',
    'Durban',
    'Havana',
    'Kuala Lumpur',
    'La Paz',
    'Vigan',
]

data = []
for wonder_city in wonders_cities:
    info = fetch_and_save_wiki_text(wonder_city)
    tokens = tokenizer.encode(info, add_special_tokens=True, truncation=True, max_length=30000)
    num_tokens = len(tokens)
    data.append([wonder_city, info, num_tokens])

df = pd.DataFrame(data, columns=["wonder_city", "information", "num_tokens"])


In [None]:
df.head()

Unnamed: 0,wonder_city,information,num_tokens
0,Beirut,"Beirut ( bay-ROOT; Arabic: بيروت, romanized: )...",12729
1,Doha,"Doha (Arabic: الدوحة, romanized: ad-Dawḥa [adˈ...",11117
2,Durban,"Durban ( DUR-bən; Zulu: eThekwini, from itheku...",8350
3,Havana,Havana (; Spanish: La Habana [la aˈβana] ; Luc...,30000
4,Kuala Lumpur,"Kuala Lumpur (Malaysian: [ˈkualə, -a ˈlumpo(r)...",12925


In [None]:
df["cleaned_information"] = df["information"].apply(clean_text)
df["token_count"] = df["cleaned_information"].apply(count_tokens)
df.head()

Token indices sequence length is longer than the specified maximum sequence length for this model (12083 > 4096). Running this sequence through the model will result in indexing errors


Unnamed: 0,wonder_city,information,num_tokens,cleaned_information,token_count
0,Beirut,"Beirut ( bay-ROOT; Arabic: بيروت, romanized: )...",12729,beirut ( bayroot arabic romanized ) is the cap...,12083
1,Doha,"Doha (Arabic: الدوحة, romanized: ad-Dawḥa [adˈ...",11117,doha (arabic romanized addawa [addua] or adda)...,10143
2,Durban,"Durban ( DUR-bən; Zulu: eThekwini, from itheku...",8350,durban ( durbn zulu ethekwini from itheku mean...,7733
3,Havana,Havana (; Spanish: La Habana [la aˈβana] ; Luc...,30000,havana ( spanish la habana [la aana] lucumi il...,28948
4,Kuala Lumpur,"Kuala Lumpur (Malaysian: [ˈkualə, -a ˈlumpo(r)...",12925,kuala lumpur (malaysian [kual a lumpo(r) (r)])...,12674


In [None]:
!pip3 install langchain

Collecting langchain
  Obtaining dependency information for langchain from https://files.pythonhosted.org/packages/ed/3e/93045d37eba24e0b5eb05312e30cd9e12805ea5f1ae9ba51ec8a7d2f5372/langchain-0.1.16-py3-none-any.whl.metadata
  Downloading langchain-0.1.16-py3-none-any.whl.metadata (13 kB)
Collecting SQLAlchemy<3,>=1.4 (from langchain)
  Obtaining dependency information for SQLAlchemy<3,>=1.4 from https://files.pythonhosted.org/packages/29/82/3e4ca1381a3b0e80f03ba3fafbf047ed6c5f75ff4fd79f1726952c06f604/SQLAlchemy-2.0.29-cp310-cp310-win_amd64.whl.metadata
  Downloading SQLAlchemy-2.0.29-cp310-cp310-win_amd64.whl.metadata (9.8 kB)
Collecting aiohttp<4.0.0,>=3.8.3 (from langchain)
  Obtaining dependency information for aiohttp<4.0.0,>=3.8.3 from https://files.pythonhosted.org/packages/60/69/3febe2b4a12bc34721eb2ddb60b50d9e7fc8bdac98abb4019ffcd8032272/aiohttp-3.9.5-cp310-cp310-win_amd64.whl.metadata
  Downloading aiohttp-3.9.5-cp310-cp310-win_amd64.whl.metadata (7.7 kB)
Collecting async-tim

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
jupyterlab-server 2.22.1 requires requests>=2.28, but you have requests 2.26.0 which is incompatible.
nerfstudio 0.2.2 requires protobuf!=3.20.0,<=3.20.3, but you have protobuf 4.25.3 which is incompatible.
nerfstudio 0.2.2 requires torch<2.0.0,>=1.12.1, but you have torch 2.1.2+cu121 which is incompatible.

[notice] A new release of pip is available: 23.2 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip


In [None]:
from langchain import PromptTemplate
from langchain.chains import LLMChain

def generate_summary(text_chunk, llm_model):
    # Defining the template to generate summary
    #template = """
    #Write a concise summary of the text, return your responses with 5 lines that cover the key points of the text.
    #```{text}```
    #SUMMARY:
    #"""
    #prompt = PromptTemplate(template=template, input_variables=["text"])
    #llm_chain = LLMChain(prompt = prompt, llm = llm_model)

    #summary = llm_chain.run(text_chunk)

    prompts = """
       Write a concise summary of the text, return your responses with 5 lines that cover the key points of the text.
        {}
       SUMMARY:
    """.format(text_chunk)

    output = llm_model(
      prompts, 
      max_tokens = 32, # Generate up to 32 tokens, set to None to generate up to the end of the context window
      stop=["Q:", "\n"], # Stop generating just before the model would generate a new question
      echo=True # Echo the prompt back in the output
    ) # Generate a completion, can also call create_completion

    summary = ""
    res = output["choices"][0]["text"]
    start_index = res.find("SUMMARY:")
    if start_index != -1:
      start_index += 8
      summary = res[start_index:].strip()

    return summary

In [None]:
!pip3 install textsplitter
!pip3 install -qU langchain-text-splitters




[notice] A new release of pip is available: 23.2 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip

[notice] A new release of pip is available: 23.2 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip


In [None]:
print(LLM.model_path)

./models/llama-2-7b-gguf/llama-2-7b.Q4_0.gguf


In [None]:
from langchain_text_splitters import RecursiveCharacterTextSplitter
from tqdm import tqdm

text_splitter = RecursiveCharacterTextSplitter(chunk_size=4096, chunk_overlap=50, length_function=len)

df["summary"] = ""

for index, row in tqdm(df.iterrows(), total=len(df), desc="Generating Summaries"):
    wonder_city = row["wonder_city"]
    text_chunk = row["cleaned_information"]
    chunks = text_splitter.split_text(text_chunk)
    chunk_summaries = []

    for chunk in chunks:
        summary = generate_summary(chunk, LLM)
        chunk_summaries.append(summary)

    combined_summary = "\n".join(chunk_summaries)
    df.at[index, "summary"] = combined_summary

    break

NameError: name 'df' is not defined

In [None]:
j = {'id': 'cmpl-d8c40893-068b-4955-a759-b1867b066d3c', 'object': 'text_completion', 'created': 1713749359, 'model': './models/llama-2-7b-gguf/llama-2-7b.Q4_0.gguf', 'choices': [{'text': '\n       Q:Write a concise summary of the text, return your responses with 5 lines that cover the key points of the text.\n        beirut ( bayroot arabic romanized ) is the capital and largest city of lebanon. as of 2014 greater beirut has a population of 2.5 million which makes it the thirdlargest city in the levant region and the thirteenthlargest in the arab world. the city is situated on a peninsula at the midpoint of lebanons mediterranean coast. beirut has been inhabited for more than 5000 years making it one of the oldest cities in the world. beirut is lebanons seat of government and plays a central role in the lebanese economy with many banks and corporations based in the city. beirut is an important seaport for the country and region and rated a beta world city by the globalization and world cities research network. beirut was severely damaged by the lebanese civil war the 2006 lebanon war and the 2020 massive explosion in the port of beirut. its architectural and demographic structure underwent major change in recent decades. names the english name beirut is an early transcription of the arabic name bayrt (). the same names transcription into french is beyrouth which was sometimes used during lebanons french mandate. the arabic name derives from phoenician brt ( brt). this was a modification of the canaanite and phoenician word brt later brt meaning wells in reference to the sites accessible water table. the name is first attested in the 14th century bc when it was mentioned in three akkadian cuneiform tablets of the amarna letters letters sent by king ammunira of biruta to amenhotep iii or amenhotep iv of egypt. biruta was also mentioned in the amarna letters from king ribhadda of byblos. the greeks hellenised the name as bryts (ancient greek ) which the romans latinised as berytus. when it attained the status of a roman colony it was notionally refounded and its official name was emended to colonia iulia augusta felix berytus to include its imperial sponsors. at the time of the crusades the city was known in french as barut or baruth. prehistory beirut was settled over 5000 years ago and there is evidence that the surrounding area had already been inhabited for tens of thousands of years prior to this. several prehistoric archaeological sites have been discovered within the urban area of beirut revealing flint tools from sequential periods dating from the middle palaeolithic and upper paleolithic through the neolithic to the bronze age. beirut i (minet elhosn) was listed as the town of beirut (french beyrouth ville) by louis burkhalter and said to be on the beach near the orient and bassoul hotels on the avenue des franais in central beirut. the site was discovered by lortet in 1894 and discussed by godefroy zumoffen in 1900. the flint industry from the site was described as mousterian and is held by the museum of fine arts of lyon. beirut ii (umm elkhatib) was suggested by burkhalter to have been south of tarik el jedideh where p.e. gigues discovered a copper age flint industry at around 100 metres (328 feet) above sea level. the site had been built on and destroyed by 1948. beirut iii (furn eshshebbak) listed as plateau tabet was suggested to have been located on the left bank of the beirut river. burkhalter suggested that it was west of the damascus road although this determination has been criticized by lorraine copeland. p. e. gigues discovered a series of neolithic flint tools on the surface along with the remains of a structure suggested to be a hut circle. auguste bergy discussed polished axes that were also found at this site which has now completely disappeared as a result of construction and urbanization of the area. beirut iv (furn eshshebbak river banks) was also on the left bank of the river and on either side of the road leading eastwards from the furn esh shebbak police station towards the river that marked the city limits. the area was covered in red sand that represented quaternary river terraces. the site was found by jesuit father dillenseger and published by fellow jesuits godefroy zumoffen raoul describes and auguste bergy. collections from the site were made by bergy describes and another jesuit paul\n       A:\n    1. I think that there are a lot of people who want to have a vacation in Lebanon, because it is very beautiful. I think it is', 'index': 0, 'logprobs': None, 'finish_reason': 'length'}], 'usage': {'prompt_tokens': 1106, 'completion_tokens': 32, 'total_tokens': 1138}}
res = j["choices"][0]["text"]
start_index = res.find("A:")
if start_index != -1:
    start_index += 2
    print(res[start_index:].strip())


1. I think that there are a lot of people who want to have a vacation in Lebanon, because it is very beautiful. I think it is


In [None]:
df[["wonder_city", "summary"]]

Unnamed: 0,wonder_city,summary
0,Beirut,\n\n\n\n\n\n\n\n\n\n\n\n\n
1,Doha,
2,Durban,
3,Havana,
4,Kuala Lumpur,
5,La Paz,
6,Vigan,
