In [1]:
# python.exe -m pip install --upgrade pip
# pip install ipykernel
# pip install jupyter
# python -m pip install bitsandbytes --prefer-binary --extra-index-url=https://jllllll.github.io/bitsandbytes-windows-webui
# pip install langchain einops accelerate transformers scipy xformers
# pip install torch==2.0.1+cu117 torchvision==0.15.2+cu117 -f https://download.pytorch.org/whl/torch_stable.html
# pip install sentencepiece sentence_transformers
# pip install llama-index
# pip install llama-hub
# pip install streamlit

In [1]:
# Import tranformer classes
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer
from transformers import BitsAndBytesConfig
# Import torch
import torch

In [2]:
torch.cuda.is_available()

True

In [3]:
# model name
name = "meta-llama/Llama-2-7b-chat-hf"
# name = "togethercomputer/LLaMA-2-7B-32K"
# Set auth variable from hugging face
auth_token = ""

In [4]:
# Create tokenizer
tokenizer = AutoTokenizer.from_pretrained(
    name,
    cache_dir="./model/",
    use_auth_token=auth_token,
)

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


In [5]:
# Create quantization config
quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_quant_type="nf4",
)

In [6]:
# Create model
model = AutoModelForCausalLM.from_pretrained(
    name,
    cache_dir="./model/",
    use_auth_token=auth_token,
    rope_scaling={"type":"dynamic","factor":2},
    load_in_4bit=True,
    quantization_config=quantization_config,
)



Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

bin d:\Git Repo Local Archive\GAN Poetry\project\Lib\site-packages\bitsandbytes\libbitsandbytes_cuda117.dll


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]



In [8]:
# Setup prompt
prompt = "### User:What is the fastest car in the world and how much does it cost? ### Assistant:"
# Pass the prompt to the tokenizer
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
# Setup text streamer
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_token=True)

In [9]:
# Actually run the thing
output = model.generate(inputs["input_ids"], streamer=streamer, use_cache=True, max_new_tokens=float("inf"))

The fastest car in the world is subjective and can vary depending on the criteria used to measure speed. Unterscheidung between a sports car and a supercar. A sports car is typically designed for speed and handling on a racetrack, while a supercar is a high-performance vehicle with luxury features and amenities.

Some of the fastest cars in the world include:

1. Bugatti Chiron Super Sport - 300+ mph (483+ km/h) - $3 million+
2. Hennessey Venom F5 - 301+ mph (485+ km/h) - $1.6 million+
3. Koenigsegg Regera - 248+ mph (399+ km/h) - $2.5 million+
4. Pagani Huayra BC - 238+ mph (383+ km/h) - $2.8 million+
5. Rimac C_Two - 258+ mph (415+ km/h) - $1.9 million+

Note: These prices are approximate and can vary depending on location, taxes, and other factors.

It's important to note that these cars are not only fast but also very expensive, and their prices can vary greatly depending on the brand, model, and other factors. Additionally, the top speed of a car is not the only factor to consider

In [10]:
# Convert the output tokens back to text
output_text = tokenizer.decode(output[0], skip_special_tokens=True)

In [11]:
print(output_text)

### User:What is the fastest car in the world and how much does it cost? ### Assistant:The fastest car in the world is subjective and can vary depending on the criteria used to measure speed. Unterscheidung between a sports car and a supercar. A sports car is typically designed for speed and handling on a racetrack, while a supercar is a high-performance vehicle with luxury features and amenities.

Some of the fastest cars in the world include:

1. Bugatti Chiron Super Sport - 300+ mph (483+ km/h) - $3 million+
2. Hennessey Venom F5 - 301+ mph (485+ km/h) - $1.6 million+
3. Koenigsegg Regera - 248+ mph (399+ km/h) - $2.5 million+
4. Pagani Huayra BC - 238+ mph (383+ km/h) - $2.8 million+
5. Rimac C_Two - 258+ mph (415+ km/h) - $1.9 million+

Note: These prices are approximate and can vary depending on location, taxes, and other factors.

It's important to note that these cars are not only fast but also very expensive, and their prices can vary greatly depending on the brand, model, and

In [9]:
# Import the prompt wrapper...but for llama index
from llama_index.prompts.prompts import SimpleInputPrompt
# Create a system prompt
system_prompt = """<s>[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as 
helpfully as possible, while being safe. Your answers should not include
any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content.
Please ensure that your responses are socially unbiased and positive in nature.

If a question does not make any sense, or is not factually coherent, explain 
why instead of answering something not correct. If you don't know the answer 
to a question, please don't share false information.

Your goal is to provide answers relating to the Constitution of India.<</SYS>>"""
# Throw together the query wrapper
query_wrapper_prompt = SimpleInputPrompt("{query_str} [/INST]")

In [10]:
# Compelete the query prompt
query_wrapper_prompt.format(query_str='hello')

'hello [/INST]'

In [11]:
# Import the llama index HF Wrapper
from llama_index.llms import HuggingFaceLLM
# Create a HF LLM using the llama index wrapper
llm = HuggingFaceLLM(
    context_window=4096,
    max_new_tokens=1024,
    system_prompt=system_prompt,
    query_wrapper_prompt=query_wrapper_prompt,
    model=model,
    tokenizer=tokenizer
)

In [12]:
# Bring in embeddings wrapper
from llama_index.embeddings import LangchainEmbedding
# Bring in HF embeddings - need these to represent document chunks
from langchain.embeddings.huggingface import HuggingFaceEmbeddings

In [13]:
# Create an Embeddings instance
embeddings = LangchainEmbedding(
    HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
)

Downloading .gitattributes:   0%|          | 0.00/1.18k [00:00<?, ?B/s]

Downloading 1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

Downloading README.md:   0%|          | 0.00/10.7k [00:00<?, ?B/s]

Downloading config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

Downloading (…)ce_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

Downloading data_config.json:   0%|          | 0.00/39.3k [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

Downloading (…)nce_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

Downloading tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Downloading tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

Downloading train_script.py:   0%|          | 0.00/13.2k [00:00<?, ?B/s]

Downloading vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

In [14]:
# Bring in stuff to change service context
from llama_index import set_global_service_context
from llama_index import ServiceContext

In [15]:
# Create new service context instance
service_context = ServiceContext.from_defaults(
    chunk_size=1024,
    llm=llm,
    embed_model=embeddings
)
# And set the service context
set_global_service_context(service_context)

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\Sagar\AppData\Local\llama_index...
[nltk_data]   Unzipping tokenizers\punkt.zip.


In [16]:
# Import classes to load documents
from llama_index import VectorStoreIndex, download_loader

In [19]:
# Download PDF loader
PyMuPDFReader = download_loader("PyMuPDFReader")
# Create PDF loader
loader = PyMuPDFReader()
# Load documents
documents = loader.load(file_path=r"D:\Git Repo Local Archive\GAN Poetry\10_Two-stage ELM for phishing Web pages detection using.pdf",metadata=True)

In [20]:
# Create an index - we'll be able to query this in a sec
index = VectorStoreIndex.from_documents(documents)

In [21]:
# Setup index query engine using LLM
query_engine = index.as_query_engine()

In [22]:
# Test out a query in Natural language
response = query_engine.query("Summarize this article in 2000 words")

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


In [23]:
print(response.response)

 The article discusses the problem of phishing on the internet and the need for an efficient method for detecting phishing websites. The authors propose a two-stage framework using Extreme Learning Machine (ELM) for phishing detection, which combines textual content features and rule-based features.

The first stage of the framework involves using ELM to construct a textual content classification model, which predicts the labels of textual contents. The authors propose nine basic features, including the URL, IP address, domain name, and MX record, to construct the textual content features. They also introduce a rule-based feature that represents the relationship between internal and external links on a website.

In the second stage, the authors utilize a linear combination model-based ensemble ELM (LC-ELM) to construct a detection module on the hybrid features. The hybrid features include the textual content labels predicted by the ELM model and the rule-based feature. The LC-ELM model