## Llama Index and Llama 2 tutorial on Lonestar 6

Llama2 is the Meta open source Large Language Model. LlamaIndex is a python library that connects data to the LLMs such as Llama2. This allows the user to quickly use their unstructured data as a basis for any chats or outputs. 


In [1]:
import os
import logging
import sys
from IPython.display import Markdown, display
from llama_index.query_engine import CitationQueryEngine
from llama_index.llms import HuggingFaceLLM
from llama_index.prompts import PromptTemplate, PromptType
from pathlib import Path
from llama_index import download_loader, KnowledgeGraphIndex, SimpleDirectoryReader
logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

from LLM_location import *

In [2]:
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")


## Set your working directory
Change your working directory to your Scratch location. This will improve performance, and ensure you have access to the model you rsynced earlier

In [3]:
scratch = ! echo $SCRATCH

! pwd


/work/06659/wmobley/ls6/sites-and-stories-nlp-jupyterenv


## Access the model
Next we'll access the models. You have 4 models to access the 7 and 13billion parameters chat and normal model. The folder will also have access to the 70b parameter models; however, we have not tested their performance on the LS6 dev machines. 



In [4]:
model = llm_location()
c = corpus()
display(c.fc)


FileChooser(path='/work/06659/wmobley/ls6/sites-and-stories-nlp-jupyterenv', filename='', title='', show_hidde…

## Select Model
For this script we will chose the Llama 2 13B parameter chat model. 

In [5]:
display(model.dropdown)

Dropdown(description='Number:', index=3, options=('LLAMA2 7B', 'LLAMA2 7B CHAT', 'LLAMA2 13B', 'LLAMA2 13B CHA…

In [6]:
c.unzip()

## Load the Model
Next we'll load the model. If it can't find the model it will download it. 

In [7]:

llm = HuggingFaceLLM(
    context_window=4096,
    max_new_tokens=2048,
    generate_kwargs={"temperature": 0.0, "do_sample": False},
#     query_wrapper_prompt=query_wrapper_prompt,
    tokenizer_name=model.get_llm_path(),
    model_name=model.get_llm_path(),
    device_map="balanced",
    # change these settings below depending on your GPU
    model_kwargs={ "load_in_8bit": False, "cache_dir":f"{scratch[0]}"},
)#"torch_dtype": torch.float16,




Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

## Load the PDF documents 

In [8]:
required_exts = ['.pdf']
reader = SimpleDirectoryReader(
    input_dir=c.corpus_path,

    required_exts=required_exts,
    recursive=True,
)

documents = reader.load_data()
print(f"Loaded {len(documents)} docs")

ImportError: pypdf is required to read PDF files: `pip install pypdf`

In [None]:
from llama_index import VectorStoreIndex, ServiceContext, set_global_service_context


In [None]:
service_context = ServiceContext.from_defaults(
    llm=llm, embed_model="local"
)
set_global_service_context(service_context)
# Better Explain Each of these steps. 
index = VectorStoreIndex.from_documents(documents, service_context=service_context)
query_engine = CitationQueryEngine.from_args(
index,
similarity_top_k=3,
# here we can control how granular citation sources are, the default is 512
citation_chunk_size=150,
)
def query(text):
    
    response = query_engine.query(text)
    display(Markdown(f"<b>{response}</b>"));
    return response

In [None]:
# response = query("What algorithms are used for optimization ?")


In [None]:
query("What sampling approaches are used for estimating states of the world?")


In [None]:
response = query("How computationally intensive are DMDU Processes?")


In [None]:
response = query("what makes robust decision making difficult to understand? Please cite sources")


In [None]:
response = query("what is an integrated modeling platform? Please provide source bibliography")


In [None]:
for i in range(len(response.source_nodes)):
    display(response.source_nodes[i].node.metadata)
    display(Markdown(response.source_nodes[i].node.get_text()))

In [None]:
response = query("What are the limitations when applying robust decision making to a problem.")


In [None]:
for i in range(len(response.source_nodes)):
    display(Markdown(response.source_nodes[i].node.get_text()));
    display(response.source_nodes[i].node.metadata);

In [None]:
response = query("what are five features can improve user experience in  Decision making under deep uncertainty software, Please cite your sources")


In [None]:
response = query("what features can reduce the complexity of the decision making under deep uncertainty process? Please cite your sources")


In [None]:
response = query("Decision Making under deep uncertainty has a high learning curve, what could be added to a gui to reduce this learning curve? Please cite your sources")


In [None]:
for i in range(len(response.source_nodes)):
    display(Markdown(response.source_nodes[i].node.get_text()));
    display(response.source_nodes[i].node.metadata);

In [None]:
query("""What are the types of DMDU analyses used within the documents? """)


In [None]:
query("""DMDU uses the following steps: 1) Problem Framing
	- Identify
		- objectives
		- constraints
		- major uncertainties
		- definition of success.
2) Identify when the status quo starts to fail. 
	- Simulate Business as Usual
	- Identify tipping points into a failing system
		- Need to identify rules for switching interventions
3) Identify and measure interventions
4) Explore Pathways from interventions
5) Design Adaptive Plan


Provide a list of triplets that identifies the algorithms used with each part.\n\n\
                          """)
