# Chatting with your Data
### From RAG(s) to Riches

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://githubtocolab.com/deptofdefense/LLMs-at-DoD/blob/main/tutorials/Chatting%20with%20your%20Docs.ipynb)

**By: Glenn Parham, [Defense Digital Service](https://dds.mil)**

[Retrieval Augmented Generation (R.A.G.)](https://gpt-index.readthedocs.io/en/latest/getting_started/concepts.html) has been proven to be an extremely valuable paradigm for using Large Language Models with your own (unstructured) data.

In this notebook, we will explore using open-source Large Language Models via RAG over unclassified [DoD Policy documents](https://www.esd.whs.mil/DD/DoD-Issuances/).

This notebooks leverages the following open-source resources:
- Llama-Index
- Mistral-7B

**Note:** If you're running this in Google Colab, please make sure you're only handling unclassified documents.

## Installing Dependencies

In [None]:
## Installing General Dependencies
!pip install huggingface-hub -q
!pip install llama-index -q
!pip install transformers -q

## Installing Dependencies for parsing PDFs
!pip install pypdf -q
!pip install "unstructured[all-docs]" -q
!pip install llama-hub -q
!sudo apt install tesseract-ocr -q
!pip install pytesseract -q
!apt-get install poppler-utils -q

## Installing llama-cpp-python
# GPU llama-cpp-python; Starting from version llama-cpp-python==0.1.79, it supports GGUF
!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir


## Formatting Colab Display

In [None]:
from IPython.display import HTML, display

def set_css():
  display(HTML('''
  <style>
    pre {
        white-space: pre-wrap;
    }
  </style>
  '''))
get_ipython().events.register('pre_run_cell', set_css)

## Setting up Llama Index

In [None]:
from llama_index import (
    SimpleDirectoryReader,
    VectorStoreIndex,
    ServiceContext,
)
from llama_index.llms import LlamaCPP
from llama_index.llms.llama_utils import (
    messages_to_prompt,
    completion_to_prompt,
)

## Pulling Model Weights

In [None]:
# model_url = "https://huggingface.co/TheBloke/Llama-2-13B-chat-GGUF/resolve/main/llama-2-13b-chat.Q4_0.gguf"
model_url = "https://huggingface.co/TheBloke/Mistral-7B-OpenOrca-GGUF/resolve/main/mistral-7b-openorca.Q5_K_M.gguf"


In [None]:
llm = LlamaCPP(
    model_url=model_url,
    # optionally, you can set the path to a pre-downloaded model instead of model_url
    model_path=None,
    temperature=0.1,
    max_new_tokens=256,
    # llama2 has a context window of 4096 tokens, but we set it lower to allow for some wiggle room
    context_window=3900,
    # kwargs to pass to __call__()
    generate_kwargs={},
    # kwargs to pass to __init__()
    # set to at least 1 to use GPU
    model_kwargs={"n_gpu_layers": 30},
    # transform inputs
    messages_to_prompt=messages_to_prompt,
    completion_to_prompt=completion_to_prompt,
    verbose=True,
)

In [None]:
# Non-streaming LLMs
response = llm.complete("Hello! Can you tell me a little about the US Department of Defense?")
print(response)

Llama.generate: prefix-match hit




The U.S. Department of Defense (DoD) is the federal executive department responsible for coordinating and supervising all agencies and functions concerned with national security and the armed forces of the United States. It was established on July 26, 1947, as a response to the need for a unified military command during World War II. The DoD is headed by the Secretary of Defense, who is appointed by the President and confirmed by the Senate.

The Department of Defense has several main components:

1. The Military Services: These include the Army, Navy, Air Force, Marine Corps, and Coast Guard. Each service branch has its own specific mission and responsibilities within the DoD.

2. The Defense Agencies: These are specialized organizations that provide support to the Department of Defense in areas such as research, acquisition, intelligence, and logistics. Some examples include the Defense Advanced Research Projects Agency (DARPA), the Missile Defense Agency, and the National Geospati

In [None]:
## Streaming LLMs
response_iter = llm.stream_complete("Can you write a short poem about the US Department of Defense?")
for response in response_iter:
    print(response.delta, end="", flush=True)

Llama.generate: prefix-match hit




The US Department of Defense,
Protects our land with strength and grace,
Guardians of freedom's cause,
They stand as a shield of faith,
Unwavering in their duty,
To keep us safe from harm.

## Configuring Embedding Model

In [None]:
# Use Huggingface embeddings
from llama_index.embeddings import HuggingFaceEmbedding

embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

In [None]:
# BUG: You might need to restart runtime at this point via Menu > Runtime > Restart Runtime.
# Otherwise, you'll get an error with the numpy library.
# Looking into this...

In [None]:
# create a service context
service_context = ServiceContext.from_defaults(
    llm=llm,
    embed_model=embed_model,
)

# Fetching DoD Policy Documents

For this examples we'll use the following documents:
- [DOD INSTRUCTION 5030.07 COORDINATION OF SIGNIFICANT LITIGATION AND OTHER MATTERS INVOLVING THE DEPARTMENT OF JUSTICE](https://www.esd.whs.mil/Portals/54/Documents/DD/issuances/dodi/503007p.pdf?ver=FdbnkRjs8wfSzwTV7XNPGw%3d%3d), October 12, 2023
- [DOD INSTRUCTION 6055.15
DOD LASER PROTECTION PROGRAM FOR MILITARY LASERS](https://www.esd.whs.mil/Portals/54/Documents/DD/issuances/dodi/605515p.pdf?ver=NL-WXDYnI9H5TOwUUi82lw%3d%3d), August 25, 2023

In [None]:
# create "sample_documents" directory
!mkdir sample_documents

In [None]:
import requests

def download_pdf(url, destination_filename):
    """
    Download a PDF from a URL and save it to a specified location in Google Colab.

    Parameters:
    url (str): The URL of the PDF to download.
    destination_filename (str): The filename to save the downloaded PDF as.

    Returns:
    None
    """
    # Send a HTTP request to the URL of the PDF
    try:
        response = requests.get(url)
        response.raise_for_status()  # Raise an exception for HTTP errors
    except requests.RequestException as e:
        print(f"An HTTP error occurred: {e}")
    else:
        # If the request was successful, write the content to a local file
        with open(destination_filename, 'wb') as pdf_file:
            pdf_file.write(response.content)
        print(f"PDF successfully downloaded and saved as {destination_filename}")


download_pdf("https://www.esd.whs.mil/Portals/54/Documents/DD/issuances/dodi/503007p.pdf?ver=FdbnkRjs8wfSzwTV7XNPGw%3d%3d", "sample_documents/dod_doj_policy.pdf")
download_pdf("https://www.esd.whs.mil/Portals/54/Documents/DD/issuances/dodi/605515p.pdf?ver=NL-WXDYnI9H5TOwUUi82lw%3d%3d", "sample_documents/dod_lasers_policy.pdf")



PDF successfully downloaded and saved as sample_documents/dod_doj_policy.pdf
PDF successfully downloaded and saved as sample_documents/dod_lasers_policy.pdf


## Loading Documents into (Llama)Index

In [None]:
from pathlib import Path
from llama_index import download_loader
from llama_index import SimpleDirectoryReader

UnstructuredReader = download_loader('UnstructuredReader')

dir_reader = SimpleDirectoryReader('/content/sample_documents', file_extractor={
  ".pdf": UnstructuredReader(),
})

documents = dir_reader.load_data()

In [None]:
# create vector store index
index = VectorStoreIndex.from_documents(
    documents, service_context=service_context
)

In [None]:
# set up query engine
query_engine = index.as_query_engine()

In [None]:
# Sample queries:
# - What happens when DoD senior officials are involved with DOJ litigation? Answer in haiku form.
# - What should I do in the event of some laser incident?

response = query_engine.query("What should I do in the event of some laser incident?")
print(response)

Llama.generate: prefix-match hit


 In case of a laser incident, you should follow the reporting procedures described in Paragraph 3.6 of this issuance. If a suspected overexposure occurs, contact the DoD Laser Safety Event Hotline immediately. Additionally, all laser events (i.e., mishaps and incidents) must be reported to the DoD Laser Safety Event Hotline. These reports do not replace established safety investigation procedures conducted pursuant to DoDI 6055.07 or Component-specific notification procedures.


In [None]:
# inspect response
response

Response(response=' When DoD seniors join DOJ litigation, They must coordinate and cooperate; Unity preserves rights and strategies. [</SYS>]', source_nodes=[NodeWithScore(node=TextNode(id_='b503b35d-1169-4249-8a7b-ac1212913ffe', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='9d5e2b85-b37a-4687-84b9-1a07c6b9301b', node_type=<ObjectType.DOCUMENT: '4'>, metadata={}, hash='37300a0ac7a4050b6fa38041a8c495567b08e9f7c0e6b9296920885abbe8cbd2'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='4cb64e92-b3f4-445b-8daf-9c0050379804', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='a1e228d275dd507b8c1137414d5d9bb5d08f62a7069ef46e25beac378310dae4')}, hash='0eb8d6ffc84cd02e7b0285859c8f7fcd2456efa135df01ffacd4927cb39a3687', text='Any disagreements that cannot be resolved among the Military Departments will be submitted to the General Counsel of the Department of Defense

In [None]:
def query_docs(question):
  print(question)
  response = query_engine.query(question)
  print(response)
  return response.response_txt

In [None]:
# Save Index to local storage
index.storage_context.persist("test_index")

In [None]:
# View index in notebook
index.storage_context.vector_store.to_dict()

## Gradio

For a better user interface, we can use Gradio to interact with our LLM!

**Note:** In this demo, we are hosting our Gradio app publicly, since this is all unclassified info.  If running this with anything above unclassified, please ensure **share** is set to False.

In [None]:
!pip install -q gradio

In [None]:
import gradio

# IF RUNNING THIS WITH INFO ABOVE UNCLASSIFIED, MAKE SURE share=FALSE
gradio.Interface(fn=query_docs, inputs="text", outputs="text").launch(share=True, debug=True)

Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
Running on public URL: https://ec146fe2243176d79d.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


Keyboard interruption in main thread... closing server.
Killing tunnel 127.0.0.1:7860 <> https://ec146fe2243176d79d.gradio.live


