This notebook follows the following tutorial to set up LLaMA and Langchain and then adds our own corpus of IRS documents.

https://medium.com/@murtuza753/using-llama-2-0-faiss-and-langchain-for-question-answering-on-your-own-data-682241488476

First, import LLaMA-2

In [None]:
from torch import cuda, bfloat16
import transformers

model_id = 'meta-llama/Llama-2-7b-chat-hf'

device = f'cuda:{cuda.current_device()}' if cuda.is_available() else 'cpu'

# set quantization configuration to load large model with less GPU memory
bnb_config = transformers.BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type='nf4',
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=bfloat16
)

# begin initializing HF items, you need an access token
hf_auth = 'redacted'
model_config = transformers.AutoConfig.from_pretrained(
    model_id,
    use_auth_token=hf_auth
)

model = transformers.AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
    config=model_config,
    quantization_config=bnb_config,
    device_map='auto',
    use_auth_token=hf_auth
)

# enable evaluation mode to allow model inference
model.eval()

print(f"Model loaded on {device}")



config.json:   0%|          | 0.00/614 [00:00<?, ?B/s]



model.safetensors.index.json:   0%|          | 0.00/26.8k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]



generation_config.json:   0%|          | 0.00/188 [00:00<?, ?B/s]

Model loaded on cuda:0


# New Section

In [None]:
tokenizer = transformers.AutoTokenizer.from_pretrained(
    model_id,
    use_auth_token=hf_auth
)



tokenizer_config.json:   0%|          | 0.00/1.62k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]

In [None]:
stop_list = ['\nHuman:', '\n```\n']

stop_token_ids = [tokenizer(x)['input_ids'] for x in stop_list]
stop_token_ids

[[1, 29871, 13, 29950, 7889, 29901], [1, 29871, 13, 28956, 13]]

In [None]:
import torch

stop_token_ids = [torch.LongTensor(x).to(device) for x in stop_token_ids]
stop_token_ids

[tensor([    1, 29871,    13, 29950,  7889, 29901], device='cuda:0'),
 tensor([    1, 29871,    13, 28956,    13], device='cuda:0')]

In [None]:
from transformers import StoppingCriteria, StoppingCriteriaList

# define custom stopping criteria object
class StopOnTokens(StoppingCriteria):
    def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool:
        for stop_ids in stop_token_ids:
            if torch.eq(input_ids[0][-len(stop_ids):], stop_ids).all():
                return True
        return False

stopping_criteria = StoppingCriteriaList([StopOnTokens()])

instantiate the text generation pipeline

In [None]:
generate_text = transformers.pipeline(
    model=model,
    tokenizer=tokenizer,
    return_full_text=True,  # langchain expects the full text
    task='text-generation',
    stopping_criteria=stopping_criteria,  # without this model rambles during chat
    temperature=0.1,  # 'randomness' of outputs, 0.0 is the min and 1.0 the max
    max_new_tokens=512,  # max number of tokens to generate in the output
    repetition_penalty=1.1  # without this output begins repeating
)

Evaluate LLaMA-2 Baseline response

In [None]:
res = generate_text("How do I file my taxes?")
print(res[0]["generated_text"])

How do I file my taxes?
 Unterscheidung between a "tax return" and an "income tax return". In the United States, the term "taxes" generally refers to the federal income tax, which is a tax on income earned by individuals and businesses. The process of filing taxes involves preparing and submitting a tax return to the relevant tax authority, such as the Internal Revenue Service (IRS) for federal taxes or a state or local tax authority for state or local taxes.

There are several steps involved in filing taxes:

1. Gather all necessary documents: This includes W-2 forms from your employer(s), 1099 forms for any self-employment income, interest statements from banks and other financial institutions, and any other income or deduction documentation.
2. Choose a filing status: Your filing status will depend on your marital status and family situation. The most common filing statuses are single, married filing jointly, married filing separately, head of household, and qualifying widow(er) wit

In [None]:
from langchain.llms import HuggingFacePipeline

llm = HuggingFacePipeline(pipeline=generate_text)

# checking again that everything is working fine
llm(prompt="How do I file my taxes?")

"\n Unterscheidung between a tax return and an amended tax return.\nHow to file your taxes online with TurboTax.\nHow to file your taxes by mail with the IRS.\nWhat are the tax filing deadlines for 2023?\nWhat happens if you don't file your taxes on time?\nHow to avoid penalties for not filing your taxes on time.\nWhat is the difference between a tax extension and a tax amendment?\nHow to file for an automatic six-month tax extension.\nHow to file for an automatic two-year tax extension.\nWhat are the tax filing requirements for self-employed individuals?\nHow to file taxes as a self-employed individual.\nWhat are the tax filing requirements for independent contractors?\nHow to file taxes as an independent contractor.\nWhat are the tax filing requirements for freelancers?\nHow to file taxes as a freelancer.\nWhat are the tax filing requirements for small business owners?\nHow to file taxes as a small business owner.\nWhat are the tax filing requirements for nonprofit organizations?\nHo

Now load our own documents with langchain

In [None]:
# upload IRS PDFs that have been converted to txt

from google.colab import files
uploaded = files.upload()

Saving f8879.txt to f8879.txt
Saving f8845.txt to f8845.txt
Saving p5859.txt to p5859.txt
Saving p5681.txt to p5681.txt
Saving p6388.txt to p6388.txt
Saving f8689.txt to f8689.txt
Saving f15057.txt to f15057.txt
Saving f1120.txt to f1120.txt
Saving p5118.txt to p5118.txt


In [None]:
# check uploaded files

import os
os.listdir('.')

['.config',
 'f15057.txt',
 'p6388.txt',
 'f1120.txt',
 'f8845.txt',
 'p5859.txt',
 'p5118.txt',
 'p5681.txt',
 'f8879.txt',
 'f8689.txt',
 'sample_data']

In [None]:
import os
import codecs
from langchain.document_loaders import TextLoader

# prep files with correct encoding and add to langchain documents

def convert_and_load_text_files():
    current_encoding = 'latin-1'

    # List all files in the current working directory
    files = [f for f in os.listdir('.') if os.path.isfile(f) and f.lower().endswith('.txt')]

    # Iterate through each .txt file
    for file_name in files:
        # Read the content of the file
        with codecs.open(file_name, 'r', encoding=current_encoding, errors='replace') as file:
            content = file.read()

        # Write the content back in ASCII encoding
        with codecs.open(file_name, 'w', encoding='ascii', errors='replace') as file:
            file.write(content)

        # Load the content into Langchain DocumentLoader
        document_loader = TextLoader(file_name)
        doc = document_loader.load()
        print(doc)
        docs.append(doc)

docs = []
convert_and_load_text_files()


[Document(page_content="Catalog Number 71284W www.irs.gov Form 15057 (2-2019)Form 15057  \n(February 2019)Department of the Treasury - Internal Revenue Service\nAgreement to Rescind  \nNotice of Final Partnership Adjustment\n(See Instructions on Reverse)Audit control number\nTaxpayer ID Number (TIN)\nPursuant to section 6231(d) of the Internal Revenue Code,  \n(name of partnership)\n at\n(number, street, city or town, state, ZIP code) and the Commissioner of\nInternal Revenue agree to the following:\n1. The parties agree to rescind the notice of final partnership adjustment, issued on  to the \npartnership for the taxable year ending .(date of notice of final partnership adjustment)\n2. The parties agree that the period of limitations on making adjustments under section 6235 has not expired as to the above tax year \nand can be further extended at the time of this agreement or at a later date under applicable provisions of the Internal Revenue \nCode.\n3. The parties acknowledge that t

In [None]:
print(docs)



Finish preprocessing docs for retrieval

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=20)

for doc in docs:
  all_splits = text_splitter.split_documents(doc)
  print(len(all_splits))

4
55
22
2
180
273
335
12
16


In [None]:
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS

model_name = "sentence-transformers/all-mpnet-base-v2"
model_kwargs = {"device": "cuda"}

embeddings = HuggingFaceEmbeddings(model_name=model_name, model_kwargs=model_kwargs)

# storing embeddings in the vector store
vectorstore = FAISS.from_documents(all_splits, embeddings)

In [None]:
from langchain.chains import ConversationalRetrievalChain

chain = ConversationalRetrievalChain.from_llm(llm, vectorstore.as_retriever(), return_source_documents=True)

Evaluate RAG-LLaMA with the same query

In [None]:
chat_history = []

query = "How do I file my taxes?"
result = chain({"question": query, "chat_history": chat_history})

print(result['answer'])

 Form 8689 is used to determine how much US tax you owe to the USVI. You can either file it with your US tax return or separately with the VIs Bureau of Internal Revenue.


In [None]:
print(result['source_documents'])

[Document(page_content='37 Income tax withheld by the USVI ............... 37\n38 2023 estimated tax payments and amount applied from 2022 return ... 38\n39 Amount paid with Form 4868 (extension request) .......... 39\n40 Add lines 37 through 39. These are your total payments to the USVI ........... 40\n41 Enter the smaller of line 36 or line 40. Add this amount to the total payments line of your tax return. \nOn the dotted line next to it, enter ???Form 8689??? and show this amount ........... 41\n42 Overpayment to the USVI. If line 40 is more than line 36, subtract line 36 from line 40 ..... 42\n43 Amount of line 42 you want refunded to you ................... 43\n44 Amount of line 42 you want applied to your 2024 estimated tax .... 44\n45 Amount you owe to the USVI. If line 40 is less than line 36, subtract line 40 from line 36 ..... 45\n46 \n Enter the amount from line 45 that you will pay when you file your income tax return. Add this amount', metadata={'source': 'f8689.txt'}), Do