<a href="https://colab.research.google.com/github/himajavadaga/GenAI-Langchain/blob/main/Compliance_Rag_modular_code.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# Install necessary packages
!pip install -qqq langchain llama-index llama-hub openai accelerate==0.21.0 bitsandbytes==0.40.2 transformers sentence_transformers InstructorEmbedding chromadb pypdf pdf2image
!pip install beautifulsoup4>=4.12.2
!pip install --upgrade llama_index MainContentExtractor
!pip install langchain_community
!pip install langchain-huggingface
!pip install llama-index-embeddings-huggingface
!pip install llama-index
%pip install llama-index-vector-stores-chroma
%pip install llama-index-llms-huggingface
!pip install bitsandbytes
!pip install --upgrade accelerate

Collecting accelerate
  Using cached accelerate-0.34.0-py3-none-any.whl.metadata (19 kB)
Using cached accelerate-0.34.0-py3-none-any.whl (324 kB)
Installing collected packages: accelerate
  Attempting uninstall: accelerate
    Found existing installation: accelerate 0.21.0
    Uninstalling accelerate-0.21.0:
      Successfully uninstalled accelerate-0.21.0
Successfully installed accelerate-0.34.0


In [2]:
!apt-get install -y python3.10-venv

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
python3.10-venv is already the newest version (3.10.12-1~22.04.5).
0 upgraded, 0 newly installed, 0 to remove and 49 not upgraded.


In [3]:
#Create a python virtual environment
!python -m venv myenv
!source myenv/bin/activate

In [4]:
# Imports
import torch
import logging
import sys
from transformers import AutoTokenizer, TextStreamer, pipeline
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, AutoTokenizer
import chromadb

In [5]:
from langchain import HuggingFacePipeline, PromptTemplate
from langchain.chains import RetrievalQA
from langchain.embeddings import HuggingFaceInstructEmbeddings
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from langchain.vectorstores import chroma

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, StorageContext, load_index_from_storage, service_context


In [6]:
import chromadb
chroma_client = chromadb.EphemeralClient()
chroma_collection = chroma_client.create_collection("stores")

In [7]:
from IPython.display import Markdown, display
from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.core import Settings






In [8]:
from llama_index.vector_stores.chroma import ChromaVectorStore

In [9]:
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.response.notebook_utils import display_response
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.indices.postprocessor import SimilarityPostprocessor, KeywordNodePostprocessor
from llama_index.core.response_synthesizers import get_response_synthesizer
from llama_index.core import SummaryIndex
from llama_index.core.prompts import PromptTemplate
from llama_index.core.storage.storage_context import StorageContext
from llama_index.core.response.pprint_utils import pprint_response

In [10]:
from llama_index.core.text_splitter import SentenceSplitter

In [11]:
# Logging configuration
logging.basicConfig(stream=sys.stdout,level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

In [12]:
# google drive mounting for document storage and access
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [14]:
# creating the folder called data to extract pdf from websites and store them
!mkdir "data"

!wget -O ./data/ffiec.pdf 'https://bsaaml.ffiec.gov/docs/manual/BSA_AML_Man_2014_v2_CDDBO.pdf' # Downloads a pdf file to the data directory

--2024-09-03 15:17:10--  https://bsaaml.ffiec.gov/docs/manual/BSA_AML_Man_2014_v2_CDDBO.pdf
Resolving bsaaml.ffiec.gov (bsaaml.ffiec.gov)... 70.33.178.56
Connecting to bsaaml.ffiec.gov (bsaaml.ffiec.gov)|70.33.178.56|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2960413 (2.8M) [application/pdf]
Saving to: ‘./data/ffiec.pdf’


2024-09-03 15:17:23 (442 KB/s) - ‘./data/ffiec.pdf’ saved [2960413/2960413]



In [15]:
from llama_index.core import VectorStoreIndex, download_loader

MainContentExtractorReader = download_loader("MainContentExtractorReader")
#Loading external data sources
loader = MainContentExtractorReader()
documents_link = loader.load_data(urls = ['https://www.fincen.gov/resources/'])
documents_pdf = SimpleDirectoryReader('/content/data').load_data()
documents = documents_pdf + documents_link

  MainContentExtractorReader = download_loader("MainContentExtractorReader")


In [16]:
 hf_token = "huggingfacetoken"

In [17]:
# Check if CUDA is available
if torch.cuda.is_available():
    # Set device to CUDA if available
    device_map = "cuda"
    print("CUDA is available. Using device:", device_map)
else:
    # Set device to CPU if CUDA is not available
    device_map = "cpu"
    print("CUDA is not available. Using device:", device_map)
    quantization_config = None # Disable quantization


CUDA is available. Using device: cuda


In [18]:
quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
)

In [19]:
#LLM configuration

llm = HuggingFaceLLM(
    model_name="meta-llama/Llama-2-7b-chat-hf",
    tokenizer_name="meta-llama/Llama-2-7b-chat-hf",
    query_wrapper_prompt=PromptTemplate(" [INST] {query_str} [/INST] "),
    context_window=3900,
    model_kwargs={"token": hf_token, "quantization_config": quantization_config},
    tokenizer_kwargs={"token": hf_token},
    device_map="auto",
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/614 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/26.8k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/188 [00:00<?, ?B/s]

You are calling `save_pretrained` to a 4-bit converted model, but your `bitsandbytes` version doesn't support it. If you want to save 4-bit models, make sure to have `bitsandbytes>=0.41.3` installed.


tokenizer_config.json:   0%|          | 0.00/1.62k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]

In [20]:
#Set up Embedding Model

from langchain.embeddings import HuggingFaceInstructEmbeddings
from huggingface_hub import notebook_login
notebook_login()

from langchain_community.embeddings import HuggingFaceInstructEmbeddings


VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [21]:
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
embed_model = HuggingFaceEmbedding(model_name = "BAAI/bge-small-en-v1.5")


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/94.8k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/52.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/743 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/133M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/366 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [23]:
from llama_index.core import Settings

# bge embedding model
Settings.embed_model = embed_model

# Llama-3-8B-Instruct model
Settings.llm = llm

In [24]:

from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.text_splitter import SentenceSplitter

index = VectorStoreIndex.from_documents(
    documents,
    transformations = [SentenceSplitter(chunk_size = 512)]
)


In [25]:
#Storing

import chromadb
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import StorageContext

In [26]:
chroma_client = chromadb.PersistentClient()
chroma_collection = chroma_client.create_collection("final_db")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

In [27]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context
)
query_engine = index.as_query_engine()
response = query_engine.query("What are the red flags specific to payment processors that have been highlighted by FinCEN and FFIEC?")
print(response)

Based on the provided context information, the red flags specific to payment processors that have been highlighted by FinCEN and FFIEC are:

1. Unidentified or unverifiable nature and source of transactions: If a bank is unable to identify and understand the nature and source of transactions processed through an account, it may be vulnerable to suspicious activity.
2. High-risk merchants: Payment processors may process transactions for higher-risk merchants who may use third-party processors to avoid direct bank relationships. Banks should be aware of the heightened risk of returns and use of services by higher-risk merchants.
3. Lack of adequate due diligence: Banks should ensure that their payment processors have an effective means of verifying their merchant clients' identities and business practices. A bank may assess the risks associated with payment processors by considering the following:
	* Implementing a policy that requires an initial background check of the processor and its