# Lex Guardian Notebook
In this notebook, we experiment on the implementation of the law guardian

## Import the Necessary files and Libraries

In [2]:
# Set configuration for notebook
import os

os.chdir("c:\\Users\\Spectra\\LexGuardian")
os.getcwd()

'c:\\Users\\Spectra\\LexGuardian'

In [3]:
# Import libraries
from dotenv import load_dotenv
from lex_guardian.utils import load_config
from qdrant_client import QdrantClient
from langchain_community.vectorstores import Qdrant
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.llms import HuggingFaceEndpoint
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

In [4]:
# Load config and dotenv
config = load_config()
load_dotenv()

True

## Process the documents

In [18]:
# Load the data

## Instantiate loader
loader = PyPDFLoader(config["data"]["data_path"])

## load the data
documents = loader.load()

## Split documents
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
text_chunks = text_splitter.split_documents(documents)

In [20]:
## Instantialize vectore store and load documents
url = config["vectorstore"]["url"]
api_key = os.getenv("QDRANT_API_KEY")


model_name = "BAAI/bge-base-en-v1.5"
embeddings = HuggingFaceEmbeddings(
    model_name=model_name,
)

vectorstore = Qdrant.from_documents(
    text_chunks,
    embeddings,
    url=url,
    prefer_grpc=True,
    api_key=api_key,
    collection_name="constitution",
)

2024-07-03 18:15:58,317 - sentence_transformers.SentenceTransformer - INFO - Use pytorch device_name: cpu
2024-07-03 18:15:58,319 - sentence_transformers.SentenceTransformer - INFO - Load pretrained SentenceTransformer: BAAI/bge-base-en-v1.5


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


config_sentence_transformers.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/94.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/52.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/777 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/366 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

## Retriever Initialization

In [21]:
# Instantiate retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})

In [23]:
# Get prompt
prompt = """
You are a knowledgeable and friendly legal assistant with expertise in Kenyan law. Your task is to help the common person understand various aspects of Kenyan law in a simple, clear, and easy-to-understand manner.

When responding, please follow these guidelines:

- Simplify Legal Terms: Use simple language and avoid legal jargon. When legal terms must be used, provide clear explanations.
- Provide Examples: Where possible, use examples or analogies to make complex legal concepts easier to grasp.
- Be Concise: Keep your responses short and to the point, but ensure they are complete and informative.
- Be Neutral and Informative: Provide information neutrally without offering personal opinions or legal advice.

Use the following pieces of retrieved context to answer the question.

Question: {question}
Context: {context}
Answer:
"""

prompt_template = PromptTemplate.from_template(prompt)

## Create QA Chain

In [24]:
# Post-processing
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

In [31]:
# LLM

repo_id = "mistralai/Mixtral-8x7B-Instruct-v0.1"

llm = HuggingFaceEndpoint(
    repo_id=repo_id,
    temperature=0.3,
)

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: read).
Your token has been saved to C:\Users\Spectra\.cache\huggingface\token
Login successful


In [32]:
# Chain using LangChain Expression Language
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt_template
    | llm
    | StrOutputParser()
)

In [16]:
def stream_response(query):
    for chunk in rag_chain.stream(query):
        print(chunk, end="", flush=True)

In [43]:
stream_response(
    "What are the rights of an arrested person and what section of the constitution guarantees these rights?"
)

The rights of an arrested person in Kenya are guaranteed by the Constitution, specifically Article 49. According to this Article, an arrested person has the right to:

1. Be informed promptly, in a language they understand, of the reason for their arrest, their right to remain silent, and the consequences of not remaining silent.
2. Remain silent.
3. Be brought before a court of law within 24 hours of being arrested, or as soon as reasonably possible.
4. Be released on bond or bail, unless there are compelling reasons not to do so.
5. Be represented by a legal practitioner of their choice.
6. Have an interpreter if they do not understand the language used in court.
7. Be treated with humanity and respect for their inherent dignity.

Additionally, Article 25 of the Constitution states that certain rights and fundamental freedoms cannot be limited, including freedom from torture and cruel, inhuman or degrading treatment or punishment, freedom from slavery or servitude, the right to a fai

In [9]:
from lex_guardian.rag import instantiate_db, setup_retriever, setup_llm, setup_rag_chain

In [7]:
db = instantiate_db(config)
retriever = setup_retriever(db)

  warn_deprecated(
  from tqdm.autonotebook import tqdm, trange
2024-07-04 16:31:05,998 - sentence_transformers.SentenceTransformer - INFO - Use pytorch device_name: cpu
2024-07-04 16:31:06,002 - sentence_transformers.SentenceTransformer - INFO - Load pretrained SentenceTransformer: BAAI/bge-base-en-v1.5
  warn_deprecated(


In [8]:
retriever.get_relevant_documents("my laws when I get arrested")

  warn_deprecated(
2024-07-04 16:32:57,261 - httpx - INFO - HTTP Request: POST https://b439d150-8555-4215-96d3-e48eb3ad8e86.europe-west3-0.gcp.cloud.qdrant.io:6333/collections/constitution/points/search "HTTP/1.1 200 OK"


[Document(metadata={'page': 27, 'source': 'data/raw/TheConstitutionOfKenya.pdf', '_id': 'd2f427a3-e603-4f67-a7ac-930648de3f82', '_collection_name': 'constitution'}, page_content='(b) promote efﬁcient administration.\nAccess to justice.\n48. \xa0The State shall ensure access to justice for all persons and, if any fee is\nrequired, it shall be reasonable and shall not impede access to justice.\nRights of arrested persons.\n49. \xa0 (1)\xa0\xa0An arrested person has the right —\n(a) to be informed promptly, in a language that the person understands,\nof—\n(i) the reason for the arrest;\n(ii) the right to remain silent; and\n(iii) the consequences of not remaining silent;\n(b) to remain silent;'),
 Document(metadata={'page': 27, 'source': 'data/raw/TheConstitutionOfKenya.pdf', '_id': '8e0e9da3-55e4-4ab3-ad98-5d113d136408', '_collection_name': 'constitution'}, page_content='(b) promote efﬁcient administration.\nAccess to justice.\n48. \xa0The State shall ensure access to justice for all per

In [11]:
llm = setup_llm()

  warn_deprecated(


The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: read).
Your token has been saved to C:\Users\Spectra\.cache\huggingface\token
Login successful


In [14]:
rag_chain = setup_rag_chain(llm=llm, retriever=retriever)

In [15]:
rag_chain

{
  context: VectorStoreRetriever(tags=['Qdrant', 'HuggingFaceEmbeddings'], vectorstore=<langchain_community.vectorstores.qdrant.Qdrant object at 0x000001E94D20C690>, search_kwargs={'k': 5})
           | RunnableLambda(format_docs),
  question: RunnablePassthrough()
}
| PromptTemplate(input_variables=['context', 'question'], template='\n    You are a knowledgeable and friendly legal assistant with expertise in Kenyan law. Your task is to help the common person understand various aspects of Kenyan law in a simple, clear, and easy-to-understand manner.\n\n    When responding, please follow these guidelines:\n\n    - Simplify Legal Terms: Use simple language and avoid legal jargon. When legal terms must be used, provide clear explanations.\n    - Provide Examples: Where possible, use examples or analogies to make complex legal concepts easier to grasp.\n    - Be Concise: Keep your responses short and to the point, but ensure they are complete and informative.\n    - Be Neutral and Informa

In [17]:
stream_response("my laws when I get arrested")

2024-07-04 16:36:56,947 - httpx - INFO - HTTP Request: POST https://b439d150-8555-4215-96d3-e48eb3ad8e86.europe-west3-0.gcp.cloud.qdrant.io:6333/collections/constitution/points/search "HTTP/1.1 200 OK"



    When you get arrested in Kenya, the law guarantees you certain rights to ensure fairness and access to justice. Here are some of the key rights:

    1. **Be informed of the reason for arrest**: You have the right to be told why you are being arrested. This should be done in a language you understand.

    2. **Right to remain silent**: You can choose not to say anything that could be used as evidence against you.

    3. **Right not to be compelled to make a confession or admission**: You cannot be forced to admit to something that could be used as evidence against you.

    4. **Right to be held separately**: If you are arrested, you should be held separately from people who are serving a sentence.

    5. **Right to be brought before a court**: You should be taken to court as soon as possible, but not later than 24 hours after being arrested. If the 24 hours ends outside ordinary court hours or on a day that is not an ordinary court day, you should be taken to court the next da