# **Krack Hack**
## **Theme:**
1) AI for Sustainable development goals
2) Intelligent Financial Advisor

### **Problem Statement:**
Many individuals lack access to personalized financial advice and struggle to perform cash flow analysis. Moreover, there is a need for leveraging AI to address sustainable development goals (SDGs) such as financial empowerment  enabling individuals to make sound financial decisions and plans.

### **Solution (Building application for Financial advisory using GPT-3.5 Turbo Model, FAISS, LANGCHAIN, RAG Model, Next.JS with Voice Features)**
Develop an intelligent financial advisor platform that integrates artificial intelligence  to offer personalized financial advice and cash flow analysis to users. This platform will assist users in making informed decisions regarding investments and financial goals, thereby promoting financial empowerment.. This solution aligns with SDGs by addressing the need for financial empowerment to  promote sustainable development.

**Data Information**
   - The data consists of 9 books
   1) TRACTION : GET A GRIP ON YOUR BUSINESS - GINO	WICKMAN
   2) Rich Dad Poor dad
   3) The Intelligent Investor - BENJAMIN GRAHAM
   4) The Millionaire Next Door ( Thomas J. Stanley & William D. Danko, 1998)
   5) The Total Money Makeover - Dave Ramsey
   6) The-Little-Book-of-Common-Sense-Investing
   7) The-Psychology-of-Money
   8) Thinking-Fast-and-Slow
   9) I Will Teach You To Be Rich - Ramit Sethi
   
**Loading the document and splitting text:**
   - Loading the pdf file using PyPDFLoader and extracted text from the pdf
   - Splitting the text into smaller chunks using langchain.text_splitter

**Text Embeddings:**
   - Text embeddings are generated using the `HuggingFaceEmbeddings` class from `langchain_community.embeddings`.
   - The model used for embedding is 'sentence-transformers/all-MiniLM-L6-v2', and it is configured to run on the CPU.

**Converting to vectors and saving it**
   - Converted text to vector using FAISS class from langchain_community.vectorstores and then saving the data

In [75]:
# !pip install langchain pypdf pypdf2 sentence_transformers

In [76]:
# !pip install accelerate

In [77]:
#<-------------------------------------------------------------------------------------------------->
#Importing the required libraries
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.document_loaders import PyPDFLoader, DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
#<-------------------------------------------------------------------------------------------------->
DATA_PATH = 'books/'  #Path containing my data
DB_FAISS_PATH = 'vectorstore/db_faiss'  #Path where we store the embeddings of the data
#<-------------------------------------------------------------------------------------------------->
#Function for creating embeddings of my data
def create_vector_db():
    #Loading the data
    loader = DirectoryLoader(DATA_PATH,glob='*.pdf',loader_cls=PyPDFLoader)
    documents = loader.load()

    #Splitting the data
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=500,chunk_overlap=50)
    texts = text_splitter.split_documents(documents)

    #Converting to embeddings using sentence transformer model
    embeddings = HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2',
                                       model_kwargs={'device': 'cpu'})
    db = FAISS.from_documents(texts, embeddings)

    #Saving the model
    db.save_local(DB_FAISS_PATH)

    #Returns the database
    return db
#<-------------------------------------------------------------------------------------------------->

# **Creating a RAG Using LangChain and FAISS**

    Calling the function created above which converts the text data into embeddings

In [78]:
# !pip install faiss-cpu

In [79]:
# #<-------------------------------------------------------------------------------------------------->
# #Storing the embeddings into db1
# db1=create_vector_db()
# #<-------------------------------------------------------------------------------------------------->

    Loading the embeddings created above

In [80]:
# #<-------------------------------------------------------------------------------------------------->
embeddings = HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2',
                                       model_kwargs={'device': 'cpu'})
db1 = FAISS.load_local(DB_FAISS_PATH,embeddings,allow_dangerous_deserialization=True)
# #<-------------------------------------------------------------------------------------------------->

- It creates a retriever using a vector store (`db1`). The retriever is configured for similarity search, enabling the retrieval of documents similar to a given query.

In [81]:
#<-------------------------------------------------------------------------------------------------->
#It checks similar content
retriever = db1.as_retriever(
   search_type="similarity",
   search_kwargs={'k': 100}
)
#<-------------------------------------------------------------------------------------------------->

- Checking our vector database and see if it can retrieve similar chunks of content giving some prompt
- It is basically fetching the output of the prompt from the vector database only

In [82]:
#<-------------------------------------------------------------------------------------------------->
#Query to ask from the database
query = "How to handle apple scab disease"

#Fetching it from above
docs = db1.similarity_search(query)
print(docs[0].page_content)
#<-------------------------------------------------------------------------------------------------->

and penetration of host cuticle just as with  ascospores. Late in the season, hyphae grow 
deep into leaf tissue to start the overwintering  phase of the life cycle.  Control of the apple 
scab pathogen is primarily with fungicides on  a timed schedule. Removal of infected or 
dead plant tissues and use of resistant varieties are other recommended practices. 
The ascomycetes described in this chapte r include a significant number of plant


We return a lot of text here and it's not that clear what we need or what is relevant. Fortunately, our LLM will be able to parse this information much faster than us. All we need is to connect the output from our `vectorstore` to our `chat` chatbot. To do that we can use the same logic as we used earlier.

In [83]:
# #<-------------------------------------------------------------------------------------------------->
def augment_prompt(query: str):
    # get top 3 results from knowledge base
    results = db1.similarity_search(query, k=10)
    # get the text from the results
    source_knowledge = "\n".join([x.page_content for x in results])
    # feed into an augmented prompt
    augmented_prompt = f"""Using the contexts below, answer the query.

    Contexts:
    {source_knowledge}

    Query: {query}"""
    return augmented_prompt
# #<-------------------------------------------------------------------------------------------------->

Using this we produce an augmented prompt:

In [84]:
# #<-------------------------------------------------------------------------------------------------->
query = "How to handle apple scab disease"
print(augment_prompt(query))
# #<-------------------------------------------------------------------------------------------------->

Using the contexts below, answer the query.

    Contexts:
    and penetration of host cuticle just as with  ascospores. Late in the season, hyphae grow 
deep into leaf tissue to start the overwintering  phase of the life cycle.  Control of the apple 
scab pathogen is primarily with fungicides on  a timed schedule. Removal of infected or 
dead plant tissues and use of resistant varieties are other recommended practices. 
The ascomycetes described in this chapte r include a significant number of plant
Many diseases, such as apple scab, are more  severe on young tissue because pathogens 
infect this tissue more readily. Plants are ofte n more susceptible to diseases when they are 
young or when the plant or particular plant parts are actively growing. Young tissue can 
be more susceptible to disease because natural barriers (e.g., cuticular coats) have yet to develop. There are also plant diseases, such as Sclerotinia blight of peanut caused by
plant, especially when cultivars differ in 

# **Building LLM Chain for Question-Answering and Integrating Retreival Augmented generation (RAG)**

- We are using OpenAI API to access the GPT-3.5 Turbo model

- The prompt_template variable is a string that contains a template for the prompt that will be given to the model. This template includes an instruction section, which provides context for the question, and a question section, which contains the actual question.

In [85]:
# !pip install openai==0.28

In [86]:
# #<-------------------------------------------------------------------------------------------------->
#Importing Libraries
import transformers
from transformers import pipeline
import openai
# #<-------------------------------------------------------------------------------------------------->
# Loading the GPT-3.5 Turbo model
openai.api_key = 'sk-proj-qIzYflKkszJgArMzdhBsT3BlbkFJUX4IEIJXAE6pPE7LcNhW'
model_id = 'gpt-3.5-turbo'
# #<-------------------------------------------------------------------------------------------------->
# Generating template of prompt to give to my model
prompt_template = """
### [INSTRUCTION]
Answer the question based on your knowledge of crop management system:

{context}

### QUESTION:
{question}

[/INSTRUCTION]
"""
# #<-------------------------------------------------------------------------------------------------->

- This function generate_response(prompt) generates a response using the **LLM chain** and taking context from the data which we gave in vector database which makes a **RAG Chain**
- These functions work together to generate a prompt from a template and then use that prompt to get a response from the OpenAI GPT-3.5 model and then checking in the **vector database** for the context realted to the prompt and updating the output based on the context.

In [87]:
#<-------------------------------------------------------------------------------------------------->
# Create prompt from prompt template
def generate_prompt(context, question):
    prompt = prompt_template.format(context=context, question=question)
    return prompt

# #<-------------------------------------------------------------------------------------------------->

# Creating LLM chain
def generate_respons(prompt):
    # Create a list of messages
    messages = [
        {"role": "system", "content": "You have good knowledge about crop management system."},
        {"role": "user", "content": prompt},
    ]

    # Call the OpenAI API with the list of messages
    # client = openai(api_key='...')
    response = openai.chat.completions.create(
        model=model_id,
        messages=messages,
        temperature=0.5,
        max_tokens=2500,
        top_p=1.0,
        frequency_penalty=0.0,
        presence_penalty=0.0
    )

    # Return the text content of the response
    return response.choices[0].message.content.strip()
# #<-------------------------------------------------------------------------------------------------->

# **Testing our model created on RAG chain**
- Printing the output by the llm model by giving context to it from the vector database

In [88]:
#<-------------------------------------------------------------------------------------------------->
# Query to be asked
query = "How to handle apple scab disease"

# Invoking query in pipeline
context = augment_prompt(query)
prompt = generate_prompt(context, query)
response = generate_respons(prompt)
# Storing output text
output = response
print(output)
# #<-------------------------------------------------------------------------------------------------->

To handle apple scab disease, the following practices can be implemented:

1. **Fungicide Application**: Control of the apple scab pathogen is primarily done through fungicides on a timed schedule. Initiate a fungicidal spray program when conditions favor infection.

2. **Cultural Practices**:
    - **Remove Infected or Dead Plant Tissues**: Regularly remove infected or dead plant tissues to reduce the spread of the disease.
    - **Use Resistant Varieties**: Plant resistant apple varieties to reduce the susceptibility to apple scab.

3. **Monitoring and Forecasting**:
    - Monitor for initial inoculum and weather conditions conducive for spore release and infection. Initiate control measures when infection is favored.
  
4. **Diagnostic Information**:
    - Complete diagnostic information sheets or questionnaires when sending samples for diagnosis to understand the extent of the infection.

5. **Seek Expert Assistance**:
    - If needed, seek assistance from county extension agents o

# **Voice-Recognition**
- Adding voice functionality

In [89]:
# # #<-------------------------------------------------------------------------------------------------->
# #Importing the libraries
# import speech_recognition as sr
# import pyttsx3
# # #<-------------------------------------------------------------------------------------------------->

- Building all the required functions for handling voice features
1) Convert voice to text
2) text to speech
3) Speech to text

In [90]:
# # #<-------------------------------------------------------------------------------------------------->
# recognizer = sr.Recognizer()
# def capture_voice_input():
#     with sr.Microphone() as source:
#         print("Listening...")
#         audio = recognizer.listen(source)
#     return audio
# # #<-------------------------------------------------------------------------------------------------->
# def convert_voice_to_text(audio):
#     try:
#         text = recognizer.recognize_google(audio)
#         print("You said: " + text)
#     except sr.UnknownValueError:
#         text = ""
#         print("Sorry, I didn't understand that.")
#     except sr.RequestError as e:
#         text = ""
#         print("Error; {0}".format(e))
#     return text
# # #<-------------------------------------------------------------------------------------------------->
# def text_to_speech(text):
#     engine = pyttsx3.init()
#     engine.say(text)
#     engine.runAndWait()
# # #<-------------------------------------------------------------------------------------------------->
# def speech_to_text():
#     recognizer = sr.Recognizer()
#     with sr.Microphone() as source:
#         print("Say something:")
#         audio = recognizer.listen(source)

#     try:
#         print("You said: " + recognizer.recognize_google(audio))
#     except sr.UnknownValueError:
#         print("Could not understand audio")
#     except sr.RequestError as e:
#         print("Could not request results; {0}".format(e))

# # #<-------------------------------------------------------------------------------------------------->

# **Summarizing the generated response by our model**
- Using facebook bart large cnn model for this task

In [91]:
# # #<-------------------------------------------------------------------------------------------------->
# #Importing libraries for text summarization
# from transformers import BartTokenizer, BartForConditionalGeneration
# model_name = "facebook/bart-large-cnn"
# tokenizer = BartTokenizer.from_pretrained(model_name)
# model = BartForConditionalGeneration.from_pretrained(model_name)
# # #<-------------------------------------------------------------------------------------------------->

In [92]:
# # #<-------------------------------------------------------------------------------------------------->
# def summarze_output(output):
#     input_text =output
#     inputs = tokenizer.encode("summarize: " + input_text, return_tensors="pt", max_length=1024, truncation=True)
#     summary_ids = model.generate(inputs, max_length=100, min_length=50, length_penalty=2.0, num_beams=4, early_stopping=True)
#     summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
#     return summary
# # #<-------------------------------------------------------------------------------------------------->

**Callling the above functions and integrating them**

In [93]:
# #<-------------------------------------------------------------------------------------------------->
# audio = capture_voice_input()
# query = convert_voice_to_text(audio)
# context = augment_prompt(query)
# prompt = generate_prompt(context, query)
# response = generate_respons(prompt)
# # #<-------------------------------------------------------------------------------------------------->
# out = response
# print(out)
# #<-------------------------------------------------------------------------------------------------->
# print(summarze_output(out))
# #<-------------------------------------------------------------------------------------------------->


In [94]:
# !pip install fastapi uvicorn

In [95]:
from fastapi import FastAPI, Query
from fastapi.middleware.cors import CORSMiddleware
app = FastAPI()
origins = [
    "http://localhost:3000",  # Add other allowed origins as needed
]

app.add_middleware(
    CORSMiddleware,
    allow_origins=origins,
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

@app.get("/generate_response/")
async def generate_response(input_data: str = Query(..., title="input_data", description="Enter your prompt here")):
    context = augment_prompt(input_data)
    prompt = generate_prompt(context, input_data)
    response = generate_respons(prompt)
    # summarze_output(response)
    print(response)
    return {"response": response}

ModuleNotFoundError: No module named 'fastapi'

In [None]:
import asyncio
import uvicorn
if __name__ == "__main__":
    config = uvicorn.Config(app)
    server = uvicorn.Server(config)
    await server.serve()

INFO:     Started server process [8532]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)


Hello! How can I assist you today with crop management system?
INFO:     127.0.0.1:65063 - "GET /generate_response/?input_data=Hi HTTP/1.1" 200 OK
To handle apple scab disease, several management practices can be implemented:

1. **Fungicide Application**: Control of the apple scab pathogen is primarily done through the application of fungicides on a timed schedule. Fungicides should be applied preventatively before the disease appears, following the recommendations provided by agricultural extension services or plant pathologists.

2. **Sanitation**: Removal of infected or dead plant tissues, such as fallen leaves, can help reduce the overwintering of the apple scab pathogen. This practice reduces the initial inoculum for the next growing season.

3. **Resistant Varieties**: Planting resistant apple varieties can help reduce the impact of apple scab disease. Resistant varieties are less susceptible to infection, providing a natural defense against the pathogen.

4. **Monitoring and Fo

INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [8532]
