# **Krack Hack**
## **Theme:**
1) AI for Sustainable development goals
2) Intelligent Financial Advisor

### **Problem Statement:**
Many individuals lack access to personalized financial advice and struggle to perform cash flow analysis. Moreover, there is a need for leveraging AI to address sustainable development goals (SDGs) such as financial empowerment  enabling individuals to make sound financial decisions and plans.

### **Solution (Building application for Financial advisory using GPT-3.5 Turbo Model, FAISS, LANGCHAIN, RAG Model, Next.JS with Voice Features)**
Develop an intelligent financial advisor platform that integrates artificial intelligence  to offer personalized financial advice and cash flow analysis to users. This platform will assist users in making informed decisions regarding investments and financial goals, thereby promoting financial empowerment.. This solution aligns with SDGs by addressing the need for financial empowerment to  promote sustainable development.

**Data Information**
   - The data consists of 9 books
   1) TRACTION : GET A GRIP ON YOUR BUSINESS - GINO	WICKMAN
   2) Rich Dad Poor dad
   3) The Intelligent Investor - BENJAMIN GRAHAM
   4) The Millionaire Next Door ( Thomas J. Stanley & William D. Danko, 1998)
   5) The Total Money Makeover - Dave Ramsey
   6) The-Little-Book-of-Common-Sense-Investing
   7) The-Psychology-of-Money
   8) Thinking-Fast-and-Slow
   9) I Will Teach You To Be Rich - Ramit Sethi
   
**Loading the document and splitting text:**
   - Loading the pdf file using PyPDFLoader and extracted text from the pdf 
   - Splitting the text into smaller chunks using langchain.text_splitter

**Text Embeddings:**
   - Text embeddings are generated using the `HuggingFaceEmbeddings` class from `langchain_community.embeddings`.
   - The model used for embedding is 'sentence-transformers/all-MiniLM-L6-v2', and it is configured to run on the CPU.

**Converting to vectors and saving it**
   - Converted text to vector using FAISS class from langchain_community.vectorstores and then saving the data 

In [24]:
#<-------------------------------------------------------------------------------------------------->
#Importing the required libraries
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.document_loaders import PyPDFLoader, DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter 
#<-------------------------------------------------------------------------------------------------->
DATA_PATH = 'data/'  #Path containing my data
DB_FAISS_PATH = 'vectorstore/db_faiss'  #Path where we store the embeddings of the data
#<-------------------------------------------------------------------------------------------------->
#Function for creating embeddings of my data
def create_vector_db():
    #Loading the data
    loader = DirectoryLoader(DATA_PATH,glob='*.pdf',loader_cls=PyPDFLoader)
    documents = loader.load()

    #Splitting the data
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=500,chunk_overlap=50)
    texts = text_splitter.split_documents(documents)

    #Converting to embeddings using sentence transformer model 
    embeddings = HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2',
                                       model_kwargs={'device': 'cpu'})
    db = FAISS.from_documents(texts, embeddings)

    #Saving the model
    db.save_local(DB_FAISS_PATH)

    #Returns the database
    return db
#<-------------------------------------------------------------------------------------------------->

# **Creating a RAG Using LangChain and FAISS**

    Calling the function created above which converts the text data into embeddings

In [25]:
# #<-------------------------------------------------------------------------------------------------->
# #Storing the embeddings into db1
# db1=create_vector_db()
# #<-------------------------------------------------------------------------------------------------->

    Loading the embeddings created above

In [26]:
# #<-------------------------------------------------------------------------------------------------->
embeddings = HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2',
                                       model_kwargs={'device': 'cpu'})
db1 = FAISS.load_local(DB_FAISS_PATH,embeddings)
# #<-------------------------------------------------------------------------------------------------->

- It creates a retriever using a vector store (`db1`). The retriever is configured for similarity search, enabling the retrieval of documents similar to a given query.

In [27]:
#<-------------------------------------------------------------------------------------------------->
#It checks similar content
retriever = db1.as_retriever(
   search_type="similarity",
   search_kwargs={'k': 100}
)
#<-------------------------------------------------------------------------------------------------->

- Checking our vector database and see if it can retrieve similar chunks of content giving some prompt
- It is basically fetching the output of the prompt from the vector database only

In [28]:
#<-------------------------------------------------------------------------------------------------->
#Query to ask from the database
query = "I want to start investing, give me some tips"

#Fetching it from above
docs = db1.similarity_search(query)
print(docs[0].page_content)
#<-------------------------------------------------------------------------------------------------->

buy . Investing is not buying. It’ s more a case of knowing.
3. Organize smart people.
Intelligent people are those who work with or hire a person who is
more intelligent than they are. When you need advice, make sure
you choose your advisor wisely .
There is a lot to learn, but the rewards can be astronomical. If you do
not want to learn those skills, then being a type-one investor is highly
recommended. It is what you know that is your greatest wealth. It is what


We return a lot of text here and it's not that clear what we need or what is relevant. Fortunately, our LLM will be able to parse this information much faster than us. All we need is to connect the output from our `vectorstore` to our `chat` chatbot. To do that we can use the same logic as we used earlier.

In [29]:
# #<-------------------------------------------------------------------------------------------------->
def augment_prompt(query: str):
    # get top 3 results from knowledge base
    results = db1.similarity_search(query, k=10)
    # get the text from the results
    source_knowledge = "\n".join([x.page_content for x in results])
    # feed into an augmented prompt
    augmented_prompt = f"""Using the contexts below, answer the query.

    Contexts:
    {source_knowledge}

    Query: {query}"""
    return augmented_prompt
# #<-------------------------------------------------------------------------------------------------->

Using this we produce an augmented prompt:

In [30]:
# #<-------------------------------------------------------------------------------------------------->
query = "I want to start investing, give me some tips"
print(augment_prompt(query))
# #<-------------------------------------------------------------------------------------------------->

Using the contexts below, answer the query.

    Contexts:
    buy . Investing is not buying. It’ s more a case of knowing.
3. Organize smart people.
Intelligent people are those who work with or hire a person who is
more intelligent than they are. When you need advice, make sure
you choose your advisor wisely .
There is a lot to learn, but the rewards can be astronomical. If you do
not want to learn those skills, then being a type-one investor is highly
recommended. It is what you know that is your greatest wealth. It is what
now	you	understand	concepts,	like	automation	and	IRAs,	that	would	have
seemed	foreign	just	a	few	weeks	ago.	The	best	thing	you	can	do	is	be	a	great
example	to	others	and,	if	they	want	your	advice,	share	this	book	with	them.
Ignore	the	noise.	Remember,	investing	shouldn’t	be	dramatic	or	even	fun
—it	should	be	methodical,	calm,	and	as	fun	as	watching	grass	grow.	(What
you	can	
do
	with	your	investments—and	your	Rich	Life—
that’s
	fun!)
good investors.
Education and

# **Building LLM Chain for Question-Answering and Integrating Retreival Augmented generation (RAG)**

- We are using OpenAI API to access the GPT-3.5 Turbo model

- The prompt_template variable is a string that contains a template for the prompt that will be given to the model. This template includes an instruction section, which provides context for the question, and a question section, which contains the actual question.

In [31]:
# #<-------------------------------------------------------------------------------------------------->
#Importing Libraries
import transformers
from transformers import pipeline
import openai
# #<-------------------------------------------------------------------------------------------------->
# Loading the GPT-3.5 Turbo model
openai.api_key = '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'
model_id = 'gpt-3.5-turbo'
# #<-------------------------------------------------------------------------------------------------->
# Generating template of prompt to give to my model
prompt_template = """
### [INSTRUCTION]
Answer the question based on your financial advisory, investment, stock, loans knowledge. Here is context to help:

{context}

### QUESTION:
{question}

[/INSTRUCTION]
"""
# #<-------------------------------------------------------------------------------------------------->

- This function generate_response(prompt) generates a response using the **LLM chain** and taking context from the data which we gave in vector database which makes a **RAG Chain** 
- These functions work together to generate a prompt from a template and then use that prompt to get a response from the OpenAI GPT-3.5 model and then checking in the **vector database** for the context realted to the prompt and updating the output based on the context.

In [32]:
#<-------------------------------------------------------------------------------------------------->
# Create prompt from prompt template
def generate_prompt(context, question):
    prompt = prompt_template.format(context=context, question=question)
    return prompt

# #<-------------------------------------------------------------------------------------------------->

# Creating LLM chain
def generate_respons(prompt):
    # Create a list of messages
    messages = [
        {"role": "system", "content": "You are a helpful financial advisor."},
        {"role": "user", "content": prompt},
    ]

    # Call the OpenAI API with the list of messages
    response = openai.ChatCompletion.create(
        model=model_id,
        messages=messages,
        temperature=0.5,
        max_tokens=2500,
        top_p=1.0,
        frequency_penalty=0.0,
        presence_penalty=0.0
    )

    # Return the text content of the response
    return response.choices[0].message.content.strip()
# #<-------------------------------------------------------------------------------------------------->

# **Testing our model created on RAG chain**
- Printing the output by the llm model by giving context to it from the vector database

In [33]:
#<-------------------------------------------------------------------------------------------------->
# Query to be asked
query = "I want to start investing, give me some tips"

# Invoking query in pipeline
context = augment_prompt(query)
prompt = generate_prompt(context, query)
response = generate_respons(prompt)
# Storing output text
output = response
print(output)
# #<-------------------------------------------------------------------------------------------------->

Here are some tips to help you get started with investing:

1. Educate yourself: Take the time to learn about different investment options, such as stocks, bonds, mutual funds, and real estate. Understand the risks and potential returns associated with each type of investment.

2. Set clear financial goals: Determine why you want to invest and what you hope to achieve. This will help guide your investment decisions and keep you focused.

3. Start small: Begin with a small amount of money that you are comfortable investing. This allows you to gain experience and confidence without risking too much.

4. Diversify your portfolio: Spread your investments across different asset classes and industries to reduce risk. Diversification helps protect your portfolio from the impact of any single investment performing poorly.

5. Consider your risk tolerance: Understand how much risk you are willing to take on. Investments with higher potential returns often come with higher risks. Make sure your 

# **Voice-Recognition**
- Adding voice functionality

In [34]:
# # #<-------------------------------------------------------------------------------------------------->
# #Importing the libraries
# import speech_recognition as sr
# import pyttsx3
# # #<-------------------------------------------------------------------------------------------------->

- Building all the required functions for handling voice features
1) Convert voice to text
2) text to speech
3) Speech to text

In [35]:
# # #<-------------------------------------------------------------------------------------------------->
# recognizer = sr.Recognizer()
# def capture_voice_input():
#     with sr.Microphone() as source:
#         print("Listening...")
#         audio = recognizer.listen(source)
#     return audio
# # #<-------------------------------------------------------------------------------------------------->
# def convert_voice_to_text(audio):
#     try:
#         text = recognizer.recognize_google(audio)
#         print("You said: " + text)
#     except sr.UnknownValueError:
#         text = ""
#         print("Sorry, I didn't understand that.")
#     except sr.RequestError as e:
#         text = ""
#         print("Error; {0}".format(e))
#     return text
# # #<-------------------------------------------------------------------------------------------------->
# def text_to_speech(text):
#     engine = pyttsx3.init()
#     engine.say(text)
#     engine.runAndWait()
# # #<-------------------------------------------------------------------------------------------------->
# def speech_to_text():
#     recognizer = sr.Recognizer()
#     with sr.Microphone() as source:
#         print("Say something:")
#         audio = recognizer.listen(source)

#     try:
#         print("You said: " + recognizer.recognize_google(audio))
#     except sr.UnknownValueError:
#         print("Could not understand audio")
#     except sr.RequestError as e:
#         print("Could not request results; {0}".format(e))

# # #<-------------------------------------------------------------------------------------------------->

# **Summarizing the generated response by our model**
- Using facebook bart large cnn model for this task 

In [36]:
# # #<-------------------------------------------------------------------------------------------------->
# #Importing libraries for text summarization
# from transformers import BartTokenizer, BartForConditionalGeneration
# model_name = "facebook/bart-large-cnn"
# tokenizer = BartTokenizer.from_pretrained(model_name)
# model = BartForConditionalGeneration.from_pretrained(model_name)
# # #<-------------------------------------------------------------------------------------------------->

In [37]:
# # #<-------------------------------------------------------------------------------------------------->
# def summarze_output(output):
#     input_text =output
#     inputs = tokenizer.encode("summarize: " + input_text, return_tensors="pt", max_length=1024, truncation=True)
#     summary_ids = model.generate(inputs, max_length=100, min_length=50, length_penalty=2.0, num_beams=4, early_stopping=True)
#     summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
#     return summary
# # #<-------------------------------------------------------------------------------------------------->

**Callling the above functions and integrating them**

In [38]:
# #<-------------------------------------------------------------------------------------------------->
# audio = capture_voice_input()
# query = convert_voice_to_text(audio)
context = augment_prompt(query)
prompt = generate_prompt(context, query)
response = generate_respons(prompt)
# #<-------------------------------------------------------------------------------------------------->
out = response
print(out)
# #<-------------------------------------------------------------------------------------------------->
# print(summarze_output(out))
# #<-------------------------------------------------------------------------------------------------->      


Here are some tips to help you get started with investing:

1. Educate yourself: Take the time to learn about different investment options, such as stocks, bonds, mutual funds, and real estate. Understand the risks and potential returns associated with each.

2. Set clear financial goals: Determine what you want to achieve with your investments. Are you saving for retirement, a down payment on a house, or a child's education? Setting specific goals will help you make better investment decisions.

3. Start small: Begin with a small amount of money that you can afford to invest. This will allow you to gain experience and learn from any mistakes without risking a significant portion of your savings.

4. Diversify your portfolio: Spread your investments across different asset classes and industries. Diversification helps reduce risk by minimizing the impact of any single investment performing poorly.

5. Consider your risk tolerance: Understand your comfort level with risk and invest accor

In [39]:
from fastapi import FastAPI, Query
from fastapi.middleware.cors import CORSMiddleware
app = FastAPI()
origins = [
    "http://localhost:3000",  # Add other allowed origins as needed
]

app.add_middleware(
    CORSMiddleware,
    allow_origins=origins,
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

@app.get("/generate_response/")
async def generate_response(input_data: str = Query(..., title="input_data", description="Enter your prompt here")):
    context = augment_prompt(input_data)
    prompt = generate_prompt(context, input_data)
    response = generate_respons(prompt)
    # summarze_output(response)
    return {"response": response}


In [40]:
import asyncio
import uvicorn
if __name__ == "__main__":
    config = uvicorn.Config(app)
    server = uvicorn.Server(config)
    await server.serve()

INFO:     Started server process [10028]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)


INFO:     127.0.0.1:55509 - "GET /generate_response?input_data= HTTP/1.1" 307 Temporary Redirect
INFO:     127.0.0.1:55509 - "GET /generate_response/?input_data= HTTP/1.1" 200 OK
INFO:     127.0.0.1:55644 - "GET /generate_response?input_data= HTTP/1.1" 307 Temporary Redirect
INFO:     127.0.0.1:55644 - "GET /generate_response/?input_data= HTTP/1.1" 200 OK
INFO:     127.0.0.1:55726 - "GET /generate_response?input_data= HTTP/1.1" 307 Temporary Redirect
INFO:     127.0.0.1:55726 - "GET /generate_response/?input_data= HTTP/1.1" 200 OK
INFO:     127.0.0.1:55759 - "GET /generate_response?input_data= HTTP/1.1" 307 Temporary Redirect
INFO:     127.0.0.1:55759 - "GET /generate_response/?input_data= HTTP/1.1" 200 OK
INFO:     127.0.0.1:55930 - "GET /generate_response?input_data=Give%20me%20some%20bussiness%20ideas HTTP/1.1" 307 Temporary Redirect
INFO:     127.0.0.1:55930 - "GET /generate_response/?input_data=Give%20me%20some%20bussiness%20ideas HTTP/1.1" 200 OK
INFO:     127.0.0.1:56058 - "GET 

ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "c:\Users\Garv Gupta\AppData\Local\Programs\Python\Python311\Lib\site-packages\openai\api_requestor.py", line 753, in _interpret_response_line
    data = json.loads(rbody)
           ^^^^^^^^^^^^^^^^^
  File "c:\Users\Garv Gupta\AppData\Local\Programs\Python\Python311\Lib\json\__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\Garv Gupta\AppData\Local\Programs\Python\Python311\Lib\json\decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\Garv Gupta\AppData\Local\Programs\Python\Python311\Lib\json\decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

The above exception was the direct cause of the following

INFO:     127.0.0.1:56085 - "GET /generate_response?input_data=tell%20me%20some%20startup%20ideas HTTP/1.1" 307 Temporary Redirect
INFO:     127.0.0.1:56085 - "GET /generate_response/?input_data=tell%20me%20some%20startup%20ideas HTTP/1.1" 200 OK
INFO:     127.0.0.1:56152 - "GET /generate_response?input_data=give%20some%20points%20to%20build%20my%20startup HTTP/1.1" 307 Temporary Redirect
INFO:     127.0.0.1:56152 - "GET /generate_response/?input_data=give%20some%20points%20to%20build%20my%20startup HTTP/1.1" 200 OK
INFO:     127.0.0.1:56161 - "GET /generate_response?input_data=give%20some%20points%20to%20build%20my%20startup HTTP/1.1" 307 Temporary Redirect
INFO:     127.0.0.1:56161 - "GET /generate_response/?input_data=give%20some%20points%20to%20build%20my%20startup HTTP/1.1" 200 OK
INFO:     127.0.0.1:56220 - "GET /generate_response?input_data=how%20to%20start%20my%20startup. HTTP/1.1" 307 Temporary Redirect
INFO:     127.0.0.1:56220 - "GET /generate_response/?input_data=how%20to%2