<img src="src/pighat.png" alt="drawing" width="400"/>
<h1 style='text-align:center;'><span style='color:#46aa22;'>Financial Literacy </span><span style='color:#80d162;'>AI </span><span style='color:#46aa22;'>Resource</span></h1>

---

Write Summary Here

Note about running in Google Colab notebook

In [None]:
# Clone the FLAIR repository
!git clone https://github.com/hannahawalsh/FLAIR.git
# import os
# if "LFQA_utils.py" not in os.listdir() and "FLAIR" not in os.listdir():
#     os.chdir("FLAIR")

In [None]:
%%capture
# Install dependencies - this takes a few moments
!pip install transformers
!pip install faiss_gpu
!pip install datasets

In [None]:
# Imports 
import sys
sys.path.insert(0, "./FLAIR")
from LFQA_utils import longFormQA, wrap_print, ask_questions

### The Long Form Question Answering Model
A long form question answering model is an AI model that provides long, complex answers to questions. Unlike other question answering models, it does not simply pull snippets of text from a database for its answers, but synthesizes information to create one cohesive answer. It can answer more abstract questions rather than trivia-type ones.  

Our model actually consists of two parts. One is the question-answer model, which pulls passages of text relating to the answer. The second is a sequence to sequence model, which forms those answers into one cohesive answer. 

Some of the functions used in this project are modified from the blogpost and accompanying github code [Explain _Anything_ Like I'm Five: A Model for Open Domain Long Form Question Answering](https://yjernite.github.io/lfqa.html), which goes more in depth on the technical aspects of this type of model. We applied their concepts to fit the needs of our project and apply it to the financial literacy world.  

---

#### Loading the model and the data
Run the below cell to load the model weights and index the financial literacy dataset (if that file doesn't already exist). It might take a little bit if the data hasn't been indexed. You only need to run this cell once per session.

Note: this must be run on a GPU to work. It currently doesn't support running on a CPU. If you're running this in Google Colab, don't worry; it is already set up to run on a small GPU.

In [None]:
# Note: this cell takes ~2 minutes to run

# Initialize our Long-Form Question-Answering class
data_filename = "https://github.com/hannahawalsh/FLAIR/raw/main/financial_literacy_data.csv"
lfqa = longFormQA(data_filename)

# Load model weights from hugging face
qa_model_name = "yjernite/retribert-base-uncased"
s2s_model_name = "yjernite/bart_eli5"
lfqa.load_model_weights(qa_model_name, s2s_model_name)

# Create a dense index of our data if one hasn't been created yet
lfqa.create_dense_index(batch_size=512)

### And we're ready to roll!
You can now ask the model questions! Try it out below.

Change how the answers are formed by changing some of the custom parameters.

Do note that because of the limited time and resources of the hackathon, the answers may not always make that much sense or be accurate. It also doesn't know anything about non-financial topics, so don't ask!

In [None]:
# Ask it a question
question = "What is a credit score?"
answer = lfqa.ask_a_question(Q, **model_kwargs)
wrap_print(answer)

In [None]:
# Add some custom parameters 
max_question_len = 512
min_answer_len = 256
max_answer_len = 512
question = "What is a credit score?"

answer = lfqa.ask_a_question(Q, **model_kwargs)
wrap_print(answer)

---
Below are a few more examples. Feel free to change the cells to experiment yourself. Note the model is lacking information in some areas so it doesn't answer well, or even correctly. Also watch for how the model parameters affect the answers.

In [None]:
# Here's a few more examples
kwargs = {
    "max_question_len": 1024, 
    "min_answer_len": 64, 
    "max_answer_len": 512} 

questions = [
    "What is the minimum amount I need to open a bank account?",
    "Is there an account for medical costs?",
    "Should I save for retirement if I paid off all my bills?",
    "How do I pay my bills every month?",
    "What's the first thing I should do when I get out of jail?",
    "Are there resources for people leaving prison?"]

ask_questions(questions, kwargs)

In [None]:
# Ask those same questions but with different model parameters
kwargs = {
    "max_question_len": 256, 
    "min_answer_len": 32, 
    "max_answer_len": 256} 
ask_questions(questions, kwargs)

In [None]:
# Try some more questions
questions = [
    "How do I open up a bank account?",
    "What do I need a bank account for?",
    "Why should I open a savings account?",
    "Whats the difference between a savings and checking account?",
    "Whats the difference between credit and debit?",
]
kwargs = {"max_question_len": 256, "min_answer_len": 32, "max_answer_len": 512} 
ask_questions(questions, kwargs)

In [None]:
# And those same ones with different parameters
kwargs = {"max_question_len": 256, "min_answer_len": 32, "max_answer_len": 512} 
ask_questions(questions, kwargs)

In [None]:
# These answers are using different parameters and it is clear that we 
# need to increase min_answer_len
# When answers are too short, they aren't helpful
kwargs = {"max_question_len": 1024, "min_answer_len": 8, "max_answer_len": 1024} 
questions = [
    "What do I do if I'm poor?",
    "How do I get out of debt?",
    "Can I get a credit card without a credit score?",
    "How do I improve my credit score?",
    "Are there bank accounts for non-citizens?"]
ask_questions(questions, kwargs)