## Back-End Code
_collapse this section or scroll past it_

In [None]:
!pip install PyPDF2

**Video Tutorials**
1. Setup: [Tutorial_API_Key.mp4](https://csciitd-my.sharepoint.com/:v:/g/personal/ph1230116_iitd_ac_in/EWYUHVBmZ9ZNvo2f9S_6BokBkjJwuZhjdxAeXEIMPPer_A?nav=eyJyZWZlcnJhbEluZm8iOnsicmVmZXJyYWxBcHAiOiJPbmVEcml2ZUZvckJ1c2luZXNzIiwicmVmZXJyYWxBcHBQbGF0Zm9ybSI6IldlYiIsInJlZmVycmFsTW9kZSI6InZpZXciLCJyZWZlcnJhbFZpZXciOiJNeUZpbGVzTGlua0NvcHkifX0&e=D96hbu)
2. File Upload: [Tutorial_File_Upload.mp4](https://csciitd-my.sharepoint.com/:v:/g/personal/ph1230116_iitd_ac_in/EX_DeHmvDl9KtpEC-DzMbZoB057CR4trmJh9LmQ0nNA2Ew?nav=eyJyZWZlcnJhbEluZm8iOnsicmVmZXJyYWxBcHAiOiJPbmVEcml2ZUZvckJ1c2luZXNzIiwicmVmZXJyYWxBcHBQbGF0Zm9ybSI6IldlYiIsInJlZmVycmFsTW9kZSI6InZpZXciLCJyZWZlcnJhbFZpZXciOiJNeUZpbGVzTGlua0NvcHkifX0&e=Zf7Z8d)
3. Usage: [Tutorial_Usage.mp4](https://csciitd-my.sharepoint.com/:v:/g/personal/ph1230116_iitd_ac_in/EbRXjNUpMRlAs5qwgt5DHeMBT-Ssu8SbeiRjlNg3rqOxzA?nav=eyJyZWZlcnJhbEluZm8iOnsicmVmZXJyYWxBcHAiOiJPbmVEcml2ZUZvckJ1c2luZXNzIiwicmVmZXJyYWxBcHBQbGF0Zm9ybSI6IldlYiIsInJlZmVycmFsTW9kZSI6InZpZXciLCJyZWZlcnJhbFZpZXciOiJNeUZpbGVzTGlua0NvcHkifX0&e=6Rx0lF)

---

**Setup**
1. Go to [Google AI Studio](https://aistudio.google.com/app/apikey/) and click the button labelled "Create API key"
2. Select any option in the "Search Google Cloud Projects" box, the API key will now be visible on the page in a table
2. Copy the API key (from the API key column), you may close the AI studio tab now
3. Click the key icon in the left sidebar (labelled secrets), and click "+ Add New Secret"
4. Name it `GOOGLE_API_KEY` (left textbox) and paste your API key in the Value/right textbox
5. Make sure to check the "Notebook Access" radio button (left of the Name field)

The aforementioned steps are one-time, once set-up, you may simply skip it 2nd time onwards.

---

**File Upload**
1. Click the file folder in the left sidebar (below the key/secrets icon)
2. Click the upload file button (left-most icon in the menu bar of the newly opened "Files" tab) (If you have a URL, save the PDF to your device first)
3. Select & upload the file from the file explorer pop-up menu
4. After the file uploads (may take several seconds), right click it in the Files tab and rename it to `report.pdf` (if you don't see the PDF, click the refresh button, right of the Upload File button)

---

**Usage**
1. In the menu bar, under the Runtime menu, click "Run All" (or just press Ctrl+F9)
2. Scroll to the bottom, the response text would be displayed under the 2nd-to-last cell (may take upto a minute to generate)
3. Ask further questions in the input textbox visible at the end of the last cell
4. End the conversation by inputting `QUIT` in the chat textbox

---


In [None]:
# Imports

# PDF pre-processing
from PyPDF2 import PdfReader as reader
from tqdm   import tqdm

# Gen AI API
import google.generativeai as genai

# Colab stuff
from google.colab    import userdata
from IPython.display import display, Markdown

In [None]:
# API key extraction
api_key=userdata.get("GOOGLE_API_KEY")
genai.configure(api_key=api_key)

In [None]:
# Document reader
def read_docu(pdf_path: str = 'report.pdf', give_token_estm: bool = True):
    '''Converts PDF to python string
    pdf_path : path to the PDF to be scanned
    give_token_estm : whether to print an estimate for the token count'''

    with open(pdf_path, 'rb') as pdf_file:
        reader_obj = reader(pdf_file)
        pages_obj  = reader_obj.pages

        docu_text  = ''

        for page in pages_obj:
            docu_text += page.extract_text().strip()
            docu_text += '\n'

    if give_token_estm:
        word_count_estm = docu_text.count(' ') + docu_text.count('\n') + 10
        print(f"Input token count estimate : {round(1.48*word_count_estm)}\n")

    return docu_text

In [None]:
# API Caller/Summarizer
def summarize(input_text: str, ai_model: genai.GenerativeModel, give_tokens_used: bool = True):
    '''Summarizes input_text by calling ai_model'''

    summary = ai_model.generate_content(input_text)

    if give_tokens_used:
        print(f"Tokens consumed : {summary.usage_metadata.prompt_token_count}")
        print(f"Summary length  : {len(summary.text.split())} words\n")

    return summary.text

In [None]:
# Model Config
model_sys_prompt = '''\
You shall be given the transcript of a company's financial document. The document can be a third-party report, an annual company filing, or a questionnaire.
Start with a title to your summary, followed by one line about the nature of the document.

Your primary task is to summarize it, providing details about the following topics:
1. Market Performace - Any key financial figures (eg - EBITDA, revenue, cagr, etc.)
2. Key Charts/Tables - The important tables/graphics. Do not copy-paste the data, simply mention the heading & give a sentence about it.
3. Company Info - Background info about the history or nature of the company. Do not give excessive technical details here.
4. Business Model - A quick summary of the company's business model.
5. Strategic Initiatives - Major moves that the company has made recently, or have planned.

Omit sections when details pertaining to them are unavailable in the document.
Also note that the first and last few pages might be largely irrelevant.

Your secondary task is to add an "Analysis" section to your report with the following topics:
1. Risk Analysis - Point out the risks with the company's strategic initiatives.
2. Competitive Analysis - Point out how you think the company would fare in the current market (give a high level overview of the market if info unavailable).

DO NOT make up information.'''

model_config = genai.GenerationConfig(temperature=0.2, max_output_tokens=5000, response_mime_type='text/plain')

model_name  = 'gemini-1.5-pro'

## Main

Comprised of 2 cells:
- The first generates a summary of your uploaded document
- The second allows you to further question the AI

In [None]:
# Summary generation

# Pre-processing
docu_text = read_docu('report.pdf')
token_count_estm = round(1.48*(docu_text.count(' ') + docu_text.count('\n') + 10))

if token_count_estm >= 32000:
    model_name = 'gemini-1.5-flash'


# Model Setup
model = genai.GenerativeModel(  model_name=f'models/{model_name}',
                                generation_config=model_config,
                                system_instruction=model_sys_prompt )


# Execution
summary = summarize(docu_text, model)

display(Markdown(summary))

In [None]:
# Further Prompting

chat_sys_prompt = (
    'In a previous conversation, you were given a big financial document and were asked to produce a summary.'
    ' You came up with this:\n'
    f'{summary}'
    '\nYou will now be asked follow up questions about this summary.'
    '\nDo not use more than 1 emoji per message.'
)

chat_model = genai.GenerativeModel( model_name='models/gemini-1.5-pro',
                                    generation_config=model_config,
                                    system_instruction=chat_sys_prompt  )

chat = chat_model.start_chat()


display(Markdown("Type `QUIT` to end the conversation\n"))
display(Markdown("## Cross-Questioning Regarding The Generated Summary"))

print()
msg = input('Message: ')

while msg.upper() != 'QUIT':

    print("\nAI Response:")

    try:
        response = chat.send_message(msg)
    except:
        print('The Gemini API free quota is 2 API calls/min; kindly wait a bit before sending messages.')
        print('You may try resending your previous message.')

    display(Markdown(response.text))
    msg = input('\nMessage: ')

print("Conversation terminated. Re-run cell to initiate new conversation.")