# **Step-1: Installing Dependencies**

1.pdfplumber - This package is used to extract text from PDF documents.
2.google-generativeai - This package provides tools for interacting with Google's Gemini AI models.

In [5]:
!pip install google-generativeai
!pip install pdfplumber



# **Importing Libraries**
* import os: Imports the os module for interacting with the operating system (e.g., reading environment variables).
* import google.generativeai as genai: Imports the Google Gemini AI package and gives it a shorter alias (genai).
* import pdfplumber: Imports the pdfplumber package for PDF text extraction.
* from google.colab import files: Imports the files module from google.colab, likely for uploading files in a Google Colab environment.

In [6]:
import os
import google.generativeai as genai
import pdfplumber
from google.colab import files

# Step-**2**: Setting Up API Key
Securely Load API Key from Environment Variable
genai.configure(api_key=os.getenv("GOOGLE_GEMINI_API_KEY"))

In place of GOOGLE_GEMINI_API_KEY use your API key which should be private
* in my code iam using api key which i have created.

In [7]:
# Securely Load API Key from Environment Variable
genai.configure(api_key="AIzaSyDV8P5XZoH5pPg8FXDixfuGuApCuiokAII")

# **Step-3:** extracting and analyzing
* Extracting Text from PDF
* Analyzing Text with Gemini AI

In [8]:
# Function to Extract Text from PDF
def extract_text_from_pdf(pdf_path):
    """
    Extracts text from a given PDF file.

    Args:
        pdf_path (str): The file path of the uploaded PDF.

    Returns:
        str: Extracted text from the PDF.
    """
    text = ""
    try:
        with pdfplumber.open(pdf_path) as pdf:
            for page in pdf.pages:
                page_text = page.extract_text()
                if page_text:
                    text += page_text + "\n"
        return text.strip() if text else "No readable text found."
    except Exception as e:
        print(f"❌ Error extracting text: {e}")
        return ""

# Function to Automatically Detect Document Type
def detect_document_type(text):
    """
    Determines the type of financial document based on keyword matching.

    Args:
        text (str): Extracted text from the PDF.

    Returns:
        str: Detected document type.
    """
    keywords = {
        "earnings_call": ["revenue", "EBITDA", "YoY growth", "guidance", "margins"],
        "financial_report": ["balance sheet", "net profit", "cash flow", "assets", "liabilities"],
        "market_research": ["market trends", "CAGR", "consumer behavior", "growth potential"],
        "legal_document": ["contract", "obligation", "regulation", "liability", "compliance"]
    }

    scores = {category: sum(text.lower().count(word) for word in words) for category, words in keywords.items()}
    best_match = max(scores, key=scores.get)

    return best_match if scores[best_match] > 1 else "unknown"  # Ensure classification confidence

# Function to Analyze Extracted Text with Gemini AI
def analyze_pdf_text(text, document_type):
    """
    Generates AI-powered insights based on the document type.

    Args:
        text (str): Extracted text from the PDF.
        document_type (str): Type of document detected.

    Returns:
        str: AI-generated financial insights.
    """
    prompts = {
        "earnings_call": "Extract key business updates, future growth insights, major financial triggers, and material impacts on next year's earnings.",
        "financial_report": "Summarize revenue trends, risk factors, and market analysis in the financial report.",
        "market_research": "Highlight consumer trends, competitor strategies, and key market insights from this document.",
        "legal_document": "Extract key legal clauses, obligations, and compliance points."
    }

    prompt_text = prompts.get(document_type, "Summarize key insights from this document.")

    if text == "No readable text found.":
        return "❌ No valid text extracted from the document."

    model = genai.GenerativeModel("gemini-pro")
    response = model.generate_content(f"{prompt_text}\n\n{text}")
    return response.text.strip()

# Upload PDF File
print("📂 Please upload a PDF file.")
uploaded = files.upload()
pdf_path = list(uploaded.keys())[0]  # Get uploaded file name

# Extract Text
extracted_text = extract_text_from_pdf(pdf_path)

# Detect Document Type
detected_type = detect_document_type(extracted_text)
print(f"📄 Detected Document Type: {detected_type}")

# Analyze Text with Gemini AI
analysis_result = analyze_pdf_text(extracted_text, detected_type)

# Display Results
print("\n🔍 AI-Generated Analysis:")
print(analysis_result)

📂 Please upload a PDF file.


Saving SJS Transcript Call.pdf to SJS Transcript Call (1).pdf
📄 Detected Document Type: earnings_call

🔍 AI-Generated Analysis:
**Key Business Updates:**

* Acquisition of Walter Pack India (WPI), a high-margin manufacturer of plastic parts for passenger vehicles and consumer appliances.
* WPI's addition has reduced dependence on two-wheeler segment and enhanced SJS's portfolio.
* Strategic objectives achieved through WPI acquisition include addition of new technologies, customers, manufacturing capabilities, and management bandwidth.

**Future Growth Insights:**

* Inorganic growth strategy to continue as a core focus.
* WPI's cross-selling opportunities, synergies with Exotech, and export growth potential will drive growth.
* Organic growth target of 20-25% for SJS and Exotech combined.
* New product introductions, such as optical plastics and cover glass, will expand SJS's addressable market.

**Major Financial Triggers:**

* Consolidated revenue growth of 13.6% YoY, driven by stron