<a href="https://colab.research.google.com/github/jhansilakshmiragala/Ajackus-assignment/blob/main/FluidAI_py.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [56]:
!pip install google-generativeai




In [57]:
import google.generativeai as genai #Code for Single PDF
import PyPDF2
import re
import os
from dotenv import load_dotenv

# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

# Load environment variables from Google Drive
env_path = "/content/drive/My Drive/apikey.env"
load_dotenv(env_path)

# Retrieve API key securely
api_key = os.getenv("GOOGLE_API_KEY")

# Configure Google Gemini AI
if api_key:
    genai.configure(api_key=api_key)
    print("✅ API Key Loaded Securely from Google Drive")
else:
    print("❌ Error: API Key Not Found! Make sure to store it in Google Drive.")

# Function to extract text from PDF
def extract_text_from_pdf(pdf_path):
    text = ""
    with open(pdf_path, "rb") as file:
        reader = PyPDF2.PdfReader(file)
        for page in reader.pages:
            text += page.extract_text() + "\n"
    return text

# Function to clean extracted text
def clean_text(text):
    text = re.sub(r'\s+', ' ', text)  # Remove extra spaces and newlines
    return text.strip()

# Function to extract insights using Google Gemini AI
def extract_insights(text):
    model = genai.GenerativeModel("gemini-pro")  # Use Google's Gemini Pro model
    response = model.generate_content(f'''
    Extract key financial and business insights from the following earnings call transcript:

    {text[:8000]}  # Limit text size to avoid errors

    Identify:
    - Future growth prospects
    - Key changes in business strategy
    - Revenue triggers
    - Material effects on next year's earnings
    - Important financial figures (Revenue, EBITDA, PAT, CAPEX, etc.)
    - Risks and challenges the company might face
    - Sentiment analysis (Positive, Neutral, or Negative)

    Provide a structured summary with bullet points.
    ''')
    return response.text

# Define the PDF file path
pdf_path = "/content/drive/My Drive/SJS Transcript Call.pdf"  # Update if needed

# Run the functions
extracted_text = extract_text_from_pdf(pdf_path)
cleaned_text = clean_text(extracted_text)
insights = extract_insights(cleaned_text)

# Save the output to a file
insights_file = "/content/drive/My Drive/Investor_Insights.txt"
with open(insights_file, "w") as file:
    file.write(insights)

print("Extracted Key Insights:")
print(insights)
print(f"Insights saved permanently in Google Drive: {insights_file}")


Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
✅ API Key Loaded Securely from Google Drive
Extracted Key Insights:
**Future growth prospects:**

* SJS Enterprises plans to continue focusing on inorganic growth through acquisitions to strengthen its market leadership in the aesthetics business.

**Key changes in business strategy:**

* Increasing focus on passenger vehicles and consumer appliances, and reducing reliance on two-wheelers.

**Revenue triggers:**

* Cross-selling opportunities and synergies between SJS, Exotech, and Walter Pack India.
* Expansion into new and emerging technologies and customer segments.

**Material effects on next year's earnings:**

* Enhanced revenue growth and margin performance due to the acquisition of Walter Pack India.
* Increased manufacturing capabilities and increased management bandwidth.

**Important financial figures:**

* **Revenue:** Consolidated revenues grew b

In [58]:
import google.generativeai as genai #Code for multiple files->Handles Multiple PDFs
import PyPDF2
import re
import os
from dotenv import load_dotenv
from google.colab import drive

def mount_drive():
    drive.mount('/content/drive')

def load_api_key():
    env_path = "/content/drive/My Drive/apikey.env"
    load_dotenv(env_path)
    api_key = os.getenv("GOOGLE_API_KEY")
    if not api_key:
        raise ValueError("❌ API Key is missing! Ensure it is stored in Google Drive.")
    genai.configure(api_key=api_key)
    print("✅ API Key Loaded Securely from Google Drive")

def extract_text_from_pdf(pdf_path):
    try:
        text = ""
        with open(pdf_path, "rb") as file:
            reader = PyPDF2.PdfReader(file)
            for page in reader.pages:
                page_text = page.extract_text()
                if page_text:
                    text += page_text + "\n"
        if not text.strip():
            print(f"⚠️ No text extracted from {pdf_path}. It may be an image-based PDF.")
        return text.strip()
    except Exception as e:
        print(f"❌ Error processing {pdf_path}: {e}")
        return ""

def clean_text(text):
    return re.sub(r'\s+', ' ', text).strip()

def extract_insights(text):
    if not text:
        return "⚠️ No insights generated. The input text was empty."

    model = genai.GenerativeModel("gemini-pro")
    prompt = f'''
    Extract key financial and business insights from the following earnings call transcript:
    {text[:8000]}

    Identify:
    - Future growth prospects
    - Key changes in business strategy
    - Revenue triggers
    - Material effects on next year's earnings
    - Important financial figures (Revenue, EBITDA, PAT, CAPEX, etc.)
    - Risks and challenges the company might face (if not explicitly mentioned, infer possible risks)
    - Sentiment analysis (Positive, Neutral, or Negative)

    Provide a structured summary with bullet points.
    '''

    try:
        response = model.generate_content(prompt)
        return response.text.strip() if response else "⚠️ No insights received from Gemini API."
    except Exception as e:
        return f"❌ Error generating insights: {e}"

def process_pdfs(pdf_folder):
    insights_folder = "/content/drive/My Drive/Investor_Insights"
    os.makedirs(insights_folder, exist_ok=True)

    for filename in os.listdir(pdf_folder):
        if filename.endswith(".pdf"):
            pdf_path = os.path.join(pdf_folder, filename)
            print(f"📄 Processing {filename}...")
            extracted_text = extract_text_from_pdf(pdf_path)
            cleaned_text = clean_text(extracted_text)
            insights = extract_insights(cleaned_text)

            insights_file = os.path.join(insights_folder, f"{filename.replace('.pdf', '_Insights.txt')}")
            with open(insights_file, "w") as file:
                file.write(insights)
            print(f"✅ Insights saved: {insights_file}\n")

def main():
    mount_drive()
    load_api_key()
    pdf_folder = "/content/drive/My Drive/Investor_PDFs"
    process_pdfs(pdf_folder)

if __name__ == "__main__":
    main()


Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
✅ API Key Loaded Securely from Google Drive
📄 Processing SJS Transcript Call.pdf...
✅ Insights saved: /content/drive/My Drive/Investor_Insights/SJS Transcript Call_Insights.txt



In [None]:
!pip install python-dotenv




In [None]:
!pip install PyPDF2



In [None]:
from google.colab import drive
drive.mount('/content/drive')


Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
