In [2]:
import fitz  # PyMuPDF
import requests
import json

In [3]:

# Step 1: Extract text from PDF
def extract_text_from_pdf(pdf_path):
    doc = fitz.open(pdf_path)
    text = ""
    for page in doc:
        text += page.get_text()
    return text

In [4]:
# Step 2: Send text to LM Studio's local server (Phi-3.1-mini)
def summarize_text(text, max_chunk_length=3000):
    # You can chunk the text if it's too long for one prompt
    prompt = f"Summarize the following document:\n\n{text[:max_chunk_length]}"
    
    payload = {
        "model": "phi-3.1-mini-128k-instruct",
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": prompt}
        ],
        "temperature": 0.3,
        "max_tokens": 500
    }

    response = requests.post(
        "http://localhost:1234/v1/chat/completions",
        headers={"Content-Type": "application/json"},
        data=json.dumps(payload)
    )

    if response.status_code == 200:
        return response.json()['choices'][0]['message']['content']
    else:
        return f"Error: {response.status_code}\n{response.text}"


In [5]:

# Step 3: Run the summarization pipeline
if __name__ == "__main__":
    pdf_path = "annual-report-2023-2024.pdf"  # Replace with your PDF file path
    print("Extracting text...")
    extracted_text = extract_text_from_pdf(pdf_path)

    print("Sending to Phi-3.1 for summarization...")
    summary = summarize_text(extracted_text)

    print("\nSummary:\n", summary)

Extracting text...
Sending to Phi-3.1 for summarization...

Summary:
 The Integrated Annual Report for FY 2023-24 by Tata Consultancy Services (TCS) reflects on two decades of value creation since its IPO in 2004. As an IT services, consulting and business solutions organization with over half a million well-trained consultants globally, the company has consistently delivered innovative technological transformations to various industries such as banking, retail, manufacturing, healthcare, and utilities. TCS's unique Location Independent Agile™ delivery model is recognized for its excellence in software development.

The report emphasizes how technology acts as an enabler for businesses seeking competitive advantage, strategy alignment, growth opportunities, improved customer experience, and employee engagement. It highlights the synergistic relationship between Cloud and AI/GenAI technologies that is driving significant shifts in industry approaches to innovation and efficiency. With o