# Corporate Earnings Reports Summarizer with PDF Summary Download

## Overview
This tool is designed to automatically analyze and compare corporate earnings reports across multiple quarters, providing a concise PDF summary. It's an essential asset for financial analysts, investors, and fintech professionals who need to quickly digest and compare key information from lengthy financial documents.

## Features
- Accepts PDF inputs of corporate earnings reports for three time periods:
  1. Current quarter
  2. Previous quarter
  3. Same quarter from the previous year
- Extracts key financial metrics and performance indicators
- Generates a comparative analysis across the three time periods
- Calculates quarter-over-quarter (QoQ) and year-over-year (YoY) changes
- Identifies and extracts key bullet points from the current report
- Creates a downloadable PDF summary with tables and bullet points
- Automatically detects and includes the company name in the output

## How It Works
1. Upload PDF files of earnings reports for the three specified time periods
2. The script extracts text and key financial data from each PDF
3. It then compares the data, calculating percentage changes
4. Key points from the current quarter's report are extracted
5. A comprehensive PDF is generated with:
   - A comparison table of financial metrics
   - QoQ and YoY percentage changes
   - Key points from the current report
6. The summary PDF is automatically downloaded upon completion

## Usage
Run the cells in order and follow the prompts to upload the required PDF files. Once processed, a summary PDF will be generated and downloaded automatically.

## Applications in Fintech
This tool significantly enhances financial analysis processes by:
- Automating the comparison of quarterly financial performance
- Highlighting key metrics and their changes over time
- Enabling quick identification of trends and anomalies
- Facilitating data-driven decision-making for investment strategies
- Streamlining the creation of financial reports and presentations

## Technical Highlights
- Utilizes PDF text extraction techniques
- Implements regular expressions for precise data extraction
- Employs data processing to calculate financial changes
- Generates professional-looking PDF reports with tables and formatted text

## Note
While this tool provides a comprehensive summary and comparison, it's always recommended to refer to the full reports for detailed analysis and to comply with financial regulations.

## Future Enhancements
- Integration with financial APIs for real-time data verification
- Machine learning-based sentiment analysis of earnings call transcripts
- Customizable metrics and report formats
- Historical data tracking and trend analysis over multiple years

This project demonstrates proficiency in financial data analysis, PDF processing, and automated report generation, key skills in the fintech industry.

In [11]:
import io
import os
import re
import PyPDF2
from google.colab import files
from reportlab.lib.pagesizes import letter
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, Table, TableStyle
from reportlab.lib.styles import getSampleStyleSheet
from reportlab.lib import colors

def extract_text_from_pdf(pdf_file):
    reader = PyPDF2.PdfReader(io.BytesIO(pdf_file))
    text = ""
    for page in reader.pages:
        text += page.extract_text()
    return text

def extract_company_name(text):
    match = re.search(r'([A-Z][A-Z\s&.]+)(?:\s+ANNOUNCES|,\s*INC\.)', text)
    if match:
        return match.group(1).strip()
    return "Unknown Company"

def extract_financial_data(text):
    patterns = {
        'Net sales': r'Net sales.*?\$(\d+\.?\d*)\s+billion',
        'Operating income': r'Operating income.*?\$(\d+\.?\d*)\s+billion',
        'Net income': r'Net income.*?\$(\d+\.?\d*)\s+billion',
        'North America sales': r'North America segment sales.*?\$(\d+\.?\d*)\s+billion',
        'International sales': r'International segment sales.*?\$(\d+\.?\d*)\s+billion',
        'AWS sales': r'AWS segment sales.*?\$(\d+\.?\d*)\s+billion',
    }
    data = {}
    for key, pattern in patterns.items():
        match = re.search(pattern, text)
        if match:
            data[key] = float(match.group(1))
    return data

def extract_key_points(text):
    points = re.findall(r'•\s*(.*?)(?:\n|$)', text)
    return points[:5]  # Return top 5 bullet points

def create_comparison_table(current_data, previous_data, last_year_data):
    table_data = [['Metric', 'Current Quarter', 'Previous Quarter', 'Year-over-Year', 'QoQ Change', 'YoY Change']]
    for key in current_data.keys():
        current = current_data.get(key, 0)
        previous = previous_data.get(key, 0)
        last_year = last_year_data.get(key, 0)
        qoq_change = f"{((current - previous) / previous * 100):.2f}%" if previous else "N/A"
        yoy_change = f"{((current - last_year) / last_year * 100):.2f}%" if last_year else "N/A"
        table_data.append([
            key,
            f"${current:.2f}B",
            f"${previous:.2f}B",
            f"${last_year:.2f}B",
            qoq_change,
            yoy_change
        ])
    return table_data

def create_pdf_summary(company_name, comparison_table, key_points, filename):
    doc = SimpleDocTemplate(filename, pagesize=letter)
    styles = getSampleStyleSheet()
    story = []

    # Title
    story.append(Paragraph(f"{company_name} Quarterly Earnings Comparison", styles['Title']))
    story.append(Spacer(1, 12))

    # Comparison Table
    story.append(Paragraph("Financial Comparison", styles['Heading2']))
    t = Table(comparison_table)
    t.setStyle(TableStyle([
        ('BACKGROUND', (0, 0), (-1, 0), colors.grey),
        ('TEXTCOLOR', (0, 0), (-1, 0), colors.whitesmoke),
        ('ALIGN', (0, 0), (-1, -1), 'CENTER'),
        ('FONTNAME', (0, 0), (-1, 0), 'Helvetica-Bold'),
        ('FONTSIZE', (0, 0), (-1, 0), 12),
        ('BOTTOMPADDING', (0, 0), (-1, 0), 12),
        ('BACKGROUND', (0, 1), (-1, -1), colors.beige),
        ('GRID', (0, 0), (-1, -1), 1, colors.black)
    ]))
    story.append(t)
    story.append(Spacer(1, 12))

    # Key Points
    story.append(Paragraph("Key Points", styles['Heading2']))
    for point in key_points:
        story.append(Paragraph(f"• {point}", styles['Normal']))
        story.append(Spacer(1, 6))

    doc.build(story)

# Main execution
print("Upload the current quarter's earnings report PDF:")
current_quarter = files.upload()
current_quarter_text = extract_text_from_pdf(list(current_quarter.values())[0])
company_name = extract_company_name(current_quarter_text)
current_quarter_data = extract_financial_data(current_quarter_text)
current_quarter_points = extract_key_points(current_quarter_text)

print("\nUpload the previous quarter's earnings report PDF:")
previous_quarter = files.upload()
previous_quarter_text = extract_text_from_pdf(list(previous_quarter.values())[0])
previous_quarter_data = extract_financial_data(previous_quarter_text)

print("\nUpload the same quarter from last year's earnings report PDF:")
last_year_quarter = files.upload()
last_year_quarter_text = extract_text_from_pdf(list(last_year_quarter.values())[0])
last_year_quarter_data = extract_financial_data(last_year_quarter_text)

comparison_table = create_comparison_table(current_quarter_data, previous_quarter_data, last_year_quarter_data)

output_filename = f"{company_name.replace(' ', '_')}_quarterly_earnings_comparison.pdf"
create_pdf_summary(company_name, comparison_table, current_quarter_points, output_filename)
files.download(output_filename)

print(f"\nComparison PDF file '{output_filename}' created and downloaded successfully.")
print("Current working directory:", os.getcwd())
print("Files in directory:", os.listdir())

Upload the current quarter's earnings report PDF:


Saving AMZN-Q2-2024-Earnings-Release.pdf to AMZN-Q2-2024-Earnings-Release.pdf

Upload the previous quarter's earnings report PDF:


Saving AMZN-Q1-2024-Earnings-Release.pdf to AMZN-Q1-2024-Earnings-Release.pdf

Upload the same quarter from last year's earnings report PDF:


Saving Q2-2023-Amazon-Earnings-Release.pdf to Q2-2023-Amazon-Earnings-Release.pdf


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>


Comparison PDF file 'AMAZON.COM_quarterly_earnings_comparison.pdf' created and downloaded successfully.
Current working directory: /content
Files in directory: ['.config', 'AMZN-Q2-2024-Earnings-Release.pdf', 'AMAZON.COM_quarterly_earnings_comparison.pdf', 'Q2-2023-Amazon-Earnings-Release.pdf', 'AMZN-Q1-2024-Earnings-Release.pdf', '.ipynb_checkpoints', 'sample_data']
