# Creating a Three-Column PDF Document with FPDF

This notebook demonstrates how to create a PDF document with a three-column layout using the FPDF library. Each column will contain different elements:

- **Column 1**: Text and an image
- **Column 2**: Text and a table from a pandas DataFrame
- **Column 3**: An image and a matplotlib line chart

In [71]:
# Import necessary libraries
from fpdf import FPDF
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import os
from datetime import datetime, timedelta

## Step 1: Create a Custom PDF Class for Three-Column Layout

We'll create a custom class that inherits from FPDF to implement our three-column layout. The key is to calculate the column widths and manage positioning within each column.

In [72]:
class ThreeColumnPDF(FPDF):
    def __init__(self):
        super().__init__(orientation='P', unit='mm', format='A4')
        # A4 dimensions: 210mm x 297mm
        self.column_width = 60  # Width of each column
        self.column_spacing = 10  # Space between columns
        self.left_margin = 10  # Left margin
        self.top_margin = 10  # Top margin
        self.set_margins(self.left_margin, self.top_margin, self.left_margin)
        self.column_positions = [
            self.left_margin,  # Column 1 starting position
            self.left_margin + self.column_width + self.column_spacing,  # Column 2 starting position
            self.left_margin + 2 * (self.column_width + self.column_spacing)  # Column 3 starting position
        ]
        self.col_height = 0
        self.current_column = 0
        
    def set_column(self, column):
        """Set the current column (0, 1, or 2)"""
        self.current_column = column
        # Set X position to the start of the specified column
        self.set_x(self.column_positions[column])
        
    def reset_column_height(self):
        """Reset the column height tracking"""
        self.col_height = self.get_y()
        
    def accept_page_break(self):
        """Handle page breaks with our column layout"""
        if self.current_column < 2:
            # Go to the next column
            self.current_column += 1
            self.set_column(self.current_column)
            self.set_y(self.top_margin)
            # Don't actually break the page
            return False
        else:
            # We're at the end of the third column, create a new page
            self.add_page()
            self.current_column = 0
            self.set_column(0)
            self.set_y(self.top_margin)
            return False

## Step 2: Prepare Sample Data for Our Document

Let's create sample data for our document, including:
1. Sample text
2. A pandas DataFrame for the table
3. Stock price data for the matplotlib chart

In [73]:
# 1. Sample text
sample_text_1 = """
The Three-Column Layout

Using multiple columns in document design allows for efficient use of space while improving readability. This technique is common in newspapers, magazines, and academic journals.

Benefits of multi-column layouts include:
- Improved readability with shorter line lengths
- More efficient use of page space
- Better organization of different content types
- Enhanced visual presentation

This example demonstrates how to implement a three-column layout using the FPDF library in Python, with each column containing different types of content.
"""

sample_text_2 = """
Data Visualization in PDFs

Integrating data visualizations into PDF documents allows for comprehensive reporting and analysis. Tables and charts can be combined with explanatory text to create informative documents.

Below is a sample table showing quarterly sales data for different product categories over the past year. The data demonstrates seasonal variations in sales performance across product lines.
"""

# 2. Create a pandas DataFrame for the table
data = {
    'Quarter': ['Q1', 'Q2', 'Q3', 'Q4'],
    'Electronics': [42500, 38700, 52300, 61800],
    'Clothing': [31200, 36800, 29500, 45600],
    'Home Goods': [18500, 22300, 24100, 31500]
}
df = pd.DataFrame(data)

# 3. Generate sample stock price data
np.random.seed(42)  # For reproducibility
days = 120
base_price = 100
# Generate random walk for stock price
daily_returns = np.random.normal(0.001, 0.02, days)
price_series = base_price * (1 + np.cumsum(daily_returns))

# Create dates for the stock data
end_date = datetime.now()
start_date = end_date - timedelta(days=days)
date_range = pd.date_range(start=start_date, end=end_date, periods=days)

# Create a DataFrame with the stock data
stock_df = pd.DataFrame({
    'Date': date_range,
    'Price': price_series
})

## Step 3: Create Sample Images

We'll generate sample images for our document:
1. A placeholder image for column 1
2. A matplotlib chart of stock prices for column 3

In [74]:
# Create a placeholder image
def create_placeholder_image(filename='placeholder.jpg', width=500, height=300, color=(100, 150, 200)):
    """Create a simple placeholder image"""
    img = Image.new('RGB', (width, height), color)
    img.save(filename)
    return filename

# Create a stock price chart using matplotlib
def create_stock_chart(stock_data, filename='stock_chart.png'):
    """Create a line chart of stock prices"""
    plt.figure(figsize=(6, 4))
    plt.plot(stock_data['Date'], stock_data['Price'], linewidth=2)
    plt.title('Sample Stock Price')
    plt.ylabel('Price ($)')
    plt.grid(linestyle='--', alpha=0.7)
    plt.xticks(rotation=45)
    plt.tight_layout()
    plt.savefig(filename, dpi=150)
    plt.close()
    return filename

# Generate our images
placeholder_image = create_placeholder_image()
chart_image = create_stock_chart(stock_df)

# Create another image for the third column
icon_image = create_placeholder_image('icon.png', 250, 150, (180, 120, 80))

## Step 4: Helper Functions for Document Content

Let's create helper functions to add the various elements to our PDF:

In [75]:
def add_dataframe_table(pdf, df, x, y, col_width=30):
    """Add a pandas DataFrame as a table to the PDF"""
    # Store current position
    pdf.set_xy(x, y)
    
    # Calculate column widths based on column count
    if isinstance(col_width, (int, float)):
        col_width = [col_width] * len(df.columns)
    
    # Table header
    pdf.set_font('Arial', 'B', 10)
    pdf.set_fill_color(200, 220, 255)  # Light blue background
    for i, col_name in enumerate(df.columns):
        pdf.cell(col_width[i], 7, str(col_name), 1, 0, 'C', 1)
    pdf.ln()
    
    # Table data
    pdf.set_font('Arial', '', 9)
    for _, row in df.iterrows():
        for i, col_name in enumerate(df.columns):
            value = row[col_name]
            # Format numbers with commas for thousands
            if isinstance(value, (int, float)) and not isinstance(value, bool):
                cell_text = f"{value:,}"
            else:
                cell_text = str(value)
            pdf.cell(col_width[i], 6, cell_text, 1, 0, 'C')
        pdf.ln()
    
    # Return the new Y position after the table
    return pdf.get_y()

## Step 5: Create the Three-Column PDF Document

In [76]:
# Create the PDF
pdf = ThreeColumnPDF()
pdf.add_page()
pdf.set_auto_page_break(False)

# Add a title that spans all columns
pdf.set_font('Arial', 'B', 16)
pdf.cell(0, 10, 'Three-Column PDF Document Example', 0, 1, 'C')
pdf.ln(5)

# Reset position to start of columns
pdf.set_y(pdf.top_margin + 15)
print(pdf.column_positions)

# ---- Column 1: Text and Image ----
pdf.set_column(0)
pdf.reset_column_height()

# Add heading
pdf.set_font('Arial', 'B', 12)
pdf.cell(pdf.column_width, 10, 'Column One', 0, 1, 'L')

# Add text
pdf.set_font('Arial', '', 10)
pdf.multi_cell(pdf.column_width, 5, sample_text_1)
pdf.ln(5)

# Add image
pdf.image(placeholder_image, x=pdf.get_x(), y=pdf.get_y(), w=pdf.column_width)
image_height = 300 * (pdf.column_width / 500)  # Scale height proportionally
pdf.ln(image_height + 5)

# Add some more text
pdf.set_font('Arial', 'I', 9)
pdf.multi_cell(pdf.column_width, 5, "Figure 1: Placeholder image demonstrating visual content in the first column.")

# ---- Column 2: Text and DataFrame Table ----
pdf.set_column(1)
pdf.set_y(pdf.top_margin + 15)  # Reset Y position for column 2
pdf.set_x(80)


# Add heading
pdf.set_font('Arial', 'B', 12)
pdf.cell(pdf.column_width, 10, 'Column Two', 0, 1, 'L')
pdf.set_x(80)

# Add text
pdf.set_font('Arial', '', 10)
pdf.multi_cell(pdf.column_width, 5, sample_text_2)
pdf.ln(5)
pdf.set_x(80)

# Add table from DataFrame
table_y = pdf.get_y()
col_widths = [15, 15, 15, 15]  # Adjust column widths to fit
new_y = add_dataframe_table(pdf, df, 80, table_y, col_widths)
pdf.set_y(new_y + 5)
pdf.set_x(80)

# Add caption for the table
pdf.set_font('Arial', 'I', 9)
pdf.multi_cell(pdf.column_width, 5, "Table 1: Quarterly sales data by product category (in USD).")
pdf.ln(5)
pdf.set_x(80)

# Add additional text
pdf.set_font('Arial', '', 10)
pdf.multi_cell(pdf.column_width, 5, "The table above shows strong Q4 performance across all product categories, with Electronics showing the most consistent growth throughout the year.")
pdf.set_x(80)

# ---- Column 3: Image and Chart ----
pdf.set_column(2)
pdf.set_y(pdf.top_margin + 15)  # Reset Y position for column 3
pdf.set_x(150)

# Add heading
pdf.set_font('Arial', 'B', 12)
pdf.cell(pdf.column_width, 10, 'Column Three', 0, 1, 'L')
pdf.set_x(150)

# Add introductory text
pdf.set_font('Arial', '', 10)
pdf.multi_cell(pdf.column_width, 5, "This column demonstrates how to include multiple images in a PDF document, including a data visualization created with matplotlib.")
pdf.ln(5)
pdf.set_x(150)

# Add icon image
pdf.image(icon_image, x=pdf.get_x(), y=pdf.get_y(), w=pdf.column_width * 0.8)
icon_height = 150 * (pdf.column_width * 0.8 / 250)  # Scale height proportionally
pdf.ln(icon_height + 5)
pdf.set_x(150)

# Add caption for the icon
pdf.set_font('Arial', 'I', 9)
pdf.multi_cell(pdf.column_width, 5, "Figure 2: Another placeholder image.")
pdf.ln(5)
pdf.set_x(150)

# Add text before the chart
pdf.set_font('Arial', '', 10)
pdf.multi_cell(pdf.column_width, 5, "Below is a line chart showing sample stock price data over the past 120 days. The chart demonstrates how to incorporate matplotlib visualizations into PDF documents.")
pdf.ln(5)
pdf.set_x(150)

# Add stock price chart
pdf.image(chart_image, x=pdf.get_x(), y=pdf.get_y(), w=pdf.column_width)
chart_height = 400 * (pdf.column_width / 600)  # Scale height proportionally
pdf.ln(chart_height + 5)
pdf.set_x(150)

# Add caption for the chart
pdf.set_font('Arial', 'I', 9)
pdf.multi_cell(pdf.column_width, 5, "Figure 3: Sample stock price over time, showing typical market fluctuations.")
pdf.set_x(150)

# Save the PDF
pdf_file = 'three_column_document.pdf'
pdf.output(pdf_file)

# Clean up temporary image files
try:
    for file in [placeholder_image, chart_image, icon_image]:
        if os.path.exists(file):
            os.remove(file)
except Exception as e:
    print(f"Error removing temporary files: {e}")

[10, 80, 150]


## Explanation of the Three-Column Layout

### How the Three-Column Layout Works

1. **Custom PDF Class**: We created a `ThreeColumnPDF` class that extends the base `FPDF` class to implement our column layout.

2. **Column Management**:
   - We defined the width of each column and the spacing between them
   - We calculated the starting X position for each column
   - We implemented a `set_column()` method to switch between columns

3. **Page Break Handling**:
   - We overrode the `accept_page_break()` method to implement column-based flow
   - When text reaches the bottom of a column, it flows to the next column
   - When all three columns are filled, a new page is created

### Content Organization

**Column 1: Text and Image**
- Heading
- Explanatory text
- Placeholder image with caption

**Column 2: Text and Table**
- Heading
- Introductory text
- Table created from pandas DataFrame
- Table caption and analysis

**Column 3: Image and Chart**
- Heading
- Introductory text
- Small icon image with caption
- Explanatory text
- Matplotlib chart with caption

### Key Techniques Used

1. **Position Management**: Careful tracking of X and Y coordinates to place elements
2. **DataFrame Conversion**: Converting pandas DataFrame to a formatted table
3. **Image Integration**: Including both static images and matplotlib-generated charts
4. **Text Wrapping**: Using `multi_cell()` to enable text wrapping within column boundaries