## PDF Page Rotation Angle Detection Task

Objective:
Implement the `determine_rotation_angle` function within the given code structure to detect the rotation angle of each page in a PDF file.

Code Structure:
The main function `rotate_all_pages_upright` is already implemented, but if necessary you are allowed to change its implementation. Your task is to complete the `determine_rotation_angle` function.

Input:
- A PDF file path (the function should be able to handle various PDF files)

Output:
- A list of integers, where each integer represents the rotation angle needed for a page in the PDF

Rotation Angle:
- The rotation angle should be in degrees, normalized to the range [0, 359].
- 0 means the page is already upright
- 90 means the page needs to be rotated 90 degrees clockwise to be upright
- and so on...

Task:
1. Implement the `determine_rotation_angle` function:
   - Input: A single page object (PdfReader.PageObject)
   - Output: An integer representing the rotation angle in degrees

2. The function should analyze the content of the page and determine the angle needed to make the page upright.

Requirements:
1. The function should work with different PDF files, not just a specific one.
2. Implement robust methods to determine the correct rotation angle.
3. Handle potential exceptions or edge cases (e.g., pages with mixed orientations, complex layouts).
4. Optimize for both accuracy and processing speed, as the function will be called for each page in the PDF.

Additional Considerations:
- You are allowed to use up to 40GB of GPU VRAM if necessary for your implementation.
- You may create as many additional functions as needed to support your implementation.
- You may use additional libraries if required, but ensure they are imported properly.
- Provide clear comments in your code to explain your rotation detection logic.

Testing:
- Test your implementation with various types of PDFs to ensure its robustness and generalizability.
- The main script provides a way to test your implementation on a file named "grouped_documents.pdf".

Note:
The task involves determining the rotation angle only. The actual rotation of the pages is not required in this implementation.

In [3]:
# make sure you have installed all reqired modules:
# run pip install -r requirements.txt to do so 

from typing import List
import fitz # use fitz aka PyMuPDF for better text extraction capabilities instead of PdfReader
import math

def rotate_all_pages_upright(input_pdf: str) -> List[int]: 
    """
    Analyze all pages in the input PDF and determine the rotation angle needed for each page.

    Args:
    input_pdf (str): The file path of the input PDF.

    Returns:
    List[int]: A list of rotation angles (in degrees) for each page. 
               The angles are normalized to be in the range [0, 359].
               0 means no rotation needed, 90 means 90 degrees clockwise, etc.
    """
    # document reading with fitz
    document = fitz.open(input_pdf)
    angles = []

    for page_number in range(len(document)):
        page = document[page_number]

        # Use determine_rotation_angle to get the correct angle for each page
        rotation_angle = determine_rotation_angle(page)
        angles.append(rotation_angle)

    return angles

def determine_rotation_angle(page: 'fitz.Page') -> int: # take 'fitz.Page' input (rather than PdfReader.page input)
    """
    Determine the rotation angle needed to make the page upright by analyzing the main text blocks.

    Args:
    page (fitz.Page): A single page from a PDF opened with PyMuPDF.

    Returns:
    int:  The rotation angle in degrees (e.g. 0, 90, 180, 270).
          The rotation angle is normalized to be in the range [0, 359].
          0 means the page is already upright, 90 means 90 degrees clockwise, etc.
    """

    # define margins to crop footer/header from scanned PDFs 
    top_margin = 100
    bottom_margin = 100
    

    # extract text and get orienation through "dict" functionality
    text_blocks = page.get_text("dict")["blocks"]
    page_height = page.rect.height
    main_text_angles = []

    for block in text_blocks:
        block_y0 = block["bbox"][1]  # top of the text block
        block_y1 = block["bbox"][3]  # bottom of the text block

        # Check if the text block is outside the header and footer margins
        if block_y0 > top_margin and block_y1 < (page_height - bottom_margin):
            for line in block.get("lines", []):
                for span in line.get("spans", []):
                    bbox = span["bbox"]
                    x0, y0, x1, y1 = bbox

                    # Calculate angle based on the bounding box coordinates
                    dx = x1 - x0
                    dy = y1 - y0
                    angle = math.degrees(math.atan2(dy, dx))
                        
                    main_text_angles.append(angle)

        # Handle cases where no text is found in the main body (excluding headers/footers)
    if not main_text_angles:
            # Set the page angle to 0 if no text is detected (assuming that no rotation is required if no text is found) 
        angle = [0]
    else:
        angle = main_text_angles
    return angle


# Usage
input_pdf: str = "grouped_documents.pdf"
rotation_angles: List[int] = rotate_all_pages_upright(input_pdf)
print(f"Rotation angles for each page: {rotation_angles}")

Rotation angles for each page: [[0], [47.42746649772392], [31.109134981116533], [45.0], [45.0], [45.0], [17.286767812376056], [21.727441021720246], [75.44282361276886], [80.11366229419129], [22.60070030146316], [12.707786661468065], [17.286765088044667], [40.14044782186539], [55.57812122791439], [45.0], [45.0], [45.0]]
