PdfConverter does not extract PDF metadata (title, author, creation date)

## Feature request

The current `PdfConverter` extracts only the text body of a PDF. Standard PDF documents carry structured metadata in their document info dictionary: title, author, subject, keywords, creator, and creation date.

For research, legal, and document management workflows this metadata is often exactly what you need. It is also useful as context when the LLM processes the converted markdown - knowing the author and date helps with citation and provenance.

## Proposed output

When PDF metadata is present, prepend a metadata block to the converted markdown:

```markdown
# Document Metadata

**Title:** Annual Report 2025
**Author:** Jane Smith
**Subject:** Financial Results
**Keywords:** annual report, financials, 2025
**Created:** 2025-03-15
**Modified:** 2025-03-20

---

[body text follows]
```

## Implementation notes

`pdfminer.six` (already a dependency) exposes document info via `PDFDocument` and `resolve1`. Alternatively, `pypdf` exposes it via `reader.metadata`. Either can be used without adding a new dependency.

Fields with empty or `None` values should be skipped. The metadata block should only appear when at least one metadata field is non-empty.

## Why this matters

- Researchers converting batches of papers get author/title/year without parsing the body
- The `DocumentConverterResult.title` field can be populated from the PDF title metadata automatically
- Consistent with how `EmlConverter` surfaces email headers as structured metadata

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PdfConverter does not extract PDF metadata (title, author, creation date) #1664

Feature request

Proposed output

Implementation notes

Why this matters

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PdfConverter does not extract PDF metadata (title, author, creation date) #1664

Description

Feature request

Proposed output

Implementation notes

Why this matters

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions