extractPages retains all original PDF objects, resulting in no size reduction

## Summary

`extractPages()` does not reduce file size because all objects from the original PDF are retained in the new document's context. For PDFs with large embedded resources (e.g., full CJK font files), extracting a few pages produces output nearly identical in size to the original.

## Reproduction

```typescript
import { PDF } from '@libpdf/core';
import { readFileSync } from 'fs';

const pdfBytes = readFileSync('large.pdf'); // 80MB, 128 pages
const doc = await PDF.load(pdfBytes);

// Extract only 5 pages
const extracted = await doc.extractPages([0, 1, 2, 3, 4]);
const bytes = await extracted.save({ incremental: false, subsetFonts: true });

console.log(bytes.length); // ~80MB — same as original
```

### Individual page analysis

| Pages | Size |
|-------|------|
| Original (128 pages) | 79.7 MB |
| extractPages([0,1,2,3,4]) | 79.7 MB |
| extractPages([0]) — page 1 only | 79.7 MB |
| extractPages([1]) — page 2 only | 0.6 MB |

Page 1 contains a CJK font (~79 MB of stream data across 641 objects). Extracting **any** set of pages that includes page 1 produces a file of the same size as the original, even though only the glyphs used on that page should be needed.

### Attempted workarounds (none reduced size)

- `save({ subsetFonts: true, incremental: false })`
- `flattenAnnotations()` / `flattenAll()` / `flattenLayers()`
- Reload: `PDF.load(await extracted.save())` then save again
- `copyPagesFrom()` to a new `PDF.create()` document
- Setting `pdf.ctx = null` before saving extracted document

### Comparison with pdf-lib

For reference, `pdf-lib` (which `@libpdf/core` is forked from) handles this correctly:

```typescript
import { PDFDocument } from 'pdf-lib';

const srcDoc = await PDFDocument.load(pdfBytes); // 80MB, 128 pages
const newDoc = await PDFDocument.create();
const pages = await newDoc.copyPages(srcDoc, [0, 1, 2, 3, 4]);
for (const p of pages) newDoc.addPage(p);

const bytes = await newDoc.save();
console.log(bytes.length); // ~2.6MB ✅
```

| Method | 5 pages | Page 1 only |
|--------|---------|-------------|
| `@libpdf/core` extractPages | 79.7 MB | 79.7 MB |
| `@libpdf/core` copyPagesFrom | 79.7 MB | 79.7 MB |
| `pdf-lib` copyPages | 2.6 MB | 0.3 MB |

## Expected Behavior

`extractPages()` (and `copyPagesFrom()`) should only include objects that are actually referenced by the extracted pages, and `save()` should garbage-collect unreachable objects.

## Environment

- `@libpdf/core`: 0.3.4
- Node.js: 24.14.1
- OS: macOS (also reproduced on Linux via Docker)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

extractPages retains all original PDF objects, resulting in no size reduction #60

Summary

Reproduction

Individual page analysis

Attempted workarounds (none reduced size)

Comparison with pdf-lib

Expected Behavior

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Pages	Size
Original (128 pages)	79.7 MB
extractPages([0,1,2,3,4])	79.7 MB
extractPages([0]) — page 1 only	79.7 MB
extractPages([1]) — page 2 only	0.6 MB

Method	5 pages	Page 1 only
`@libpdf/core` extractPages	79.7 MB	79.7 MB
`@libpdf/core` copyPagesFrom	79.7 MB	79.7 MB
`pdf-lib` copyPages	2.6 MB	0.3 MB

extractPages retains all original PDF objects, resulting in no size reduction #60

Description

Summary

Reproduction

Individual page analysis

Attempted workarounds (none reduced size)

Comparison with pdf-lib

Expected Behavior

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions