pdfrest · datalogics-cgreen · Dec 18, 2025 · Dec 18, 2025 · Dec 18, 2025 · Dec 18, 2025
diff --git a/AGENTS.md b/AGENTS.md
@@ -39,11 +39,25 @@
 - When calling pdfRest, supply the API key via the `Api-Key` header (not
   `Authorization: Bearer`); keep tests and client defaults in sync with this
   convention.
+- Avoid `@field_validator` on payload models. Prefer existing `BeforeValidator`
+  helpers (e.g., `_allowed_mime_types`) so validation remains declarative and
+  consistent across schemas.
 - Treat `PdfRestClient` and `AsyncPdfRestClient` as context managers in both
   production code and tests so transports are disposed deterministically.
 - When uploading content, always send the multipart field name `file`; when
   uploading by URL, send a JSON payload using the `url` key with a list of
   http/https addresses (single values are promoted to lists internally).
+- Always upload local assets before invoking an endpoint helper. Public client
+  APIs must accept `PdfRestFile` objects (or sequences) rather than raw paths or
+  ids, including optional resources such as compression profiles. Never expose
+  `PdfRestFileID` in the interface—callers should upload the profile JSON, get
+  the resulting `PdfRestFile`, then pass that object into helpers like
+  `compress_pdf`.
+- When an endpoint supports both an inline upload parameter and an `*_id`
+  variant, ignore the upload form and expose only the base parameter (without
+  `_id`) typed as `PdfRestFile`. Serialize via `_serialize_as_first_file_id`
+  with `serialization_alias` pointing to the server’s `*_id` field so requests
+  always reference already-uploaded resources.
 - `prepare_request` rejects mixed multipart (`files`) and JSON payloads; only
   URL uploads (`create_from_urls`) should combine JSON bodies with the request.
 - Replicate server-side safeguards when porting validation logic: the output
@@ -111,30 +125,48 @@
 
 ## Testing Guidelines
 
+- **Live Test Requirement (Do Not Skip):** Every new endpoint or service must
+  ship with a matching live pytest module under `tests/live/` before the work is
+  considered complete. Mirror the naming/structure used by the graphic
+  conversion suites: one module per endpoint, parameterized success cases that
+  enumerate all accepted literals, at least one invalid input that hits the
+  server, and coverage for any request options surfaced on the client. If an
+  endpoint cannot be exercised live, call that out explicitly in the PR
+  description with the reason and the follow-up plan; otherwise reviewers should
+  block the change. Treat this as a release gate on par with unit tests.
+
 - Write pytest tests: files named `test_*.py`, test functions `test_*`, fixtures
   in `conftest.py` where shared.
+
 - Ensure high-value coverage of public functions and edge cases; document intent
   in test docstrings when non-obvious.
+
 - Use `uvx nox -s tests` to exercise the full interpreter matrix locally when
   validating compatibility.
+
 - When writing live tests for URL uploads, first create the remote resources via
   `create_from_paths`, then reuse the returned URLs in `create_from_urls` to
   avoid relying on third-party availability.
+
 - For parameterized tests prefer `pytest.param(..., id="short-label")` so test
   IDs stay readable; make assertions for every relevant response attribute (name
   prefix, MIME type, size, URLs, warnings).
+
 - Avoid manual loops over test parameters; prefer `@pytest.mark.parametrize`
   with explicit `id=` values so each combination is visible and reproducible.
+
 - Always couple `pytest.raises` with an explicit `match=` regex that reflects
   the intended validation error wording—mirror the human-readable text rather
   than relying on default exception formatting.
+
 - Mirror PNG’s request/response scenarios for each graphic conversion endpoint:
   maintain per-endpoint test modules (`test_convert_to_png.py`,
   `test_convert_to_bmp.py`, etc.) covering success, parameter customization,
   validation errors, multi-file guards, and async flows. Keep shared payload
   validation (output prefix and page-range cases) in a dedicated suite (e.g.,
   `tests/test_graphic_payload_validation.py`) that exercises every payload
   model.
+
 - When introducing additional pdfRest endpoints, follow the same pattern used
   for graphic conversions: encapsulate shared request validation in a typed
   payload model, expose fully named client methods, and create a dedicated test
@@ -143,15 +175,20 @@
   checks (e.g., common field requirements, payload serialization) in shared
   helper tests so new services inherit consistent coverage with minimal
   duplication.
+
 - Prefer `pytest.mark.parametrize` (with `pytest.param(..., id="...")`) over
-  explicit loops inside tests; nest parametrization for multi-dimensional
-  coverage so each case appears as an individual test item.
+  explicit loops or copy/paste blocks—if only the input value or expected error
+  changes, parameterize it so failures point to the exact case and reviewers
+  don’t have to diff almost-identical code. Nest parametrization for
+  multi-dimensional coverage so each combination appears as its own test item.
+
 - Live tests should verify that literal enumerations match pdfRest’s accepted
   values. Exercise format-specific options (e.g., each image format’s
   `color_model`) individually, and run smoothing enumerations through every
   enabled endpoint to confirm consistent server behaviour. Include “wildly”
   invalid values (e.g., bogus literals or mixed lists) alongside boundary
   failures so the server-side error messaging is exercised.
+
 - Provide live integration tests under `tests/live/` (with an `__init__.py` so
   pytest discovers the package) that introspect payload models to enumerate
   valid/invalid literal values and numeric boundaries. These tests should vary a
@@ -162,11 +199,13 @@
   exception surfaced by the client). When test fixtures produce deterministic
   results (e.g., `tests/resources/report.pdf`), assert the concrete values
   returned by pdfRest rather than only checking for presence or type.
+
 - Use `tests/resources/20-pages.pdf` for high-page-count scenarios such as split
   and merge endpoints so boundary coverage (multi-output splits, staggered page
   selections) remains reproducible. Parameterize live split/merge tests to cover
   multiple page-group patterns, and pair each success case with an invalid input
   that reaches the server by overriding the JSON body via `extra_body`.
+
 - Developers can load a pdfRest API key from `.env` during ad-hoc exploration.
   The repo includes `python-dotenv`; call `load_dotenv()` (optionally pointing
   to `.env`) in temporary scripts to drive the in-flight client against live

diff --git a/tests/live/test_live_convert_to_pdfx.py b/tests/live/test_live_convert_to_pdfx.py
@@ -0,0 +1,78 @@
+from __future__ import annotations
+
+from typing import cast, get_args
+
+import pytest
+
+from pdfrest import PdfRestApiError, PdfRestClient
+from pdfrest.models import PdfRestFile
+from pdfrest.types import PdfXType
+
+from ..resources import get_test_resource_path
+
+PDFX_TYPES: tuple[PdfXType, ...] = cast(tuple[PdfXType, ...], get_args(PdfXType))
+
+
+@pytest.fixture(scope="module")
+def uploaded_pdf_for_pdfx(
+    pdfrest_api_key: str,
+    pdfrest_live_base_url: str,
+) -> PdfRestFile:
+    resource = get_test_resource_path("report.pdf")
+    with PdfRestClient(
+        api_key=pdfrest_api_key,
+        base_url=pdfrest_live_base_url,
+    ) as client:
+        return client.files.create_from_paths([resource])[0]
+
+
+@pytest.mark.parametrize("output_type", PDFX_TYPES, ids=list(PDFX_TYPES))
+def test_live_convert_to_pdfx_success(
+    pdfrest_api_key: str,
+    pdfrest_live_base_url: str,
+    uploaded_pdf_for_pdfx: PdfRestFile,
+    output_type: PdfXType,
+) -> None:
+    with PdfRestClient(
+        api_key=pdfrest_api_key,
+        base_url=pdfrest_live_base_url,
+    ) as client:
+        response = client.convert_to_pdfx(
+            uploaded_pdf_for_pdfx,
+            output_type=output_type,
+            output="pdfx-live",
+        )
+
+    assert response.output_files
+    output_file = response.output_file
+    assert output_file.type == "application/pdf"
+    assert str(response.input_id) == str(uploaded_pdf_for_pdfx.id)
+    assert output_file.name.startswith("pdfx-live")
+
+
+@pytest.mark.parametrize(
+    "invalid_output_type",
+    [
+        pytest.param("PDF/X-0", id="pdfx-0"),
+        pytest.param("PDF/X-99", id="pdfx-99"),
+        pytest.param("pdf/x-4", id="lowercase"),
+    ],
+)
+def test_live_convert_to_pdfx_invalid_output_type(
+    pdfrest_api_key: str,
+    pdfrest_live_base_url: str,
+    uploaded_pdf_for_pdfx: PdfRestFile,
+    invalid_output_type: str,
+) -> None:
+    with (
+        PdfRestClient(
+            api_key=pdfrest_api_key,
+            base_url=pdfrest_live_base_url,
+        ) as client,
+        pytest.raises(PdfRestApiError),
+    ):
+        client.convert_to_pdfx(
+            uploaded_pdf_for_pdfx,
+            output_type="PDF/X-1a",
+            extra_body={"output_type": invalid_output_type},
+        )
diff --git a/tests/live/test_live_convert_to_word.py b/tests/live/test_live_convert_to_word.py
@@ -0,0 +1,75 @@
+from __future__ import annotations
+
+import pytest
+
+from pdfrest import PdfRestApiError, PdfRestClient
+from pdfrest.models import PdfRestFile
+
+from ..resources import get_test_resource_path
+
+
+@pytest.fixture(scope="module")
+def uploaded_pdf_for_word(
+    pdfrest_api_key: str,
+    pdfrest_live_base_url: str,
+) -> PdfRestFile:
+    resource = get_test_resource_path("report.pdf")
+    with PdfRestClient(
+        api_key=pdfrest_api_key,
+        base_url=pdfrest_live_base_url,
+    ) as client:
+        return client.files.create_from_paths([resource])[0]
+
+
+@pytest.mark.parametrize(
+    "output_name",
+    [
+        pytest.param(None, id="default-output"),
+        pytest.param("live-word", id="custom-output"),
+    ],
+)
+def test_live_convert_to_word_success(
+    pdfrest_api_key: str,
+    pdfrest_live_base_url: str,
+    uploaded_pdf_for_word: PdfRestFile,
+    output_name: str | None,
+) -> None:
+    kwargs: dict[str, str] = {}
+    if output_name is not None:
+        kwargs["output"] = output_name
+
+    with PdfRestClient(
+        api_key=pdfrest_api_key,
+        base_url=pdfrest_live_base_url,
+    ) as client:
+        response = client.convert_to_word(uploaded_pdf_for_word, **kwargs)
+
+    assert response.output_files
+    output_file = response.output_file
+    assert (
+        output_file.type
+        == "application/vnd.openxmlformats-officedocument.wordprocessingml.document"
+    )
+    assert str(response.input_id) == str(uploaded_pdf_for_word.id)
+    if output_name is not None:
+        assert output_file.name.startswith(output_name)
+    else:
+        assert output_file.name.endswith(".docx")
+
+
+def test_live_convert_to_word_invalid_file_id(
+    pdfrest_api_key: str,
+    pdfrest_live_base_url: str,
+    uploaded_pdf_for_word: PdfRestFile,
+) -> None:
+    with (
+        PdfRestClient(
+            api_key=pdfrest_api_key,
+            base_url=pdfrest_live_base_url,
+        ) as client,
+        pytest.raises(PdfRestApiError),
+    ):
+        client.convert_to_word(
+            uploaded_pdf_for_word,
+            extra_body={"id": "00000000-0000-0000-0000-000000000000"},
+        )
diff --git a/tests/live/test_live_flatten_pdf_forms.py b/tests/live/test_live_flatten_pdf_forms.py
@@ -0,0 +1,72 @@
+from __future__ import annotations
+
+import pytest
+
+from pdfrest import PdfRestApiError, PdfRestClient
+from pdfrest.models import PdfRestFile
+
+from ..resources import get_test_resource_path
+
+
+@pytest.fixture(scope="module")
+def uploaded_pdf_with_forms(
+    pdfrest_api_key: str,
+    pdfrest_live_base_url: str,
+) -> PdfRestFile:
+    resource = get_test_resource_path("form_with_data.pdf")
+    with PdfRestClient(
+        api_key=pdfrest_api_key,
+        base_url=pdfrest_live_base_url,
+    ) as client:
+        return client.files.create_from_paths([resource])[0]
+
+
+@pytest.mark.parametrize(
+    "output_name",
+    [
+        pytest.param(None, id="default-output"),
+        pytest.param("flattened-live", id="custom-output"),
+    ],
+)
+def test_live_flatten_pdf_forms(
+    pdfrest_api_key: str,
+    pdfrest_live_base_url: str,
+    uploaded_pdf_with_forms: PdfRestFile,
+    output_name: str | None,
+) -> None:
+    kwargs: dict[str, str] = {}
+    if output_name is not None:
+        kwargs["output"] = output_name
+
+    with PdfRestClient(
+        api_key=pdfrest_api_key,
+        base_url=pdfrest_live_base_url,
+    ) as client:
+        response = client.flatten_pdf_forms(uploaded_pdf_with_forms, **kwargs)
+
+    assert response.output_files
+    output_file = response.output_file
+    assert output_file.type == "application/pdf"
+    assert str(response.input_id) == str(uploaded_pdf_with_forms.id)
+    if output_name is not None:
+        assert output_file.name.startswith(output_name)
+    else:
+        assert output_file.name.endswith(".pdf")
+
+
+def test_live_flatten_pdf_forms_invalid_file_id(
+    pdfrest_api_key: str,
+    pdfrest_live_base_url: str,
+    uploaded_pdf_with_forms: PdfRestFile,
+) -> None:
+    with (
+        PdfRestClient(
+            api_key=pdfrest_api_key,
+            base_url=pdfrest_live_base_url,
+        ) as client,
+        pytest.raises(PdfRestApiError),
+    ):
+        client.flatten_pdf_forms(
+            uploaded_pdf_with_forms,
+            extra_body={"id": "ffffffff-ffff-ffff-ffff-ffffffffffff"},
+        )
diff --git a/tests/resources/form_with_data.pdf b/tests/resources/form_with_data.pdf