From 578528b0be22c6b7d9125a64ea60d61ea7355215 Mon Sep 17 00:00:00 2001 From: Vlada Dusek Date: Tue, 5 May 2026 10:34:15 +0200 Subject: [PATCH 1/2] docs: polish and modernize README The README was outdated and missing structure expected from a modern OSS project. Rewrite it with a TOC, sync/async quick-start examples, a feature list with deep links into the docs (concepts, guides, upgrading), a documentation table, related projects, and support/contributing sections so users can discover the platform-side resources without leaving the repo. --- README.md | 200 ++++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 172 insertions(+), 28 deletions(-) diff --git a/README.md b/README.md index 210f4eab..638ee67b 100644 --- a/README.md +++ b/README.md @@ -1,58 +1,202 @@ -

Apify API client for Python

+

Apify API client for Python

- PyPI package version - PyPI package downloads - Codecov report - PyPI Python version - Chat on Discord + The official Python client for the Apify REST API.

-The Apify API Client for Python is the official library to access the [Apify API](https://docs.apify.com/api/v2) from your Python applications. It provides useful features like automatic retries and convenience functions to improve your experience with the Apify API. +

+ PyPI version + PyPI downloads + Python versions + Coverage + License + Chat on Discord +

+ +`apify-client` lets you talk to the [Apify platform](https://apify.com) from Python — run [Actors](https://docs.apify.com/platform/actors), manage [storages](https://docs.apify.com/platform/storage) (datasets, key-value stores, request queues), schedule tasks, configure webhooks, and use everything else exposed by the [Apify API](https://docs.apify.com/api/v2). It ships both synchronous and asynchronous clients, fully typed responses, automatic retries with exponential backoff, tiered timeouts, pagination helpers, streaming, and a pluggable HTTP layer. + +> If you want to **build** Apify Actors in Python rather than consume the API, use the [Apify SDK for Python](https://docs.apify.com/sdk/python) instead — it bundles this client and adds Actor-side primitives. -If you want to develop Apify Actors in Python, check out the [Apify SDK for Python](https://docs.apify.com/sdk/python) instead. +## Table of contents + +- [Installation](#installation) +- [Quick start](#quick-start) +- [Features](#features) +- [Usage examples](#usage-examples) +- [Documentation](#documentation) +- [Related projects](#related-projects) +- [Support and community](#support-and-community) +- [Contributing](#contributing) +- [License](#license) ## Installation -Requires Python 3.11+ +`apify-client` requires **Python 3.11 or higher**. It is published on [PyPI](https://pypi.org/project/apify-client/) and can be installed for example with [pip](https://pip.pypa.io/): + +```bash +pip install apify-client +``` + +or with [uv](https://docs.astral.sh/uv/): + +```bash +uv add apify-client +``` -You can install the package from its [PyPI listing](https://pypi.org/project/apify-client). To do that, simply run `pip install apify-client` in your terminal. +or any other Python package manager that consumes PyPI. -## Usage +## Quick start -For usage instructions, check the documentation on [Apify Docs](https://docs.apify.com/api/client/python/). +You'll need an Apify API token — find yours in the [Integrations section of Apify Console](https://console.apify.com/account/integrations). Pass it to the client and you're ready to go. -## Quick Start +### Synchronous client ```python from apify_client import ApifyClient -apify_client = ApifyClient('MY-APIFY-TOKEN') +client = ApifyClient('MY-APIFY-TOKEN') + +# Start an Actor and wait for it to finish. +run = client.actor('apify/hello-world').call( + run_input={'message': 'Hello, Apify!'}, +) + +# Iterate items from the run's default dataset. +for item in client.dataset(run.default_dataset_id).iterate_items(): + print(item) +``` + +### Asynchronous client + +```python +import asyncio + +from apify_client import ApifyClientAsync + + +async def main() -> None: + client = ApifyClientAsync('MY-APIFY-TOKEN') + + run = await client.actor('apify/hello-world').call( + run_input={'message': 'Hello, Apify!'}, + ) + + # Iterate items from the run's default dataset. + async for item in client.dataset(run.default_dataset_id).iterate_items(): + print(item) -# Start an Actor and wait for it to finish -actor_call = apify_client.actor('john-doe/my-cool-actor').call() -# Fetch results from the Actor's default dataset -if actor_call is not None: - dataset_items = apify_client.dataset(actor_call.default_dataset_id).list_items().items +asyncio.run(main()) ``` +> **Keep your token secret.** It authorizes requests on your behalf and can incur usage costs. Never commit it to source control or expose it to client-side code. + +For a guided walkthrough — authenticating, running an Actor, and reading its results — see the [Quick start guide](https://docs.apify.com/api/client/python/docs/quick-start). + ## Features -Besides greatly simplifying the process of querying the Apify API, the client provides other useful features. +- **Synchronous and asynchronous clients** — pick [`ApifyClient`](https://docs.apify.com/api/client/python/reference/class/ApifyClient) or [`ApifyClientAsync`](https://docs.apify.com/api/client/python/reference/class/ApifyClientAsync) to match your codebase; both expose the same API ([Asyncio support](https://docs.apify.com/api/client/python/docs/concepts/asyncio-support)). +- **Fully typed responses** — every method returns a [Pydantic](https://docs.pydantic.dev/) model generated from the Apify OpenAPI spec, with IDE autocomplete and runtime validation ([Upgrading to v3](https://docs.apify.com/api/client/python/docs/upgrading/upgrading-to-v3)). +- **Automatic retries** — exponential backoff for network errors, HTTP 429, and 5xx responses, configurable per client ([Retries](https://docs.apify.com/api/client/python/docs/concepts/retries)). +- **Tiered timeouts** — short / medium / long tiers picked per endpoint, overridable per call ([Timeouts](https://docs.apify.com/api/client/python/docs/concepts/timeouts)). +- **Pagination and streaming** — iterate datasets, key-value store keys, or live logs without manual paging or buffering ([Pagination](https://docs.apify.com/api/client/python/docs/concepts/pagination), [Streaming](https://docs.apify.com/api/client/python/docs/concepts/streaming-resources)). +- **Convenience methods** — `call()`, `wait_for_finish()`, nested resource access, and other shortcuts that hide platform quirks ([Convenience methods](https://docs.apify.com/api/client/python/docs/concepts/convenience-methods)). +- **Pluggable HTTP layer** — swap the default [Impit](https://github.com/apify/impit)-based HTTP client for `httpx`, `requests`, `aiohttp`, or any custom implementation ([Custom HTTP clients](https://docs.apify.com/api/client/python/docs/concepts/custom-http-clients)). +- **Structured errors** — every API error surfaces as an [`ApifyApiError`](https://docs.apify.com/api/client/python/reference/class/ApifyApiError) with HTTP-specific subclasses for precise handling ([Error handling](https://docs.apify.com/api/client/python/docs/concepts/error-handling)). +- **Debug logging** — opt-in structured logging on the `apify_client` logger captures request URLs, status codes, retry attempts, and more ([Logging](https://docs.apify.com/api/client/python/docs/concepts/logging)). + +## Usage examples + +The client mirrors the platform's resource model. Each entry point returns either a **single-resource client** for an individual item or a **collection client** for listing and creating items ([Single and collection clients](https://docs.apify.com/api/client/python/docs/concepts/single-and-collection-clients)). + +### List Actors and create one + +```python +actors = client.actors() +print(actors.list(limit=10).items) + +new_actor = actors.create(name='my-actor') +``` + +### Stream live logs while a run is in progress + +```python +run = client.actor('apify/web-scraper').start(run_input={...}) + +with client.run(run.id).log().stream() as log_stream: + for chunk in log_stream.iter_bytes(): + print(chunk.decode(), end='') +``` + +### Read and write key-value store records + +```python +store = client.key_value_store('STORE-ID') +store.set_record('greeting', {'message': 'Hello!'}) +record = store.get_record('greeting') +``` + +### Iterate dataset items with automatic pagination + +```python +for item in client.dataset('DATASET-ID').iterate_items(fields='title,url'): + process(item) +``` + +### Tune retries and timeouts + +```python +from datetime import timedelta + +from apify_client import ApifyClient + +client = ApifyClient( + token='MY-APIFY-TOKEN', + max_retries=8, + min_delay_between_retries=timedelta(milliseconds=500), + timeout_long=timedelta(minutes=10), +) +``` + +For end-to-end recipes — passing input, managing tasks for reusable input, retrieving and merging Actor data, integrating with Pandas, plugging in a custom HTTP client — see the [Guides](https://docs.apify.com/api/client/python/docs/guides/passing-input-to-actor). + +## Documentation + +The full documentation lives at **[docs.apify.com/api/client/python](https://docs.apify.com/api/client/python)**. -### Automatic parsing and error handling +| Section | What you'll find | +|---|---| +| [Introduction](https://docs.apify.com/api/client/python/docs) | Overview, prerequisites, and a tour of the client. | +| [Quick start](https://docs.apify.com/api/client/python/docs/quick-start) | Authenticate, run an Actor, and fetch its results step by step. | +| [Concepts](https://docs.apify.com/api/client/python/docs/concepts/asyncio-support) | Asyncio, single vs. collection clients, nested clients, error handling, retries, logging, convenience methods, pagination, streaming, custom HTTP clients, timeouts. | +| [Guides](https://docs.apify.com/api/client/python/docs/guides/passing-input-to-actor) | Pass input to an Actor, manage tasks for reusable input, retrieve Actor data, integrate with data libraries (e.g. Pandas), use HTTPX as the HTTP client. | +| [Upgrading](https://docs.apify.com/api/client/python/docs/upgrading/upgrading-to-v3) | Migrating between major versions. | +| [API reference](https://docs.apify.com/api/client/python/reference) | Generated reference for every class, method, and model. | +| [Changelog](https://docs.apify.com/api/client/python/docs/changelog) | Release history and breaking changes. | -Based on the endpoint, the client automatically extracts the relevant data and returns it in the expected format. Date strings are automatically converted to `datetime.datetime` objects. For exceptions, we throw an `ApifyApiError`, which wraps the plain JSON errors returned by API and enriches them with other context for easier debugging. +## Related projects -### Retries with exponential backoff +- **[Apify SDK for Python](https://docs.apify.com/sdk/python)** — toolkit for **building** Apify Actors in Python (this client is bundled with it). +- **[Crawlee for Python](https://crawlee.dev/python)** — high-level web scraping and browser automation framework that powers many Actors. +- **[Apify API client for JavaScript / TypeScript](https://docs.apify.com/api/client/js)** — equivalent Apify API client for Node.js. +- **[Apify SDK for JavaScript / TypeScript](https://docs.apify.com/sdk/js)** — equivalent Apify SDK for Node.js. +- **[Crawlee for JavaScript / TypeScript](https://crawlee.dev)** — original Node.js implementation of the Crawlee framework. +- **[Apify CLI](https://docs.apify.com/cli)** — command-line tool for interacting with the Apify platform: managing Actors, runs, storages, local development, and deployment. -Network communication sometimes fails. The client will automatically retry requests that failed due to a network error, an internal error of the Apify API (HTTP 500+) or rate limit error (HTTP 429). By default, it will retry up to 4 times. First retry will be attempted after ~500ms, second after ~1000ms and so on. You can configure those parameters using the `max_retries` and `min_delay_between_retries` options of the `ApifyClient` constructor. +## Support and community -### Support for asynchronous usage +- **Discord** — chat with the team and other users on the [Apify Discord server](https://discord.gg/jyEM2PRvMU). +- **GitHub issues** — report a bug or request a feature in the repository's [issue tracker](https://github.com/apify/apify-client-python/issues). -Starting with version 1.0.0, the package offers an asynchronous version of the client, [`ApifyClientAsync`](https://docs.apify.com/api/client/python), which allows you to work with the Apify API in an asynchronous way, using the standard `async`/`await` syntax. +## Contributing + +Bug reports, fixes, and improvements are welcome! See [CONTRIBUTING.md](./CONTRIBUTING.md) for the development setup, coding standards, testing, and the release process. The repo uses [uv](https://docs.astral.sh/uv/) for project management and [Poe the Poet](https://poethepoet.natn.io/) as a task runner; the typical loop is: + +```bash +uv run poe install-dev # install dev deps and git hooks +uv run poe check-code # lint, type-check, unit tests, docstring check +``` -### Convenience functions and options +## License -Some actions can't be performed by the API itself, such as indefinite waiting for an Actor run to finish (because of network timeouts). The client provides convenient `call()` and `wait_for_finish()` functions that do that. Key-value store records can be retrieved as objects, buffers or streams via the respective options, dataset items can be fetched as individual objects or serialized data and we plan to add better stream support and async iterators. +Released under the [Apache License 2.0](./LICENSE). From 30261fcbc84847d478051d21ac1fd883124aa70b Mon Sep 17 00:00:00 2001 From: Vlada Dusek Date: Tue, 5 May 2026 11:48:19 +0200 Subject: [PATCH 2/2] docs: link Fully typed responses bullet to Typed models concept MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The new typed-models concept page is the right destination for that bullet — Upgrading to v3 only covers the migration story. --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 638ee67b..cb90ac7a 100644 --- a/README.md +++ b/README.md @@ -96,7 +96,7 @@ For a guided walkthrough — authenticating, running an Actor, and reading its r ## Features - **Synchronous and asynchronous clients** — pick [`ApifyClient`](https://docs.apify.com/api/client/python/reference/class/ApifyClient) or [`ApifyClientAsync`](https://docs.apify.com/api/client/python/reference/class/ApifyClientAsync) to match your codebase; both expose the same API ([Asyncio support](https://docs.apify.com/api/client/python/docs/concepts/asyncio-support)). -- **Fully typed responses** — every method returns a [Pydantic](https://docs.pydantic.dev/) model generated from the Apify OpenAPI spec, with IDE autocomplete and runtime validation ([Upgrading to v3](https://docs.apify.com/api/client/python/docs/upgrading/upgrading-to-v3)). +- **Fully typed responses** — every method returns a [Pydantic](https://docs.pydantic.dev/) model generated from the Apify OpenAPI spec, with IDE autocomplete and runtime validation ([Typed models](https://docs.apify.com/api/client/python/docs/concepts/typed-models)). - **Automatic retries** — exponential backoff for network errors, HTTP 429, and 5xx responses, configurable per client ([Retries](https://docs.apify.com/api/client/python/docs/concepts/retries)). - **Tiered timeouts** — short / medium / long tiers picked per endpoint, overridable per call ([Timeouts](https://docs.apify.com/api/client/python/docs/concepts/timeouts)). - **Pagination and streaming** — iterate datasets, key-value store keys, or live logs without manual paging or buffering ([Pagination](https://docs.apify.com/api/client/python/docs/concepts/pagination), [Streaming](https://docs.apify.com/api/client/python/docs/concepts/streaming-resources)).