Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
200 changes: 172 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,58 +1,202 @@
<h1 align=center>Apify API client for Python</h1>
<h1 align="center">Apify API client for Python</h1>

<p align="center">
<a href="https://badge.fury.io/py/apify-client" rel="nofollow"><img src="https://badge.fury.io/py/apify-client.svg" alt="PyPI package version"></a>
<a href="https://pypi.org/project/apify-client/" rel="nofollow"><img src="https://img.shields.io/pypi/dm/apify-client" alt="PyPI package downloads"></a>
<a href="https://codecov.io/gh/apify/apify-client-python"><img src="https://codecov.io/gh/apify/apify-client-python/graph/badge.svg?token=TYQQWYYZ7A" alt="Codecov report"></a>
<a href="https://pypi.org/project/apify-client/" rel="nofollow"><img src="https://img.shields.io/pypi/pyversions/apify-client" alt="PyPI Python version"></a>
<a href="https://discord.gg/jyEM2PRvMU" rel="nofollow"><img src="https://img.shields.io/discord/801163717915574323?label=discord" alt="Chat on Discord"></a>
<strong>The official Python client for the <a href="https://docs.apify.com/api/v2">Apify REST API</a>.</strong>
</p>

The Apify API Client for Python is the official library to access the [Apify API](https://docs.apify.com/api/v2) from your Python applications. It provides useful features like automatic retries and convenience functions to improve your experience with the Apify API.
<p align="center">
<a href="https://pypi.org/project/apify-client/"><img src="https://badge.fury.io/py/apify-client.svg" alt="PyPI version"></a>
<a href="https://pypi.org/project/apify-client/"><img src="https://img.shields.io/pypi/dm/apify-client" alt="PyPI downloads"></a>
<a href="https://pypi.org/project/apify-client/"><img src="https://img.shields.io/pypi/pyversions/apify-client" alt="Python versions"></a>
<a href="https://codecov.io/gh/apify/apify-client-python"><img src="https://codecov.io/gh/apify/apify-client-python/graph/badge.svg?token=TYQQWYYZ7A" alt="Coverage"></a>
<a href="https://github.com/apify/apify-client-python/blob/master/LICENSE"><img src="https://img.shields.io/pypi/l/apify-client" alt="License"></a>
<a href="https://discord.gg/jyEM2PRvMU"><img src="https://img.shields.io/discord/801163717915574323?label=discord" alt="Chat on Discord"></a>
</p>

`apify-client` lets you talk to the [Apify platform](https://apify.com) from Python — run [Actors](https://docs.apify.com/platform/actors), manage [storages](https://docs.apify.com/platform/storage) (datasets, key-value stores, request queues), schedule tasks, configure webhooks, and use everything else exposed by the [Apify API](https://docs.apify.com/api/v2). It ships both synchronous and asynchronous clients, fully typed responses, automatic retries with exponential backoff, tiered timeouts, pagination helpers, streaming, and a pluggable HTTP layer.

> If you want to **build** Apify Actors in Python rather than consume the API, use the [Apify SDK for Python](https://docs.apify.com/sdk/python) instead — it bundles this client and adds Actor-side primitives.

If you want to develop Apify Actors in Python, check out the [Apify SDK for Python](https://docs.apify.com/sdk/python) instead.
## Table of contents

- [Installation](#installation)
- [Quick start](#quick-start)
- [Features](#features)
- [Usage examples](#usage-examples)
- [Documentation](#documentation)
- [Related projects](#related-projects)
- [Support and community](#support-and-community)
- [Contributing](#contributing)
- [License](#license)

## Installation

Requires Python 3.11+
`apify-client` requires **Python 3.11 or higher**. It is published on [PyPI](https://pypi.org/project/apify-client/) and can be installed for example with [pip](https://pip.pypa.io/):

```bash
pip install apify-client
```

or with [uv](https://docs.astral.sh/uv/):

```bash
uv add apify-client
```

You can install the package from its [PyPI listing](https://pypi.org/project/apify-client). To do that, simply run `pip install apify-client` in your terminal.
or any other Python package manager that consumes PyPI.

## Usage
## Quick start

For usage instructions, check the documentation on [Apify Docs](https://docs.apify.com/api/client/python/).
You'll need an Apify API token — find yours in the [Integrations section of Apify Console](https://console.apify.com/account/integrations). Pass it to the client and you're ready to go.

## Quick Start
### Synchronous client

```python
from apify_client import ApifyClient

apify_client = ApifyClient('MY-APIFY-TOKEN')
client = ApifyClient('MY-APIFY-TOKEN')

# Start an Actor and wait for it to finish.
run = client.actor('apify/hello-world').call(
run_input={'message': 'Hello, Apify!'},
)

# Iterate items from the run's default dataset.
for item in client.dataset(run.default_dataset_id).iterate_items():
print(item)
```

### Asynchronous client

```python
import asyncio

from apify_client import ApifyClientAsync


async def main() -> None:
client = ApifyClientAsync('MY-APIFY-TOKEN')

run = await client.actor('apify/hello-world').call(
run_input={'message': 'Hello, Apify!'},
)

# Iterate items from the run's default dataset.
async for item in client.dataset(run.default_dataset_id).iterate_items():
print(item)

# Start an Actor and wait for it to finish
actor_call = apify_client.actor('john-doe/my-cool-actor').call()

# Fetch results from the Actor's default dataset
if actor_call is not None:
dataset_items = apify_client.dataset(actor_call.default_dataset_id).list_items().items
asyncio.run(main())
```

> **Keep your token secret.** It authorizes requests on your behalf and can incur usage costs. Never commit it to source control or expose it to client-side code.

For a guided walkthrough — authenticating, running an Actor, and reading its results — see the [Quick start guide](https://docs.apify.com/api/client/python/docs/quick-start).

## Features

Besides greatly simplifying the process of querying the Apify API, the client provides other useful features.
- **Synchronous and asynchronous clients** — pick [`ApifyClient`](https://docs.apify.com/api/client/python/reference/class/ApifyClient) or [`ApifyClientAsync`](https://docs.apify.com/api/client/python/reference/class/ApifyClientAsync) to match your codebase; both expose the same API ([Asyncio support](https://docs.apify.com/api/client/python/docs/concepts/asyncio-support)).
- **Fully typed responses** — every method returns a [Pydantic](https://docs.pydantic.dev/) model generated from the Apify OpenAPI spec, with IDE autocomplete and runtime validation ([Typed models](https://docs.apify.com/api/client/python/docs/concepts/typed-models)).
- **Automatic retries** — exponential backoff for network errors, HTTP 429, and 5xx responses, configurable per client ([Retries](https://docs.apify.com/api/client/python/docs/concepts/retries)).
- **Tiered timeouts** — short / medium / long tiers picked per endpoint, overridable per call ([Timeouts](https://docs.apify.com/api/client/python/docs/concepts/timeouts)).
- **Pagination and streaming** — iterate datasets, key-value store keys, or live logs without manual paging or buffering ([Pagination](https://docs.apify.com/api/client/python/docs/concepts/pagination), [Streaming](https://docs.apify.com/api/client/python/docs/concepts/streaming-resources)).
- **Convenience methods** — `call()`, `wait_for_finish()`, nested resource access, and other shortcuts that hide platform quirks ([Convenience methods](https://docs.apify.com/api/client/python/docs/concepts/convenience-methods)).
- **Pluggable HTTP layer** — swap the default [Impit](https://github.com/apify/impit)-based HTTP client for `httpx`, `requests`, `aiohttp`, or any custom implementation ([Custom HTTP clients](https://docs.apify.com/api/client/python/docs/concepts/custom-http-clients)).
- **Structured errors** — every API error surfaces as an [`ApifyApiError`](https://docs.apify.com/api/client/python/reference/class/ApifyApiError) with HTTP-specific subclasses for precise handling ([Error handling](https://docs.apify.com/api/client/python/docs/concepts/error-handling)).
- **Debug logging** — opt-in structured logging on the `apify_client` logger captures request URLs, status codes, retry attempts, and more ([Logging](https://docs.apify.com/api/client/python/docs/concepts/logging)).

## Usage examples

The client mirrors the platform's resource model. Each entry point returns either a **single-resource client** for an individual item or a **collection client** for listing and creating items ([Single and collection clients](https://docs.apify.com/api/client/python/docs/concepts/single-and-collection-clients)).

### List Actors and create one

```python
actors = client.actors()
print(actors.list(limit=10).items)

new_actor = actors.create(name='my-actor')
```

### Stream live logs while a run is in progress

```python
run = client.actor('apify/web-scraper').start(run_input={...})

with client.run(run.id).log().stream() as log_stream:
for chunk in log_stream.iter_bytes():
print(chunk.decode(), end='')
```

### Read and write key-value store records

```python
store = client.key_value_store('STORE-ID')
store.set_record('greeting', {'message': 'Hello!'})
record = store.get_record('greeting')
```

### Iterate dataset items with automatic pagination

```python
for item in client.dataset('DATASET-ID').iterate_items(fields='title,url'):
process(item)
```

### Tune retries and timeouts

```python
from datetime import timedelta

from apify_client import ApifyClient

client = ApifyClient(
token='MY-APIFY-TOKEN',
max_retries=8,
min_delay_between_retries=timedelta(milliseconds=500),
timeout_long=timedelta(minutes=10),
)
```

For end-to-end recipes — passing input, managing tasks for reusable input, retrieving and merging Actor data, integrating with Pandas, plugging in a custom HTTP client — see the [Guides](https://docs.apify.com/api/client/python/docs/guides/passing-input-to-actor).

## Documentation

The full documentation lives at **[docs.apify.com/api/client/python](https://docs.apify.com/api/client/python)**.

### Automatic parsing and error handling
| Section | What you'll find |
|---|---|
| [Introduction](https://docs.apify.com/api/client/python/docs) | Overview, prerequisites, and a tour of the client. |
| [Quick start](https://docs.apify.com/api/client/python/docs/quick-start) | Authenticate, run an Actor, and fetch its results step by step. |
| [Concepts](https://docs.apify.com/api/client/python/docs/concepts/asyncio-support) | Asyncio, single vs. collection clients, nested clients, error handling, retries, logging, convenience methods, pagination, streaming, custom HTTP clients, timeouts. |
| [Guides](https://docs.apify.com/api/client/python/docs/guides/passing-input-to-actor) | Pass input to an Actor, manage tasks for reusable input, retrieve Actor data, integrate with data libraries (e.g. Pandas), use HTTPX as the HTTP client. |
| [Upgrading](https://docs.apify.com/api/client/python/docs/upgrading/upgrading-to-v3) | Migrating between major versions. |
| [API reference](https://docs.apify.com/api/client/python/reference) | Generated reference for every class, method, and model. |
| [Changelog](https://docs.apify.com/api/client/python/docs/changelog) | Release history and breaking changes. |

Based on the endpoint, the client automatically extracts the relevant data and returns it in the expected format. Date strings are automatically converted to `datetime.datetime` objects. For exceptions, we throw an `ApifyApiError`, which wraps the plain JSON errors returned by API and enriches them with other context for easier debugging.
## Related projects

### Retries with exponential backoff
- **[Apify SDK for Python](https://docs.apify.com/sdk/python)** — toolkit for **building** Apify Actors in Python (this client is bundled with it).
- **[Crawlee for Python](https://crawlee.dev/python)** — high-level web scraping and browser automation framework that powers many Actors.
- **[Apify API client for JavaScript / TypeScript](https://docs.apify.com/api/client/js)** — equivalent Apify API client for Node.js.
- **[Apify SDK for JavaScript / TypeScript](https://docs.apify.com/sdk/js)** — equivalent Apify SDK for Node.js.
- **[Crawlee for JavaScript / TypeScript](https://crawlee.dev)** — original Node.js implementation of the Crawlee framework.
- **[Apify CLI](https://docs.apify.com/cli)** — command-line tool for interacting with the Apify platform: managing Actors, runs, storages, local development, and deployment.

Network communication sometimes fails. The client will automatically retry requests that failed due to a network error, an internal error of the Apify API (HTTP 500+) or rate limit error (HTTP 429). By default, it will retry up to 4 times. First retry will be attempted after ~500ms, second after ~1000ms and so on. You can configure those parameters using the `max_retries` and `min_delay_between_retries` options of the `ApifyClient` constructor.
## Support and community

### Support for asynchronous usage
- **Discord** — chat with the team and other users on the [Apify Discord server](https://discord.gg/jyEM2PRvMU).
- **GitHub issues** — report a bug or request a feature in the repository's [issue tracker](https://github.com/apify/apify-client-python/issues).

Starting with version 1.0.0, the package offers an asynchronous version of the client, [`ApifyClientAsync`](https://docs.apify.com/api/client/python), which allows you to work with the Apify API in an asynchronous way, using the standard `async`/`await` syntax.
## Contributing

Bug reports, fixes, and improvements are welcome! See [CONTRIBUTING.md](./CONTRIBUTING.md) for the development setup, coding standards, testing, and the release process. The repo uses [uv](https://docs.astral.sh/uv/) for project management and [Poe the Poet](https://poethepoet.natn.io/) as a task runner; the typical loop is:

```bash
uv run poe install-dev # install dev deps and git hooks
uv run poe check-code # lint, type-check, unit tests, docstring check
```

### Convenience functions and options
## License

Some actions can't be performed by the API itself, such as indefinite waiting for an Actor run to finish (because of network timeouts). The client provides convenient `call()` and `wait_for_finish()` functions that do that. Key-value store records can be retrieved as objects, buffers or streams via the respective options, dataset items can be fetched as individual objects or serialized data and we plan to add better stream support and async iterators.
Released under the [Apache License 2.0](./LICENSE).
Loading