Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 2 additions & 5 deletions .github/workflows/_release_docs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -50,13 +50,10 @@ jobs:
python-version: ${{ env.PYTHON_VERSION }}

- name: Install Python dependencies
run: make install-dev

- name: Build generated API reference
run: make build-api-reference
run: uv run poe install-dev

- name: Build Docusaurus docs
run: make build-docs
run: uv run poe build-docs
env:
APIFY_SIGNING_TOKEN: ${{ secrets.APIFY_SIGNING_TOKEN }}
SEGMENT_TOKEN: ${{ secrets.SEGMENT_TOKEN }}
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/on_schedule_tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,9 +57,9 @@ jobs:

# Sync the project, but no need to install the browsers into the test runner environment.
- name: Install Python dependencies
run: make install-sync
run: uv run poe install-sync

- name: Run templates end-to-end tests
run: make e2e-templates-tests args="-m ${{ matrix.http-client }} and ${{ matrix.crawler-type }} and ${{ matrix.package-manager }}"
run: uv run poe e2e-templates-tests -- -m "${{ matrix.http-client }} and ${{ matrix.crawler-type }} and ${{ matrix.package-manager }}"
env:
APIFY_TEST_USER_API_TOKEN: ${{ secrets.APIFY_TEST_USER_API_TOKEN }}
4 changes: 2 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@ repos:
hooks:
- id: lint-check
name: Lint check
entry: make lint
entry: uv run poe lint
language: system
pass_filenames: false

- id: type-check
name: Type check
entry: make type-check
entry: uv run poe type-check
language: system
pass_filenames: false
61 changes: 39 additions & 22 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,20 +8,40 @@ For local development, it is required to have Python 3.10 (or a later version) i

We use [uv](https://docs.astral.sh/uv/) for project management. Install it and set up your IDE accordingly.

We use [Poe the Poet](https://poethepoet.natn.io/) as a task runner, similar to npm scripts in `package.json`.
All tasks are defined in `pyproject.toml` under `[tool.poe.tasks]` and can be run with `uv run poe <task>`.

### Available tasks

| Task | Description |
| ---- | ----------- |
| `install-dev` | Install development dependencies |
| `check-code` | Run lint, type-check, and unit-tests |
| `lint` | Run linter |
| `format` | Fix lint issues and format code |
| `type-check` | Run type checker |
| `unit-tests` | Run unit tests |
| `unit-tests-cov` | Run unit tests with coverage |
| `e2e-templates-tests` | Run end-to-end template tests |
| `build-docs` | Build documentation website |
| `run-docs` | Run documentation website locally |
| `build` | Build package |
| `clean` | Remove build artifacts and clean caches |

## Dependencies

To install this package and its development dependencies, run:

```sh
make install-dev
uv run poe install-dev
```

## Code checking

To execute all code checking tools together, run:

```sh
make check-code
uv run poe check-code
```

### Linting
Expand All @@ -31,7 +51,7 @@ We utilize [ruff](https://docs.astral.sh/ruff/) for linting, which analyzes code
To run linting:

```sh
make lint
uv run poe lint
```

### Formatting
Expand All @@ -41,7 +61,7 @@ Our automated code formatting also leverages [ruff](https://docs.astral.sh/ruff/
To run formatting:

```sh
make format
uv run poe format
```

### Type checking
Expand All @@ -51,51 +71,48 @@ Type checking is handled by [ty](https://docs.astral.sh/ty/), verifying code aga
To run type checking:

```sh
make type-check
uv run poe type-check
```

### Unit tests

We employ pytest as our testing framework, equipped with various plugins. Check pyproject.toml for configuration details and installed plugins.

We use [pytest](https://docs.pytest.org/) as a testing framework with many plugins. Check `pyproject.toml` for configuration details and installed plugins.

To run unit tests:

```sh
make unit-tests
uv run poe unit-tests
```

To run unit tests with HTML coverage report:
To run unit tests with coverage report:

```sh
make unit-tests-cov
uv run poe unit-tests-cov
```

## End-to-end tests

Pre-requisites for running end-to-end tests:
- [apify-cli](https://docs.apify.com/cli/docs/installation) correctly installed
- `apify-cli` available in `PATH` environment variable
- Your [apify token](https://docs.apify.com/platform/integrations/api#api-token) is available in `APIFY_TEST_USER_API_TOKEN` environment variable
Prerequisites:

- [apify-cli](https://docs.apify.com/cli/docs/installation) installed and available in `PATH`
- Set `APIFY_TEST_USER_API_TOKEN` to your [Apify API token](https://docs.apify.com/platform/integrations/api#api-token)

To run end-to-end tests:

```sh
make e2e-templates-tests
uv run poe e2e-templates-tests
```

## Documentation

We follow the [Google docstring format](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html) for code documentation. All user-facing classes and functions must be documented. Documentation standards are enforced using [Ruff](https://docs.astral.sh/ruff/).

Our API documentation is generated from these docstrings using [pydoc-markdown](https://pypi.org/project/pydoc-markdown/) with custom post-processing. Additional content is provided through markdown files in the `docs/` directory. The final documentation is rendered using [Docusaurus](https://docusaurus.io/) and published to GitHub pages.
Our API documentation is generated from these docstrings using [pydoc-markdown](https://pypi.org/project/pydoc-markdown/) with custom post-processing. Additional content is provided through markdown files in the `docs/` directory. The final documentation is rendered using [Docusaurus](https://docusaurus.io/) and published to GitHub Pages.

To run the documentation locally, ensure you have `Node.js` 20+ installed, then run:

```sh
make run-docs
uv run poe run-docs
```

## Release process
Expand All @@ -120,14 +137,14 @@ name = "crawlee"
version = "x.z.y"
```

4. Generate the distribution archives for the package:
4. Build the package:

```shell
uv build
```sh
uv run poe build
```

5. Set up the PyPI API token for authentication and upload the package to PyPI:
5. Upload to PyPI:

```shell
```sh
uv publish --token YOUR_API_TOKEN
```
79 changes: 0 additions & 79 deletions Makefile

This file was deleted.

42 changes: 42 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,7 @@ dev = [
"build<2.0.0", # For e2e tests.
"dycw-pytest-only<3.0.0",
"fakeredis[probabilistic,json,lua]<3.0.0",
"poethepoet<1.0.0",
"pre-commit<5.0.0",
"proxy-py<3.0.0",
"pydoc-markdown<5.0.0",
Expand Down Expand Up @@ -255,3 +256,44 @@ exclude_lines = ["pragma: no cover", "if TYPE_CHECKING:", "assert_never()"]

[tool.ipdb]
context = 7

# Run tasks with: uv run poe <task>
[tool.poe.tasks]
clean = "rm -rf .coverage .pytest_cache .ruff_cache .ty_cache .uv-cache build coverage-unit.xml dist htmlcov website/.docusaurus website/.yarn website/module_shortcuts.json website/node_modules "
install-sync = "uv sync --all-extras"
build = "uv build --verbose"
publish-to-pypi = "uv publish --verbose --token ${APIFY_PYPI_TOKEN_CRAWLEE}"
type-check = "uv run ty check"
check-code = ["lint", "type-check", "unit-tests"]

[tool.poe.tasks.install-dev]
shell = "uv sync --all-extras && uv run pre-commit install && uv run playwright install"

[tool.poe.tasks.lint]
shell = "uv run ruff format --check && uv run ruff check"

[tool.poe.tasks.format]
shell = "uv run ruff check --fix && uv run ruff format"

[tool.poe.tasks.unit-tests]
shell = """
uv run pytest --numprocesses=1 --verbose -m "run_alone" tests/unit && \
uv run pytest --numprocesses=auto --verbose -m "not run_alone" tests/unit
"""

[tool.poe.tasks.unit-tests-cov]
shell = """
uv run pytest --numprocesses=1 --verbose -m "run_alone" --cov=src/crawlee --cov-report=xml:coverage-unit.xml tests/unit && \
uv run pytest --numprocesses=auto --verbose -m "not run_alone" --cov=src/crawlee --cov-report=xml:coverage-unit.xml --cov-append tests/unit
"""

[tool.poe.tasks.e2e-templates-tests]
cmd = "uv run pytest --numprocesses=${E2E_TESTS_CONCURRENCY:-1} --verbose tests/e2e/project_template --timeout=600"

[tool.poe.tasks.build-docs]
shell = "./build_api_reference.sh && corepack enable && yarn && yarn build"
cwd = "website"

[tool.poe.tasks.run-docs]
shell = "./build_api_reference.sh && corepack enable && yarn && yarn start"
cwd = "website"
1 change: 1 addition & 0 deletions tests/unit/crawlers/_playwright/test_playwright_crawler.py
Original file line number Diff line number Diff line change
Expand Up @@ -234,6 +234,7 @@ async def request_handler(context: PlaywrightCrawlingContext) -> None:
assert 'headless' not in headers['user-agent'].lower()


@pytest.mark.flaky(reruns=3, reason='Test is flaky.')
async def test_firefox_headless_headers(header_network: dict, server_url: URL) -> None:
browser_type: BrowserType = 'firefox'
crawler = PlaywrightCrawler(headless=True, browser_type=browser_type)
Expand Down
25 changes: 25 additions & 0 deletions uv.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading