diff --git a/.github/styles/config/vocabularies/Docs/accept.txt b/.github/styles/config/vocabularies/Docs/accept.txt index 229368396b..7254288409 100644 --- a/.github/styles/config/vocabularies/Docs/accept.txt +++ b/.github/styles/config/vocabularies/Docs/accept.txt @@ -1,9 +1,88 @@ -apify(?=-\w+) +Apify(?=-\w+) +@apify\.com +\bApify\b Actor(s)? +SDK(s) +[Ss]torages +Crawlee +[Aa]utoscaling +CU + booleans -Docusaurus env +npm +serverless +[Bb]oolean +node_modules +[Rr]egex +[Mm]onorepo +[Gg]ist +SDK +Dockerfile +Docker's + +Docusaurus navbar nginx -npm +:::caution +:::note +:::info +:::tip +:::warning + +maxWidth +startUrls + +PDFs +dataset's +gif +Gzip + +API's +APIs +webhook's +idempotency +backoff + +Authy +reCaptcha +OAuth +untrusted +unencrypted +proxied + +LLM +embedder +chatbot +[Ll]angchain + +[Kk]eboola +[Aa]irbyte +[Qq]drant +[Pp]inecone +[Mm]ilvus +[Zz]illiz +llama_index +[Ff]lowise + +exploitability +[Ww]hitepaper +[Cc]ron +scalably +metamorph +hostname +IPs +unscoped +multistep +[Aa]utogenerated +preconfigured +[Dd]atacenter + +[Ww]ikipedia +[Zz]apier +[Tt]rello +[Pp]refill + + +[Mm]ultiselect diff --git a/.github/workflows/typos-check.yaml b/.github/workflows/typos-check.yaml deleted file mode 100644 index e65b3d851b..0000000000 --- a/.github/workflows/typos-check.yaml +++ /dev/null @@ -1,18 +0,0 @@ -name: Typos Check - -on: - pull_request: - branches: [ master ] - -jobs: - run: - name: Spell Check with Typos - runs-on: ubuntu-latest - steps: - - name: Checkout code - uses: actions/checkout@v4 - - - name: Check spelling - uses: crate-ci/typos@master - with: - files: ./sources diff --git a/.github/workflows/vale.yaml b/.github/workflows/vale.yaml index e098be7152..e8cb96a3f9 100644 --- a/.github/workflows/vale.yaml +++ b/.github/workflows/vale.yaml @@ -32,3 +32,4 @@ jobs: fail_on_error: true vale_flags: '--minAlertLevel=error' reporter: github-pr-annotations + diff --git a/vale.ini b/.vale.ini similarity index 52% rename from vale.ini rename to .vale.ini index 3b8beef84a..3ecece96f7 100644 --- a/vale.ini +++ b/.vale.ini @@ -2,7 +2,7 @@ StylesPath = .github/styles MinAlertLevel = warning IgnoredScopes = code, tt, table, tr, td -vocabularies = Docs +Vocab = Docs Packages = write-good, Microsoft @@ -10,7 +10,11 @@ Packages = write-good, Microsoft mdx = md [*.md] -BasedOnStyles = Apify, write-good, Microsoft +BasedOnStyles = Vale, Apify, write-good, Microsoft +# Ignore URLs, HTML/XML tags starting with capital letter, lines containing = sign, http & https URL ending with ] or ) & email addresses +TokenIgnores = (<\/?[A-Z].+>), ([^\n]+=[^\n]*), (\[[^\]]+\]\([^\)]+\)), ([^\n]+@[^\n]+\.[^\n]), ({[^}]*}), (`[^`]*`), (`\w+`) +Vale.Spelling = YES + # Disabling rules (NO) Microsoft.Contractions = NO diff --git a/_typos.toml b/_typos.toml deleted file mode 100644 index c6d20dc2ca..0000000000 --- a/_typos.toml +++ /dev/null @@ -1,12 +0,0 @@ -[default] -extend-ignore-re = [ - '`[^`\n]+`', # skip inline code - '```[\s\S]*?```', # skip code blocks - 'Bún bò Nam Bô', # otherwise "Nam" is considered as a typo of "Name" -] - -[default.extend-words] -SER = "SER" - -[files] -extend-exclude = ['sources/api/*.mdx'] diff --git a/sources/academy/webscraping/anti_scraping/techniques/fingerprinting.md b/sources/academy/webscraping/anti_scraping/techniques/fingerprinting.md index 1fadca91fd..df882f4e17 100644 --- a/sources/academy/webscraping/anti_scraping/techniques/fingerprinting.md +++ b/sources/academy/webscraping/anti_scraping/techniques/fingerprinting.md @@ -86,9 +86,9 @@ navigator.permissions.query('some_permission'); ``` ### With canvases {#with-canvases} - + This technique is based on rendering [WebGL](https://developer.mozilla.org/en-US/docs/Web/API/WebGL_API) scenes to a canvas element and observing the pixels rendered. WebGL rendering is tightly connected with the hardware, and therefore provides high entropy. Here's a quick breakdown of how it works: - + 1. A JavaScript script creates a [`` element](https://developer.mozilla.org/en-US/docs/Web/API/Canvas_API) and renders some font or a custom shape. 2. The script then gets the pixel-map from the `` element. 3. The collected pixel-map is stored in a cryptographic hash specific to the device's hardware. diff --git a/sources/academy/webscraping/api_scraping/general_api_scraping/cookies_headers_tokens.md b/sources/academy/webscraping/api_scraping/general_api_scraping/cookies_headers_tokens.md index 8c96eb343e..a639d45512 100644 --- a/sources/academy/webscraping/api_scraping/general_api_scraping/cookies_headers_tokens.md +++ b/sources/academy/webscraping/api_scraping/general_api_scraping/cookies_headers_tokens.md @@ -16,10 +16,10 @@ Unfortunately, most APIs will require a valid cookie to be included in the `cook Luckily, there are ways to retrieve and set cookies for requests prior to sending them, which will be covered more in-depth within future Scraping Academy modules. The most important things to know at the moment are: ## Cookies {#cookies} - + 1. For sites that heavily rely on cookies for user-verification and request authorization, certain generic requests (such as to the website's main page, or to the target page) will return back a (or multiple) `set-cookie` header(s). 2. The `set-cookie` response header(s) can be parsed and used as the `cookie` header in the headers of a request. A great package for parsing these values from a response's headers is [`set-cookie-parser`](https://www.npmjs.com/package/set-cookie-parser). With this package, cookies can be parsed from headers like so: - + ```js import axios from 'axios'; diff --git a/sources/platform/actors/development/actor_definition/actor_json.md b/sources/platform/actors/development/actor_definition/actor_json.md index f5fa9ad4d4..de65df4036 100644 --- a/sources/platform/actors/development/actor_definition/actor_json.md +++ b/sources/platform/actors/development/actor_definition/actor_json.md @@ -5,7 +5,7 @@ slug: /actors/development/actor-definition/actor-json sidebar_position: 1 --- -**Learn how to write the main Actor config in the `.actor/actor.json` file.** +**Learn how to write the main Actor configuration in the `.actor/actor.json` file.** --- diff --git a/sources/platform/actors/development/actor_definition/input_schema/specification.md b/sources/platform/actors/development/actor_definition/input_schema/specification.md index 7f24b75552..b264755264 100644 --- a/sources/platform/actors/development/actor_definition/input_schema/specification.md +++ b/sources/platform/actors/development/actor_definition/input_schema/specification.md @@ -18,7 +18,7 @@ The Actor input schema serves three main purposes: - It simplifies invoking your Actors from external systems by generating calling code and connectors for integrations. To define an input schema for an Actor, set `input` field in the `.actor/actor.json` file to an input schema object (described below), or path to a JSON file containing the input schema object. -For backwards compatibility, if the `input` field is omitted, the system looks for an `INPUT_SCHEMA.json` file either in the `.actor` directory or the Actor's top-level directory—but note that this functionality is deprececated and might be removed in the future. The maximum allowed size for the input schema file is 500 kB. +For backwards compatibility, if the `input` field is omitted, the system looks for an `INPUT_SCHEMA.json` file either in the `.actor` directory or the Actor's top-level directory—but note that this functionality is deprecated and might be removed in the future. The maximum allowed size for the input schema file is 500 kB. When you provide an input schema, the system will validate the input data passed to the Actor on start (via the API or Apify Console) against the specified schema to ensure compliance before starting the Actor. If the input object doesn't conform the schema, the caller receives an error and the Actor is not started. @@ -343,7 +343,7 @@ The object where the proxy configuration is stored has the following structure: } ``` -Example of a blackbox object: +Example of a black box object: ```json { diff --git a/sources/platform/actors/development/builds_and_runs/builds.md b/sources/platform/actors/development/builds_and_runs/builds.md index bab0d19d3a..873aa31515 100644 --- a/sources/platform/actors/development/builds_and_runs/builds.md +++ b/sources/platform/actors/development/builds_and_runs/builds.md @@ -11,7 +11,7 @@ slug: /actors/development/builds-and-runs/builds ## Understand Actor builds -Before an Actor can be run, it needs to be built. The build process creates a snapshot of a specific version of the Actor's settings, including its [source code](../actor_definition/source_code.md) and [environment variables](../programming_interface/environment_variables.md). This snapshot is then used to create a Docker image containing everything the Actor needs for its run, such as NPM packages, web browsers, etc. +Before an Actor can be run, it needs to be built. The build process creates a snapshot of a specific version of the Actor's settings, including its [source code](../actor_definition/source_code.md) and [environment variables](../programming_interface/environment_variables.md). This snapshot is then used to create a Docker image containing everything the Actor needs for its run, such as `npm` packages, web browsers, etc. ### Build numbers diff --git a/sources/platform/actors/development/builds_and_runs/state_persistence.md b/sources/platform/actors/development/builds_and_runs/state_persistence.md index c9889483f5..42886161b5 100644 --- a/sources/platform/actors/development/builds_and_runs/state_persistence.md +++ b/sources/platform/actors/development/builds_and_runs/state_persistence.md @@ -18,7 +18,7 @@ Long-running [Actor](../../index.mdx) jobs may need to migrate between servers. To prevent data loss, long-running Actors should: - Periodically save (persist) their state. -- Listem for [migration events](/sdk/js/api/apify/class/PlatformEventManager) +- Listen for [migration events](/sdk/js/api/apify/class/PlatformEventManager) - Check for persisted state when starting, allowing them to resume from where they left off. For short-running Actors, the risk of restarts and the cost of repeated runs are low, so you can typically ignore state persistence. diff --git a/sources/platform/actors/development/deployment/continuous_integration.md b/sources/platform/actors/development/deployment/continuous_integration.md index 85bbe723c6..beda6543fc 100644 --- a/sources/platform/actors/development/deployment/continuous_integration.md +++ b/sources/platform/actors/development/deployment/continuous_integration.md @@ -30,7 +30,7 @@ To set up automated builds and tests for your Actors you need to: ![Apify token in app](./images/ci-token.png) 1. Add your Apify token to GitHub secrets - 1. Go to your repo > Settings > Secrets > New repository secret + 1. Go to your repository > Settings > Secrets > New repository secret 1. Name the secret & paste in your token 1. Add the Builds Actor API endpoint URL to GitHub secrets 1. Use this format: @@ -43,7 +43,7 @@ To set up automated builds and tests for your Actors you need to: 1. Name the secret 1. Create GitHub Actions workflow files: - 1. In your repo, create the `.github/workflows` directory + 1. In your repository, create the `.github/workflows` directory 2. Add `latest.yml` and `beta.yml` files with the following content diff --git a/sources/platform/actors/development/deployment/index.md b/sources/platform/actors/development/deployment/index.md index 21569b0187..ccbd537a3e 100644 --- a/sources/platform/actors/development/deployment/index.md +++ b/sources/platform/actors/development/deployment/index.md @@ -13,7 +13,7 @@ Deploying an Actor involves uploading your [source code](/platform/actors/develo ## Deploy using Apify CLI -The fastest way to deploy and build your Actor is by uising the [Apify CLI](/cli). If you've completed one of the tutorials from the [academy](/academy), you should have already have it installed. If not, follow the [Apify CLI installation instructions](/cli/docs/installation). +The fastest way to deploy and build your Actor is by using the [Apify CLI](/cli). If you've completed one of the tutorials from the [academy](/academy), you should have already have it installed. If not, follow the [Apify CLI installation instructions](/cli/docs/installation). To deploy your Actor using Apify CLI: @@ -49,7 +49,7 @@ You can also pull an existing Actor from the Apify platform to your local machin apify pull [ACTORID] ``` -This command fetches the Actor's files to your current directory. If the Actor is defined as a Git repository, it will be cloned, for Actors defined in the Web IDE, the command will fetch the files diresctly. +This command fetches the Actor's files to your current directory. If the Actor is defined as a Git repository, it will be cloned, for Actors defined in the Web IDE, the command will fetch the files directly. You can specify a particular version of the Actor to pull by using the `--version` flag: diff --git a/sources/platform/actors/development/deployment/source_types.md b/sources/platform/actors/development/deployment/source_types.md index 248b16a64b..f747fcc628 100644 --- a/sources/platform/actors/development/deployment/source_types.md +++ b/sources/platform/actors/development/deployment/source_types.md @@ -9,10 +9,13 @@ sidebar_position: 1 --- -This section explains the various sources types available for Apify Actors and how to deploy an Actor from Github using CLI or Gist. Apify Actors supporst four source types: +This section explains the various sources types available for Apify Actors and how to deploy an Actor from GitHub using CLI or Gist. Apify Actors supports four source types: - [Web IDE](#web-ide) - [Git repository](#git-repository) + - [Private repositories](#private-repositories) + - [How to configure deployment keys](#how-to-configure-deployment-keys) + - [Actor monorepos](#actor-monorepos) - [Zip file](#zip-file) - [GitHub Gist](#github-gist) @@ -22,7 +25,7 @@ This is the default option when your Actor's source code is hosted on the Apify A `Dockerfile` is mandatory for all Actors. When using the default NodeJS Dockerfile, you'll typically need `main.js` for your source code and `package.json` for [NPM](https://www.npmjs.com/) package configurations. -For more information on creating custom Dockersfiles or using Apify's base images, refer to the [Dockerfile](/platform/actors/development/actor-definition/dockerfile#custom-dockerfile) and [base Docker images](/platform/actors/development/actor-definition/dockerfile#base-docker-images) documentation. +For more information on creating custom Dockerfiles or using Apify's base images, refer to the [Dockerfile](/platform/actors/development/actor-definition/dockerfile#custom-dockerfile) and [base Docker images](/platform/actors/development/actor-definition/dockerfile#base-docker-images) documentation. ## Git repository @@ -30,7 +33,7 @@ For more information on creating custom Dockersfiles or using Apify's base image Hosting your Actor's source code in a Git repository allows for multiple files and directories, a custom `Dockerfile` for build process control, and a user description fetched from `README.md`. Specify the repository location using the **Git URL** setting with `https`, `git`, or `ssh` protocols. -To deploy an Actor from GitHub, set the **Source Type** to **Git repository** and enter the GitHub repository URL in the **Git URL** field. You can optionally specify a branch or tag by adding a URL fragmend (e.g., `#develop`). +To deploy an Actor from GitHub, set the **Source Type** to **Git repository** and enter the GitHub repository URL in the **Git URL** field. You can optionally specify a branch or tag by adding a URL fragment (e.g., `#develop`). To use a specific directory, add it after the branch/tag, separated by a colon (e.g., `#develop:some/dir`) @@ -72,14 +75,14 @@ Remember that each key can only be used once per Git hosting service (GitHub, Bi To manage multiple Actors in a single repository, use the `dockerContextDix` property in the [Actor definition](/platform/actors/development/actor-definition/actor-json) to set the Docker context directory (if not provided then the repository root is used). In the Dockerfile, copy both the Actor's source and any shared code into the Docker image. -To enable sharing Dockerfiles between multiple Actors, the Actor build process passes the `ACTOR_PATH_IN_DOCKER_CONTEXT` build arg to the Docker build. +To enable sharing Dockerfiles between multiple Actors, the Actor build process passes the `ACTOR_PATH_IN_DOCKER_CONTEXT` build argument to the Docker build. It contains the relative path from `dockerContextDir` to the directory selected as the root of the Actor in the Apify Console (the "directory" part of the Actor's git URL). For an example, see the [`apify/actor-monorepo-example`](https://github.com/apify/actor-monorepo-example) repository. To build Actors from this monorepo, you would set the source URL (including branch name and folder) as `https://github.com/apify/actor-monorepo-example#main:actors/javascript-actor` and `https://github.com/apify/actor-monorepo-example#main:actors/typescript-actor` respectively. ## Zip file -Actors can also use source code from a Zip archive hosted on an external URL. This option supports multiple files and directories, allows for custom `Dockerfile`, and uses `README.md` for the Actor description. If not using a [custom Dockerfile](../actor_definition/docker.md#custom-dockerfile), ensure your main applicat file is named `main.js`. +Actors can also use source code from a Zip archive hosted on an external URL. This option supports multiple files and directories, allows for custom `Dockerfile`, and uses `README.md` for the Actor description. If not using a [custom Dockerfile](../actor_definition/docker.md#custom-dockerfile), ensure your main file is named `main.js`. :::note Automatic use of ZIP file @@ -91,6 +94,6 @@ This source type is used automatically when you are using Apify-CLI and the sour For smaller projects, GitHub Gist offers a simpler alternative to full Git repositories or hosted Zip files. To use a GitHub Gist, create your Gist at [https://gist.github.com/](https://gist.github.com/), set the **Source type** to **GitHub Gist**, and paste the Gist URL in the provided field. -Like other source types, Gists can include multiple files, directories, and a custom Dockersfile. The Actor description is taken from `README.md`. +Like other source types, Gists can include multiple files, directories, and a custom Dockerfile. The Actor description is taken from `README.md`. By understanding these source types, you can choose the most appropriate option for hosting and deploying your Apify Actors. Each type offers unique advantages, allowing you to select the best fit for your project's size, complexity, and collaboration needs. diff --git a/sources/platform/actors/development/performance.md b/sources/platform/actors/development/performance.md index ab2726eb47..70d70666d1 100644 --- a/sources/platform/actors/development/performance.md +++ b/sources/platform/actors/development/performance.md @@ -11,7 +11,7 @@ slug: /actors/development/performance ## Optimization Tips -This guide provides tips to help you maximize the poerformance of your Actors, minimize costs, and achieve optimal results. +This guide provides tips to help you maximize the performance of your Actors, minimize costs, and achieve optimal results. ### Run batch jobs instead of single jobs diff --git a/sources/platform/actors/development/programming_interface/environment_variables.md b/sources/platform/actors/development/programming_interface/environment_variables.md index a25fb9ce98..c1f3910715 100644 --- a/sources/platform/actors/development/programming_interface/environment_variables.md +++ b/sources/platform/actors/development/programming_interface/environment_variables.md @@ -53,7 +53,7 @@ Here's a table of key system environment variables: | `APIFY_DISABLE_OUTDATED_WARNING` | Controls the display of outdated version warnings. Set to `1` to suppress notifications about updates. | | `APIFY_WORKFLOW_KEY` | Identifier used for grouping related runs and API calls together. | | `APIFY_META_ORIGIN` | Specifies how an Actor run was started. Possible values are [here](/platform/actors/running/runs-and-builds#origin) | -| `APIFY_SDK_LATEST_VERSION` | Specifies the most recent release version of the Apify SDK for Javascript. Used for checking for updates. | +| `APIFY_SDK_LATEST_VERSION` | Specifies the most recent release version of the Apify SDK for JavaScript. Used for checking for updates. | | `APIFY_INPUT_SECRETS_KEY_FILE` | Path to the secret key used to decrypt [Secret inputs](/platform/actors/development/actor-definition/input-schema/secret-input). | | `APIFY_INPUT_SECRETS_KEY_PASSPHRASE` | Passphrase for the input secret key specified in `APIFY_INPUT_SECRETS_KEY_FILE`. | diff --git a/sources/platform/actors/development/programming_interface/system_events.md b/sources/platform/actors/development/programming_interface/system_events.md index efe79c1768..f5dfb15916 100644 --- a/sources/platform/actors/development/programming_interface/system_events.md +++ b/sources/platform/actors/development/programming_interface/system_events.md @@ -22,7 +22,7 @@ Apify's system notifies Actors about various events, such as: - Abort operations triggered by another Actor - CPU overload -These events help you manage your Actor's behavior and resources effecetively. +These events help you manage your Actor's behavior and resources effectively. ## System events diff --git a/sources/platform/actors/development/quick_start/start_locally.md b/sources/platform/actors/development/quick_start/start_locally.md index e367d1bc19..f7cc5a8de1 100644 --- a/sources/platform/actors/development/quick_start/start_locally.md +++ b/sources/platform/actors/development/quick_start/start_locally.md @@ -11,7 +11,7 @@ slug: /actors/development/quick-start/locally :::info Prerequisites -You need to have [Node.js](https://nodejs.org/en/) version 16 or higher with NPM installed on your computer. +You need to have [Node.js](https://nodejs.org/en/) version 16 or higher with `npm` installed on your computer. ::: diff --git a/sources/platform/actors/publishing/badge.mdx b/sources/platform/actors/publishing/badge.mdx index a074344d12..e3eef7599b 100644 --- a/sources/platform/actors/publishing/badge.mdx +++ b/sources/platform/actors/publishing/badge.mdx @@ -31,7 +31,7 @@ https://apify.com/actor-badge?actor=/ In order to embed the badge in the HTML documentation, just use it as an image wrapped in a link as shown in the example below. Don't froget to use the `username` and `actor-name` of your Actor. #### Example - + ```html @@ -40,13 +40,13 @@ In order to embed the badge in the HTML documentation, just use it as an image w ``` - + ```markdown [![Website Content Crawler Actor](https://apify.com/actor-badge?actor=apify/website-content-crawler)](https://apify.com/apify/website-content-crawler) ``` - + ### Supported Actor states The badge indicates the state of the Actor in the Apify platform as the result of the [automated testing](../development/automated_tests.md). diff --git a/sources/platform/actors/running/usage_and_resources.md b/sources/platform/actors/running/usage_and_resources.md index 84c1664ec1..b75a002f7f 100644 --- a/sources/platform/actors/running/usage_and_resources.md +++ b/sources/platform/actors/running/usage_and_resources.md @@ -63,9 +63,9 @@ A good middle ground is `4096MB`. If you need the results faster, increase the m Autoscaling only applies to solutions that run multiple tasks (URLs) for at least 30 seconds. If you need to scrape just one URL or use Actors like [Google Sheets](https://apify.com/lukaskrivka/google-sheets) that do just a single isolated job, we recommend you lower the memory. [//]: # (TODO: It's pretty outdated, we now have platform credits in pricing) - + [//]: # (If you read that you can scrape 1000 pages of data for 1 CU and you want to scrape approximately 2 million of them monthly, that means you need 2000 CUs monthly and should [subscribe to the Business plan](https://console.apify.com/billing-new#/subscription).) - + If the Actor doesn't have this information, or you want to use your own solution, just run your solution like you want to use it long term. Let's say that you want to scrape the data **every hour for the whole month**. You set up a reasonable memory allocation like `4096MB`, and the whole run takes 15 minutes. That should consume 1 CU (4 \* 0.25 = 1). Now, you just need to multiply that by the number of hours in the day and by the number of days in the month, and you get an estimated usage of 720 (1 \* 24 \* 30) CUs monthly. diff --git a/sources/platform/integrations/actors/index.md b/sources/platform/integrations/actors/index.md index 390341434b..e387eb5685 100644 --- a/sources/platform/integrations/actors/index.md +++ b/sources/platform/integrations/actors/index.md @@ -13,7 +13,7 @@ slug: /integrations/actors :::note Integration Actors -You can check out a catalogue of our Integaration Actors within [Apify Store](https://apify.com/store/categories/integrations). +You can check out a catalogue of our Integration Actors within [Apify Store](https://apify.com/store/categories/integrations). ::: diff --git a/sources/platform/integrations/actors/integrating_actors_via_api.md b/sources/platform/integrations/actors/integrating_actors_via_api.md index 45820a9114..49f8f48b7c 100644 --- a/sources/platform/integrations/actors/integrating_actors_via_api.md +++ b/sources/platform/integrations/actors/integrating_actors_via_api.md @@ -16,8 +16,8 @@ import TabItem from '@theme/TabItem'; You can integrate Actors via API using the [Create webhook](/api/v2#/reference/webhooks/webhook-collection/create-webhook) endpoint. It's the same as any other webhook, but to make sure you see it in Apify Console, you need to make sure of a few things. -* The `requestUrl` field needs to point to the **Run Actor** or **Run task** endpoints and needs to use their IDs as identifiers (ie. not their technical names). -* The `payloadTemplate` field should be valid JSON - ie. it should only use variables enclosed in strings. You will also need to make sure that it contains a `payload` field. +* The `requestUrl` field needs to point to the **Run Actor** or **Run task** endpoints and needs to use their IDs as identifiers (i.e. not their technical names). +* The `payloadTemplate` field should be valid JSON - i.e. it should only use variables enclosed in strings. You will also need to make sure that it contains a `payload` field. * The `shouldInterpolateStrings` field needs to be set to `true`, otherwise the variables won't work. * Add `isApifyIntegration` field with the value `true`. This is a helper that turns on the Actor integration UI, if the above conditions are met. diff --git a/sources/platform/integrations/ai/milvus.md b/sources/platform/integrations/ai/milvus.md index 3222890e98..5630466776 100644 --- a/sources/platform/integrations/ai/milvus.md +++ b/sources/platform/integrations/ai/milvus.md @@ -86,12 +86,14 @@ Another way to interact with Milvus is through the [Apify Python SDK](https://do 1. Call the [Website Content Crawler](https://apify.com/apify/website-content-crawler) Actor to crawl the Milvus documentation and Zilliz website and extract text content from the web pages: + ```python actor_call = client.actor("apify/website-content-crawler").call( run_input={"maxCrawlPages": 10, "startUrls": [{"url": "https://milvus.io/"}, {"url": "https://zilliz.com/"}]} ) ``` + 1. Call Apify's Milvus integration and store all data in the Milvus Vector Database: ```python diff --git a/sources/platform/integrations/data-storage/drive.md b/sources/platform/integrations/data-storage/drive.md index 3622634ee3..9a5f41dd77 100644 --- a/sources/platform/integrations/data-storage/drive.md +++ b/sources/platform/integrations/data-storage/drive.md @@ -10,7 +10,7 @@ slug: /integrations/drive --- -Completementary to the following guide we've created a detailed video, that will guide you through the process of setting up your Google Drive integration. +Complementary to the following guide we've created a detailed video, that will guide you through the process of setting up your Google Drive integration. diff --git a/sources/platform/integrations/programming/api.md b/sources/platform/integrations/programming/api.md index e8cec54061..f59683b174 100644 --- a/sources/platform/integrations/programming/api.md +++ b/sources/platform/integrations/programming/api.md @@ -13,7 +13,7 @@ slug: /integrations/api All aspects of the Apify platform can be controlled via a REST API, which is described in detail in the [**API Reference**](/api/v2). If you want to use the Apify API from JavaScript/Node.js or Python, we strongly recommend to use one of our API clients: -- [**apify-client**](/api/client/js/) NPM package for JavaScript, supporting both browser and server +- [**apify-client**](/api/client/js/) `npm` package for JavaScript, supporting both browser and server - [**apify-client**](/api/client/python/) PyPI package for Python. You are not required to those packages—the REST API works with any HTTP client—but the official API clients implement best practices such as exponential backoff and rate limiting. @@ -155,7 +155,7 @@ You can use scoped tokens to schedule Actor and Tasks. Each schedule invocation However, **this token is always unscoped, which means that the scheduled Actor has access to all your account data**, regardless of the scope of the token that scheduled it. -### Webhoooks configuration +### Webhooks configuration If you allow a token to run an Actor, it'll also be able to manage the Actor's webhooks (similarly for tasks). diff --git a/sources/platform/integrations/programming/webhooks/actions.md b/sources/platform/integrations/programming/webhooks/actions.md index 8b4f41400e..d6a71dc095 100644 --- a/sources/platform/integrations/programming/webhooks/actions.md +++ b/sources/platform/integrations/programming/webhooks/actions.md @@ -50,7 +50,7 @@ If the URL of your request points toward Apify, you don't need to add a token, s ## Payload template -The payload template is a JSON-like string, that allows you to define a custom payload structure and inject dynamic data known only at the time of the webhook's invocation. Apart from the variables, the string must be a valid JSON. +The payload template is a JSON-like string that allows you to define a custom payload structure and inject dynamic data known only at the time of the webhook's invocation. Apart from the variables, the string must be a valid JSON. Variables must be enclosed in double curly braces and can only use the pre-defined variables listed in the [Available variables](#available-variables) section. Using any other variable will result in a validation error. @@ -132,7 +132,7 @@ Note that the `eventData` and `resource` properties contain redundant data for b ## Headers template -The headers template is a JSON-like string where you can add additional information to the default header of the webhook request. You can pass the variables in the same way as in [payload template](#payload-template) (including the use of string interpolation and available variables). The resulting headers need to be a valid json object and values can be strings only. +The headers template is a JSON-like string where you can add additional information to the default header of the webhook request. You can pass the variables in the same way as in [payload template](#payload-template) (including the use of string interpolation and available variables). The resulting headers need to be a valid `json` object and values can be strings only. Note that the following keys are hard-coded and will be always be rewritten: diff --git a/sources/platform/integrations/workflows-and-notifications/gmail.md b/sources/platform/integrations/workflows-and-notifications/gmail.md index 56c1c18378..623c0e03ca 100644 --- a/sources/platform/integrations/workflows-and-notifications/gmail.md +++ b/sources/platform/integrations/workflows-and-notifications/gmail.md @@ -10,7 +10,7 @@ slug: /integrations/gmail --- -Completementary to the following guide we've created a detailed video, that will guide you through the process of setting up your Gmail integration. +Complementary to the following guide we've created a detailed video, that will guide you through the process of setting up your Gmail integration. diff --git a/sources/platform/integrations/workflows-and-notifications/telegram.md b/sources/platform/integrations/workflows-and-notifications/telegram.md index f835596a5e..ad92fc4232 100644 --- a/sources/platform/integrations/workflows-and-notifications/telegram.md +++ b/sources/platform/integrations/workflows-and-notifications/telegram.md @@ -16,7 +16,7 @@ Your Zapier workflows can start Apify Actors or tasks, fetch items from a datase You can use the Zapier integration to trigger a workflow whenever an Actor or a task finishes. -Completementary to the following guide we've created a detailed video, that will guide you through the process of setting up your Telegram integration through Zapier. +Complementary to the following guide we've created a detailed video, that will guide you through the process of setting up your Telegram integration through Zapier. diff --git a/sources/platform/monitoring/index.md b/sources/platform/monitoring/index.md index 5d208cc7cf..4cb75ef4f0 100644 --- a/sources/platform/monitoring/index.md +++ b/sources/platform/monitoring/index.md @@ -49,7 +49,7 @@ When you set up an alert, you have four choices for how you want the metrics to 3. **Alert, when run status is one of following** - This type of alert is checked only after the run finishes. It makes possible to track the status of your finished runs and send an alert if the run finishes in a state you do not expect. If your Actor runs very often and suddenly starts failing, you will receive a single alert after the first failed run in 1 minute, and then aggregated alert every 15 minutes. -4. **Alert for dataset field statistics** - If you have a [dataset schema](../actors/development/actor_definition/dataset_schema/validation.md) set up, then you can use the field statistics to set up an alert. You can use field statistics for example to track if some field is filled in in all records, if some numeric value is too low/high (for example when tracking the price of a product over multiple sources), if the number of items in an array is too low/high (for example alert on Instagram Actor if post has a lot of comments) and many other tasks like these. +4. **Alert for dataset field statistics** - If you have a [dataset schema](../actors/development/actor_definition/dataset_schema/validation.md) set up, then you can use the field statistics to set up an alert. You can use field statistics for example to track if some field is filled in all records, if some numeric value is too low/high (for example when tracking the price of a product over multiple sources), if the number of items in an array is too low/high (for example alert on Instagram Actor if post has a lot of comments) and many other tasks like these. :::important diff --git a/sources/platform/proxy/datacenter_proxy.md b/sources/platform/proxy/datacenter_proxy.md index 33727e9db8..a2b2f9b6d8 100644 --- a/sources/platform/proxy/datacenter_proxy.md +++ b/sources/platform/proxy/datacenter_proxy.md @@ -8,7 +8,7 @@ slug: /proxy/datacenter-proxy import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; -# Datacenter proxy {#datacenter-proxy} +# Datacenter proxy **Learn how to reduce blocking when web scraping using IP address rotation. See proxy parameters and learn to implement Apify Proxy in an application.** diff --git a/sources/platform/proxy/index.md b/sources/platform/proxy/index.md index bfede81346..787659a405 100644 --- a/sources/platform/proxy/index.md +++ b/sources/platform/proxy/index.md @@ -85,7 +85,7 @@ Several types of proxy servers exist, each offering distinct advantages, disadva