[Docs] Small typo repairs (#1846)

mlrun · Mar 28, 2022 · f1f6bd8 · f1f6bd8
1 parent c92b0b2
commit f1f6bd8
Show file tree

Hide file tree

Showing 22 changed files with 31 additions and 31 deletions.
diff --git a/docs/feature-store/end-to-end-demo/01-ingest-datasources.ipynb b/docs/feature-store/end-to-end-demo/01-ingest-datasources.ipynb
@@ -1213,7 +1213,7 @@
     "When dealing with real-time aggregation, it's important to be able to update these aggregations in real-time.\n",
     "For this purpose, we will create live serving functions that will update the online feature store of the `transactions` FeatureSet and `Events` FeatureSet.\n",
     "\n",
-    "Using MLRun's `serving` runtime, craetes a nuclio function loaded with our feature set's computational graph definition\n",
+    "Using MLRun's `serving` runtime, creates a nuclio function loaded with our feature set's computational graph definition\n",
     "and an `HttpSource` to define the HTTP trigger.\n",
     "\n",
     "Notice that the implementation below does not require any rewrite of the pipeline logic."

diff --git a/docs/feature-store/end-to-end-demo/03-deploy-serving-model.ipynb b/docs/feature-store/end-to-end-demo/03-deploy-serving-model.ipynb
@@ -7,7 +7,7 @@
     "# Part 3: Serving\n",
     "\n",
     "In this part we will user MLRun's **serving runtime** to deploy our trained models from the previous stage a `Voting Ensemble` using **max vote** logic.  \n",
-    "We will also use MLRun's **Feature store** to receive the latest tag of the online **Feature Vector** we defined in the preveious stage.\n",
+    "We will also use MLRun's **Feature store** to receive the latest tag of the online **Feature Vector** we defined in the previous stage.\n",
     "\n",
     "By the end of this tutorial you’ll learn how to:\n",
     "- Define a model class to load our models, run preprocessing and predict on the data\n",

diff --git a/docs/feature-store/end-to-end-demo/04-pipeline.ipynb b/docs/feature-store/end-to-end-demo/04-pipeline.ipynb
@@ -76,7 +76,7 @@
    "source": [
     "## Step 2: Updating Project and Function Definitions\n",
     "\n",
-    "We need to save the definitions for the function we use in the projects so it is possible to automatically convert code to functions or import external functions whenever we load new versions of our code or when we run automated CI/CD workflows. In addition we may want to set other project attributes such as global parameters, secrets, and data.\n",
+    "We need to save the definitions for the function we use in the projects so it is possible to automatically convert code to functions or import external functions whenever we load new versions of our code or when we run automated CI/CD workflows. In addition, we may want to set other project attributes such as global parameters, secrets, and data.\n",
     "\n",
     "Our code maybe stored in Python files, notebooks, external repositories, packaged containers, etc. We use the `project.set_function()` method to register our code in the project, the definitions will be saved to the project object as well as in a YAML file in the root of our project.\n",
     "Functions can also be imported from MLRun marketplace (using the `hub://` schema).\n",

diff --git a/docs/feature-store/feature-sets.md b/docs/feature-store/feature-sets.md
@@ -54,7 +54,7 @@ The MLRun feature store supports three processing engines (storey, pandas, spark
 (e.g. Notebook) for interactive development or in elastic serverless functions for production and scale.
 
 The data pipeline is defined using MLRun graph (DAG) language. Graph steps can be pre-defined operators 
-(such as aggregate, filter, encode, map, join, impute, etc) or custom python classes/functions. 
+(such as aggregate, filter, encode, map, join, impute, etc.) or custom python classes/functions. 
 Read more about the graph in [**Serving and Data Pipelines**](../serving/serving-graph.md).
 
 The `pandas` and `spark` engines are good for simple batch transformations, while the `storey` stream processing engine (the default engine)

diff --git a/docs/install.md b/docs/install.md
@@ -18,7 +18,7 @@ See also important details about [MLRun client backward compatibility](#mlrun-cl
 on Windows or Mac, [Docker Desktop](https://www.docker.com/products/docker-desktop) is recommended. MLRun fully supports k8s releases up to, and including, 1.21.
 2. The Kubernetes command-line tool (kubectl) compatible with your Kubernetes cluster is installed. Refer to the [kubectl installation 
 instructions](https://kubernetes.io/docs/tasks/tools/install-kubectl/) for more information.
-3. Helm CLI is installed. Refer to the [Helm installation instructions](https://helm.sh/docs/intro/install/) for more information.
+3. Helm CLI is installed. Refer to the [Helm installation instructions](https://helm.sh/docs/intro/install/) and [sources](https://github.com/helm/helm/releases/) for more information.
 4. An accessible docker-registry (such as [Docker Hub](https://hub.docker.com)). The registry's URL and credentials are consumed by the applications via a pre-created secret.
 
 > **Note:**

diff --git a/docs/model_monitoring/model-monitoring-deployment.ipynb b/docs/model_monitoring/model-monitoring-deployment.ipynb
@@ -51,7 +51,7 @@
     "The Model Monitoring feature provides drift analysis monitoring.\n",
     "Model Drift in machine learning is a situation where the statistical properties of the target variable (what the model is trying to predict) change over time.\n",
     "In other words, the production data has changed significantly over the course of time and no longer matches the input data used to train the model.\n",
-    "So, for this new data, accuracy of the model predictions is low. Drfit analysis statistics are computed once an hour.\n",
+    "So, for this new data, accuracy of the model predictions is low. Drift analysis statistics are computed once an hour.\n",
     "For more information see <a href=\"https://www.iguazio.com/glossary/concept-drift/\" target=\"_blank\">Concept Drift</a>.\n",
     "\n",
     "### Common Terminology\n",
@@ -71,7 +71,7 @@
     "* [Model Features Analysis](#model-features-analysis)\n",
     "\n",
     "Select a project from the project tiles screen.\n",
-    "From the project dashboard, press the **Models** tile to view the models currently deployed .\n",
+    "From the project dashboard, press the **Models** tile to view the models currently deployed.\n",
     "Click **Model Endpoints** from the menu to display a list of monitored endpoints.\n",
     "If the Model Monitoring feature is not enabled, the endpoints list will be empty.\n",
     "\n",
@@ -175,7 +175,7 @@
     "* **Drift Status**&mdash;no drift (green), possible drift (yellow), drift detected (red)\n",
     "\n",
     "At the bottom of the dashboard are heat maps for the Predictions per second, Average Latency and Errors. The heat maps display data based on 15 minute intervals.\n",
-    "See [How to Read a Heat Map](#how-to-read-a-heat-map)for more details.\n",
+    "See [How to Read a Heat Map](#how-to-read-a-heat-map) for more details.\n",
     "\n",
     "Click an endpoint ID to drill down the performance details of that model.\n",
     "\n",

diff --git a/docs/projects/ci-integration.md b/docs/projects/ci-integration.md
@@ -11,7 +11,7 @@ local code (from the repository) with MLRun marketplace functions to build an au
 * deploy the model into a cluster
 * test the deployed model
 
-MLRun workflows can run inside the CI system, we will ususlly use the `mlrun project` CLI command to load the project 
+MLRun workflows can run inside the CI system, we will usually use the `mlrun project` CLI command to load the project 
 and run a workflow as part of a code update (e.g. pull request, etc.). The pipeline tasks will be executed on the Kubernetes cluster which is orchestrated by MLRun.
 
 See details:

diff --git a/docs/projects/project.md b/docs/projects/project.md
@@ -254,7 +254,7 @@ Examples:
     project.set_function(func_object)
 ```
 
-once functions are registered or saved in the project we can get their function object using `project.get_function(key)`.
+once functions are registered or saved in the project, we can get their function object using `project.get_function(key)`.
 
 example:
 

diff --git a/docs/projects/workflows.md b/docs/projects/workflows.md
@@ -138,7 +138,7 @@ Workflows are asynchronous by default, we can set the `watch` flag to True and t
 completion and print out the workflow progress, alternatively you can use `.wait_for_completion()` on the run object.
 
 The default workflow engine is `kfp`, we can override it by specifying the `engine` in the `run()` or `set_workflow()` methods,
-using the `local` engine will execute the workflow state machine loaclly (its functions will still run as cluster jobs).
+using the `local` engine will execute the workflow state machine locally (its functions will still run as cluster jobs).
 if we set the `local` flag to True the workflow will use the `local` engine AND the functions will will run as local process,
 this mode is used for local debugging of workflows.
 

diff --git a/docs/runtimes/dask-mlrun.ipynb b/docs/runtimes/dask-mlrun.ipynb
@@ -383,7 +383,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Use code_to_function to convert the code to MLRun and specify the configuration for the dask process (e.g. replicas, memory etc) <br>\n",
+    "Use code_to_function to convert the code to MLRun and specify the configuration for the dask process (e.g. replicas, memory etc.) <br>\n",
     "Note that the resource configurations are per worker"
    ]
   },

diff --git a/docs/runtimes/horovod.ipynb b/docs/runtimes/horovod.ipynb
@@ -35,7 +35,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## How can we distribute our training\n",
+    "## How can we distribute our training?\n",
     "There are two different cluster configurations (which can be combined) we need to take into account.  \n",
     "- **Multi Node** &mdash; GPUs are distributed over multiple nodes in the cluster.  \n",
     "- **Multi GPU** &mdash; GPUs are within a single Node.  \n",

diff --git a/docs/serving/custom-model-serving-class.md b/docs/serving/custom-model-serving-class.md
@@ -125,7 +125,7 @@ To specify the topology, router class and class args use `.set_topology()` with
 
 ## Creating a model serving function (service)
 
-To provision a serving function you need to create an MLRun function of type `serving`.
+To provision a serving function, you need to create an MLRun function of type `serving`.
 This can be done by using the `code_to_function()` call from a notebook. You can also import 
 an existing serving function/template from the marketplace.
 
@@ -151,7 +151,7 @@ model that is served by another function (can be used for ensembles).
 The function object(fn) accepts many options. You can specify replicas range (auto-scaling), cpu/gpu/mem resources, add shared 
 volume mounts, secrets, and any other Kubernetes resource through the `fn.spec` object or fn methods.
 
-For example, `fn.gpu(1)` means each replica uses one GPU.
+For example, `fn.gpu(1)` means each replica uses one GPU. 
 
 To deploy a model, simply call:
 

diff --git a/docs/serving/distributed-graph.ipynb b/docs/serving/distributed-graph.ipynb
@@ -576,7 +576,7 @@
    "source": [
     "**Listen on the output stream**\n",
     "\n",
-    "You can use the SDK or CLI to listen on the output stream. LIstening should be done in a separate console/notebook. Run:\n",
+    "You can use the SDK or CLI to listen on the output stream. Listening should be done in a separate console/notebook. Run:\n",
     "\n",
     "    mlrun watch-stream v3io:///users/admin/out-stream -j\n",
     "\n",

diff --git a/docs/serving/model-api.md b/docs/serving/model-api.md
@@ -2,8 +2,8 @@
 
 MLRun Serving follows the same REST API defined by Triton and [KFServing v2](https://github.com/kubeflow/kfserving/blob/master/docs/predict-api/v2/required_api.md).
 
-Nuclio also supports streaming protocols (Kafka, kinesis, MQTT, etc.). When streaming,  
-the `model` name and `operation` can be encoded inside the message body.
+Nuclio also supports streaming protocols (Kafka, kinesis, MQTT, etc.). When streaming, the 
+`model` name and `operation` can be encoded inside the message body.
 
 The APIs are:
 * [explain](#explain)

diff --git a/docs/serving/realtime-pipelines.ipynb b/docs/serving/realtime-pipelines.ipynb
@@ -180,7 +180,7 @@
     "\n",
     "Using the `flow` topology, you can specify tasks, which typically manipulate the data. The most common scenario is pre-processing of data prior to the model execution.\n",
     "\n",
-    "```{note} Once the topology is set, you cannot change an existing function toplogy.\n",
+    "```{note} Once the topology is set, you cannot change an existing function topology.\n",
     "```\n",
     "\n",
     "In this topology, you build and connect the graph (DAG) by adding steps using the `step.to()` method, or by using the \n",
@@ -405,7 +405,7 @@
     "Additional steps can follow the `catcher` step.\n",
     "```\n",
     "\n",
-    "Using the example in [Getting started with model serving](./model-serving-get-started.html#flow), you can add a error handler as follows:"
+    "Using the example in [Getting started with model serving](./model-serving-get-started.html#flow), you can add an error handler as follows:"
    ]
   },
   {

diff --git a/docs/serving/serving-graph.md b/docs/serving/serving-graph.md
@@ -14,7 +14,7 @@ MLRun graph capabilities include:
 - Easy to build and deploy distributed real-time computation graphs
 - Use the real-time serverless engine (Nuclio) for auto-scaling and optimized resource utilization
 - Built-in operators to handle data manipulation, IO, machine learning, deep-learning, NLP, etc.
-- Built-in monitoring for performance, resources, errors, data, model behaviour, and custom metrics
+- Built-in monitoring for performance, resources, errors, data, model behavior, and custom metrics
 - Debug in the IDE/Notebook, deploy to production using a single command
 
 The serving graphs are used by [MLRun's Feature Store](../feature-store/feature-store.md) to build real-time feature engineering pipelines, 

diff --git a/docs/serving/use-cases.md b/docs/serving/use-cases.md
@@ -29,7 +29,7 @@ Read more in the [Feature Store Overview](../feature-store/feature-store.md), an
 
 Graphs are used for serving models with different transformations.
 
-To deploy a serving function you need to import or create the serving function, 
+To deploy a serving function, you need to import or create the serving function, 
 add models to it, and then deploy it.  
 
 ```python

diff --git a/docs/store/artifacts.md b/docs/store/artifacts.md
@@ -54,7 +54,7 @@ predefined “projects” data container — /v3io/projects/<project name>/artif
 (for example, /v3io/projects/myproject/artifacts for a “myproject” project).
 ```
 
-When you use use `{{run.uid}}`, the artifacts for each job are stored in a dedicated directory for the executed job.
+When you use `{{run.uid}}`, the artifacts for each job are stored in a dedicated directory for the executed job.
 Otherwise, the same artifacts directory is used in all runs, so the artifacts for newer runs override those from the previous runs.
 
 As previously explained, `set_environment` returns a tuple with the project name and artifacts path.
@@ -103,15 +103,15 @@ example artifact URLs:
 
 ## Datasets
 
-Storing datasets is important in order to have a record of the data that was used to train the model, as well as storing any processed data. MLRun comes with built-in support for DataFrame format, and can not just store the DataFrame, but also provide the user information regarding the data, such as statistics.
+Storing datasets is important in order to have a record of the data that was used to train the model, as well as storing any processed data. MLRun comes with built-in support for DataFrame format, and cannot just store the DataFrame, but also provide the user information regarding the data, such as statistics.
 
 The simplest way to store a dataset is with the following code:
 
 ``` python
 context.log_dataset(key='my_data', df=df)
 ```
 
-Where `key` is the the name of the artifact and `df` is the DataFrame. By default, MLRun will store a short preview of 20 lines. You can change the number of lines by using the `preview` parameter and setting it to a different value.
+Where `key` is the name of the artifact and `df` is the DataFrame. By default, MLRun will store a short preview of 20 lines. You can change the number of lines by using the `preview` parameter and setting it to a different value.
 
 MLRun will also calculate statistics on the DataFrame on all numeric fields. You can enable statistics regardless to the DataFrame size by setting the `stats` parameter to `True`.
 
@@ -138,7 +138,7 @@ def get_data(context: MLClientCtx, source_url: DataItem, format: str = 'csv'):
                         index=False, artifact_path=target_path)
 ```
 
-We can run this function locally or as a job. For example if we run it locally:
+We can run this function locally or as a job. For example, if we run it locally:
 
 ``` python
 from os import path

diff --git a/docs/store/datastore.md b/docs/store/datastore.md
@@ -1,7 +1,7 @@
 (datastore)=
 # Data Stores & Data Items
 
-One of the biggest challenge in distributed systems is handling data given the 
+One of the biggest challenges in distributed systems is handling data given the 
 different access methods, APIs, and authentication mechanisms across types and providers.
 
 MLRun provides 3 main abstractions to access structured and unstructured data:

diff --git a/mlrun/config.py b/mlrun/config.py
@@ -70,7 +70,7 @@
     "igz_version": "",  # the version of the iguazio system the API is running on
     "iguazio_api_url": "",  # the url to iguazio api
     "spark_app_image": "",  # image to use for spark operator app runtime
-    "spark_app_image_tag": "",  # image tag to use for spark opeartor app runtime
+    "spark_app_image_tag": "",  # image tag to use for spark operator app runtime
     "spark_history_server_path": "",  # spark logs directory for spark history server
     "spark_operator_version": "spark-2",  # the version of the spark operator in use
     "builder_alpine_image": "alpine:3.13.1",  # builder alpine image (as kaniko's initContainer)
@@ -422,7 +422,7 @@ def decode_base64_config_and_load_to_dict(attribute_path: str) -> dict:
                 raise mlrun.errors.MLRunNotFoundError(
                     "Attribute does not exist in config"
                 )
-        # There is a bug in the installer component in iguazio system that causes the configrued value to be base64 of
+        # There is a bug in the installer component in iguazio system that causes the configured value to be base64 of
         # null (without conditioning it we will end up returning None instead of empty dict)
         if raw_attribute_value and raw_attribute_value != "bnVsbA==":
             try:

diff --git a/mlrun/kfpops.py b/mlrun/kfpops.py
@@ -214,7 +214,7 @@ def mlrun_op(
     :param out_path: default output path/url (prefix) for artifacts
     :param rundb:    path for rundb (or use 'MLRUN_DBPATH' env instead)
     :param mode:     run mode, e.g. 'pass' for using the command without mlrun wrapper
-    :param handler   code entry-point/hanfler name
+    :param handler   code entry-point/handler name
     :param job_image name of the image user for the job
     :param verbose:  add verbose prints/logs
     :param scrape_metrics:  whether to add the `mlrun/scrape-metrics` label to this run's resources

diff --git a/mlrun/runtimes/function.py b/mlrun/runtimes/function.py
@@ -1082,7 +1082,7 @@ def _resolve_invocation_url(self, path, force_external_address):
 
         # internal / external invocation urls is a nuclio >= 1.6.x feature
         # try to infer the invocation url from the internal and if not exists, use external.
-        # $$$$ we do not want to use the external invocation url (e.g.: ingress, nodePort, etc)
+        # $$$$ we do not want to use the external invocation url (e.g.: ingress, nodePort, etc.)
         if (
             not force_external_address
             and self.status.internal_invocation_urls