[Docs] Fix typos (#1016)

mlrun · Jun 14, 2021 · 9acf47a · 9acf47a
1 parent 617ff28
commit 9acf47a
Show file tree

Hide file tree

Showing 39 changed files with 71 additions and 70 deletions.
diff --git a/docs/feature-store/basic-demo.ipynb b/docs/feature-store/basic-demo.ipynb
@@ -1707,7 +1707,7 @@
    "metadata": {},
    "source": [
     "## Get an Offline Feature Vector for Training\n",
-    "Example of combining features from 3 sources with time travel join of 3 tabels with **time travel**\n",
+    "Example of combining features from 3 sources with time travel join of 3 tables with **time travel**\n",
     "\n",
     "Specify a set of features and request the feature vector offline result as a dataframe"
    ]

diff --git a/docs/feature-store/end-to-end-demo/01-ingest-datasources.ipynb b/docs/feature-store/end-to-end-demo/01-ingest-datasources.ipynb
@@ -313,7 +313,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "You can plot the graph to visalize the pipeline:"
+    "You can plot the graph to visualize the pipeline:"
    ]
   },
   {
@@ -1339,7 +1339,7 @@
    "source": [
     "### Define data validation & quality policy\n",
     "\n",
-    "We can define validations on the feature level. For example, define here validation to check if the heart-rate value is between 0 and 220 and respitory rate is between 0 and 25."
+    "We can define validations on the feature level. For example, define here validation to check if the heart-rate value is between 0 and 220 and respiratory rate is between 0 and 25."
    ]
   },
   {
@@ -1446,7 +1446,7 @@
    "source": [
     "### Define the Real-Time Pipeline\n",
     "\n",
-    "Define the transoformation pipeline below. This is done just like the previous sections."
+    "Define the transformation pipeline below. This is done just like the previous sections."
    ]
   },
   {
@@ -2196,7 +2196,7 @@
    "source": [
     "## Done!\n",
     "\n",
-    "You've completed the data ingestion & prepration. Proceed to [Part 2](02-create-training-model.ipynb) to train a model using these features."
+    "You've completed the data ingestion & preparation. Proceed to [Part 2](02-create-training-model.ipynb) to train a model using these features."
    ]
   }
  ],

diff --git a/docs/feature-store/end-to-end-demo/02-create-training-model.ipynb b/docs/feature-store/end-to-end-demo/02-create-training-model.ipynb
@@ -45,7 +45,7 @@
     "## Create Feature Vector  \n",
     "In this section we will create our Feature Vector.  \n",
     "The Feature vector will have a `name` so we can reference to it later via the UI or our serving function, and a list of `features` from the available FeatureSets.  We can add a feature from a feature set by adding `<FeatureSet>.<Feature>` to the list, or add `<FeatureSet>.*` to add all the FeatureSet's available features.  \n",
-    "The `Label` is added explicitely from the available features so we will not look for it when serving in real-time (since it won't be available).\n",
+    "The `Label` is added explicitly from the available features so we will not look for it when serving in real-time (since it won't be available).\n",
     "\n",
     "By default, the first FeatureSet in the feature list will act as the spine. meaning that all the other features will be joined to it.  \n",
     "So for example, in this instance we use the early_sense sensor data as our spine, so for each early_sense event we will create produce a row in the resulted Feature Vector."

diff --git a/docs/feature-store/end-to-end-demo/03-deploy-serving-model.ipynb b/docs/feature-store/end-to-end-demo/03-deploy-serving-model.ipynb
@@ -6,7 +6,7 @@
    "source": [
     "# Part 3: Serving\n",
     "In this part we will user MLRun's **serving runtime** to deploy our trained models from the previous stage a `Voting Ensemble` using **max vote** logic.  \n",
-    "We will also use MLRun's **Feature store** to receive the online **Feature Vector** we define in the preveious stage.\n",
+    "We will also use MLRun's **Feature store** to receive the online **Feature Vector** we define in the previous stage.\n",
     "\n",
     "We will:\n",
     "- Define a model class to load our models, run preprocessing and predict on the data\n",

diff --git a/docs/feature-store/feature-sets.md b/docs/feature-store/feature-sets.md
@@ -94,13 +94,13 @@ print(quotes_set.get_stats_table())
 ## Ingest Data Into The Feature Store
 
 Data can be ingested as a batch process either by running the ingest command on demand or as a scheduled job.
-The data source could be a DataFrame or files (e.g. csv, parquet). Files can be either local files residing on a volume (e.g. v3io) or remote (e.g. S3, Azure blob). If the user defines a transfomration graph then when running an ingestion process it runs the graph transformations, infers metadata and stats and writes the results to a target data store.
+The data source could be a DataFrame or files (e.g. csv, parquet). Files can be either local files residing on a volume (e.g. v3io) or remote (e.g. S3, Azure blob). If the user defines a transformation graph then when running an ingestion process it runs the graph transformations, infers metadata and stats and writes the results to a target data store.
 When targets are not specified data is stored in the configured default targets (i.e. NoSQL for real-time and Parquet for offline).
 Batch ingestion can be done locally (i.e. running as a python process in the Jupyter pod) or as an MLRun job.
 
 ### Ingest data (locally)
 
-Use FeatureSet to create the basic feature set definition and then the ingest method to run a simple ingestion "localy" in the jupyter notebook pod.
+Use FeatureSet to create the basic feature set definition and then the ingest method to run a simple ingestion "locally" in the jupyter notebook pod.
 
 
 ```python
@@ -165,5 +165,5 @@ By default the feature sets are stored as both parquet file for training and as
 The parquet file is ideal for fetching large set of data for training while the key value is ideal for an online application as it supports low latency data retrieval based on key access. <br>
 
 > **Note:** When working with Iguazio platform the default feature set storage location is under "Projects" container --> <project name>/fs/.. folder. 
-the default location can be modified in mlrun config or specified per injest operation. the parquet/csv files can be stored in NFS, S3, Azure blob storage and on Iguazio DB/FS.
+the default location can be modified in mlrun config or specified per ingest operation. the parquet/csv files can be stored in NFS, S3, Azure blob storage and on Iguazio DB/FS.
 
diff --git a/docs/feature-store/transformations.md b/docs/feature-store/transformations.md
@@ -109,7 +109,7 @@ The following is an example for adding a simple `filter` to the graph, that will
 quotes_set.graph.to("storey.Filter", "filter", _fn="(event['bid'] > 50)")
 ```
 
-In the example above, the parameter `_fn` denotes a callable expression that will be passed ot the `storey.Filter`
+In the example above, the parameter `_fn` denotes a callable expression that will be passed to the `storey.Filter`
 class as the parameter `fn`. The callable parameter may also be a Python function, in which case there's no need for
 parentheses around it. This call generates a step in the graph called `filter` which will call the expression provided
 with the event being propagated through the graph as data is fed to the feature-set.

diff --git a/docs/howto/sklearn-project.ipynb b/docs/howto/sklearn-project.ipynb
@@ -472,7 +472,7 @@
    "source": [
     "<b>Step 2:</b> Run the describe function as a Kubernetes job with specified parameters.\n",
     "\n",
-    "> `mount_v3io()` vonnect our function to v3io shared file system and allow us to pass the data and get back the results (plots) directly to our notebook, we can choose other mount options to use NFS or object storage"
+    "> `mount_v3io()` connect our function to v3io shared file system and allow us to pass the data and get back the results (plots) directly to our notebook, we can choose other mount options to use NFS or object storage"
    ]
   },
   {

diff --git a/docs/hyper-params.ipynb b/docs/hyper-params.ipynb
@@ -1072,7 +1072,7 @@
    "source": [
     "## Parallel Execution Over Containers\n",
     "\n",
-    "When working with compute intensive or long running tasks we would like to run our iterations over a cluster of containers, on the same time we dont want to bring up too many containers and rather limit the number of parallel tasks.\n",
+    "When working with compute intensive or long running tasks we would like to run our iterations over a cluster of containers, on the same time we don't want to bring up too many containers and rather limit the number of parallel tasks.\n",
     "\n",
     "MLRun support distribution of the child runs over a Dask cluster, this is handled automatically by MLRun, the user only need to specify the Dask configuration and the level of parallelism. The execution can be controlled from the client/notebook, or can have a job (immediate or scheduled) which control the execution.\n",
     "\n",
@@ -1157,7 +1157,7 @@
    "source": [
     "### Define the Parallel Work\n",
     "\n",
-    "We set the `parallel_runs` attribute to indicate how many child tasks to run in parallel, and set the `dask_cluster_uri` to point to our dask cluster (if we dont set the cluster uri it will use dask local), we can also set the `teardown_dask` flag to indicate we want to free up all the dask resources after completion."
+    "We set the `parallel_runs` attribute to indicate how many child tasks to run in parallel, and set the `dask_cluster_uri` to point to our dask cluster (if we don't set the cluster uri it will use dask local), we can also set the `teardown_dask` flag to indicate we want to free up all the dask resources after completion."
    ]
   },
   {

diff --git a/docs/runtimes/dask-mlrun.ipynb b/docs/runtimes/dask-mlrun.ipynb
@@ -22,7 +22,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Set up the enviroment"
+    "## Set up the environment"
    ]
   },
   {
@@ -119,7 +119,7 @@
    "source": [
     "### Initialize the Dask Cluster\n",
     "\n",
-    "When we request the dask cluster `client` attibute it will verify the cluster is up and running"
+    "When we request the dask cluster `client` attribute it will verify the cluster is up and running"
    ]
   },
   {

diff --git a/docs/runtimes/functions.md b/docs/runtimes/functions.md
@@ -54,7 +54,7 @@ hyper-parameter or AutoML jobs.
 Many of the runtimes support horizontal scaling, you can specify the number of `replicas` or the 
 min - max value range (for auto scaling in Dask or Nuclio). When scaling functions we use some high speed
 messaging protocol and shared storage (volumes, objects, databases, or streams). MLRun runtimes
-handle the orchestration and monitoring of the distribured task.
+handle the orchestration and monitoring of the distributed task.
 
 <img src="../_static/images/runtime-scaling.png" alt="runtime-scaling" width="400"/>
 
@@ -147,7 +147,7 @@ Run object has the following methods/properties:
 
 In the function code signature we can add the `context` attribute (first), this provides us access to the 
 job metadata, parameters, inputs, secrets, and API for logging and monitoring our results. 
-Alternatively if we dont run inside a function handler (e.g. in Python main or Notebook) we can obtain the `context` 
+Alternatively if we don't run inside a function handler (e.g. in Python main or Notebook) we can obtain the `context` 
 object from the environment using the {py:func}`~mlrun.run.get_or_create_ctx` function.
 
 example function and usage of the context object:

diff --git a/docs/runtimes/horovod.ipynb b/docs/runtimes/horovod.ipynb
@@ -57,7 +57,7 @@
     "Horovod Supports TensorFlow, Keras, PyTorch, and Apache MXNet.\n",
     "\n",
     "in MLRun we use Horovod with MPI in order to create cluster resources and allow for optimized networking.  \n",
-    "**Note:** Horovd and MPI may use [NCCL](https://developer.nvidia.com/nccl) when applicable which may require some specific configuration arguments to run optimally.\n",
+    "**Note:** Horovod and MPI may use [NCCL](https://developer.nvidia.com/nccl) when applicable which may require some specific configuration arguments to run optimally.\n",
     "\n",
     "Horovod uses this MPI and NCCL concepts for distributed computation and messaging to quickly and easily synchronize between the different nodes or GPUs.\n",
     "\n",

diff --git a/docs/runtimes/mlrun_jobs.ipynb b/docs/runtimes/mlrun_jobs.ipynb
@@ -226,7 +226,7 @@
    "source": [
     "#### _**Option 1: Using file volumes for artifacts**_\n",
     "If your are using [Iguazio data science platform](https://www.iguazio.com/) use the `mount_v3io()` auto-mount modifier.<br>\n",
-    "if you use other k8s PVC volumes you can use the `mlrun.platforms.mount_pvc(..)` modifier with the requiered params.\n",
+    "if you use other k8s PVC volumes you can use the `mlrun.platforms.mount_pvc(..)` modifier with the required params.\n",
     "\n",
     "We will use the `auto_mount()` modifier which auto selects between k8s PVC volume or Iguazio data fabric, you can set the PVC volume config via env var below or via the auto_mount params:\n",
     "```\n",
@@ -906,7 +906,7 @@
     "\n",
     "KubeFlow pipelines are used for workflow automation--we compose a graph of functions and specify parameters, inputs and outputs.\n",
     "\n",
-    "As ilustrated below, we can chain the outputs and inputs of the pipeline steps."
+    "As illustrated below, we can chain the outputs and inputs of the pipeline steps."
    ]
   },
   {

diff --git a/docs/runtimes/spark-operator.ipynb b/docs/runtimes/spark-operator.ipynb
@@ -14,7 +14,7 @@
     "\n",
     "Kubernetes takes this request and starts the Spark driver in a Kubernetes pod (a k8s abstraction, just a docker container in this case). The Spark driver can then directly talk back to the Kubernetes master to request executor pods, scaling them up and down at runtime according to the load if dynamic allocation is enabled. Kubernetes takes care of the bin-packing of the pods onto Kubernetes nodes (the physical VMs), and will dynamically scale the various node pools to meet the requirements.\n",
     "\n",
-    "When using Spark operator the resources will be allocated per task, means scale down to zero when the tesk is done.\n"
+    "When using Spark operator the resources will be allocated per task, means scale down to zero when the task is done.\n"
    ]
   },
   {

diff --git a/docs/serving/distributed-graph.ipynb b/docs/serving/distributed-graph.ipynb
@@ -579,7 +579,7 @@
    "source": [
     "**listen on the output stream**\n",
     "\n",
-    "users can use the SDK or CLI to listen on the output stream (should be done in a seperate console/notebook), run:\n",
+    "users can use the SDK or CLI to listen on the output stream (should be done in a separate console/notebook), run:\n",
     "\n",
     "    mlrun watch-stream v3io:///users/admin/out-stream -j\n",
     "\n",

diff --git a/docs/serving/graph-example.ipynb b/docs/serving/graph-example.ipynb
@@ -131,8 +131,8 @@
     "\n",
     "We use the `graph.error_handler()` (apply to all states) or `state.error_handler()` (apply to a specific state) if we want the error from the graph or the state to be fed into a specific state (catcher)\n",
     "\n",
-    "We can specify which state is the responder (returns the HTTP response) using the `state.respond()` method,\n",
-    "if we dont specify the responder the graph will be non-blocking."
+    "We can specify which state is the responder (returns the HTTP response) using the `state.respond()` method.\n",
+    "If we don't specify the responder the graph will be non-blocking."
    ]
   },
   {

diff --git a/docs/serving/model-api.md b/docs/serving/model-api.md
@@ -3,7 +3,7 @@
 ## Creating Custom Model Serving Class
 
 Model serving classes implement the full model serving functionality which include
-loading models, pre and post processing, prediction, explanability, and model monitoring.
+loading models, pre and post processing, prediction, explainability, and model monitoring.
 
 Model serving classes must inherit from `mlrun.serving.V2ModelServer`, and at the minimum 
 implement the `load()` (download the model file(s) and load the model into memory) 
@@ -67,7 +67,7 @@ and should return the specified response object.
 
 ### explain() method
 
-the explain method provides a hook for model explanability, and is accessed using the `/explain` operation. .
+the explain method provides a hook for model explainability, and is accessed using the `/explain` operation. .
 
 ### pre/post and validate hooks
 
@@ -115,7 +115,7 @@ see `.add_model()` docstring for help and parameters
 
 > See the full [Model Server example](https://github.com/mlrun/functions/blob/master/v2_model_server/v2_model_server.ipynb).
 
-If we want to use multiple versions for the same model, we use `:` to seperate the name from the version, 
+If we want to use multiple versions for the same model, we use `:` to separate the name from the version, 
 e.g. if the name is `mymodel:v2` it means model name `mymodel` version `v2`.
 
 User should specify the `model_path` (url of the model artifact/dir) and the `class_name` name 

diff --git a/docs/serving/serving-graph.md b/docs/serving/serving-graph.md
@@ -150,8 +150,7 @@ We use the `graph.error_handler()` (apply to all steps) or `step.error_handler()
 (apply to a specific step) if we want the error from the graph or the step to be 
 fed into a specific step (catcher)
 
-We can specify which step is the responder (returns the HTTP response) using the `step.respond()` method,
-if we dont specify the responder the graph will be non-blocking.
+We can specify which step is the responder (returns the HTTP response) using the `step.respond()` method.If we don't specify the responder the graph will be non-blocking.
 
 ```python
 # use built-in storey class or our custom Echo class to create and link Task steps
@@ -352,15 +351,15 @@ if the class init args contain `context` or `name`, those will be initialize wit
 [graph context](#graph-context-and-event-objects) and the step name. 
 
 the class_name and handler specify a class/function name in the `globals()` (i.e. this module) by default
-or those can be full paths to the class (mudule.submodul.class), e.g. `storey.WriteToParquet`.
+or those can be full paths to the class (module.submodule.class), e.g. `storey.WriteToParquet`.
 users can also pass the module as an argument to functions such as `function.to_mock_server(namespace=module)`,
 in this case the class or handler names will also be searched in the provided module.
 
 when using classes the class event handler will be invoked on every event with the `event.body` 
 if the Task step `full_event` parameter is set to `True` the handler will be invoked and return
-the full `event` object. if we dont specify the class event handler it will invoke the class `do()` method. 
+the full `event` object. If we don't specify the class event handler it will invoke the class `do()` method. 
 
-if you need to implement async behaviour you should subclass `storey.MapClass`.
+if you need to implement async behavior you should subclass `storey.MapClass`.
 
 
 ### Building distributed graphs

diff --git a/docs/store/artifacts.md b/docs/store/artifacts.md
@@ -2,9 +2,10 @@
 # Artifacts and Versioning <!-- omit in toc -->
 
 - [Overview](#overview)
+  - [Artifact Path](#artifact-path)
+  - [Artifact URIs, Metadata and Versioning](#artifact-uris-metadata-and-versioning)
 - [Datasets](#datasets)
   - [Logging a Dataset From a Job](#logging-a-dataset-from-a-job)
-- [Models](./models.md)
 - [Plots](#plots)
 
 ## Overview
@@ -48,7 +49,7 @@ artifacts directory in the current active directory (./artifacts)
     set_environment(project=project_name, artifact_path='./artifacts')
 
 ```{admonition} For Iguazio Platform Users
-In the Iguazio Data Science Patform, the default artifacts path is a <project name>/artifacts directory in the 
+In the Iguazio Data Science Platform, the default artifacts path is a <project name>/artifacts directory in the 
 predefined “projects” data container — /v3io/projects/<project name>/artifacts 
 (for example, /v3io/projects/myproject/artifacts for a “myproject” project).
 ```

diff --git a/docs/store/datastore.md b/docs/store/datastore.md
@@ -31,7 +31,7 @@ or project/job context secrets
 ## DataItem Object
 
 When we run jobs or pipelines we pass data using the {py:class}`~mlrun.datastore.DataItem` objects, think of them as smart 
-data pointers which abstract away the data store specific behaviour.
+data pointers which abstract away the data store specific behavior.
 
 Example function:
 

diff --git a/docs/tutorial/03-model-serving.ipynb b/docs/tutorial/03-model-serving.ipynb
@@ -470,7 +470,7 @@
    "source": [
     "## Step 5: Viewing the Nuclio Serving Function on the Dashboard\n",
     "\n",
-    "On the **Projects** dashboad page, select the project and then select \"Real-time functions (Nuclio)\"."
+    "On the **Projects** dashboard page, select the project and then select \"Real-time functions (Nuclio)\"."
    ]
   },
   {

diff --git a/examples/mlrun_dask.ipynb b/examples/mlrun_dask.ipynb
@@ -137,7 +137,7 @@
    "metadata": {},
    "source": [
     "## Build the function with extra packages\n",
-    "We can skip the build section if we dont add packages (instead need to specify the image e.g. `dsf.spec.image='mlrun/ml-models'` which contains most of the packages you may need) "
+    "We can skip the build section if we don't add packages (instead need to specify the image e.g. `dsf.spec.image='mlrun/ml-models'` which contains most of the packages you may need) "
    ]
   },
   {
@@ -684,4 +684,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 4
-}
+}