Skip to content

Commit

Permalink
[Docs] Fix typos (#1016)
Browse files Browse the repository at this point in the history
  • Loading branch information
gilad-shaham committed Jun 14, 2021
1 parent 617ff28 commit 9acf47a
Show file tree
Hide file tree
Showing 39 changed files with 71 additions and 70 deletions.
2 changes: 1 addition & 1 deletion docs/feature-store/basic-demo.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1707,7 +1707,7 @@
"metadata": {},
"source": [
"## Get an Offline Feature Vector for Training\n",
"Example of combining features from 3 sources with time travel join of 3 tabels with **time travel**\n",
"Example of combining features from 3 sources with time travel join of 3 tables with **time travel**\n",
"\n",
"Specify a set of features and request the feature vector offline result as a dataframe"
]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -313,7 +313,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"You can plot the graph to visalize the pipeline:"
"You can plot the graph to visualize the pipeline:"
]
},
{
Expand Down Expand Up @@ -1339,7 +1339,7 @@
"source": [
"### Define data validation & quality policy\n",
"\n",
"We can define validations on the feature level. For example, define here validation to check if the heart-rate value is between 0 and 220 and respitory rate is between 0 and 25."
"We can define validations on the feature level. For example, define here validation to check if the heart-rate value is between 0 and 220 and respiratory rate is between 0 and 25."
]
},
{
Expand Down Expand Up @@ -1446,7 +1446,7 @@
"source": [
"### Define the Real-Time Pipeline\n",
"\n",
"Define the transoformation pipeline below. This is done just like the previous sections."
"Define the transformation pipeline below. This is done just like the previous sections."
]
},
{
Expand Down Expand Up @@ -2196,7 +2196,7 @@
"source": [
"## Done!\n",
"\n",
"You've completed the data ingestion & prepration. Proceed to [Part 2](02-create-training-model.ipynb) to train a model using these features."
"You've completed the data ingestion & preparation. Proceed to [Part 2](02-create-training-model.ipynb) to train a model using these features."
]
}
],
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@
"## Create Feature Vector \n",
"In this section we will create our Feature Vector. \n",
"The Feature vector will have a `name` so we can reference to it later via the UI or our serving function, and a list of `features` from the available FeatureSets. We can add a feature from a feature set by adding `<FeatureSet>.<Feature>` to the list, or add `<FeatureSet>.*` to add all the FeatureSet's available features. \n",
"The `Label` is added explicitely from the available features so we will not look for it when serving in real-time (since it won't be available).\n",
"The `Label` is added explicitly from the available features so we will not look for it when serving in real-time (since it won't be available).\n",
"\n",
"By default, the first FeatureSet in the feature list will act as the spine. meaning that all the other features will be joined to it. \n",
"So for example, in this instance we use the early_sense sensor data as our spine, so for each early_sense event we will create produce a row in the resulted Feature Vector."
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"source": [
"# Part 3: Serving\n",
"In this part we will user MLRun's **serving runtime** to deploy our trained models from the previous stage a `Voting Ensemble` using **max vote** logic. \n",
"We will also use MLRun's **Feature store** to receive the online **Feature Vector** we define in the preveious stage.\n",
"We will also use MLRun's **Feature store** to receive the online **Feature Vector** we define in the previous stage.\n",
"\n",
"We will:\n",
"- Define a model class to load our models, run preprocessing and predict on the data\n",
Expand Down
6 changes: 3 additions & 3 deletions docs/feature-store/feature-sets.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,13 +94,13 @@ print(quotes_set.get_stats_table())
## Ingest Data Into The Feature Store

Data can be ingested as a batch process either by running the ingest command on demand or as a scheduled job.
The data source could be a DataFrame or files (e.g. csv, parquet). Files can be either local files residing on a volume (e.g. v3io) or remote (e.g. S3, Azure blob). If the user defines a transfomration graph then when running an ingestion process it runs the graph transformations, infers metadata and stats and writes the results to a target data store.
The data source could be a DataFrame or files (e.g. csv, parquet). Files can be either local files residing on a volume (e.g. v3io) or remote (e.g. S3, Azure blob). If the user defines a transformation graph then when running an ingestion process it runs the graph transformations, infers metadata and stats and writes the results to a target data store.
When targets are not specified data is stored in the configured default targets (i.e. NoSQL for real-time and Parquet for offline).
Batch ingestion can be done locally (i.e. running as a python process in the Jupyter pod) or as an MLRun job.

### Ingest data (locally)

Use FeatureSet to create the basic feature set definition and then the ingest method to run a simple ingestion "localy" in the jupyter notebook pod.
Use FeatureSet to create the basic feature set definition and then the ingest method to run a simple ingestion "locally" in the jupyter notebook pod.


```python
Expand Down Expand Up @@ -165,5 +165,5 @@ By default the feature sets are stored as both parquet file for training and as
The parquet file is ideal for fetching large set of data for training while the key value is ideal for an online application as it supports low latency data retrieval based on key access. <br>

> **Note:** When working with Iguazio platform the default feature set storage location is under "Projects" container --> <project name>/fs/.. folder.
the default location can be modified in mlrun config or specified per injest operation. the parquet/csv files can be stored in NFS, S3, Azure blob storage and on Iguazio DB/FS.
the default location can be modified in mlrun config or specified per ingest operation. the parquet/csv files can be stored in NFS, S3, Azure blob storage and on Iguazio DB/FS.

2 changes: 1 addition & 1 deletion docs/feature-store/transformations.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ The following is an example for adding a simple `filter` to the graph, that will
quotes_set.graph.to("storey.Filter", "filter", _fn="(event['bid'] > 50)")
```

In the example above, the parameter `_fn` denotes a callable expression that will be passed ot the `storey.Filter`
In the example above, the parameter `_fn` denotes a callable expression that will be passed to the `storey.Filter`
class as the parameter `fn`. The callable parameter may also be a Python function, in which case there's no need for
parentheses around it. This call generates a step in the graph called `filter` which will call the expression provided
with the event being propagated through the graph as data is fed to the feature-set.
Expand Down
2 changes: 1 addition & 1 deletion docs/howto/sklearn-project.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -472,7 +472,7 @@
"source": [
"<b>Step 2:</b> Run the describe function as a Kubernetes job with specified parameters.\n",
"\n",
"> `mount_v3io()` vonnect our function to v3io shared file system and allow us to pass the data and get back the results (plots) directly to our notebook, we can choose other mount options to use NFS or object storage"
"> `mount_v3io()` connect our function to v3io shared file system and allow us to pass the data and get back the results (plots) directly to our notebook, we can choose other mount options to use NFS or object storage"
]
},
{
Expand Down
4 changes: 2 additions & 2 deletions docs/hyper-params.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1072,7 +1072,7 @@
"source": [
"## Parallel Execution Over Containers\n",
"\n",
"When working with compute intensive or long running tasks we would like to run our iterations over a cluster of containers, on the same time we dont want to bring up too many containers and rather limit the number of parallel tasks.\n",
"When working with compute intensive or long running tasks we would like to run our iterations over a cluster of containers, on the same time we don't want to bring up too many containers and rather limit the number of parallel tasks.\n",
"\n",
"MLRun support distribution of the child runs over a Dask cluster, this is handled automatically by MLRun, the user only need to specify the Dask configuration and the level of parallelism. The execution can be controlled from the client/notebook, or can have a job (immediate or scheduled) which control the execution.\n",
"\n",
Expand Down Expand Up @@ -1157,7 +1157,7 @@
"source": [
"### Define the Parallel Work\n",
"\n",
"We set the `parallel_runs` attribute to indicate how many child tasks to run in parallel, and set the `dask_cluster_uri` to point to our dask cluster (if we dont set the cluster uri it will use dask local), we can also set the `teardown_dask` flag to indicate we want to free up all the dask resources after completion."
"We set the `parallel_runs` attribute to indicate how many child tasks to run in parallel, and set the `dask_cluster_uri` to point to our dask cluster (if we don't set the cluster uri it will use dask local), we can also set the `teardown_dask` flag to indicate we want to free up all the dask resources after completion."
]
},
{
Expand Down
4 changes: 2 additions & 2 deletions docs/runtimes/dask-mlrun.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Set up the enviroment"
"## Set up the environment"
]
},
{
Expand Down Expand Up @@ -119,7 +119,7 @@
"source": [
"### Initialize the Dask Cluster\n",
"\n",
"When we request the dask cluster `client` attibute it will verify the cluster is up and running"
"When we request the dask cluster `client` attribute it will verify the cluster is up and running"
]
},
{
Expand Down
4 changes: 2 additions & 2 deletions docs/runtimes/functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ hyper-parameter or AutoML jobs.
Many of the runtimes support horizontal scaling, you can specify the number of `replicas` or the
min - max value range (for auto scaling in Dask or Nuclio). When scaling functions we use some high speed
messaging protocol and shared storage (volumes, objects, databases, or streams). MLRun runtimes
handle the orchestration and monitoring of the distribured task.
handle the orchestration and monitoring of the distributed task.

<img src="../_static/images/runtime-scaling.png" alt="runtime-scaling" width="400"/>

Expand Down Expand Up @@ -147,7 +147,7 @@ Run object has the following methods/properties:

In the function code signature we can add the `context` attribute (first), this provides us access to the
job metadata, parameters, inputs, secrets, and API for logging and monitoring our results.
Alternatively if we dont run inside a function handler (e.g. in Python main or Notebook) we can obtain the `context`
Alternatively if we don't run inside a function handler (e.g. in Python main or Notebook) we can obtain the `context`
object from the environment using the {py:func}`~mlrun.run.get_or_create_ctx` function.

example function and usage of the context object:
Expand Down
2 changes: 1 addition & 1 deletion docs/runtimes/horovod.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@
"Horovod Supports TensorFlow, Keras, PyTorch, and Apache MXNet.\n",
"\n",
"in MLRun we use Horovod with MPI in order to create cluster resources and allow for optimized networking. \n",
"**Note:** Horovd and MPI may use [NCCL](https://developer.nvidia.com/nccl) when applicable which may require some specific configuration arguments to run optimally.\n",
"**Note:** Horovod and MPI may use [NCCL](https://developer.nvidia.com/nccl) when applicable which may require some specific configuration arguments to run optimally.\n",
"\n",
"Horovod uses this MPI and NCCL concepts for distributed computation and messaging to quickly and easily synchronize between the different nodes or GPUs.\n",
"\n",
Expand Down
4 changes: 2 additions & 2 deletions docs/runtimes/mlrun_jobs.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -226,7 +226,7 @@
"source": [
"#### _**Option 1: Using file volumes for artifacts**_\n",
"If your are using [Iguazio data science platform](https://www.iguazio.com/) use the `mount_v3io()` auto-mount modifier.<br>\n",
"if you use other k8s PVC volumes you can use the `mlrun.platforms.mount_pvc(..)` modifier with the requiered params.\n",
"if you use other k8s PVC volumes you can use the `mlrun.platforms.mount_pvc(..)` modifier with the required params.\n",
"\n",
"We will use the `auto_mount()` modifier which auto selects between k8s PVC volume or Iguazio data fabric, you can set the PVC volume config via env var below or via the auto_mount params:\n",
"```\n",
Expand Down Expand Up @@ -906,7 +906,7 @@
"\n",
"KubeFlow pipelines are used for workflow automation--we compose a graph of functions and specify parameters, inputs and outputs.\n",
"\n",
"As ilustrated below, we can chain the outputs and inputs of the pipeline steps."
"As illustrated below, we can chain the outputs and inputs of the pipeline steps."
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion docs/runtimes/spark-operator.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
"\n",
"Kubernetes takes this request and starts the Spark driver in a Kubernetes pod (a k8s abstraction, just a docker container in this case). The Spark driver can then directly talk back to the Kubernetes master to request executor pods, scaling them up and down at runtime according to the load if dynamic allocation is enabled. Kubernetes takes care of the bin-packing of the pods onto Kubernetes nodes (the physical VMs), and will dynamically scale the various node pools to meet the requirements.\n",
"\n",
"When using Spark operator the resources will be allocated per task, means scale down to zero when the tesk is done.\n"
"When using Spark operator the resources will be allocated per task, means scale down to zero when the task is done.\n"
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion docs/serving/distributed-graph.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -579,7 +579,7 @@
"source": [
"**listen on the output stream**\n",
"\n",
"users can use the SDK or CLI to listen on the output stream (should be done in a seperate console/notebook), run:\n",
"users can use the SDK or CLI to listen on the output stream (should be done in a separate console/notebook), run:\n",
"\n",
" mlrun watch-stream v3io:///users/admin/out-stream -j\n",
"\n",
Expand Down
4 changes: 2 additions & 2 deletions docs/serving/graph-example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -131,8 +131,8 @@
"\n",
"We use the `graph.error_handler()` (apply to all states) or `state.error_handler()` (apply to a specific state) if we want the error from the graph or the state to be fed into a specific state (catcher)\n",
"\n",
"We can specify which state is the responder (returns the HTTP response) using the `state.respond()` method,\n",
"if we dont specify the responder the graph will be non-blocking."
"We can specify which state is the responder (returns the HTTP response) using the `state.respond()` method.\n",
"If we don't specify the responder the graph will be non-blocking."
]
},
{
Expand Down
6 changes: 3 additions & 3 deletions docs/serving/model-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
## Creating Custom Model Serving Class

Model serving classes implement the full model serving functionality which include
loading models, pre and post processing, prediction, explanability, and model monitoring.
loading models, pre and post processing, prediction, explainability, and model monitoring.

Model serving classes must inherit from `mlrun.serving.V2ModelServer`, and at the minimum
implement the `load()` (download the model file(s) and load the model into memory)
Expand Down Expand Up @@ -67,7 +67,7 @@ and should return the specified response object.

### explain() method

the explain method provides a hook for model explanability, and is accessed using the `/explain` operation. .
the explain method provides a hook for model explainability, and is accessed using the `/explain` operation. .

### pre/post and validate hooks

Expand Down Expand Up @@ -115,7 +115,7 @@ see `.add_model()` docstring for help and parameters

> See the full [Model Server example](https://github.com/mlrun/functions/blob/master/v2_model_server/v2_model_server.ipynb).
If we want to use multiple versions for the same model, we use `:` to seperate the name from the version,
If we want to use multiple versions for the same model, we use `:` to separate the name from the version,
e.g. if the name is `mymodel:v2` it means model name `mymodel` version `v2`.

User should specify the `model_path` (url of the model artifact/dir) and the `class_name` name
Expand Down
9 changes: 4 additions & 5 deletions docs/serving/serving-graph.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,8 +150,7 @@ We use the `graph.error_handler()` (apply to all steps) or `step.error_handler()
(apply to a specific step) if we want the error from the graph or the step to be
fed into a specific step (catcher)

We can specify which step is the responder (returns the HTTP response) using the `step.respond()` method,
if we dont specify the responder the graph will be non-blocking.
We can specify which step is the responder (returns the HTTP response) using the `step.respond()` method.If we don't specify the responder the graph will be non-blocking.

```python
# use built-in storey class or our custom Echo class to create and link Task steps
Expand Down Expand Up @@ -352,15 +351,15 @@ if the class init args contain `context` or `name`, those will be initialize wit
[graph context](#graph-context-and-event-objects) and the step name.

the class_name and handler specify a class/function name in the `globals()` (i.e. this module) by default
or those can be full paths to the class (mudule.submodul.class), e.g. `storey.WriteToParquet`.
or those can be full paths to the class (module.submodule.class), e.g. `storey.WriteToParquet`.
users can also pass the module as an argument to functions such as `function.to_mock_server(namespace=module)`,
in this case the class or handler names will also be searched in the provided module.

when using classes the class event handler will be invoked on every event with the `event.body`
if the Task step `full_event` parameter is set to `True` the handler will be invoked and return
the full `event` object. if we dont specify the class event handler it will invoke the class `do()` method.
the full `event` object. If we don't specify the class event handler it will invoke the class `do()` method.

if you need to implement async behaviour you should subclass `storey.MapClass`.
if you need to implement async behavior you should subclass `storey.MapClass`.


### Building distributed graphs
Expand Down
5 changes: 3 additions & 2 deletions docs/store/artifacts.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,10 @@
# Artifacts and Versioning <!-- omit in toc -->

- [Overview](#overview)
- [Artifact Path](#artifact-path)
- [Artifact URIs, Metadata and Versioning](#artifact-uris-metadata-and-versioning)
- [Datasets](#datasets)
- [Logging a Dataset From a Job](#logging-a-dataset-from-a-job)
- [Models](./models.md)
- [Plots](#plots)

## Overview
Expand Down Expand Up @@ -48,7 +49,7 @@ artifacts directory in the current active directory (./artifacts)
set_environment(project=project_name, artifact_path='./artifacts')

```{admonition} For Iguazio Platform Users
In the Iguazio Data Science Patform, the default artifacts path is a <project name>/artifacts directory in the
In the Iguazio Data Science Platform, the default artifacts path is a <project name>/artifacts directory in the
predefined “projects” data container — /v3io/projects/<project name>/artifacts
(for example, /v3io/projects/myproject/artifacts for a “myproject” project).
```
Expand Down
2 changes: 1 addition & 1 deletion docs/store/datastore.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ or project/job context secrets
## DataItem Object

When we run jobs or pipelines we pass data using the {py:class}`~mlrun.datastore.DataItem` objects, think of them as smart
data pointers which abstract away the data store specific behaviour.
data pointers which abstract away the data store specific behavior.

Example function:

Expand Down
2 changes: 1 addition & 1 deletion docs/tutorial/03-model-serving.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -470,7 +470,7 @@
"source": [
"## Step 5: Viewing the Nuclio Serving Function on the Dashboard\n",
"\n",
"On the **Projects** dashboad page, select the project and then select \"Real-time functions (Nuclio)\"."
"On the **Projects** dashboard page, select the project and then select \"Real-time functions (Nuclio)\"."
]
},
{
Expand Down
4 changes: 2 additions & 2 deletions examples/mlrun_dask.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,7 @@
"metadata": {},
"source": [
"## Build the function with extra packages\n",
"We can skip the build section if we dont add packages (instead need to specify the image e.g. `dsf.spec.image='mlrun/ml-models'` which contains most of the packages you may need) "
"We can skip the build section if we don't add packages (instead need to specify the image e.g. `dsf.spec.image='mlrun/ml-models'` which contains most of the packages you may need) "
]
},
{
Expand Down Expand Up @@ -684,4 +684,4 @@
},
"nbformat": 4,
"nbformat_minor": 4
}
}

0 comments on commit 9acf47a

Please sign in to comment.