Skip to content

Commit

Permalink
[DOCS] Updates ML/anomaly detection terms in the Kibana guide (#41965)
Browse files Browse the repository at this point in the history
  • Loading branch information
szabosteve committed Jul 30, 2019
1 parent 1965dfa commit 7481c2c
Show file tree
Hide file tree
Showing 4 changed files with 54 additions and 51 deletions.
1 change: 1 addition & 0 deletions docs/ml/creating-df-kib.asciidoc
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
[role="xpack"]
[[creating-df-kib]]
== Creating {dataframe-transforms}

Expand Down
14 changes: 7 additions & 7 deletions docs/ml/creating-jobs.asciidoc
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
[role="xpack"]
[[ml-jobs]]
== Creating machine learning jobs
== Creating {anomaly-jobs}

Machine learning jobs contain the configuration information and metadata
{anomaly-jobs-cap} contain the configuration information and metadata
necessary to perform an analytics task.

{kib} provides the following wizards to make it easier to create jobs:
Expand Down Expand Up @@ -33,7 +33,7 @@ appears:
[role="screenshot"]
image::ml/images/ml-data-recognizer-sample.jpg[A screenshot of the {kib} sample data web log job creation wizard]

TIP: Alternatively, after you load a sample data set on the {kib} home page, you can click *View data* > *ML jobs*. There are {ml} jobs for both the sample eCommerce orders data set and the sample web logs data set.
TIP: Alternatively, after you load a sample data set on the {kib} home page, you can click *View data* > *ML jobs*. There are {anomaly-jobs} for both the sample eCommerce orders data set and the sample web logs data set.

If you use {filebeat-ref}/index.html[{filebeat}]
to ship access logs from your
Expand All @@ -57,17 +57,17 @@ wizards appear:
[role="screenshot"]
image::ml/images/ml-data-recognizer-metricbeat.jpg[A screenshot of the {metricbeat} job creation wizards]

These wizards create {ml} jobs, dashboards, searches, and visualizations that
are customized to help you analyze your {auditbeat}, {filebeat}, and
These wizards create {anomaly-jobs}, dashboards, searches, and visualizations
that are customized to help you analyze your {auditbeat}, {filebeat}, and
{metricbeat} data.

[NOTE]
===============================
If your data is located outside of {es}, you cannot use {kib} to create
your jobs and you cannot use {dfeeds} to retrieve your data in real time.
Machine learning analysis is still possible, however, by using APIs to
{anomal-detect-cap} is still possible, however, by using APIs to
create and manage jobs and post data to them. For more information, see
{ref}/ml-apis.html[Machine Learning APIs].
{ref}/ml-apis.html[{ml-cap} {anomaly-detect} APIs].
===============================

////
Expand Down
47 changes: 24 additions & 23 deletions docs/ml/index.asciidoc
Original file line number Diff line number Diff line change
@@ -1,35 +1,36 @@
[role="xpack"]
[[xpack-ml]]
= Machine Learning
= {ml-cap}

[partintro]
--

As datasets increase in size and complexity, the human effort required to
inspect dashboards or maintain rules for spotting infrastructure problems,
cyber attacks, or business issues becomes impractical. The Elastic {ml-features}
automatically model the normal behavior of your time series data — learning
trends, periodicity, and more — in real time to identify anomalies, streamline
root cause analysis, and reduce false positives.
cyber attacks, or business issues becomes impractical. The Elastic {ml}
{anomaly-detect} feature automatically models the normal behavior of your time
series data — learning trends, periodicity, and more — in real time to identify
anomalies, streamline root cause analysis, and reduce false positives.

The {ml-features} run in and scale with {es}, and include an
intuitive UI on the {kib} *Machine Learning* page for creating anomaly detection
jobs and understanding results.
{anomaly-detect-cap} runs in and scales with {es}, and includes an
intuitive UI on the {kib} *Machine Learning* page for creating {anomaly-jobs}
and understanding results.

If you have a basic license, you can use the *Data Visualizer* to learn more
about your data. In particular, if your data is stored in {es} and contains a
time field, you can use the *Data Visualizer* to identify possible fields for
{ml} analysis:
{anomaly-detect}:

[role="screenshot"]
image::ml/images/ml-data-visualizer-sample.jpg[Data Visualizer for sample flight data]

experimental[] You can also upload a CSV, NDJSON, or log file (up to 100 MB in size).
The {ml-features} identify the file format and field mappings. You can then
optionally import that data into an {es} index.
experimental[] You can also upload a CSV, NDJSON, or log file (up to 100 MB in
size). The *Data Visualizer* identifies the file format and field mappings. You
can then optionally import that data into an {es} index.

If you have a trial or platinum license, you can <<ml-jobs,create {ml} jobs>>
and manage jobs and {dfeeds} from the *Job Management* pane:
If you have a trial or platinum license, you can
<<ml-jobs,create {anomaly-jobs}>> and manage jobs and {dfeeds} from the *Job
Management* pane:

[role="screenshot"]
image::ml/images/ml-job-management.jpg[Job Management]
Expand All @@ -42,7 +43,7 @@ You can use the *Settings* pane to create and edit
image::ml/images/ml-settings.jpg[Calendar Management]

The *Anomaly Explorer* and *Single Metric Viewer* display the results of your
{ml} jobs. For example:
{anomaly-jobs}. For example:

[role="screenshot"]
image::ml/images/ml-single-metric-viewer.jpg[Single Metric Viewer]
Expand All @@ -56,17 +57,17 @@ occurring in your operational environment at that time:
image::ml/images/ml-annotations-list.jpg[Single Metric Viewer with annotations]

In some circumstances, annotations are also added automatically. For example, if
the {ml} analytics detect that there is missing data, it annotates the affected
the {anomaly-job} detects that there is missing data, it annotates the affected
time period. For more information, see
{stack-ov}/ml-delayed-data-detection.html[Handling delayed data].
The *Job Management* pane shows the full list of annotations for each job.
{stack-ov}/ml-delayed-data-detection.html[Handling delayed data]. The
*Job Management* pane shows the full list of annotations for each job.

NOTE: The {kib} {ml-features} use pop-ups. You must configure your
web browser so that it does not block pop-up windows or create an exception for
your {kib} URL.
NOTE: The {kib} {ml-features} use pop-ups. You must configure your web
browser so that it does not block pop-up windows or create an exception for your
{kib} URL.

For more information about {ml}, see
{stack-ov}/xpack-ml.html[Machine learning in the {stack}].
For more information about the {anomaly-detect} feature, see
{stack-ov}/xpack-ml.html[{ml-cap} {anomaly-detect}].

--

Expand Down
43 changes: 22 additions & 21 deletions docs/ml/job-tips.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -5,24 +5,25 @@
<titleabbrev>Job tips</titleabbrev>
++++

When you are creating a job in {kib}, the job creation wizards can provide
advice based on the characteristics of your data. By heeding these suggestions,
you can create jobs that are more likely to produce insightful {ml} results.
When you create an {anomaly-job} in {kib}, the job creation wizards can
provide advice based on the characteristics of your data. By heeding these
suggestions, you can create jobs that are more likely to produce insightful {ml}
results.

[[bucket-span]]
==== Bucket span

The bucket span is the time interval that {ml} analytics use to summarize and
model data for your job. When you create a job in {kib}, you can choose to
estimate a bucket span value based on your data characteristics.
model data for your job. When you create an {anomaly-job} in {kib}, you can
choose to estimate a bucket span value based on your data characteristics.

NOTE: The bucket span must contain a valid time interval. For more information,
see {ref}/ml-job-resource.html#ml-analysisconfig[Analysis configuration objects].

If you choose a value that is larger than one day or is significantly different
than the estimated value, you receive an informational message. For more
information about choosing an appropriate bucket span, see
{xpack-ref}/ml-buckets.html[Buckets].
{stack-ov}/ml-buckets.html[Buckets].

[[cardinality]]
==== Cardinality
Expand All @@ -40,14 +41,14 @@ job uses more memory resources. In particular, if the cardinality of the
Likewise if you are performing population analysis and the cardinality of the
`over_field_name` is below 10, you are advised that this might not be a suitable
field to use. For more information, see
{xpack-ref}/ml-configuring-pop.html[Performing Population Analysis].
{stack-ov}/ml-configuring-pop.html[Performing Population Analysis].

[[detectors]]
==== Detectors

Each job must have one or more _detectors_. A detector applies an analytical
function to specific fields in your data. If your job does not contain a
detector or the detector does not contain a
Each {anomaly-job} must have one or more _detectors_. A detector applies an
analytical function to specific fields in your data. If your job does not
contain a detector or the detector does not contain a
{stack-ov}/ml-functions.html[valid function], you receive an error.

If a job contains duplicate detectors, you also receive an error. Detectors are
Expand All @@ -57,9 +58,9 @@ duplicates if they have the same `function`, `field_name`, `by_field_name`,
[[influencers]]
==== Influencers

When you create a job, you can specify _influencers_, which are also sometimes
referred to as _key fields_. Picking an influencer is strongly recommended for
the following reasons:
When you create an {anomaly-job}, you can specify _influencers_, which are also
sometimes referred to as _key fields_. Picking an influencer is strongly
recommended for the following reasons:

* It allows you to more easily assign blame for the anomaly
* It simplifies and aggregates the results
Expand All @@ -78,11 +79,11 @@ The job creation wizards in {kib} can suggest which fields to use as influencers
[[model-memory-limits]]
==== Model memory limits

For each job, you can optionally specify a `model_memory_limit`, which is the
approximate maximum amount of memory resources that are required for analytical
processing. The default value is 1 GB. Once this limit is approached, data
pruning becomes more aggressive. Upon exceeding this limit, new entities are not
modeled.
For each {anomaly-job}, you can optionally specify a `model_memory_limit`, which
is the approximate maximum amount of memory resources that are required for
analytical processing. The default value is 1 GB. Once this limit is approached,
data pruning becomes more aggressive. Upon exceeding this limit, new entities
are not modeled.

You can also optionally specify the `xpack.ml.max_model_memory_limit` setting.
By default, it's not set, which means there is no upper bound on the acceptable
Expand All @@ -92,9 +93,9 @@ TIP: If you set the `model_memory_limit` too high, it will be impossible to open
the job; jobs cannot be allocated to nodes that have insufficient memory to run
them.

If the estimated model memory limit for a job is greater than the model memory
limit for the job or the maximum model memory limit for the cluster, the job
creation wizards in {kib} generate a warning. If the estimated memory
If the estimated model memory limit for an {anomaly-job} is greater than the
model memory limit for the job or the maximum model memory limit for the cluster,
the job creation wizards in {kib} generate a warning. If the estimated memory
requirement is only a little higher than the `model_memory_limit`, the job will
probably produce useful results. Otherwise, the actions you take to address
these warnings vary depending on the resources available in your cluster:
Expand Down

0 comments on commit 7481c2c

Please sign in to comment.