From 72c7207adb07e4108fb377b27c832ba55ced44d5 Mon Sep 17 00:00:00 2001 From: Benjamin Ironside Goldstein Date: Mon, 4 Nov 2024 17:53:58 -0800 Subject: [PATCH 1/4] Updates automatic import guide --- docs/getting-started/automatic-import.asciidoc | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/docs/getting-started/automatic-import.asciidoc b/docs/getting-started/automatic-import.asciidoc index d44f6df768..b7701f73ad 100644 --- a/docs/getting-started/automatic-import.asciidoc +++ b/docs/getting-started/automatic-import.asciidoc @@ -19,9 +19,9 @@ TIP: Click https://elastic.navattic.com/automatic-import[here] to access an inte .Requirements [sidebar] -- -- A working <>. Automatic Import currently works with all variants of Claude 3. Other models are not supported in this technical preview, but will be supported in future versions. +- A working <>. Recommended models: `Claude 3.5 Sonnet`; `GPT-4o`; `Gemini-1.5-pro-002`. - An https://www.elastic.co/pricing[Enterprise] subscription. -- A sample of the data you want to import, in JSON or NDJSON format. +- A sample of the data you want to import, in a structured or unstructured format (including JSON, NDJSON, and Syslog). -- IMPORTANT: Using Automatic Import allows users to create new third-party data integrations through the use of third-party generative AI models (“GAI models”). Any third-party GAI models that you choose to use are owned and operated by their respective providers. Elastic does not own or control these third-party GAI models, nor does it influence their design, training, or data-handling practices. Using third-party GAI models with Elastic solutions, and using your data with third-party GAI models is at your discretion. Elastic bears no responsibility or liability for the content, operation, or use of these third-party GAI models, nor for any potential loss or damage arising from their use. Users are advised to exercise caution when using GAI models with personal, sensitive, or confidential information, as data submitted may be used to train the models or for other purposes. Elastic recommends familiarizing yourself with the development practices and terms of use of any third-party GAI models before use. You are responsible for ensuring that your use of Automatic Import complies with the terms and conditions of any third-party platform you connect with. @@ -40,15 +40,14 @@ image::images/auto-import-create-new-integration-button.png[The Integrations pag 6. Define your integration's package name, which will prefix the imported event fields. 7. Define your **Data stream title**, **Data stream description**, and **Data stream name**. These fields appear on the integration's configuration page to help identify the data stream it writes to. 8. Select your {filebeat-ref}/configuration-filebeat-options.html[**Data collection method**]. This determines how your new integration will ingest the data (for example, from an S3 bucket, an HTTP endpoint, or a file stream). -9. Upload a sample of your data in JSON or NDJSON format. Make sure to include all the types of events that you want the new integration to handle. +9. Upload a sample of your data. Make sure to include all the types of events that you want the new integration to handle. + .Best practices for sample data [sidebar] -- -- The file extension (`.JSON` or `.NDJSON`) must match the file format. -- Only the first 10 events in the sample are analyzed. In this technical preview, additional data is truncated. -- Ensure each JSON or NDJSON object represents an event, and avoid deeply nested object structures. -- The more variety in your sample, the more accurate the pipeline will be (for example, include 10 unique log entries instead of the same type of entry 10 times). +- The file extension should match the file format. +- Each object in your sample should represent an event, and you should avoid deeply nested object structures. +- The more variety in your sample, the more accurate the pipeline will be. Include a wide range of unique log entries instead of just repeating the same type of entry. - Ideally, each field name should describe what the field does. -- + From e31c750ec9e4ab47e635942e058c81a603bbbf44 Mon Sep 17 00:00:00 2001 From: Benjamin Ironside Goldstein Date: Mon, 4 Nov 2024 17:57:33 -0800 Subject: [PATCH 2/4] adds info about 100 event limit for analysis --- docs/getting-started/automatic-import.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/getting-started/automatic-import.asciidoc b/docs/getting-started/automatic-import.asciidoc index b7701f73ad..aa9595d478 100644 --- a/docs/getting-started/automatic-import.asciidoc +++ b/docs/getting-started/automatic-import.asciidoc @@ -47,7 +47,7 @@ image::images/auto-import-create-new-integration-button.png[The Integrations pag -- - The file extension should match the file format. - Each object in your sample should represent an event, and you should avoid deeply nested object structures. -- The more variety in your sample, the more accurate the pipeline will be. Include a wide range of unique log entries instead of just repeating the same type of entry. +- The more variety in your sample, the more accurate the pipeline will be. Include a wide range of unique log entries instead of just repeating the same type of entry. Automatic Import will select up to 100 different events from your sample to use as the basis for the new integration. - Ideally, each field name should describe what the field does. -- + From cec9f777efd8ad5f98a58a2cf5559b6d77f063df Mon Sep 17 00:00:00 2001 From: Benjamin Ironside Goldstein Date: Mon, 4 Nov 2024 21:03:48 -0800 Subject: [PATCH 3/4] Replace reference to Amazon Bedrock with generic LLM --- docs/getting-started/automatic-import.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/getting-started/automatic-import.asciidoc b/docs/getting-started/automatic-import.asciidoc index aa9595d478..4ddfaa2a45 100644 --- a/docs/getting-started/automatic-import.asciidoc +++ b/docs/getting-started/automatic-import.asciidoc @@ -35,7 +35,7 @@ IMPORTANT: Using Automatic Import allows users to create new third-party data in image::images/auto-import-create-new-integration-button.png[The Integrations page with the Create new integration button highlighted] + 3. Click **Create integration**. -4. Select an <>. +4. Select an <>. 5. Define how your new integration will appear on the Integrations page by providing a **Title**, **Description**, and **Logo**. Click **Next**. 6. Define your integration's package name, which will prefix the imported event fields. 7. Define your **Data stream title**, **Data stream description**, and **Data stream name**. These fields appear on the integration's configuration page to help identify the data stream it writes to. From 9ab083f48e84c4c6a71db64c9f3e28d2ce4b2f4a Mon Sep 17 00:00:00 2001 From: Benjamin Ironside Goldstein <91905639+benironside@users.noreply.github.com> Date: Wed, 6 Nov 2024 13:24:25 -0800 Subject: [PATCH 4/4] Apply suggestions from code review Starts incorporating technical reviews! Co-authored-by: Eric Beahan --- docs/getting-started/automatic-import.asciidoc | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/docs/getting-started/automatic-import.asciidoc b/docs/getting-started/automatic-import.asciidoc index 4ddfaa2a45..9ff23fde5c 100644 --- a/docs/getting-started/automatic-import.asciidoc +++ b/docs/getting-started/automatic-import.asciidoc @@ -45,8 +45,7 @@ image::images/auto-import-create-new-integration-button.png[The Integrations pag .Best practices for sample data [sidebar] -- -- The file extension should match the file format. -- Each object in your sample should represent an event, and you should avoid deeply nested object structures. +- For JSON and NDJSON samples, each object in your sample should represent an event, and you should avoid deeply nested object structures. - The more variety in your sample, the more accurate the pipeline will be. Include a wide range of unique log entries instead of just repeating the same type of entry. Automatic Import will select up to 100 different events from your sample to use as the basis for the new integration. - Ideally, each field name should describe what the field does. --