From e7ac58918039a005da67fb3ef4b5560a181ad791 Mon Sep 17 00:00:00 2001 From: Benjamin Ironside Goldstein Date: Fri, 21 Mar 2025 08:58:15 -0700 Subject: [PATCH 1/7] updates LLM performance matrix --- .../llm-performance-matrix.asciidoc | 34 ++++++++++++------- 1 file changed, 22 insertions(+), 12 deletions(-) diff --git a/docs/AI-for-security/llm-performance-matrix.asciidoc b/docs/AI-for-security/llm-performance-matrix.asciidoc index b4f3940bf6..58b7466bed 100644 --- a/docs/AI-for-security/llm-performance-matrix.asciidoc +++ b/docs/AI-for-security/llm-performance-matrix.asciidoc @@ -3,26 +3,36 @@ This page describes the performance of various large language models (LLMs) for different use cases in {elastic-sec}, based on our internal testing. To learn more about these use cases, refer to <> or <>. -NOTE: `Excellent` is the best rating, followed by `Great`, then by `Good`, and finally by `Poor`. - +IMPORTANT: `Excellent` is the best rating, followed by `Great`, then by `Good`, and finally by `Poor`. Models rated `Excellent` or `Great` should produce quality results. Models rated `Good` or `Poor` are not recommended for that use case. [discrete] == Proprietary models Models from third-party LLM providers. [cols="1,1,1,1,1,1,1", options="header"] |=== -| *Feature* | | *Assistant - General* | *Assistant - {esql} generation* | *Assistant - Alert questions* | *Assistant - Knowledge retrieval* | *Attack Discovery* -| *Model* |*Claude 3: Opus* | Excellent | Excellent | Excellent | Good | Great -| |*Claude 3.5: Sonnet v2*| Excellent | Excellent | Excellent | Excellent | Great -| |*Claude 3.5: Sonnet* | Excellent| Excellent | Excellent | Excellent | Excellent -| |*Claude 3.5: Haiku* | Excellent| Excellent | Excellent | Excellent | Poor -| |*Claude 3: Haiku* | Excellent| Excellent | Excellent | Excellent | Poor -| |*GPT-4o* | Excellent| Excellent | Excellent | Excellent | Great -| |*GPT-4o-mini* | Excellent| Great | Great | Great | Poor -| |**Gemini 1.5 Pro 002** | Excellent| Excellent | Excellent | Excellent | Excellent -| |**Gemini 1.5 Flash 002**|Excellent| Poor | Good | Excellent | Poor +| *Feature* | | *Assistant - General* | *Assistant - {esql} generation* | *Assistant - Alert questions* | *Assistant - Knowledge retrieval* | *Attack Discovery* | *AI-powered SIEM migration* +| *Model* |*Claude 3: Opus* | Excellent | Excellent | Excellent | Good | Great | Good +| |*Claude 3.7: Sonnet* | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent +| |*Claude 3.5: Sonnet v2*| Excellent | Excellent | Excellent | Excellent | Great | Excellent +| |*Claude 3.5: Sonnet* | Excellent| Excellent | Excellent | Excellent | Excellent | Excellent +| |*Claude 3.5: Haiku* | Excellent| Excellent | Excellent | Excellent | Poor | Poor +| |*Claude 3: Haiku* | Excellent| Excellent | Excellent | Excellent | Poor |Poor +| |*GPT-4o* | Excellent| Excellent | Excellent | Excellent | Great |Great +| |*GPT-4o-mini* | Excellent| Great | Great | Great | Poor |Good +| |**Gemini 1.5 Pro 002** | Excellent| Excellent | Excellent | Excellent | Excellent | Great +| |**Gemini 1.5 Flash 002**|Excellent| Poor | Good | Excellent | Poor | Excellent |=== +| Good +| Excellent +| Excellent +| Excellent +| Poor +| Poor +| Great +| Good +| Great +| Excellent [discrete] == Open-source models Models you can <>. From 6180579dc746e8da2a465653288d34a03a1c064f Mon Sep 17 00:00:00 2001 From: Benjamin Ironside Goldstein Date: Fri, 21 Mar 2025 15:12:29 -0700 Subject: [PATCH 2/7] update --- docs/AI-for-security/llm-performance-matrix.asciidoc | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/AI-for-security/llm-performance-matrix.asciidoc b/docs/AI-for-security/llm-performance-matrix.asciidoc index 58b7466bed..8569430a94 100644 --- a/docs/AI-for-security/llm-performance-matrix.asciidoc +++ b/docs/AI-for-security/llm-performance-matrix.asciidoc @@ -4,6 +4,7 @@ This page describes the performance of various large language models (LLMs) for different use cases in {elastic-sec}, based on our internal testing. To learn more about these use cases, refer to <> or <>. IMPORTANT: `Excellent` is the best rating, followed by `Great`, then by `Good`, and finally by `Poor`. Models rated `Excellent` or `Great` should produce quality results. Models rated `Good` or `Poor` are not recommended for that use case. + [discrete] == Proprietary models Models from third-party LLM providers. @@ -21,6 +22,7 @@ Models from third-party LLM providers. | |*GPT-4o-mini* | Excellent| Great | Great | Great | Poor |Good | |**Gemini 1.5 Pro 002** | Excellent| Excellent | Excellent | Excellent | Excellent | Great | |**Gemini 1.5 Flash 002**|Excellent| Poor | Good | Excellent | Poor | Excellent +| |**Gemini 2.0 Flash 001**|Excellent| Excellent | Excellent | Excellent | Excellent | Excellent |=== | Good From f0f0df5b9f1fec04851cf9849d476112e9b7e43e Mon Sep 17 00:00:00 2001 From: Benjamin Ironside Goldstein Date: Fri, 21 Mar 2025 15:28:59 -0700 Subject: [PATCH 3/7] fixes table format --- docs/AI-for-security/llm-performance-matrix.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/AI-for-security/llm-performance-matrix.asciidoc b/docs/AI-for-security/llm-performance-matrix.asciidoc index 8569430a94..ad6714d1ea 100644 --- a/docs/AI-for-security/llm-performance-matrix.asciidoc +++ b/docs/AI-for-security/llm-performance-matrix.asciidoc @@ -9,7 +9,7 @@ IMPORTANT: `Excellent` is the best rating, followed by `Great`, then by `Good`, == Proprietary models Models from third-party LLM providers. -[cols="1,1,1,1,1,1,1", options="header"] +[cols="1,1,1,1,1,1,1,1", options="header"] |=== | *Feature* | | *Assistant - General* | *Assistant - {esql} generation* | *Assistant - Alert questions* | *Assistant - Knowledge retrieval* | *Attack Discovery* | *AI-powered SIEM migration* | *Model* |*Claude 3: Opus* | Excellent | Excellent | Excellent | Good | Great | Good From b1a01b1a380e855a18bacfaf8a38d21825e1738b Mon Sep 17 00:00:00 2001 From: Benjamin Ironside Goldstein Date: Fri, 21 Mar 2025 16:19:45 -0700 Subject: [PATCH 4/7] fix --- docs/AI-for-security/llm-performance-matrix.asciidoc | 10 ---------- 1 file changed, 10 deletions(-) diff --git a/docs/AI-for-security/llm-performance-matrix.asciidoc b/docs/AI-for-security/llm-performance-matrix.asciidoc index ad6714d1ea..4eb1a49a6f 100644 --- a/docs/AI-for-security/llm-performance-matrix.asciidoc +++ b/docs/AI-for-security/llm-performance-matrix.asciidoc @@ -25,16 +25,6 @@ Models from third-party LLM providers. | |**Gemini 2.0 Flash 001**|Excellent| Excellent | Excellent | Excellent | Excellent | Excellent |=== -| Good -| Excellent -| Excellent -| Excellent -| Poor -| Poor -| Great -| Good -| Great -| Excellent [discrete] == Open-source models Models you can <>. From 6b9511fcc60adf555d94665b4f7238d661be47bb Mon Sep 17 00:00:00 2001 From: Benjamin Ironside Goldstein Date: Tue, 1 Apr 2025 15:56:33 -0700 Subject: [PATCH 5/7] updates LLM support matrix --- docs/AI-for-security/llm-performance-matrix.asciidoc | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/AI-for-security/llm-performance-matrix.asciidoc b/docs/AI-for-security/llm-performance-matrix.asciidoc index 4eb1a49a6f..e2ff9d0000 100644 --- a/docs/AI-for-security/llm-performance-matrix.asciidoc +++ b/docs/AI-for-security/llm-performance-matrix.asciidoc @@ -31,9 +31,9 @@ Models you can <>. [cols="1,1,1,1,1,1,1", options="header"] |=== -| *Feature* | | *Assistant - General* | *Assistant - {esql} generation* | *Assistant - Alert questions* | *Assistant - Knowledge retrieval* | *Attack Discovery* -| *Model* | *Mistral Nemo* | Good | Good | Great | Good | Poor -| | *LLama 3.2* | Good | Poor | Good | Poor | Poor -| | *LLama 3.1 405b* | Good | Great | Good | Good| Poor -| | *LLama 3.1 70b* | Good | Good | Poor | Poor | Poor +| *Feature* | | *Assistant - General* | *Assistant - {esql} generation* | *Assistant - Alert questions*| *Assistant - Knowledge retrieval*| *Attack Discovery* | *AI-powered SIEKM migration* | +| *Model* | *Mistral Nemo* | Good | Good | Great | Good | Poor | Poor +| | *LLama 3.2* | Good | Poor | Good | Poor | Poor | Good +| | *LLama 3.1 405b* | Good | Great | Good | Good| Poor | Poor +| | *LLama 3.1 70b* | Good | Good | Poor | Poor | Poor | Good |=== \ No newline at end of file From 3d297ce98bed1ec06eed6f27af3e31ec9221a708 Mon Sep 17 00:00:00 2001 From: Benjamin Ironside Goldstein Date: Tue, 1 Apr 2025 16:50:56 -0700 Subject: [PATCH 6/7] fixes table --- docs/AI-for-security/llm-performance-matrix.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/AI-for-security/llm-performance-matrix.asciidoc b/docs/AI-for-security/llm-performance-matrix.asciidoc index e2ff9d0000..48331299d6 100644 --- a/docs/AI-for-security/llm-performance-matrix.asciidoc +++ b/docs/AI-for-security/llm-performance-matrix.asciidoc @@ -29,7 +29,7 @@ Models from third-party LLM providers. == Open-source models Models you can <>. -[cols="1,1,1,1,1,1,1", options="header"] +[cols="1,1,1,1,1,1,1,1", options="header"] |=== | *Feature* | | *Assistant - General* | *Assistant - {esql} generation* | *Assistant - Alert questions*| *Assistant - Knowledge retrieval*| *Attack Discovery* | *AI-powered SIEKM migration* | | *Model* | *Mistral Nemo* | Good | Good | Great | Good | Poor | Poor From 11cdee33d7628f4994059c9e42bbe3e04518d416 Mon Sep 17 00:00:00 2001 From: Benjamin Ironside Goldstein Date: Tue, 1 Apr 2025 17:45:55 -0700 Subject: [PATCH 7/7] fixes table --- docs/AI-for-security/llm-performance-matrix.asciidoc | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/AI-for-security/llm-performance-matrix.asciidoc b/docs/AI-for-security/llm-performance-matrix.asciidoc index 48331299d6..65ce172e44 100644 --- a/docs/AI-for-security/llm-performance-matrix.asciidoc +++ b/docs/AI-for-security/llm-performance-matrix.asciidoc @@ -11,7 +11,7 @@ Models from third-party LLM providers. [cols="1,1,1,1,1,1,1,1", options="header"] |=== -| *Feature* | | *Assistant - General* | *Assistant - {esql} generation* | *Assistant - Alert questions* | *Assistant - Knowledge retrieval* | *Attack Discovery* | *AI-powered SIEM migration* +| *Feature* | | *Assistant - General* | *Assistant - {esql} generation* | *Assistant - Alert questions* | *Assistant - Knowledge retrieval* | *Attack Discovery* | *Automatic Migration* | *Model* |*Claude 3: Opus* | Excellent | Excellent | Excellent | Good | Great | Good | |*Claude 3.7: Sonnet* | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent | |*Claude 3.5: Sonnet v2*| Excellent | Excellent | Excellent | Excellent | Great | Excellent @@ -31,9 +31,9 @@ Models you can <>. [cols="1,1,1,1,1,1,1,1", options="header"] |=== -| *Feature* | | *Assistant - General* | *Assistant - {esql} generation* | *Assistant - Alert questions*| *Assistant - Knowledge retrieval*| *Attack Discovery* | *AI-powered SIEKM migration* | -| *Model* | *Mistral Nemo* | Good | Good | Great | Good | Poor | Poor -| | *LLama 3.2* | Good | Poor | Good | Poor | Poor | Good -| | *LLama 3.1 405b* | Good | Great | Good | Good| Poor | Poor -| | *LLama 3.1 70b* | Good | Good | Poor | Poor | Poor | Good +| *Feature* | | *Assistant - General* | *Assistant - {esql} generation* | *Assistant - Alert questions*| *Assistant - Knowledge retrieval*| *Attack Discovery* | *Automatic Migration* +| *Model* | *Mistral Nemo* | Good | Good | Great | Good | Poor | Poor +| | *LLama 3.2* | Good | Poor | Good | Poor | Poor | Good +| | *LLama 3.1 405b* | Good | Great | Good | Good | Poor | Poor +| | *LLama 3.1 70b* | Good | Good | Poor | Poor | Poor | Good |=== \ No newline at end of file