From 2fa4f612996b39839d2b2d71a7bb731e3c82dcb6 Mon Sep 17 00:00:00 2001 From: Benjamin Ironside Goldstein Date: Tue, 3 Dec 2024 14:19:45 -0800 Subject: [PATCH 1/4] Updates LLM performance matrix --- .../AI-for-security/llm-performance-matrix.asciidoc | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/docs/AI-for-security/llm-performance-matrix.asciidoc b/docs/AI-for-security/llm-performance-matrix.asciidoc index 9cf6998a87..ca4ceefced 100644 --- a/docs/AI-for-security/llm-performance-matrix.asciidoc +++ b/docs/AI-for-security/llm-performance-matrix.asciidoc @@ -5,11 +5,12 @@ This table describes the performance of various large language models (LLMs) for [cols="1,1,1,1,1,1,1,1", options="header"] |=== -| *Feature* | *Model* | | | | | | -| | *Claude 3: Opus* | *Claude 3.5: Sonnet* | *Claude 3: Haiku* | *GPT-4o* | *GPT-4 Turbo* | **Gemini 1.5 Pro ** | **Gemini 1.5 Flash** -| *Assistant - General* | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent -| *Assistant - {esql} generation*| Great | Great | Poor | Excellent | Poor | Good | Poor -| *Assistant - Alert questions* | Excellent | Excellent | Excellent | Excellent | Poor | Excellent | Good -| *Attack discovery* | Excellent | Excellent | Poor | Poor | Good | Great | Poor +| *Feature* | *Model* | | | | | | | | +| | *Claude 3: Opus*| *Claude 3.5: Sonnet v2* | *Claude 3.5: Sonnet* | *Claude 3.5: Haiku*| *Claude 3: Haiku* | *GPT-4o* | *GPT-4o-mini* | **Gemini 1.5 Pro 002** | **Gemini 1.5 Flash 002** +| *Assistant - General* | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent +| *Assistant - {esql} generation*| Excellent | Great | Excellent | Excellent | Excellent | Excellent | Great | Excellent | Poor +| *Assistant - Alert questions* | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent | Great | Excellent | Good +| *Assistant - Knowledge retrieval* | Good | Excellent | Excellent | Excellent | Excellent | Excellent | Great | Excellent | Excellent +| *Attack Discovery* | Great | Great | Excellent | Poor | Poor | Great | Poor | Excellent | Poor |=== \ No newline at end of file From 2e55834e5720a7756cf02696d6b39a9f4c9fd438 Mon Sep 17 00:00:00 2001 From: Benjamin Ironside Goldstein Date: Tue, 3 Dec 2024 15:06:08 -0800 Subject: [PATCH 2/4] fixes format --- docs/AI-for-security/llm-performance-matrix.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/AI-for-security/llm-performance-matrix.asciidoc b/docs/AI-for-security/llm-performance-matrix.asciidoc index ca4ceefced..174aa91d2c 100644 --- a/docs/AI-for-security/llm-performance-matrix.asciidoc +++ b/docs/AI-for-security/llm-performance-matrix.asciidoc @@ -3,7 +3,7 @@ This table describes the performance of various large language models (LLMs) for different use cases in {elastic-sec}, based on our internal testing. To learn more about these use cases, refer to <> or <>. -[cols="1,1,1,1,1,1,1,1", options="header"] +[cols="1,1,1,1,1,1,1,1,1,1", options="header"] |=== | *Feature* | *Model* | | | | | | | | | | *Claude 3: Opus*| *Claude 3.5: Sonnet v2* | *Claude 3.5: Sonnet* | *Claude 3.5: Haiku*| *Claude 3: Haiku* | *GPT-4o* | *GPT-4o-mini* | **Gemini 1.5 Pro 002** | **Gemini 1.5 Flash 002** From 435e91fec2a0e847883d57e8854f8bc5e155c979 Mon Sep 17 00:00:00 2001 From: Benjamin Ironside Goldstein Date: Tue, 3 Dec 2024 15:14:14 -0800 Subject: [PATCH 3/4] Updates serverless version --- .../llm-performance-matrix.asciidoc | 56 ++++--------------- 1 file changed, 10 insertions(+), 46 deletions(-) diff --git a/docs/serverless/AI-for-security/llm-performance-matrix.asciidoc b/docs/serverless/AI-for-security/llm-performance-matrix.asciidoc index 3dafe9f8c1..a5803cb12b 100644 --- a/docs/serverless/AI-for-security/llm-performance-matrix.asciidoc +++ b/docs/serverless/AI-for-security/llm-performance-matrix.asciidoc @@ -6,51 +6,15 @@ This table describes the performance of various large language models (LLMs) for different use cases in {elastic-sec}, based on our internal testing. To learn more about these use cases, refer to <> or <>. -|=== -| **Feature**| **Model**| | | | | | - -| -| **Claude 3: Opus** -| **Claude 3.5: Sonnet** -| **Claude 3: Haiku** -| **GPT-4o** -| **GPT-4 Turbo** -| **Gemini 1.5 Pro** -| **Gemini 1.5 Flash** - -| **Assistant: general** -| Excellent -| Excellent -| Excellent -| Excellent -| Excellent -| Excellent -| Excellent -| **Assistant: {esql} generation** -| Great -| Great -| Poor -| Excellent -| Poor -| Good -| Poor - -| **Assistant: alert questions** -| Excellent -| Excellent -| Excellent -| Excellent -| Poor -| Excellent -| Good - -| **Attack discovery** -| Excellent -| Excellent -| Poor -| Poor -| Good -| Great -| Poor +[cols="1,1,1,1,1,1,1,1,1,1", options="header"] +|=== +| *Feature* | *Model* | | | | | | | | +| | *Claude 3: Opus*| *Claude 3.5: Sonnet v2* | *Claude 3.5: Sonnet* | *Claude 3.5: Haiku*| *Claude 3: Haiku* | *GPT-4o* | *GPT-4o-mini* | **Gemini 1.5 Pro 002** | **Gemini 1.5 Flash 002** +| *Assistant - General* | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent +| *Assistant - {esql} generation*| Excellent | Great | Excellent | Excellent | Excellent | Excellent | Great | Excellent | Poor +| *Assistant - Alert questions* | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent | Great | Excellent | Good +| *Assistant - Knowledge retrieval* | Good | Excellent | Excellent | Excellent | Excellent | Excellent | Great | Excellent | Excellent +| *Attack Discovery* | Great | Great | Excellent | Poor | Poor | Great | Poor | Excellent | Poor |=== + \ No newline at end of file From 3349241c59300d3da3e1bb32467801dcf27ac8f4 Mon Sep 17 00:00:00 2001 From: Benjamin Ironside Goldstein Date: Thu, 5 Dec 2024 09:05:59 -0800 Subject: [PATCH 4/4] Excellent to great --- docs/AI-for-security/llm-performance-matrix.asciidoc | 2 +- docs/serverless/AI-for-security/llm-performance-matrix.asciidoc | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/AI-for-security/llm-performance-matrix.asciidoc b/docs/AI-for-security/llm-performance-matrix.asciidoc index 174aa91d2c..c8f9e845c3 100644 --- a/docs/AI-for-security/llm-performance-matrix.asciidoc +++ b/docs/AI-for-security/llm-performance-matrix.asciidoc @@ -8,7 +8,7 @@ This table describes the performance of various large language models (LLMs) for | *Feature* | *Model* | | | | | | | | | | *Claude 3: Opus*| *Claude 3.5: Sonnet v2* | *Claude 3.5: Sonnet* | *Claude 3.5: Haiku*| *Claude 3: Haiku* | *GPT-4o* | *GPT-4o-mini* | **Gemini 1.5 Pro 002** | **Gemini 1.5 Flash 002** | *Assistant - General* | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent -| *Assistant - {esql} generation*| Excellent | Great | Excellent | Excellent | Excellent | Excellent | Great | Excellent | Poor +| *Assistant - {esql} generation*| Excellent | Excellent | Excellent | Excellent | Excellent | Excellent | Great | Excellent | Poor | *Assistant - Alert questions* | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent | Great | Excellent | Good | *Assistant - Knowledge retrieval* | Good | Excellent | Excellent | Excellent | Excellent | Excellent | Great | Excellent | Excellent | *Attack Discovery* | Great | Great | Excellent | Poor | Poor | Great | Poor | Excellent | Poor diff --git a/docs/serverless/AI-for-security/llm-performance-matrix.asciidoc b/docs/serverless/AI-for-security/llm-performance-matrix.asciidoc index a5803cb12b..193ea061ef 100644 --- a/docs/serverless/AI-for-security/llm-performance-matrix.asciidoc +++ b/docs/serverless/AI-for-security/llm-performance-matrix.asciidoc @@ -12,7 +12,7 @@ This table describes the performance of various large language models (LLMs) for | *Feature* | *Model* | | | | | | | | | | *Claude 3: Opus*| *Claude 3.5: Sonnet v2* | *Claude 3.5: Sonnet* | *Claude 3.5: Haiku*| *Claude 3: Haiku* | *GPT-4o* | *GPT-4o-mini* | **Gemini 1.5 Pro 002** | **Gemini 1.5 Flash 002** | *Assistant - General* | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent -| *Assistant - {esql} generation*| Excellent | Great | Excellent | Excellent | Excellent | Excellent | Great | Excellent | Poor +| *Assistant - {esql} generation*| Excellent | Excellent | Excellent | Excellent | Excellent | Excellent | Great | Excellent | Poor | *Assistant - Alert questions* | Excellent | Excellent | Excellent | Excellent | Excellent | Excellent | Great | Excellent | Good | *Assistant - Knowledge retrieval* | Good | Excellent | Excellent | Excellent | Excellent | Excellent | Great | Excellent | Excellent | *Attack Discovery* | Great | Great | Excellent | Poor | Poor | Great | Poor | Excellent | Poor