[Obs AI Assistant] Add Llama 4 Maverick model ratings to the LLM performance matrix #3825

yuliia-fryshko · 2025-11-05T18:12:02Z

Closes https://github.com/elastic/obs-ai-assistant-team/issues/373

This PR adds the evaluation ratings for Llama 4 Maverick based on evaluation results.

I had to run the evaluation a couple of times, since the first run had some rate limit errors. It turns out that OpenRouter routes requests across multiple providers, each with different context limits - I added more details here.

In the latest run, I was lucky and didn’t get any errors.

github-actions · 2025-11-05T18:16:20Z

🔍 Preview links for changed docs

solutions/observability/llm-performance-matrix.md

viduni94 · 2025-11-05T18:16:32Z

solutions/observability/llm-performance-matrix.md

 | OpenAI | **gpt-oss-20b** | Poor | Poor | Great | Poor | Good | Poor | Good | Good |
 | OpenAI | **gpt-oss-120b** | Excellent | Poor | Great | Great | Excellent | Good | Good | Excellent |
 | Meta | **Llama-3.3-70B-Instruct** | Excellent | Good | Great | Excellent | Excellent | Good | Good | Excellent |
+| Meta | **Llama-4-Maverick-17B** | Good | Good | Good | Excellent | Excellent | Good | Good | Great |


viduni94 · 2025-11-05T18:42:09Z

solutions/observability/llm-performance-matrix.md

 | OpenAI | **gpt-oss-20b** | Poor | Poor | Great | Poor | Good | Poor | Good | Good |
 | OpenAI | **gpt-oss-120b** | Excellent | Poor | Great | Great | Excellent | Good | Good | Excellent |
 | Meta | **Llama-3.3-70B-Instruct** | Excellent | Good | Great | Excellent | Excellent | Good | Good | Excellent |
+| Meta | **Llama-4-Maverick-17B** | Good | Good | Good | Excellent | Excellent | Good | Good | Great |


Most of the results are very different to llama 3.3 - Did you run into the token limit issue a lot ?

viduni94 · 2025-11-05T19:14:29Z

solutions/observability/llm-performance-matrix.md

 | OpenAI | **gpt-oss-20b** | Poor | Poor | Great | Poor | Good | Poor | Good | Good |
 | OpenAI | **gpt-oss-120b** | Excellent | Poor | Great | Great | Excellent | Good | Good | Excellent |
 | Meta | **Llama-3.3-70B-Instruct** | Excellent | Good | Great | Excellent | Excellent | Good | Good | Excellent |
+| Meta | **Llama-4-Maverick-17B** | Good | Good | Good | Excellent | Excellent | Good | Good | Great |


The screenshot says Excellent for contextual insights, but noticed that it's specified as Good here

viduni94

LGTM 🎉
Thanks @yuliia-fryshko

…rmance

pmoust

lgtm - thanks both for the work here and the reviews

…rmance

added llama 4 Maveric

ced6365

yuliia-fryshko requested a review from a team as a code owner November 5, 2025 18:12

yuliia-fryshko self-assigned this Nov 5, 2025

fixed typo

fab1ea6

viduni94 reviewed Nov 5, 2025

View reviewed changes

yuliia-fryshko added 2 commits November 6, 2025 16:36

updated the performance of llama 4

65ea634

edited model name for the consistensy

7cb8bfb

viduni94 approved these changes Nov 6, 2025

View reviewed changes

viduni94 changed the title ~~[Obs AI Assistant] Add Llama 4 Maveric model ratings to the LLM performance matrix~~ [Obs AI Assistant] Add Llama 4 Maverick model ratings to the LLM performance matrix Nov 6, 2025

Merge branch 'main' into obs-ai-assistant-added-llama-4-maveric-perfo…

9ecdc7d

…rmance

mdbirnstiehl approved these changes Nov 6, 2025

View reviewed changes

pmoust approved these changes Nov 6, 2025

View reviewed changes

Merge branch 'main' into obs-ai-assistant-added-llama-4-maveric-perfo…

597506c

…rmance

yuliia-fryshko enabled auto-merge (squash) November 6, 2025 16:11

yuliia-fryshko merged commit ca595c0 into elastic:main Nov 6, 2025
5 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Obs AI Assistant] Add Llama 4 Maverick model ratings to the LLM performance matrix #3825

[Obs AI Assistant] Add Llama 4 Maverick model ratings to the LLM performance matrix #3825

Uh oh!

yuliia-fryshko commented Nov 5, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 5, 2025 •

edited

Loading

Uh oh!

viduni94 Nov 5, 2025 •

edited

Loading

Uh oh!

viduni94 Nov 5, 2025 •

edited

Loading

Uh oh!

viduni94 Nov 5, 2025

Uh oh!

viduni94 left a comment

Uh oh!

pmoust left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	\| Meta \| Llama-4-Maverick-17B \| Good \| Good \| Good \| Excellent \| Excellent \| Good \| Good \| Great \|
	\| Meta \| Llama-4-Maverick-17B-128E-Instruct \| Good \| Good \| Good \| Excellent \| Excellent \| Good \| Good \| Great \|

[Obs AI Assistant] Add Llama 4 Maverick model ratings to the LLM performance matrix #3825

[Obs AI Assistant] Add Llama 4 Maverick model ratings to the LLM performance matrix #3825

Uh oh!

Conversation

yuliia-fryshko commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔍 Preview links for changed docs

Uh oh!

viduni94 Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

viduni94 Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

viduni94 Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

viduni94 left a comment

Choose a reason for hiding this comment

Uh oh!

pmoust left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yuliia-fryshko commented Nov 5, 2025 •

edited

Loading

github-actions bot commented Nov 5, 2025 •

edited

Loading

viduni94 Nov 5, 2025 •

edited

Loading

viduni94 Nov 5, 2025 •

edited

Loading