Confidence score improvements for command selection#92
Merged
nishtha489 merged 25 commits intomainfrom Oct 6, 2025
Merged
Conversation
16 tasks
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR improves command descriptions in the Azure Load Testing MCP tool to enhance model confidence scores during command selection. The changes focus on making command descriptions more specific and detailed to help the model better distinguish between similar operations.
- Updated the main Load Testing service description with detailed capability explanations and usage guidance
- Enhanced individual command descriptions with clearer distinctions between test creation, test run management, and resource operations
- Added explicit examples and parameter information to reduce ambiguity between related commands
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| LoadTestingSetup.cs | Updated main service description with comprehensive details about Load Testing capabilities and usage scenarios |
| TestRunUpdateCommand.cs | Added clarification that this only updates test run metadata, not test configuration or resources |
| TestRunListCommand.cs | Enhanced description to clearly specify this lists test runs for a given test ID with examples |
| TestRunGetCommand.cs | Clarified this retrieves details for a specific test run ID, not test configuration |
| TestRunCreateCommand.cs | Simplified description to emphasize this only creates test runs for existing tests |
| TestResourceCreateCommand.cs | Clarified this only creates Azure resources, not test plans or runs |
| TestGetCommand.cs | Enhanced description to distinguish between test configuration retrieval vs test run data |
| TestCreateCommand.cs | Added detailed explanation with examples to clarify this creates test configurations, not runs or resources |
…tRunUpdateCommand.cs Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…eateCommand.cs Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
g2vinay
reviewed
Sep 5, 2025
Contributor
g2vinay
left a comment
There was a problem hiding this comment.
Add a Changelog entry for the change, rest looks fine
added 3 commits
September 6, 2025 01:17
…shtha/confidence-improvements
…b.com/microsoft/mcp into users/nishtha/confidence-improvements
g2vinay
reviewed
Sep 5, 2025
g2vinay
reviewed
Sep 5, 2025
g2vinay
reviewed
Sep 5, 2025
g2vinay
reviewed
Sep 5, 2025
feiskyer
pushed a commit
to feiskyer/microsoft-mcp
that referenced
this pull request
Sep 8, 2025
added 2 commits
September 8, 2025 11:59
…shtha/confidence-improvements
g2vinay
approved these changes
Sep 8, 2025
jongio
requested changes
Sep 8, 2025
added 3 commits
September 22, 2025 21:59
…shtha/confidence-improvements
…shtha/confidence-improvements
…shtha/confidence-improvements
added 2 commits
September 23, 2025 18:35
g2vinay
approved these changes
Sep 23, 2025
Contributor
|
@jongio requested changes, lets wait on his approval. |
jongio
approved these changes
Oct 2, 2025
colbytimm
pushed a commit
to colbytimm/microsoft-mcp
that referenced
this pull request
Dec 8, 2025
* confidence improvements * Update tools/Azure.Mcp.Tools.LoadTesting/src/Commands/LoadTestRun/TestRunUpdateCommand.cs Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update tools/Azure.Mcp.Tools.LoadTesting/src/Commands/LoadTest/TestCreateCommand.cs Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Nishtha . <nishtha@microsoft.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
The PR includes improvements in the commands description to help improve the confidence score while model selection.GitHub issue number?
Associated issue - https://github.com/Azure/azure-mcp/issues/832Result scores for confidence
Prompt: Create a basic URL test using the following endpoint URL that runs for 30 minutes with 45 virtual users. The test name is with the test id and the load testing resource is in the resource group in my subscription
Expected tool: azmcp_loadtesting_test_create
0.585388 azmcp_loadtesting_test_create *** EXPECTED ***
0.531331 azmcp_loadtesting_testresource_create
0.508690 azmcp_loadtesting_testrun_create
Prompt: Get the load test with id in the load test resource in resource group
Expected tool: azmcp_loadtesting_test_get
0.642258 azmcp_loadtesting_test_get *** EXPECTED ***
0.608693 azmcp_loadtesting_testresource_list
0.574354 azmcp_loadtesting_testresource_create
Prompt: Create a load test resource in the resource group in my subscription
Expected tool: azmcp_loadtesting_testresource_create
0.717674 azmcp_loadtesting_testresource_create *** EXPECTED ***
0.596680 azmcp_loadtesting_testresource_list
0.514720 azmcp_loadtesting_test_create
Prompt: List all load testing resources in the resource group in my subscription
Expected tool: azmcp_loadtesting_testresource_list
0.738027 azmcp_loadtesting_testresource_list *** EXPECTED ***
0.591857 azmcp_loadtesting_testresource_create
0.577408 azmcp_group_list
Prompt: Create a test run using the id for test in the load testing resource in resource group . Use the name of test run and description as
Expected tool: azmcp_loadtesting_testrun_create
0.621803 azmcp_loadtesting_testrun_create *** EXPECTED ***
0.592748 azmcp_loadtesting_testresource_create
0.540789 azmcp_loadtesting_test_create
Prompt: Get the load test run with id in the load test resource in resource group
Expected tool: azmcp_loadtesting_testrun_get
0.625461 azmcp_loadtesting_test_get
0.603773 azmcp_loadtesting_testrun_get *** EXPECTED ***
0.568474 azmcp_loadtesting_testresource_list
Prompt: Get all the load test runs for the test with id in the load test resource in resource group
Expected tool: azmcp_loadtesting_testrun_list
0.615977 azmcp_loadtesting_testrun_list *** EXPECTED ***
0.606058 azmcp_loadtesting_test_get
0.569145 azmcp_loadtesting_testrun_get
Prompt: Update a test run display name as for the id for test in the load testing resource in resource group .
Expected tool: azmcp_loadtesting_testrun_update
0.659812 azmcp_loadtesting_testrun_update *** EXPECTED ***
0.509199 azmcp_loadtesting_testrun_create
0.454745 azmcp_loadtesting_testrun_get
Pre-merge Checklist
CHANGELOG.mdfor product changes (features, bug fixes, UI/UX, updated dependencies).\eng\common\spelling\Invoke-Cspell.ps1README.mddocumentation/docs/azmcp-commands.md/docs/e2eTestPrompts.mdToolDescriptionEvaluatorand obtained a score of0.4or more and a top 3 ranking for all related test promptscrypto mining, spam, data exfiltration, etc.)/azp run azure - mcpto run Live Test Pipeline