Skip to content

Conversation

@MilesHolland
Copy link
Contributor

@MilesHolland MilesHolland commented Nov 15, 2024

Initial PR to add 3 (5 with parameterization) tests to account for ALL e2e runs of evaluators. Current disables 3 evaluators due to consistent problems with playback tests.

Next steps will be to disable tests made redundant by these, and to re-enabled the 3 currently un-touched evaluators.

Also contains an unrelated nit docstring fix.

@MilesHolland MilesHolland requested a review from a team as a code owner November 15, 2024 18:42
@github-actions github-actions bot added the Evaluation Issues related to the client library for Azure AI Evaluation label Nov 15, 2024
@azure-sdk
Copy link
Collaborator

API change check

API changes are not detected in this pull request.

@MilesHolland MilesHolland merged commit 0a9b02d into main Dec 2, 2024
20 checks passed
@MilesHolland MilesHolland deleted the eval/testing/grouped-eval-testing branch December 2, 2024 17:58
def test_evaluate_multimodal(self, multi_modal_input_type, multimodal_input_selector, azure_cred, project_scope):
# Content safety is removed due to being unstable in playback mode
evaluators = {
# "content_safety" : ContentSafetyMultimodalEvaluator(credential=azure_cred, azure_ai_project=project_scope),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when you enable it. Does it fail the test or skip it?
do you have any theory or know root cause, why It must be happening?

l0lawrence pushed a commit to l0lawrence/azure-sdk-for-python that referenced this pull request Feb 19, 2025
* new tests

* remove gleu

* run black

* skip

* test ci: restore multimodal test

* test ci: restore conversation test

* change skip type

* remove skips because they don't work

* test only convo in CI

* only skip convo test

* just run multi

* 2 tests

* fix param placement

* skip multi

* just singlton

* remove cs from first test

* remove rai evals

* remove prompty evals

* all but 1 eval

* update recordings

* 2 evals

* just cs

* 2 rai service

* prompty only

* disable pf proxy

* delay

* more fixture tweaks

* more os setting

* skip windows in CI

* re-enabled all tests

* remove env setting

* re-reacord

* disable 1 test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Evaluation Issues related to the client library for Azure AI Evaluation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants