New Sanitizer not being properly applied to AML SDK e2e tests #35930

MilesHolland · 2024-06-05T19:08:26Z

azure-ai-ml:
dev:
windows 11:
3.10.8:

Describe the bug

There seem to be 2 separate sanitizers in use for AML SDK tests, one when recording tests, and one when playing them back. These sanitizers do not align, which causes failures in playback mode when the recordings do not match expected values. As an example, I ran a test today (6/5/2024), had it run successfully, pushed the recording, then re-ran the test in playback mode and got the following failure:

Unable to find a record for the request PUT https://Sanitized.azure.com/subscriptions/00000000-0000-0000-0000-000000000/resourcegroups/00000/providers/Microsoft.Resources/deployments/e2etest_test_863132474581-7664666?api-version=2020-06-01 Uri doesn't match: request <https://Sanitized.azure.com/subscriptions/00000000-0000-0000-0000-000000000/resourcegroups/00000/providers/Microsoft.Resources/deployments/e2etest_test_863132474581-7664666?api-version=2020-06-01> record <https://Sanitized.azure.com/subscriptions/<actual subscription id>/resourcegroups/<actual RG name>/providers/Microsoft.Resources/deployments/e2etest_test_863132474581-7664666?api-version=2020-06-01> Header differences: Body differences: Request and record bodies do not match at index 31568: request: "{"value":"00000"},"description" record: "{"value":"<actual RG name>"},"

To Reproduce
Steps to reproduce the behavior:

Setup AML SDK test infra, making sure that a default subscription ID and workspace are set.
CD to the following directory with the azure-sdk-for-python repo: sdk/ml/azure-ai-ml/
Set e2e tests to live mode and run the following test: 'pytest tests/component/ -k test_automl_component`
CD to the top directory of the repo and push the new recording of this test with python ./scripts/manage_recordings.py push -p sdk/ml/azure-ai-ml/assets.json
Set e2e tests to playback mode and rerun the test.

Expected behavior
I expect the test when run in playback mode with the new recording to simply pass, assuming the live test passed. I do not expect the sanitizers to fail for seemingly refusing to perform their primary purpose.

Screenshots
Screenshots not provided to avoid including private info.

Additional context
n/a

The text was updated successfully, but these errors were encountered:

mccoyp · 2024-06-11T18:03:11Z

Hi @MilesHolland, thank you for the report. My first thought is that the tests could be missing a sanitizer that scrubs these values during recording, since you're seeing false values in playback requests but real values in recordings. This usually indicates that false values are correctly being provided to tests in playback, but that a sanitizer needs to be added that aligns with these values.

It's also worth ensuring that your branch is up to date with the latest main branch so that your tests use the latest version of the test proxy.

Since there have been central changes to sanitizing recently, it's also possible that this is caused by something outside of ML's setup. I'll try to reproduce today and get back to you as soon as possible.

mccoyp · 2024-06-11T23:43:36Z

I was able to reproduce this issue, and it does look like test proxy behavior is causing the failure. As you may have noticed, this test was already identified as problematic while adapting ML tests to the recent sanitizer overhaul; the test is marked as live-only in main to avoid this playback error:

azure-sdk-for-python/sdk/ml/azure-ai-ml/tests/component/e2etests/test_component.py

Lines 209 to 210 in a61a8e2

    
           @pytest.mark.live_test_only("Needs re-recording to work with new test proxy sanitizers") 
        
           def test_automl_component(self, client: MLClient, registry_client: MLClient, randstr: Callable[[str], str]) -> None:

As for what's actually causing the failure, it looks like the test proxy is failing to apply some sanitizers during recording. I added a breakpoint to the start of the test and confirmed that sanitizers are being registered for the subscription ID, resource group, etc. that are still showing up in recordings. I confirmed at https://localhost:5001/Info/Active that the the test proxy had string sanitizers registered for each of these values as expected, but after recording, the sanitizers evidently weren't being applied despite this.

The only sanitizer that seems to be applying to URIs in this test's recording is AZSDK4001, a common sanitizer that's applied universally by the test proxy to replace host names with Sanitized. My thought was that having multiple sanitizers targeting the same URI could have been causing a conflict, but removing the sanitizer still didn't cause the ML sanitizers to apply correctly.

I'll have to continue investigating with @scbedd to sort out how this is happening, but for the time being I would recommend keeping the test as live-only so it doesn't cause playback failures.

scbedd · 2024-07-11T01:14:15Z

@mccoyp Agree I'm going to send you a meeting to look at this for 30 minutes in a proxy debugging session.

github-actions bot added the needs-triage This is a new issue that needs to be triaged to the appropriate team. label Jun 5, 2024

xiangyan99 assigned mccoyp Jun 5, 2024

xiangyan99 added test-proxy Anything relating to test-proxy requests or issues. and removed needs-triage This is a new issue that needs to be triaged to the appropriate team. labels Jun 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Sanitizer not being properly applied to AML SDK e2e tests #35930

New Sanitizer not being properly applied to AML SDK e2e tests #35930

MilesHolland commented Jun 5, 2024

mccoyp commented Jun 11, 2024

mccoyp commented Jun 11, 2024

scbedd commented Jul 11, 2024

New Sanitizer not being properly applied to AML SDK e2e tests #35930

New Sanitizer not being properly applied to AML SDK e2e tests #35930

Comments

MilesHolland commented Jun 5, 2024

mccoyp commented Jun 11, 2024

mccoyp commented Jun 11, 2024

scbedd commented Jul 11, 2024