Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AUTOCUT] Integration Test failed for flow-framework: 2.12.0 tar distribution #469

Closed
opensearch-ci-bot opened this issue Jan 30, 2024 · 12 comments · Fixed by #477
Closed
Labels
autocut CI CI related issues integ-test-failure Issues related to integration test failure v2.12.0

Comments

@opensearch-ci-bot
Copy link
Collaborator

The integration test failed at distribution level for component flow-framework
Version: 2.12.0
Distribution: tar
Architecture: x64
Platform: linux

Please check the logs: https://build.ci.opensearch.org/job/integ-test/7355/display/redirect

* Test-report manifest:*
- https://ci.opensearch.org/ci/dbc/integ-test/2.12.0/9282/linux/x64/tar/test-results/7355/integ-test/test-report.yml

Note: Steps to reproduce, additional logs and other files can be found within the above test-report manifest.
Instructions of this test-report manifest can be found here.

@opensearch-ci-bot opensearch-ci-bot added autocut integ-test-failure Issues related to integration test failure untriaged v2.12.0 labels Jan 30, 2024
@opensearch-ci-bot
Copy link
Collaborator Author

The integration test failed at distribution level for component flow-framework
Version: 2.12.0
Distribution: tar
Architecture: arm64
Platform: linux

Please check the logs: https://build.ci.opensearch.org/job/integ-test/7360/display/redirect

* Test-report manifest:*
- https://ci.opensearch.org/ci/dbc/integ-test/2.12.0/9282/linux/arm64/tar/test-results/7360/integ-test/test-report.yml

Note: Steps to reproduce, additional logs and other files can be found within the above test-report manifest.
Instructions of this test-report manifest can be found here.

@dbwiddis
Copy link
Member

dbwiddis commented Jan 30, 2024

It's the security enabled one that failed:

2024-01-30 01:02:15 ERROR    | flow-framework       | with-security        | FAIL  |
2024-01-30 01:02:15 INFO     | flow-framework       | without-security     | PASS  |

Complete log here

@dbwiddis
Copy link
Member

Suite: Test class org.opensearch.flowframework.rest.FlowFrameworkRestApiIT
  2> REPRODUCE WITH: ./gradlew ':integTest' --tests "org.opensearch.flowframework.rest.FlowFrameworkRestApiIT.testCreateAndProvisionLocalModelWorkflow" -Dtests.seed=61E215A886EB23E5 -Dtests.security.manager=false -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=sr-Latn-RS -Dtests.timezone=Africa/Khartoum -Druntime.java=21
  2> org.opensearch.client.ResponseException: method [DELETE], host [https://localhost:9200], URI [/.plugins-ml-model-group], status line [HTTP/1.1 403 Forbidden]
    {"error":{"root_cause":[{"type":"security_exception","reason":"no permissions for [] and User [name=admin, backend_roles=[admin], requestedTenant=null]"}],"type":"security_exception","reason":"no permissions for [] and User [name=admin, backend_roles=[admin], requestedTenant=null]"},"status":403}
        at __randomizedtesting.SeedInfo.seed([61E215A886EB23E5:EAD7DB22C0A426]:0)
        at app//org.opensearch.client.RestClient.convertResponse(RestClient.java:376)
        at app//org.opensearch.client.RestClient.performRequest(RestClient.java:346)
        at app//org.opensearch.client.RestClient.performRequest(RestClient.java:321)
        at app//org.opensearch.flowframework.FlowFrameworkRestTestCase.wipeAllODFEIndices(FlowFrameworkRestTestCase.java:291)

@dbwiddis
Copy link
Member

Also tons of threads open, all I/O related. Need to verify we're closing the client when done with it:

  2> SEVERE: 204 threads leaked from SUITE scope at org.opensearch.flowframework.rest.FlowFrameworkRestApiIT: 
  2>    1) Thread[id=175, name=I/O dispatcher 116, state=RUNNABLE, group=TGRP-FlowFrameworkRestApiIT]
  2>         at java.base/sun.nio.ch.EPoll.wait(Native Method)
  2>         at java.base/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:121)
  2>         at java.base/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:130)
  2>         at java.base/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:142)
  2>         at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:255)
  2>         at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
  2>         at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
  2>         at java.base/java.lang.Thread.run(Thread.java:1583)

@dbwiddis
Copy link
Member

dbwiddis commented Jan 30, 2024

Investigation so far:

  • SecureRestApiIT is working fine
  • RestApiIT is failing when run under security, when trying to wipeAllODFEIndices()

In our build.gradle it looks like we intended to not run RestApiIT.

Under integTest:

flow-framework/build.gradle

Lines 238 to 244 in a812e51

// Include only secure integration tests in security enabled clusters
if (System.getProperty("security.enabled") != null && System.getProperty("security.enabled") == "true") {
filter {
includeTestsMatching "org.opensearch.flowframework.rest.FlowFrameworkSecureRestApiIT"
excludeTestsMatching "org.opensearch.flowframework.rest.FlowFrameworkRestApiIT"
}
}

And under integTestRemote:

flow-framework/build.gradle

Lines 400 to 406 in a812e51

// Include only secure integration tests in security enabled clusters
if (System.getProperty("https") != null && System.getProperty("https") == "true") {
filter {
includeTestsMatching "org.opensearch.flowframework.rest.FlowFrameworkSecureRestApiIT"
excludeTestsMatching "org.opensearch.flowframework.rest.FlowFrameworkRestApiIT"
}
}

But the class being called is FlowFrameworkRestTestCase which is the superclass of both, and is thus called in both tests.

@dbwiddis
Copy link
Member

dbwiddis commented Jan 30, 2024

The wipeAllODFEIndices() uses the adminClient() which works without security but not with. The security version has a fullAccessClient().

Possible fix: Use the isHttps() conditional to pick the correct client which is set up in this class.

Also double check that all clients are closed at the end of the tests, currently I see read only and full closed but not all the other ones. It's odd that we create the clients in the superclass but tear them down in the subclass. Let's move setup/teardown pairs into the same classes.

FYI/WDYT? @joshpalis @owaiskazi19

@dbwiddis dbwiddis added CI CI related issues and removed untriaged labels Jan 30, 2024
@joshpalis
Copy link
Member

if (System.getProperty("security.enabled") != null && System.getProperty("security.enabled") == "true") should be changed to if (System.getProperty("https") != null && System.getProperty("https") == "true") { . Non-security enabled integration tests should be excluded when running with security. I suspect the https flag is being set when the distribution build runs

@dbwiddis
Copy link
Member

Sounds good. Looks like code usese isHttps() which is:

    protected boolean isHttps() {
        return Optional.ofNullable(System.getProperty("https")).map("true"::equalsIgnoreCase).orElse(false);
    }

@opensearch-ci-bot
Copy link
Collaborator Author

The integration test failed at distribution level for component flow-framework
Version: 2.12.0
Distribution: tar
Architecture: x64
Platform: linux

Please check the logs: https://build.ci.opensearch.org/job/integ-test/7384/display/redirect

* Test-report manifest:*
- https://ci.opensearch.org/ci/dbc/integ-test/2.12.0/9294/linux/x64/tar/test-results/7384/integ-test/test-report.yml

Note: Steps to reproduce, additional logs and other files can be found within the above test-report manifest.
Instructions of this test-report manifest can be found here.

@opensearch-ci-bot
Copy link
Collaborator Author

The integration test failed at distribution level for component flow-framework
Version: 2.12.0
Distribution: tar
Architecture: arm64
Platform: linux

Please check the logs: https://build.ci.opensearch.org/job/integ-test/7385/display/redirect

* Test-report manifest:*
- https://ci.opensearch.org/ci/dbc/integ-test/2.12.0/9294/linux/arm64/tar/test-results/7385/integ-test/test-report.yml

Note: Steps to reproduce, additional logs and other files can be found within the above test-report manifest.
Instructions of this test-report manifest can be found here.

@dbwiddis
Copy link
Member

dbwiddis commented Jan 31, 2024

Still failing the security tests

2024-01-31 02:27:01 ERROR    | flow-framework       | with-security        | FAIL  |
2024-01-31 02:27:01 INFO     | flow-framework       | without-security     | PASS  |

Same spot

        at __randomizedtesting.SeedInfo.seed([F7B1E9009ADEFEF9:D60B266E0556BFAF]:0)
        at app//org.opensearch.client.RestClient.convertResponse(RestClient.java:376)
        at app//org.opensearch.client.RestClient.performRequest(RestClient.java:346)
        at app//org.opensearch.client.RestClient.performRequest(RestClient.java:321)
        at app//org.opensearch.flowframework.FlowFrameworkRestTestCase.wipeAllODFEIndices(FlowFrameworkRestTestCase.java:291)

@jpalis I think you need to replace adminClient() with fullAccessClient() in that method when isHttps() is true.

My proposed solution doesn't work, but clearly adminClient() is wrong here.

@opensearch-ci-bot
Copy link
Collaborator Author

Closing the issue as the Integration Test passed for flow-framework
Version: 2.12.0
Distribution: tar
Architecture: x64
Platform: linux

Please check the logs: https://build.ci.opensearch.org/job/integ-test/7420/display/redirect

*

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
autocut CI CI related issues integ-test-failure Issues related to integration test failure v2.12.0
Projects
None yet
3 participants