Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the test stable #5346

Merged
merged 1 commit into from
Nov 1, 2022
Merged

Make the test stable #5346

merged 1 commit into from
Nov 1, 2022

Conversation

style95
Copy link
Member

@style95 style95 commented Nov 1, 2022

Description

This is to make the test stable.

Related issue and scope

  • I opened an issue to propose and discuss this change (#????)

My changes affect the following components

  • API
  • Controller
  • Message Bus (e.g., Kafka)
  • Loadbalancer
  • Scheduler
  • Invoker
  • Intrinsic actions (e.g., sequences, conductors)
  • Data stores (e.g., CouchDB)
  • Tests
  • Deployment
  • CLI
  • General tooling
  • Documentation

Types of changes

  • Bug fix (generally a non-breaking change which closes an issue).
  • Enhancement or new feature (adds new functionality).
  • Breaking change (a bug fix or enhancement which changes existing behavior).

Checklist:

  • I signed an Apache CLA.
  • I reviewed the style guides and followed the recommendations (Travis CI will check :).
  • I added tests to cover my changes.
  • My changes require further changes to the documentation.
  • I updated the documentation where necessary.

@@ -1575,6 +1575,9 @@ class FunctionPullingContainerProxyTests

machine ! Initialize(invocationNamespace.asString, fqn, action, schedulerHost, rpcPort, messageTransId)
probe.expectMsg(Transition(machine, Uninitialized, CreatingClient))
awaitAssert {
machine.stateData shouldBe a[ContainerCreatedData]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When ClientCreationCompleted is sent to the proxy before the data is updated, it causes a kind of cycle and the test keeps running.
https://github.com/apache/openwhisk/blob/master/core/invoker/src/main/scala/org/apache/openwhisk/core/containerpool/v2/FunctionPullingContainerProxy.scala#L342

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this test instability introduced recently? I know this is just a test, but I'm wondering if there could be an issue with this #5333 since I'm seeing weird orphaned non-existent containers in etcd

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me explain the details.

    case Event(ClientCreationCompleted(proxy), _: NonexistentData) =>
      self ! ClientCreationCompleted(proxy.orElse(Some(sender())))
      stay()

https://github.com/apache/openwhisk/blob/master/core/invoker/src/main/scala/org/apache/openwhisk/core/containerpool/v2/FunctionPullingContainerProxy.scala#L342

According to this logic, if the proxy receives ClientCreationCompleted before the data becomes the ContainerCreatedData that comes from here, it repeatedly send the message to itself.
And it creates an akka message cycle.

Generally, it takes some time to create and initialize the activation client proxy.
It takes more than the time for a container proxy to receive the ContainerCreatedData.
So I believe there was no issue with this.

But in the test code, there is no client proxy initialization and it sent ClientCreationCompleted as soon as the proxy status changes. And it made a cycle according to the timing.

@bdoyle0182
Copy link
Contributor

LGTM thanks for fixing this, seems like this is why I've been having trouble with the scheduler tests.

@@ -124,7 +124,9 @@ class ActivationClientProxy(
stay()

case Event(e: RescheduleActivation, client: Client) =>
logging.info(this, s"[${containerId.asString}] got a reschedule message ${e.msg.activationId} for action: ${e.msg.action}")
logging.info(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea how it passed the scalaFmt in the previous PR.

@@ -339,7 +339,11 @@ class FunctionPullingContainerProxy(

// wait for container creation when cold start
case Event(ClientCreationCompleted(proxy), _: NonexistentData) =>
self ! ClientCreationCompleted(proxy.orElse(Some(sender())))
akka.pattern.after(3.milliseconds, actorSystem.scheduler) {
Copy link
Member Author

@style95 style95 Nov 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if it generally does not happen in production, it would be great to add a small delay to avoid the cycle as a last resort.

@codecov-commenter
Copy link

codecov-commenter commented Nov 1, 2022

Codecov Report

Merging #5346 (88887d5) into master (07c9202) will increase coverage by 37.48%.
The diff coverage is 55.55%.

@@             Coverage Diff             @@
##           master    #5346       +/-   ##
===========================================
+ Coverage   38.96%   76.45%   +37.48%     
===========================================
  Files         240      240               
  Lines       14378    14383        +5     
  Branches      614      614               
===========================================
+ Hits         5602    10996     +5394     
+ Misses       8776     3387     -5389     
Impacted Files Coverage Δ
.../core/containerpool/v2/ActivationClientProxy.scala 78.08% <33.33%> (+78.08%) ⬆️
...ntainerpool/v2/FunctionPullingContainerProxy.scala 78.67% <100.00%> (+78.67%) ⬆️
...pache/openwhisk/core/invoker/InvokerReactive.scala 53.90% <0.00%> (-17.97%) ⬇️
.../apache/openwhisk/core/controller/Controller.scala 83.47% <0.00%> (+0.82%) ⬆️
.../org/apache/openwhisk/core/entity/EntityPath.scala 100.00% <0.00%> (+1.88%) ⬆️
...apache/openwhisk/core/entitlement/Collection.scala 88.37% <0.00%> (+2.32%) ⬆️
...la/org/apache/openwhisk/core/invoker/Invoker.scala 72.50% <0.00%> (+2.50%) ⬆️
.../org/apache/openwhisk/http/PoolingRestClient.scala 90.90% <0.00%> (+3.03%) ⬆️
...hisk/core/controller/actions/SequenceActions.scala 91.86% <0.00%> (+3.25%) ⬆️
...sk/core/containerpool/docker/DockerContainer.scala 95.60% <0.00%> (+3.29%) ⬆️
... and 154 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@style95 style95 merged commit 0f4b0c2 into apache:master Nov 1, 2022
msciabarra pushed a commit to nuvolaris/openwhisk that referenced this pull request Nov 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants