Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When running Invoker on "Bare-Metal" it cannot read log files #3195

Closed
mcdan opened this issue Jan 17, 2018 · 9 comments
Closed

When running Invoker on "Bare-Metal" it cannot read log files #3195

mcdan opened this issue Jan 17, 2018 · 9 comments

Comments

@mcdan
Copy link
Member

mcdan commented Jan 17, 2018

  • Using docker-compose as deployment
  • Running the invoker directly in the OS ( ubuntu in this case )

Steps to reproduce the issue:

  1. Deploy openwhisk with docker-compose
  2. Build the invoker
  3. Stop invoker in compose
  4. Start built invoker as root user ( needed to actually read the log files )

Open Whisk cannot read the log files as /containers is hardcoded into the invoker's docker container pool:
https://github.com/apache/incubator-openwhisk/blob/a6a782f7d3a409c542c157e4be46074ef69a67ce/core/invoker/src/main/scala/whisk/core/containerpool/docker/DockerContainerFactory.scala#L43
and
https://github.com/apache/incubator-openwhisk/blob/a6a782f7d3a409c542c157e4be46074ef69a67ce/core/invoker/src/main/scala/whisk/core/containerpool/docker/DockerClientWithFileAccess.scala#L41-L42

Additional information you deem important:

  • There is also a swallowed exception that happens if you run the Invoker as a non-root user, still trying to track that one down.
mcdan added a commit to adobe-apiplatform/incubator-openwhisk that referenced this issue Jan 17, 2018
@markusthoemmes
Copy link
Contributor

Containers is hardcoded, right, but the only thing that matters is that you mount the proper local container directory to that directory in the container. Why would you need to configure that?

@mcdan
Copy link
Member Author

mcdan commented Jan 17, 2018

What if you're not running the invoker as a container? Why can't you just run it on the server directly?

@markusthoemmes
Copy link
Contributor

Ahhhhh, sorry. I misread your intro, my bad. Right in that case you'll need to tweak that value. Out of curiosity (and not necessarily related to your issue): If you're using docker compose in general and also install docker on the machine itself, why do you choose to run the invoker on the machine "natively".

@mcdan
Copy link
Member Author

mcdan commented Jan 17, 2018 via email

@chetanmeh
Copy link
Member

chetanmeh commented Jan 18, 2018

There is also a swallowed exception that happens if you run the Invoker as a non-root user, still trying to track that one down.

@mcdan I tried to track that down and it appears to be coming in collectLogs calling itself. Following rough patch demonstrates and fixes that.

The issue appears to during materializing of the source in call to collectLogs -> -> DockerToActivationFileLogStore#collectLogs -> logs.runWith. Here when the Source is materialized it results in an exception which is not wrapped in Future so current failure handling gets tripped

Index: core/invoker/src/main/scala/whisk/core/containerpool/ContainerProxy.scala
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- core/invoker/src/main/scala/whisk/core/containerpool/ContainerProxy.scala	(revision 925500cf8d34bb75bebd077ea057af236eba0b01)
+++ core/invoker/src/main/scala/whisk/core/containerpool/ContainerProxy.scala	(date 1516298930000)
@@ -38,6 +38,8 @@
 import whisk.core.entity.ExecManifest.ImageName
 import whisk.http.Messages
 
+import scala.util.Try
+
 // States
 sealed trait ContainerState
 case object Uninitialized extends ContainerState
@@ -380,18 +382,25 @@
     val activationWithLogs: Future[Either[ActivationLogReadingError, WhiskActivation]] = activation
       .flatMap { activation =>
         val start = tid.started(this, LoggingMarkers.INVOKER_COLLECT_LOGS)
-        collectLogs(tid, job.msg.user, activation, container, job.action)
-          .andThen {
-            case Success(_) => tid.finished(this, start)
-            case Failure(t) => tid.failed(this, start, s"reading logs failed: $t")
-          }
-          .map(logs => Right(activation.withLogs(logs)))
-          .recover {
-            case LogCollectingException(logs) =>
-              Left(ActivationLogReadingError(activation.withLogs(logs)))
-            case _ =>
-              Left(ActivationLogReadingError(activation.withLogs(ActivationLogs(Vector(Messages.logFailure)))))
-          }
+        Try(
+          collectLogs(tid, job.msg.user, activation, container, job.action)
+            .andThen {
+              case Success(_) => tid.finished(this, start)
+              case Failure(t) => tid.failed(this, start, s"reading logs failed: $t")
+            }
+            .map(logs => Right(activation.withLogs(logs)))
+            .recover {
+              case LogCollectingException(logs) =>
+                Left(ActivationLogReadingError(activation.withLogs(logs)))
+              case _ =>
+                Left(ActivationLogReadingError(activation.withLogs(ActivationLogs(Vector(Messages.logFailure)))))
+            }) match {
+          case Success(e) => e
+          case Failure(t) =>
+            tid.failed(this, start, s"reading logs failed: $t")
+            Future.successful(
+              Left(ActivationLogReadingError(activation.withLogs(ActivationLogs(Vector(Messages.logFailure))))))
+        }
       }
 
     // Storing the record. Entirely asynchronous and not waited upon.
Index: tests/src/test/scala/whisk/core/containerpool/test/ContainerProxyTests.scala
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- tests/src/test/scala/whisk/core/containerpool/test/ContainerProxyTests.scala	(revision 925500cf8d34bb75bebd077ea057af236eba0b01)
+++ tests/src/test/scala/whisk/core/containerpool/test/ContainerProxyTests.scala	(date 1516299060000)
@@ -507,6 +507,41 @@
     }
   }
 
+  it should "complete the transaction and destroy the container if log reading failed terminally - v2" in {
+    val container = new TestContainer
+    val factory = createFactory(Future.successful(container))
+    val acker = createAcker
+    val store = createStore
+    val collector = LoggedFunction {
+      (transid: TransactionId,
+       user: Identity,
+       activation: WhiskActivation,
+       container: Container,
+       action: ExecutableWhiskAction) =>
+        throw new Exception
+    }
+
+    val machine =
+      childActorOf(ContainerProxy.props(factory, acker, store, collector, InstanceId(0), pauseGrace = timeout))
+    registerCallback(machine)
+    machine ! Run(action, message)
+    expectMsg(Transition(machine, Uninitialized, Running))
+    expectMsg(ContainerRemoved) // The message is sent as soon as the container decides to destroy itself
+    expectMsg(Transition(machine, Running, Removing))
+
+    awaitAssert {
+      factory.calls should have size 1
+      container.initializeCount shouldBe 1
+      container.runCount shouldBe 1
+      collector.calls should have size 1
+      container.destroyCount shouldBe 1
+      acker.calls should have size 1
+      acker.calls(0)._2.response shouldBe ActivationResponse.success()
+      store.calls should have size 1
+      store.calls(0)._2.logs shouldBe ActivationLogs(Vector(Messages.logFailure))
+    }
+  }
+
   it should "resend the job to the parent if resuming a container fails" in within(timeout) {
     val container = new TestContainer {
       override def resume()(implicit transid: TransactionId) = {

With this following log entry is seen

[2018-01-18T23:40:19.846Z] INFO  [pool-2-thread-13] akka.actor.ActorSystemImpl  - [#tid_49] [ContainerProxy]  [marker:invoker_collectLogs_start:669] 
[2018-01-18T23:40:19.848Z] WARN  [pool-2-thread-13] akka.actor.ActorSystemImpl  - [#tid_49] [ContainerProxy] reading logs failed: java.lang.IllegalArgumentException: Path '/project/workdir//data/openwhisk/invoker/containers/afc36bff93783c2dcd125743e7349b16e754b8d3fb8ff87e525eba94fb03dc32/afc36bff93783c2dcd125743e7349b16e754b8d3fb8ff87e525eba94fb03dc32-json.log' does not exist [marker:invoker_collectLogs_error:670:1] 
[2018-01-18T23:40:19.849Z] INFO  [pool-2-thread-13] whisk.core.invoker.Invoker$  - [#tid_49] [InvokerReactive] recording the activation result to the data store 

mcdan added a commit to adobe-apiplatform/incubator-openwhisk that referenced this issue Jan 22, 2018
mcdan added a commit to adobe-apiplatform/incubator-openwhisk that referenced this issue Feb 26, 2018
 * Simplify error handling code
 * Change how data location is calculated to avoid performance issue
@style95
Copy link
Member

style95 commented Sep 20, 2018

@mcdan This is because invoker tries to access containers directory under /var/lib/docker while collecting logs.
But containers directory is generally owned by root.
Since your IDE will run as normal user, it does not have any permission to access that directory.

So if you run your intelliJ IDE as a root user, you can run invoker as well.
And you might need to create a symlink for containers in project root directory.

I hope this would be helpful.

@ddragosd
Copy link
Contributor

ddragosd commented Nov 29, 2018

I'm taking another stance at this and I'm stuck at getting the invoker to communicate with action containers. More details here: docker/for-mac#171 .

My current thinking is to provide a simple ContainerFactoryProvider that exposes action container's port 8080 to the host ( -p 0:8080)

The logging issue would still there, unless ... -Dwhisk.log-limit.max=0 which skips the log collection.

@ddragosd
Copy link
Contributor

I've placed my changes in #4142
Now both Controller and Invoker can be started and debugged from IntelliJ directly.

@rabbah
Copy link
Member

rabbah commented Dec 18, 2019

Closing as stale.

@rabbah rabbah closed this as completed Dec 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants