Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Storm improvements #65

Merged
merged 19 commits into from
Nov 17, 2015
Merged

[WIP] Storm improvements #65

merged 19 commits into from
Nov 17, 2015

Conversation

brndnmtthws
Copy link
Member

  • Added additional configuration parameters for controlling scheduler
    behaviour:
    • mesos.offer.filter.seconds
    • mesos.offer.expiry.multiplier
    • mesos.prefer.reserved.resources
  • Implemented combining of reserved & unreserved resources
  • Improved Docker support, added support for running Storm inside
    containers. This also introduces the mesos.container.docker.image
    config param
  • Added unit tests
  • Improved port handling, especially with regard to logviewer
  • Code cleanup/style fixes
  • Implemented filters & reviveOffers()

Also from @maverick2202:

  • Upgrade Storm
  • Add worker name prefix

NOTE: this isn't quite ready to merge yet.

Let's merge this instead of #62 and #63.

cc @sargun @erikdw

Ankur Choksi added 3 commits November 11, 2015 14:01
 - Use Storm 0.9.5
 - Add prefix to the worker mesos id
 - Use proper directory name for HTTP server
}

private final Protos.TaskInfo task;
private final Protos.Offer offer;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be at the top of the class.

@sargun
Copy link

sargun commented Nov 11, 2015

Please add findbugs to the pom:

    <reporting>
      <plugins>
        <plugin>
          <groupId>org.codehaus.mojo</groupId>
          <artifactId>findbugs-maven-plugin</artifactId>
          <version>3.0.3</version>
          <configuration>
            <effort>Max</effort>
            <threshold>Default</threshold>
            <xmlOutput>true</xmlOutput>
          </configuration>
        </plugin>
      </plugins>
    </reporting>

@brndnmtthws
Copy link
Member Author

Thanks for the review @sargun. Updated the PR as per your comments.

@brndnmtthws brndnmtthws force-pushed the wip-improvements-0.9.6 branch 3 times, most recently from 7a49c40 to a437ce3 Compare November 12, 2015 16:35
@@ -9,7 +9,7 @@ STORM_CMD = STORM_PATH + "/storm"
def nimbus():
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please PEP8 this file:

3c075477e55e:bin sdhillon$ pep8 storm-mesos 
storm-mesos:9:1: E302 expected 2 blank lines, found 1
storm-mesos:10:3: E111 indentation is not a multiple of four
storm-mesos:11:3: E111 indentation is not a multiple of four

public static String hostFromAssignmentId(String assignmentId, String delimiter) {
final int last = assignmentId.lastIndexOf(delimiter);
String host = assignmentId.substring(last + delimiter.length());
LOG.debug("AssignMentId: " + assignmentId + " Host: " + host);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AssignMentId? Why is the M capitalized?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. I'll fix it though.

@sargun
Copy link

sargun commented Nov 13, 2015

All the places you're doing:
if (X == null)
X = $default_value

Why not use Optionals across the board?

}
}

public void taskLost(final TaskID taskId) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're not doing reconcilation, how will this callback ever get triggered on disconnection?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm not sure why you're asking. This could be called any time there's a status update with TASK_LOST.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - but you're not doing task reconciliation. So, if you're dependent on task statuses working at all, things wont work well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Storm does its own out-of-band reconciliation-like thing (using ZK), so I'm not too worried about it.

@brndnmtthws
Copy link
Member Author

Think I got all the == null's.

@sargun
Copy link

sargun commented Nov 13, 2015

I was unable to test fault recovery, but it looks pretty good. Please reorder / squash commits.

## Optional configuration

* `mesos.supervisor.suicide.inactive.timeout.secs`: Seconds to wait before supervisor to suicides if supervisor has no task to run. Defaults to "120".
* `mesos.master.failover.timeout.secs`: Framework failover timeout in second. Defaults to "3600".
* `mesos.master.failover.timeout.secs`: Framework failover timeout in second. Defaults to "24*7*3600".
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brndnmtthws Whats the motivation for setting such a long timeout?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To prevent accidental framework removal. It's just a default, so you can always specify something else :)

@erikdw
Copy link
Collaborator

erikdw commented Nov 17, 2015

DOOOOOOOOOD!!!

@erikdw
Copy link
Collaborator

erikdw commented Nov 17, 2015

I said we had other comments man. This is a HUGE change.

@brndnmtthws
Copy link
Member Author

Happy to address them quickly. I just don't want this PR to keep growing over time, because we'll never get it merged.

@erikdw
Copy link
Collaborator

erikdw commented Nov 17, 2015

Well... let's all make an effort to keep changes small and modular in the future then. Then they are reviewable without herculean effort, less likely to break things, etc.

-->
<property name="tokens" value="ASSIGN, BAND, BAND_ASSIGN, BOR,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference between this, and just accepting the default are the following values:

  • DO_WHILE (the while keyword in a do-while)
  • LCURLY (left curly)
  • LITERAL_SWITCH (the switch keyword)
  • RCURLY(right curly)
  • SLIST (a statement list)
  • LITERAL_ASSERT (the assert keyword)
  • TYPE_EXTENSION_AND (the & symbol when used in a generic upper or lower bounds constrain)

If there isn't a particular reason for excluding these values, this entire block can be simplified to <module name="WhitespaceAround"/>

<property name="tokens" value="COMMA, SEMI, TYPECAST"/>
</module>

<module name="NoWhitespaceAfter">
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference between this, and just accepting the default are the following values:

  • ARRAY_INIT (An array initialization)
  • ARRAY_DECLARATOR(An array declaration)
  • INDEX_OP(The array index operator)

If there isn't a particular reason for excluding these values, this entire block can be simplified to <module name="NoWhitespaceAfter"/>

@erikdw
Copy link
Collaborator

erikdw commented Dec 4, 2015

I'm not sure yet what the cause is, but when I'm testing the post-#65 version of this project, it is unable to launch multiple worker tasks in the same executor at the same time. I'm suspicious of either the assignmentId changes or the declined-offer-filtering change, but cannot say for sure yet what the cause really is. As an example, I have a small topology that needs 3 workers, all 3 of which get assigned to the same host (I only have 1 mesos-slave host) per the MesosNimbus logs, but only 1 comes up every 2 minutes. i.e., first port 31000 comes up, but for the other 2 there are no logs in the supervisor, then a bit over 2 minutes later it's 31001 that comes up, then a bit over 2 minutes after that it's 31002 that comes up. Will update once I figure out more about what's happening, but figured I'd mention it as soon as I validated that it's happening.

(Of course it's also entirely possible that this is some artifact of the vagrant setup I'm using and not a real problem in the MesosNimbus / MesosSupervisor code -- TBD! 🔍 )

Ah... so I found the proximate cause in the mesos-master logs. The problem is related to something that I feel like is a bug (or at least odd design decision) within mesos proper. Specifically, the ExecutorInfo field must be identical between the Executor and all tasks within a given Executor, or it rejects tasks with mismatching ExecutorInfo.

Error log:
I1204 07:54:43.277238  6970 master.cpp:4449] Sending status update TASK_ERROR (UUID: e06501dc-357f-0000-400b-2aee357f0000) for task master-31001 of framework 8602f882-7716-44f2-9b98-5de10437ad61-0000 'Task has invalid ExecutorInfo (existing ExecutorInfo with same ExecutorID is not compatible).
Existing ExecutorInfo:
executor_id {
  value: "smoketest-f-2-1449215681"
}
data: "{\"supervisorid\":\"master-smoketest-f-2-1449215681\",\"assignmentid\":\"master\"}"
resources {
  name: "cpus"
  type: SCALAR
  scalar {
    value: 0.1
  }
  role: "*"
}
resources {
  name: "mem"
  type: SCALAR
  scalar {
    value: 500
  }
  role: "*"
}
command {
  uris {
    value: "file:///usr/local/storm/storm-mesos-0.9.6.tgz"
  }
  uris {
    value: "http://master:53877/generated-conf/storm.yaml"
  }
  value: "cp storm.yaml storm-mesos*/conf && cd storm-mesos* && python bin/storm supervisor storm.mesos.MesosSupervisor"
}
framework_id {
  value: "8602f882-7716-44f2-9b98-5de10437ad61-0000"
}
name: "storm-supervisor | smoketest-f-2-1449215681 | master"
Rejected Task's ExecutorInfo:
executor_id {
  value: "smoketest-f-2-1449215681"
}
data: "{\"supervisorid\":\"master-smoketest-f-2-1449215681\",\"assignmentid\":\"master\"}"
command {
  uris {
    value: "file:///usr/local/storm/storm-mesos-0.9.6.tgz"
  }
  uris {
    value: "http://master:53877/generated-conf/storm.yaml"
  }
  value: "cp storm.yaml storm-mesos*/conf && cd storm-mesos* && python bin/storm supervisor storm.mesos.MesosSupervisor"
}
framework_id {
  value: "8602f882-7716-44f2-9b98-5de10437ad61-0000"
}
name: "storm-supervisor | smoketest-f-2-1449215681 | master"
Notably, the rejected Task's ExecutorInfo lacks the 2 resources sections.

As for why that is... need to look further.

executorInfoBuilder
.setExecutorId(ExecutorID.newBuilder().setValue(details.getId()))
.setData(ByteString.copyFromUtf8(executorDataStr));
if (!subtractedExecutorResources) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the logic causing the problem I described here. I'm sending a PR in a second.

erikdw added a commit to erikdw/storm-mesos that referenced this pull request Dec 4, 2015
One of the logic changes in PR mesos#65 broke the ability to simultaneously launch
more than 1 worker process for a given topology.  The cause of the breakage
was intentionally avoidiing inclusion of the executor's resources into the
ExecutorInfo structure associated with the mesos tasks (storm workers).
This is problematic because the mesos-master rejects tasks whose ExecutorInfo
isn't identical to other tasks under the same executor.  Notably, since the
executor is already running for subsequent tasks, the resources that are
added to these subsequent tasks' ExecutorInfo aren't actually used, so there
is no advantage in attempting to avoid their inclusion.

FFR, this is the commit with that change:
* af8c49b

After this fix I was able to instantly launch 3 workers for a topology
on the same mesos-slave host.
erikdw added a commit to erikdw/storm-mesos that referenced this pull request Dec 4, 2015
One of the logic changes in PR mesos#65 broke the ability to simultaneously launch
more than 1 worker process for a given topology.  The cause of the breakage
was intentionally avoiding inclusion of the executor's resources into the
ExecutorInfo structure associated with the mesos tasks (storm workers).
This is problematic because the mesos-master rejects tasks whose ExecutorInfo
isn't identical to other tasks under the same executor.  Notably, since the
executor is already running for subsequent tasks, the resources that are
added to these subsequent tasks' ExecutorInfo aren't actually used, so there
is no advantage in attempting to avoid their inclusion.

FFR, this is the commit with that change:
* af8c49b

After this fix I was able to instantly launch 3 workers for a topology
on the same mesos-slave host.
erikdw added a commit to erikdw/storm-mesos that referenced this pull request Dec 4, 2015
One of the logic changes in PR mesos#65 broke the ability to simultaneously launch
more than 1 worker process for a given topology.  The cause of the breakage
was intentionally avoiding inclusion of the executor's resources into the
ExecutorInfo structure associated with the mesos tasks (storm workers).
Notably, the avoidance is only triggered for tasks other than the 1st one
that potentially launches the executor.

This is problematic because the mesos-master rejects tasks whose ExecutorInfo
isn't identical to other tasks under the same executor.  Notably, since the
executor is already running for subsequent tasks, the resources that are
added to these subsequent tasks' ExecutorInfo aren't actually used, so there
is no advantage in attempting to avoid their inclusion.

FFR, this is the commit with that change:
* af8c49b

After this fix I was able to instantly launch 3 workers for a topology
on the same mesos-slave host.
DarinJ pushed a commit to DarinJ/storm that referenced this pull request Dec 17, 2015
One of the logic changes in PR mesos#65 broke the ability to simultaneously launch
more than 1 worker process for a given topology.  The cause of the breakage
was intentionally avoiding inclusion of the executor's resources into the
ExecutorInfo structure associated with the mesos tasks (storm workers).
Notably, the avoidance is only triggered for tasks other than the 1st one
that potentially launches the executor.

This is problematic because the mesos-master rejects tasks whose ExecutorInfo
isn't identical to other tasks under the same executor.  Notably, since the
executor is already running for subsequent tasks, the resources that are
added to these subsequent tasks' ExecutorInfo aren't actually used, so there
is no advantage in attempting to avoid their inclusion.

FFR, this is the commit with that change:
* af8c49b

After this fix I was able to instantly launch 3 workers for a topology
on the same mesos-slave host.
resources.memSlots = (int) Math.floor((offerMem - executorMem) / mem);
if (r.hasReservation()) {
// skip resources with reservations
continue;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brndnmtthws Why are we skipping reserved resources?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the way dynamic reservations are implemented in Mesos, you may receive reserved offers for other frameworks. Since the Storm framework doesn't implement dynamic reservations, we just decline all of them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants