Poll API, health management for recorders, backend daemon for reporting load, aggregation window management, work assignment scheduling #47

anvinjain · 2017-03-06T22:25:15Z

Poll API, update recorder info and return work assignment in response
Health management for recorders on poll api request
Backend daemon for reporting load
Managing process group associations on backend
Setting up and expiry of aggregation window
Building schedule of work assignments on basis of work reported by leader

workid -> profileworkinfo map in aggregated profile app,cluster,proc fields in window fleshed out default root nodes in stack tree running checksum using underlying byte arr fixed bugs in usage of codedinputstream parsing profile indexes with update when wse is processed

…ds to aggregation classes

…processing

… profiles for same work id, finalizable entities, aggregation entity pojos added in aggregation module, other pull request related fixes

…nment and scheduling all tests were passing at this point

janmejay

Haven't looked at tests at all (assuming they are good and cover the feature well).

Didn't see any integration tests, assuming you want them done in e2e work, if not, please cover basic integration flows (eg. backend defunkt flow with real backend and real leader has nothing to do with running broker, so can be tested at integration level in isolation, perhaps).

Other comments are all inline.

Other than core logic issues (which need to be fixed and are often direct in terms of analysis and fix, GC overhead issues are more work, of-course, I mean the other ones), general soft-issues theme coming across is about:

naming
more than necessary indirection

I have called them out in some places, but haven't been very thorough in calling out every single instance. But have called out atleast one example of each pattern, so after going thru the whole thing, may be take a cursory look at the code-changes again (only with the readability and maintainability in mind and see if you find something worth fixing.

janmejay · 2017-03-07T11:10:21Z

aggregation/src/main/java/fk/prof/aggregation/model/FinalizedAggregationWindow.java

@@ -59,7 +64,7 @@ protected Header buildHeaderProto(int version, WorkType workType) {
    return Header.newBuilder()
        .setFormatVersion(version)
        .setWorkType(workType)
-        .setAggregationEndTime(endWithTolerance.atOffset(ZoneOffset.UTC).format(DateTimeFormatter.ISO_ZONED_DATE_TIME))
+        .setAggregationEndTime(endedAt == null ? null : endedAt.atOffset(ZoneOffset.UTC).format(DateTimeFormatter.ISO_ZONED_DATE_TIME))


For all DTOs, we should consider re-using the object for serialization.

They expose a "clear" mechanism to do this. One can clear the DTO and start afresh without any additional GC pressure.

Something worth thinking about.

You mean reusing the builder object right? Also, clearing the same builder object can work only where concurrent access cannot happen. Above is a candidate, agreed

No, I meant both builder and DTO, but looks like neither is possible. The API is extremely stupid, it seems, read: https://groups.google.com/forum/#!topic/protobuf/b0gS4wpjuIo and https://groups.google.com/forum/#!topic/protobuf/No9bBRh3Wp0

It seems they wanted to support some "optimizations" by being GC unfriendly, which they didn't get right either. Dumbness rules!

janmejay · 2017-03-07T14:16:20Z

backend/src/main/java/fk/prof/backend/BackendManager.java

+    SimultaneousWorkAssignmentCounter simultaneousWorkAssignmentCounter = new SimultaneousWorkAssignmentCounterImpl(configManager.getMaxSimultaneousProfiles());
+
+    VerticleDeployer backendHttpVerticleDeployer = new BackendHttpVerticleDeployer(vertx, configManager, leaderStore, aggregationWindowLookupStore, processGroupAssociationStore);
+    VerticleDeployer backendDaemonVerticleDeployer = new BackendDaemonVerticleDeployer(vertx, configManager, leaderStore, processGroupAssociationStore, aggregationWindowLookupStore, simultaneousWorkAssignmentCounter);


We should have a validation that ensures that only one thread will ever run backend-daemon. Basically 1 vertical.

Validation is done inside the deployer for backend daemon. Fixed in commit 668398a

janmejay · 2017-03-07T18:23:36Z

...nd/src/main/java/fk/prof/backend/model/assignment/impl/ProcessGroupAssociationStoreImpl.java

+
+    //Add process group associations which are returned by leader
+    for(Recorder.ProcessGroup processGroup: processGroups.getProcessGroupList()) {
+      ProcessGroupDetail existingValue = this.processGroupLookup.putIfAbsent(processGroup, new ProcessGroupDetail(processGroup, thresholdForDefunctRecorderInSecs));


This is a little confusing, because there is just one thread calling it.

Calling put as opposed to putIfAbsent makes the intent clear.

put if absent is required here. Leader always returns all the process groups associated with a backend. Process groups which are already present in the lookup should not be associated with a new processgroupdetail instance. Only the ones which are new. Hence the putIfAbsent semantic

janmejay · 2017-03-07T18:28:34Z

backend/src/main/java/fk/prof/backend/aggregator/AggregationWindow.java

+  private void abortOngoingProfiles() {
+    ensureEntityIsWriteable();
+    try {
+      for (Map.Entry<Long, ProfileWorkInfo> entry : workInfoLookup.entrySet()) {


This appears racy at first, until one realizes that put really only happens in the constructor.

We should make this a immutable-map, I guess? to make invariants absolutely obvious?

It needn't even be a concurrent data-structure, because volatile bool orders things anyway. Just a HashMap will do.

Fixed in commit a525979

janmejay · 2017-03-07T18:31:39Z

backend/src/main/java/fk/prof/backend/aggregator/AggregationWindow.java

+    abortOngoingProfiles();
+    long[] workIds = this.workInfoLookup.keySet().stream().mapToLong(Long::longValue).toArray();
+    aggregationWindowLookupStore.deAssociateAggregationWindow(workIds);
+    this.endedAt = LocalDateTime.now(Clock.systemUTC());


This means we can have profiles that end first and start later (in the worst cases).

May be we should completely move to stable clocks here? or may be its worth recording both, we record a start and end for convenience, but also have a stable-clock based duration?

start is assigned in the constructor itself (maybe I can not accept that as a param and generate value in constructor itself). Unless clock goes backward, end first and start later cannot happen, right.

yes, clock rewind scenarios. On Varadhi machines we have seen upto 4 minutes of delta across ~70 odd brokers.

But clock rewind happening on the same machine itself?

vertx delegates timer stuff to schedule methods exposed by netty's event loop. Netty converts this into a ScheduledFutureTask and scheduling queue is handled by AbstractScheduledEventExecutor
Netty uses system nanotime internally (stable clock), so wall clock skew on backend should not affect us. Of course, end time can still be recorded incorrectly, so additionally will store duration in the object as a separate field

Fixed in commit a525979 by adding duration field in aggregation window. Opened an issue to add the same in proto for aggregated profile and associated serde: #62

janmejay · 2017-03-09T18:48:25Z

backend/src/main/java/fk/prof/backend/util/proto/RecorderProtoUtil.java

+
+public class RecorderProtoUtil {
+
+  public static Recorder.ProcessGroup mapRecorderInfoToProcessGroup(Recorder.RecorderInfo recorderInfo) {


All functions in this class are GC provoking. May be return closable-wrapped objects to eventually allow pooling without too much trouble?

Will discuss offline

As you mentioned earlier, proto builders are single-time use only and cannot be used to construct other proto objects once build() is called on them

janmejay · 2017-03-09T18:52:43Z

backend/src/main/java/fk/prof/backend/worker/BackendDaemon.java

+
+  private ProfHttpClient buildHttpClient() {
+    JsonObject httpClientConfig = configManager.getHttpClientConfig();
+    ProfHttpClient httpClient = ProfHttpClient.newBuilder()


In cases like this builder is not useful. It takes 5 params and we are providing all 5, it defeats the point.

I'd rather have a convenient constructor that takes httpClientConfig and funnels to 5-arg constructor.

bloated param list in constructor is such a readability pain according to me. builder gives that convenience of self-documenting code where I don't have to keep matching param index with the param name in constructor signature to understand what gets mapped where and also reduces risk of accidentally mixing up argument order (for multiple variable of same type, like int, there will be no visual cue)

Regardless, we have had this discussion earlier, I will fix such instances.

Let us not remove it unless we really agree. Let us discuss this offline.

todo: Default behavior for builder

Fixed in commit eaa57eb

janmejay · 2017-03-09T19:06:48Z

backend/src/main/java/fk/prof/backend/worker/BackendDaemon.java

+                if(ar.result().getStatusCode() == 200) {
+                  try {
+                    Recorder.ProcessGroups assignedProcessGroups = ProtoUtil.buildProtoFromBuffer(Recorder.ProcessGroups.parser(), ar.result().getResponse());
+                    processGroupAssociationStore.updateProcessGroupAssociations(assignedProcessGroups, (processGroupDetail, processGroupAssociationResult) -> {


This is keeping this mapping fresh. But we ignore failures.

In case of inability to talk to leader, leader may assign the same PG to another backend.

We should perhaps configure a threshold after which, if we remain partitioned, we'll drop ownership. Of-course, this threshold should be fairly high, so that we survive intermittent network-partitions, usually DC level issues are usually resolved in first few hours, so may be we should go on for an hour or 2 and then drop the ball.

Of-course, we should namespace serialized files then, because its easy to end up in a situation where 2 backend (where one is not able to talk to the leader) are aggr-win synchronized and are stepping on each other's tows.

Assignment of PG on leader is lazy and only in response to /association request made by some recorder. For multiple agg win to be synchronized, leader has to assign same pg to both backends (B1 and B2). These events are ordered from leader's perspective. Once a PG is assigned to B1, leader will not assign it to B2 unless B1 goes defunct (some delay, d seconds in nanotime clock has to be observed by leader). Since aggregation window determines start time (wall-clock) when it is initialized (which is when association is received by backend), for two windows to have same start time means that clock at B2 is exactly d seconds slower than B1 and two or more competing requests for /association have been made exactly so.

Let's also consider scenario when assignment has already been made and aggr win is in progress: when n/w partition occurs and some recorders(let's call that set S1) can talk to previous backend(B1), others cannot(S2). Recorders in S2, then call association and get a new backend assigned to PG(B2) if leader determines B1 is defunct. If B1 had recovered before that or healthy regardless according to leader, reassignment does not happen, and leader still returns B1.
In case of leader returning B2 because according to leader b1 is defunct, recorders in S1 are still talking to B1. That is a problem yes, but only for the current aggregation window. I can add a check in /leader/work API to return work assignment only if B1 is still assigned with the PG. In this case two files are written to disk but as I said with different start times, and we can live with that, they show up as different profiles, coverage is also distributed in proportion to machines observed by respective backends. In the case where n/w recovers before aggregation window end, backend is able to report load to leader, finds out it is not associated with PG anymore and expires self aggregation window.

So we can tweak /leader/work API such that backend sends self details as well in the request and leader checks for association before handing out work. Makes sense?

Yes, that /leader/work proposal does seem to work.

Fixed in commit 4082bc7

janmejay · 2017-03-09T19:08:17Z

backend/src/main/java/fk/prof/backend/worker/BackendDaemon.java

+              } else {
+                logger.error("Error when reporting load to leader", ar.cause());
+              }
+              setupTimerForReportingLoad();


I'd do this in finally for safety. Anything goes wrong here and we lose our heartbeat, pretty scary.

applies to the one below too.

This one should be moved to finally but the ones below are in catch clause and else clause for a reason, otherwise competing load report timers will be set

Fixed in commit eaa57eb

janmejay · 2017-03-09T19:10:54Z

backend/src/main/java/fk/prof/backend/worker/BackendDaemon.java

+    String leaderIPAddress;
+    if((leaderIPAddress = leaderReadContext.getLeaderIPAddress()) != null) {
+      try {
+        String requestPath = new StringBuilder(ApiPathConstants.LEADER_GET_WORK)


Method for this path generation and a class for leader-bound http-client will help reduce this repetitive code. Eg. for http-client, host, port etc can be constructor params. May be separate wrapper objects over httpClient that encap path and verb too?

path generation abstracted out in url utils. Fixed in commit 4082bc7

…gic for load reporting and backend defunctness Work assignment schedule does not take max concurrency as ceiling anymore and so does not prevent building schedule Work slot pool deals with slot pojo instead of integers Retired work assignment factory and instead work assignment schedule accepts bootstrap config in constructor Backend health updated if newer tick is received by leader, backend does not send newer ticks if load report fails

…egation window Making work info lookup non-concurrent and immutable in aggregation window Adding duration as member in aggregation window Rename of process group association store

Adding method to build httpclient from json config and associated refactor Stale check for aggregation window when fetching recording policy Variable renaming across window assignment schedule Better error checks in load report from daemon Rename of aggregation window store

gauravAshok and others added 30 commits December 22, 2016 13:36

adding AsyncStorage and IO stream for ser/de.

e17d4c4

Added data model for cpu sampling stacktraces

ce400f4

parsing recording header in profile api

10d904e

removing proto generated recorder classes

bc48ebb

updated gitignore to include proto generated recorder class

746f76e

changed tab spacing to 2 characters

d2777f2

wse parser and refactoring code

c730e9b

in middle of some changes

e9c6539

in middle of some changes

abd2539

using codedinputstream for parsing protobuf

13832dd

Added fields to profileworkinfo and unit tests for happy path

e4970ad

formatting changes

b29bbe4

added more unit-tests and other fixes

5904319

moved location of profileapitest

4bffaf1

formatting and removed storage related code

a849797

removed file created by mistake

106f949

Added unit tests, refactored into aggregation module, added some fiel…

af64c73

…ds to aggregation classes

Merge branch 'master' into aggregator

58870db

refactored packages of aggregation module

91a3d74

added aggregationprofilemodel to gitignore

b93a114

cleanup

4c573ad

setup logging properties, metrics integration, config loading

08ba7dd

setup rolling logs for file

0303532

Merge branch 'master' into aggregator

7295ef5

Refactoring code throughout, added support for non-concurrent workid …

3a878f7

…processing

Merge branch 'master' into aggregator

9f33a4e

reworked buffer handling in profile api, non-concurrent processing of…

b2c1bb7

… profiles for same work id, finalizable entities, aggregation entity pojos added in aggregation module, other pull request related fixes

Minor updates

47dc7b7

Fixed jackson version issues

3bd73f3

anvinjain added 11 commits March 1, 2017 03:30

Merge branch 'master' into work_assignment

31b274c

log4j2 support added in backend and unfinished changes for work assig…

a28a452

…nment and scheduling all tests were passing at this point

building schedule for work assignment and starting aggregation window

4fac1f8

Merge branch 'master' into work_assignment

46e8d2a

blind code complete, at least compiles now

b0bbee3

added backend daemon deployer

62d1d81

minor refactoring, logs, comments across the board

c06b54d

self-review fixes and existing unit tests passing

417b771

minor bug fixes and added some unit tests

e3199ea

Added unit tests for poll, load api, other fixes

7fda302

update recorder info when sending assignment on poll

d77e4e2

anvinjain requested review from janmejay and gauravAshok March 6, 2017 22:25

anvinjain self-assigned this Mar 6, 2017

anvinjain mentioned this pull request Mar 8, 2017

ProfileWorkInfo.processStateEvent returns false in case the state does not change. #53

Closed

janmejay suggested changes Mar 9, 2017

View reviewed changes

anvinjain added 7 commits March 14, 2017 20:17

Hard-coding verticle count to 1 for backend daemon in its deployer

668398a

daemon deployment hardcoded to single verticle, modifications to aggr…

a525979

…egation window Making work info lookup non-concurrent and immutable in aggregation window Adding duration as member in aggregation window Rename of process group association store

Utility to attach multiple handlers to vertx route

a5eb038

cleaned up exceptions

650134a

Stickiness of work assignment to a recorder in same aggregation window

da0053a

Rename of WorkProfile proto message to RecordingPolicy

91da467

anvinjain requested a review from janmejay March 14, 2017 21:15

anvinjain mentioned this pull request Mar 14, 2017

Package refactoring in backend #64

Open

anvinjain added 3 commits March 15, 2017 03:38

backend id picked from config

8ae803b

Added check of backend association in /leader/work API

4082bc7

anvinjain merged commit 4082bc7 into master Mar 15, 2017

anvinjain deleted the work_assignment branch May 19, 2017 07:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Poll API, health management for recorders, backend daemon for reporting load, aggregation window management, work assignment scheduling #47

Poll API, health management for recorders, backend daemon for reporting load, aggregation window management, work assignment scheduling #47

anvinjain commented Mar 6, 2017

janmejay left a comment •

edited

janmejay Mar 7, 2017

anvinjain Mar 10, 2017

janmejay Mar 10, 2017

janmejay Mar 7, 2017

anvinjain Mar 14, 2017 •

edited

janmejay Mar 7, 2017

anvinjain Mar 14, 2017

janmejay Mar 7, 2017

anvinjain Mar 14, 2017

janmejay Mar 7, 2017

anvinjain Mar 10, 2017

janmejay Mar 10, 2017

anvinjain Mar 10, 2017

anvinjain Mar 14, 2017

anvinjain Mar 14, 2017 •

edited

janmejay Mar 9, 2017

anvinjain Mar 10, 2017

anvinjain Mar 15, 2017 •

edited

janmejay Mar 9, 2017

anvinjain Mar 10, 2017 •

edited

janmejay Mar 10, 2017

anvinjain Mar 14, 2017

anvinjain Mar 15, 2017

janmejay Mar 9, 2017

anvinjain Mar 10, 2017 •

edited

janmejay Mar 10, 2017

anvinjain Mar 15, 2017

janmejay Mar 9, 2017

janmejay Mar 9, 2017

anvinjain Mar 15, 2017

anvinjain Mar 15, 2017

janmejay Mar 9, 2017

anvinjain Mar 15, 2017


		public class RecorderProtoUtil {

		public static Recorder.ProcessGroup mapRecorderInfoToProcessGroup(Recorder.RecorderInfo recorderInfo) {

Poll API, health management for recorders, backend daemon for reporting load, aggregation window management, work assignment scheduling #47

Poll API, health management for recorders, backend daemon for reporting load, aggregation window management, work assignment scheduling #47

Conversation

anvinjain commented Mar 6, 2017

janmejay left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anvinjain Mar 14, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anvinjain Mar 14, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anvinjain Mar 15, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anvinjain Mar 10, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anvinjain Mar 10, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

janmejay left a comment •

edited

anvinjain Mar 14, 2017 •

edited

anvinjain Mar 14, 2017 •

edited

anvinjain Mar 15, 2017 •

edited

anvinjain Mar 10, 2017 •

edited

anvinjain Mar 10, 2017 •

edited