Skip to content
This repository has been archived by the owner on Jan 15, 2022. It is now read-only.

Add support to filter task / job counters #160

Merged
merged 7 commits into from
Apr 10, 2017

Conversation

piyushnarang
Copy link

We noticed that in scenarios where the number of tasks is very large (10K+), requesting for task counters ends up resulting in a very large payload. This PR adds a filter criteria (includeCounter) to the job / task endpoints to allow users to provide a set of counters they're interested in. This is similar to the existing include clause. The response payload is then the filtered set of counters + include fields requested.

Here's a couple of examples:

Job:
/api/v1/job/mycluster/my_job_id?include=jobKey&includeCounter=org.apache.hadoop.mapreduce.FileSystemCounter.HDFS_BYTES_WRITTEN

Response:

{
  "jobKey" : {
    "cluster" : "procrev@atla",
    "userName" : "ads",
    "appId" : "ad_shard_log_joiner_procrev_atla_even_hourly",
    "runId" : 1486598274736,
    "jobId" : {
      "cluster" : "procrev@atla",
      "jobEpoch" : 1486504431708,
      "jobSequence" : 30155,
      "jobIdString" : "job_1486504431708_30155"
    },
    "qualifiedJobId" : {
      "cluster" : "procrev@atla",
      "jobEpoch" : 1486504431708,
      "jobSequence" : 30155,
      "jobIdString" : "job_1486504431708_30155"
    },
    "encodedRunId" : 9223370550256501071
  },
  "counters" : {
    "org.apache.hadoop.mapreduce.FileSystemCounter" : {
      "HDFS_BYTES_WRITTEN" : 3269002525990
    }
  },
  "mapCounters" : {
    "org.apache.hadoop.mapreduce.FileSystemCounter" : {
      "HDFS_BYTES_WRITTEN" : 0
    }
  },
  "reduceCounters" : {
    "org.apache.hadoop.mapreduce.FileSystemCounter" : {
      "HDFS_BYTES_WRITTEN" : 3269002525990
    }
  }
}

Task: /api/v1/tasks/mycluster/my_job_id?include=taskType&include=taskId&includeCounter=org.apache.hadoop.mapreduce.TaskCounter.COMMITTED_HEAP_BYTES&includeCounter=org.apache.hadoop.mapreduce.TaskCounter.PHYSICAL_MEMORY_BYTES&includeCounter=org.apache.hadoop.mapreduce.TaskCounter.GC_TIME_MILLIS&includeCounter=org.apache.hadoop.mapreduce.TaskCounter.CPU_MILLISECONDS

Response:

[  
   {  
      "taskId":"task_1486504431708_30155_m_000000",
      "taskType":"MAP",
      "counters":{  
         "org.apache.hadoop.mapreduce.TaskCounter":{  
            "PHYSICAL_MEMORY_BYTES":722796544,
            "GC_TIME_MILLIS":632,
            "CPU_MILLISECONDS":144470,
            "COMMITTED_HEAP_BYTES":798347264
         }
      }
   },
   {  
      "taskId":"task_1486504431708_30155_m_000000",
      "taskType":"MAP",
      "counters":{  
         "org.apache.hadoop.mapreduce.TaskCounter":{  
            "PHYSICAL_MEMORY_BYTES":722796544,
            "GC_TIME_MILLIS":632,
            "CPU_MILLISECONDS":144470,
            "COMMITTED_HEAP_BYTES":798347264
         }
      }
   },
]

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.02%) to 2.127% when pulling 75b8123 on piyushnarang:task_counter_filters into e17ec79 on twitter:master.

@piyushnarang
Copy link
Author

cc @vrushalivc / @dieu

@vrushalivc vrushalivc merged commit 2089d69 into twitter:master Apr 10, 2017
@vrushalivc
Copy link
Contributor

Thanks @piyushnarang and @dieu

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants