Skip to content
This repository has been archived by the owner on Jan 15, 2022. It is now read-only.

Calculate job cost based on hadoop2 counters of mbmillis #132

Merged

Conversation

vrushalic
Copy link
Collaborator

hRaven should now start calculating job cost based on the hadoop2 counters of mb millis instead of slot millis.

Since hadoop2 did not ship with these counters (MB_MILLIS_MAPS and MB_MILLIS_REDUCES in job counters) , on the hRaven side, hRaven would itself calculate the megabytemillis of a job and then calculate the job cost. But now that hadoop2 is emitting these counters, we should use those values.

If those counters are not present (for instance in files generated by older versions of hadoop2), hRaven should continue to use the slot millis counters.

Also, hRaven was correcting the slot millis emitted by hadoop due to a bug in hadoop. Now that fix is irrelevant and we should deprecate those methods.

Relevant open source hadoop jira https://issues.apache.org/jira/browse/MAPREDUCE-5464

@coveralls
Copy link

Coverage Status

Coverage increased (+0.05%) when pulling 8d7e1a5 on vrushalic:calculate_cost_hadoop2_mbmillis into 533094d on twitter:twitter_only.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.05%) when pulling 8d7e1a5 on vrushalic:calculate_cost_hadoop2_mbmillis into 533094d on twitter:twitter_only.

jrottinghuis added a commit that referenced this pull request Jan 26, 2015
Calculate job cost based on hadoop2 counters of mbmillis
@jrottinghuis jrottinghuis merged commit 93938a6 into twitter:twitter_only Jan 26, 2015
@sjlee
Copy link
Collaborator

sjlee commented Jan 27, 2015

This should also be merged to master, right?

@vrushalic
Copy link
Collaborator Author

Yes.

On Tue, Jan 27, 2015 at 1:26 PM, Sangjin Lee notifications@github.com
wrote:

This should also be merged to master, right?


Reply to this email directly or view it on GitHub
#132 (comment).

@vrushalic
Copy link
Collaborator Author

Any help in making the master current will be highly appreciated!

On Tue, Jan 27, 2015 at 1:38 PM, Vrushali Channapattan <vrushali@twitter.com

wrote:

Yes.

On Tue, Jan 27, 2015 at 1:26 PM, Sangjin Lee notifications@github.com
wrote:

This should also be merged to master, right?


Reply to this email directly or view it on GitHub
#132 (comment).

@sjlee
Copy link
Collaborator

sjlee commented Jan 28, 2015

We can clone the PR to merge the identical patch to master. Let me know if you want me to do it. I can help.

@vrushalic
Copy link
Collaborator Author

Yes, that would be great, please go ahead!

Thanks
Vrushali

On Jan 27, 2015, at 4:32 PM, Sangjin Lee notifications@github.com wrote:

We can clone the PR to merge the identical patch to master. Let me know if
you want me to do it. I can help.


Reply to this email directly or view it on GitHub
#132 (comment).

@sjlee
Copy link
Collaborator

sjlee commented Jan 28, 2015

Hmm. The issue turns out to be bit bigger than I initially thought. It appears that there is some divergence between master and twitter_only branches. There are some changes/commits that were done on master that are not present on twitter_only.

We need a one-time PR to re-apply any missing changes from master to twitter_only. Also going forward, we need to make sure to apply any commits that are done over to twitter_only, or the branches will start drifting apart.

@vrushalic
Copy link
Collaborator Author

Yes, we didn't miss those; those were being done as part of
#131

On Wed, Jan 28, 2015 at 9:28 AM, Sangjin Lee notifications@github.com
wrote:

Hmm. The issue turns out to be bit bigger than I initially thought. It
appears that there is some divergence between master and twitter_only
branches. There are some changes/commits that were done on master that are
not present on twitter_only.

We need a one-time PR to re-apply any missing changes from master to
twitter_only. Also going forward, we need to make sure to apply any commits
that are done over to twitter_only, or the branches will start drifting
apart.


Reply to this email directly or view it on GitHub
#132 (comment).

@sjlee
Copy link
Collaborator

sjlee commented Jan 28, 2015

Thanks for the reminder. My apologies for not realizing it.

Going forward, it would be good to establish the best practice. IMO, for any generic issues, master should be the first place to commit. Then for each issue, we would promptly merge it over to twitter_only. That way, the merge stays sane, and the branches stay reasonably in sync. Thoughts?

sjlee pushed a commit to sjlee/hraven that referenced this pull request Jan 29, 2015
vrushalic pushed a commit that referenced this pull request Jan 29, 2015
Merged #132 from twitter_only to master (#134).
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants