Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record per-target compile workflow stats when using RscCompile #8092

Merged
merged 3 commits into from Jul 25, 2019

Conversation

@wiwa
Copy link
Contributor

commented Jul 22, 2019

Problem

We would like to have metrics about how much of a build uses Rsc.

Solution

Record the compile workflow a target uses under RscCompile

Result

We will have solved the problem.

@stuhood
Copy link
Member

left a comment

Thanks!

@stuhood stuhood requested a review from ity Jul 23, 2019

@wiwa wiwa force-pushed the wiwa:f/record-workflow branch from 61d1342 to d47df76 Jul 24, 2019

self.options_scope,
compile_target,
['compile', 'workflow'],
workflow.value

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Jul 24, 2019

Contributor

I think I spent some time making run tracker JSON reporting support enums such as the target workflow directly, without converting to string via .value. I think you can do:

Suggested change
workflow.value
workflow

and it would work, and I think that would be more consistent with the other ways we've begun to record these enum-style options.

This comment has been minimized.

Copy link
@wiwa

wiwa Jul 24, 2019

Author Contributor

Could you give an example of it working with an enum somewhere in the codebase? Basically I want to go from "think" to "know".

This comment has been minimized.

Copy link
@stuhood

stuhood Jul 24, 2019

Member

This doesn't seem like a blocking change, even if we were sure that the replacement worked. Calling workflow.value is definitely more explicit here: he wants the string value, and this is the most explicit way to get it.

This comment has been minimized.

Copy link
@stuhood

stuhood Jul 24, 2019

Member

Danny pointed out that enum discourages accessing the value (although we should probably do more to discourage it, because it is a very common practice in other implementations to rely on the value of an enum for serialization).

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Jul 24, 2019

Contributor

@wiwa when I say think I meant "I was writing this on my phone and couldn't find the link". I wrote the enum support and I definitely wrote the json encoding support: see

class RunTrackerOptionEncoder(CoercingOptionEncoder):
"""Use the json encoder we use for making options hashable to support datatypes.
This encoder also explicitly allows OrderedDict inputs, as we accept more than just option values
when encoding stats to json.
"""
def default(self, o):
if isinstance(o, OrderedDict):
return o
return super().default(o)

As the person who took the time to make json encoding enum options work, I would ask you to please make use of it and record the enum instance without extracting the private .value field.

As I look at the enum docstring:

def enum(all_values):
I'm realizing the fact that .value is intended to be private isn't mentioned anywhere at all. I initially wanted to name it specifically .private_value_field, precisely so this confusion wouldn't occur. I'm going to do that in a followup PR.

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Jul 24, 2019

Contributor

Also @stuhood:

it is a very common practice in other implementations to rely on the value of an enum for serialization

That's actually what we're doing here and what I'm trying to ensure by asking for options to be recorded without unwrapping the enum (because that's what we already do elsewhere in pants). I'm relying on the enum serialization logic that is spread across the CoercingEncoder, CoercingOptionEncoder, and RunTrackerOptionEncoder class (in particular, see pants.base.hash_utils.CoercingEncoder for the logic regarding datatype and enum serialization I'm referring to). We rely on the json serialization of the enum itself for caching purposes, as CoercingEncoder is used in stable_json_sha1(), which is used to implement caching in pants. If the workflow is a fingerprintable option, we are relying on the stable_json_sha1() of its serialization, and that serialization right now doesn't pull out the .value, it uses the CoercingEncoder to ensure it always gets serialized the same way. The point of an enum (or one of them) is to attach some strictness / validation to raw data. In my view, one of the jobs of the enum wrapper is to serialize itself reliably, and we have that, just by removing the .value here.

This comment has been minimized.

Copy link
@wiwa

wiwa Jul 24, 2019

Author Contributor

Okay, thanks, will change. I like the philosophy of enum responsibilities. Maybe value should be _value?

This comment has been minimized.

Copy link
@stuhood

stuhood Jul 24, 2019

Member

It doesn't look like this works: https://gist.github.com/stuhood/9b733f7006e08e94bd3bed350d3ed3d2

Sorry for the trouble @wiwa : can switch it back to using .value.

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Jul 24, 2019

Contributor

I've just pushed a branch at dc9d100...b0571f8 which tries to use the more stable repr() for enums, to see if that passes travis. It's difficult to investigate failures in https://travis-ci.org/pantsbuild/pants/jobs/563261042 since the error is in the ast.literal_eval() method, which processes the target_dict after everything writes to it, so the stack trace is less useful. The branch is going through travis here.

@ity

ity approved these changes Jul 24, 2019

self.context.run_tracker.report_target_info(
self.options_scope,
compile_target,
['compile', 'workflow'],

This comment has been minimized.

Copy link
@ity

ity Jul 24, 2019

Contributor

could you also add execution_strategy to track in this change?

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Jul 24, 2019

Contributor

(self.execution_strategy_enum would also work! if you check NailgunTask you'll see self.execution_strategy just accesses the .value (there's a TODO to fix it). i would probably suggest using self.execution_strategy_enum here)

This comment has been minimized.

Copy link
@wiwa

wiwa Jul 24, 2019

Author Contributor

Added


workflow = rsc_compile_context.workflow

# Replica of JvmCompile's _record_target_stats logic

This comment has been minimized.

Copy link
@ity

ity Jul 24, 2019

Contributor

you can remove this comment, in fact in a subsequent change you could just make record_target_stats public and make it take args and then use it here, since RscCompile inherits from JvmCompile.

but this is good as is, and not a blocking change.

wiwa added some commits Jul 24, 2019

@stuhood stuhood merged commit 1a1238e into pantsbuild:master Jul 25, 2019

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.