Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: use scala sculpt to automatically break targets #892

Closed
wants to merge 6 commits into from

Conversation

johnynek
Copy link
Member

Description

A demonstration of a solution to #265 : automatically splitting targets into smaller targets.

Motivation

The goal of this is to add a new optional output to all scala_library targets such as target_split.BUILD which will emit a fine grained segment of a build that has broken a large target into the smallest DAG we can. It will have one section of external_dependencies, all the deps of the current rule, which in this first version will just be forwarded to each node in the dag, then the dag nodes, finally a single export target that exports the entire dag with the same name as the original target to that downstream consumers aren't broken.

The code is not yet complete (and of course should not be merged yet) but I plan to have a basic version of this workable by this friday so people can start experimenting with it.

It is my opinion that onboarding becomes much easier: you build one big globs target, then use this to output a dag for your repo and rewrite it.

My hope is that it can get more people onto using strict reproducible compilations and help people move away from zinc (which still has a number of reproducibility issues and fixing them does not seem to be a priority of that team, but I would love to be corrected and learn it is a priority for them).

@johnynek
Copy link
Member Author

cc @ittaiz @andyscott @ianoc

Copy link
Member

@ittaiz ittaiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!
If we want to merge it in I think it would be a good idea to toggle it on/off in the toolchain.
The idea is that people will copy post build the generated BUILD files and commit them on, right?

)

java_binary(
name = "scalac_runner",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought maybe this should be part of sculpt external repository since it’s a runner for that?

sculpt_json = rule(
implementation = sculpt_json_impl,
attrs = {
"srcs": attr.label_list(allow_files = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can sculpt accept java files?

@andyscott
Copy link
Contributor

I'd explore incorporating these changes as an optional phase after #865 lands. That way we don't have to invoke scalac multiple times. The new phase can:

  • add the scalac plugin
  • augment the normal compilation run's regular outputs with the sculpt json file
  • introduce new actions that produce optional outputs for the split BUILD files

As for toggling this on/off, if this is encapsulated as a phase there are a few reasonable ways. A toolchain paired with the default configuration of scala rules (that conditionally enable the phase) probably works well.

@ittaiz
Copy link
Member

ittaiz commented Dec 17, 2019 via email

@johnynek
Copy link
Member Author

Okay, it basically works to print out the dag now:

bazel build src/scala/io/bazel/rules_scala/target_split:example

bazel run src/scala/io/bazel/rules_scala/target_split:sculpt_processor_runner $(pwd)/bazel-bin/src/scala/io/bazel/rules_scala/target_split/example_sculpt.json

Will print out a DAG of items...

Next step is to format it into a BUILD.

The goal of this work is a path-breaking exercise to show a demo of this. There has been a lot of FUD about how hard this is, I am simply trying to show it is only a few hours of work and to remove any risk that it is a massive time sink.

@johnynek
Copy link
Member Author

okay, running this command:

bazel run src/scala/io/bazel/rules_scala/target_split:sculpt_processor_runner -- --sculpt_json $(pwd)/bazel-bin/src/scala/io/bazel/rules_scala/target_split/example_sculpt.json --package_root $(pwd)/src/scala/io/bazel/rules_scala/target_split/ --target_name example

gives:

scala_library(
    name = "example0",
    srcs = [
        "File1.scala",
        ],
    deps = [],
    visibility = [],
    exports = [],
    )


scala_library(
    name = "example1",
    srcs = [
        "File2.scala",
        ],
    deps = [
        ":example0",
        ],
    visibility = [],
    exports = [],
    )

Next step will be to make all the args passed in by bazel and trying it on targets with dependencies (and plumbing those through).

For those watching the clock, Friday was the deadline I set for myself to demonstrate the full system. I didn't make as much progress today because I spent half the day at the pool with the kids.

@johnynek
Copy link
Member Author

johnynek commented Dec 18, 2019

okay, here is a fully usable tool:

bazel build src/scala/io/bazel/rules_scala/target_split:example_scala --aspects @io_bazel_rules_scala//src/scala/io/bazel/rules_scala/target_split:sculpt.bzl%sculpt_aspect --output_groups=build_files

cat  bazel-bin/src/scala/io/bazel/rules_scala/target_split/example_scala_split.BUILD

The aspect can run on any target. It splits a target, writes the dag, and emits an export-only target with the same name so it doesn't break the build.

@johnynek
Copy link
Member Author

johnynek commented Dec 20, 2019

Okay, I used the aspect to make this PR:

johnynek/bosatsu#461

Here is how you use it:

  1. set the git sha to the latest git sha on this PR.
  2. add:
load("@io_bazel_rules_scala//src/scala/io/bazel/rules_scala/target_split:sculpt.bzl", "make_sculpt") 
make_sculpt()

to your WORKSPACE
3. run the aspect on the target you want, for the above PR:

bazel build core/src/main/scala/org/bykn/bosatsu:bosatsu --aspects @io_bazel_rules_scala//src/scala/io/bazel/rules_scala/target_split:sculpt.bzl%sculpt_aspect --output_groups=build_files

This isn't perfect. Sculpt doesn't see all dependencies needed for compilation (although I didn't see any violations of the DAG), so I had to spend about 5 minutes adding some dependencies to get things to compile.

This is MUCH MUCH less work than trying to do this by hand (note it broke into 70-some targets). I think this is already a huge win for people looking to split huge targets.

Also note this is a real world project: it had to deal with real dependencies and even resources needed in macros that run at compilation time.

Surely there will be more work needed, but I think we can safely say we have a tool to split large targets and that tool is considerably easier than doing by hand.

I'd love a report from others interested in splitting big targets and see how well it works for them.

@johnynek
Copy link
Member Author

I should mention, we could include the transitive dependencies, or deps + 1 into the dag automatically (maybe even as options you can pass to the aspect) which would probably have made it "just work".

@ittaiz
Copy link
Member

ittaiz commented Dec 21, 2019

Thanks @johnynek! Nice work.
@jjudd @borkaehw @SrodriguezO do you think you can take this out for a spin on one of your Targets? This was something that you guys have been wishing for for a long time, right?

@long-stripe
Copy link
Contributor

@johnynek I've been trying this out when I have free time and wanted to give you a quick status/bug report:

ERROR: [snip]/BUILD:1:1: in @io_bazel_rules_scala//src/scala/io/bazel/rules_scala/target_split:sculpt.bzl%sculpt_aspect aspect on java_library rule //[snip]/resources:resources: 
Traceback (most recent call last):
	File "[snip]/BUILD", line 1
		@io_bazel_rules_scala//src/scala/io/bazel/rules_scala/target_split:sculpt.bzl%sculpt_aspect(...)
	File "/private/var/tmp/_bazel_long/a417e293fd8aad17d9bcfce14ecf93ae/external/io_bazel_rules_scala/src/scala/io/bazel/rules_scala/target_split/sculpt.bzl", line 152, in sculpt_aspect_impl
		sculpt_json_for(target, ctx)
	File "/private/var/tmp/_bazel_long/a417e293fd8aad17d9bcfce14ecf93ae/external/io_bazel_rules_scala/src/scala/io/bazel/rules_scala/target_split/sculpt.bzl", line 99, in sculpt_json_for
		run_sculpt(ctx.actions, outjson, ctx.files._scu..., <6 more arguments>)
	File "/private/var/tmp/_bazel_long/a417e293fd8aad17d9bcfce14ecf93ae/external/io_bazel_rules_scala/src/scala/io/bazel/rules_scala/target_split/sculpt.bzl", line 104, in run_sculpt
		ctx.rule.attr.scalacopts
No attribute 'scalacopts' in attr. Make sure you declared a rule attribute with this name.
Available attributes: _action_listener, _config_dependencies, _host_jdk, _java_plugins, _java_toolchain, _jvm, _proguard_whitelister, compatible_with, data, deprecation, deps, exec_compatible_with, exec_properties, exported_plugins, exports, features, generator_function, generator_location, generator_name, javacopts, licenses, name, neverlink, plugins, proguard_specs, resource_jars, resource_strip_prefix, resources, restricted_to, runtime_deps, srcs, tags, testonly, transitive_configs, visibility
ERROR: [snip]/BUILD:3:1: in @io_bazel_rules_scala//src/scala/io/bazel/rules_scala/target_split:sculpt.bzl%sculpt_aspect aspect on proto_library rule //[snip]/protobuf:proto_lib: 
Traceback (most recent call last):
	File "[snip]/BUILD", line 3
		@io_bazel_rules_scala//src/scala/io/bazel/rules_scala/target_split:sculpt.bzl%sculpt_aspect(...)
	File "/private/var/tmp/_bazel_long/a417e293fd8aad17d9bcfce14ecf93ae/external/io_bazel_rules_scala/src/scala/io/bazel/rules_scala/target_split/sculpt.bzl", line 152, in sculpt_aspect_impl
		sculpt_json_for(target, ctx)
	File "/private/var/tmp/_bazel_long/a417e293fd8aad17d9bcfce14ecf93ae/external/io_bazel_rules_scala/src/scala/io/bazel/rules_scala/target_split/sculpt.bzl", line 99, in sculpt_json_for
		run_sculpt(ctx.actions, outjson, ctx.files._scu..., <6 more arguments>)
	File "/private/var/tmp/_bazel_long/a417e293fd8aad17d9bcfce14ecf93ae/external/io_bazel_rules_scala/src/scala/io/bazel/rules_scala/target_split/sculpt.bzl", line 104, in run_sculpt
		ctx.rule.attr.scalacopts
No attribute 'scalacopts' in attr. Make sure you declared a rule attribute with this name.
Available attributes: _action_listener, _config_dependencies, _proto_compiler, compatible_with, data, deprecation, deps, exec_compatible_with, exec_properties, exports, features, generator_function, generator_location, generator_name, import_prefix, licenses, name, restricted_to, srcs, strip_import_prefix, tags, testonly, transitive_configs, visibility

It looks like calling the aspect via command line means it propagates to any deps, which can erroneously include proto_library and java_library targets, as well as top-level targets in the target pattern.

This was my call, edited for the public:

bazel build "//path/to/package/..." --aspects @io_bazel_rules_scala//src/scala/io/bazel/rules_scala/target_split:sculpt.bzl%sculpt_aspect --output_groups=build_files

I'm sorry I don't have a bugfix on hand to accompany this, but I wanted to at least leave this feedback here!

@johnynek
Copy link
Member Author

@long-stripe I think you need to just change the part where it gets the scalacopts to be a checked call: if there is no attribute, then don’t add any non-default scalac opts.

@long-stripe
Copy link
Contributor

long-stripe commented Jan 14, 2020

@johnynek Thanks for the tip -- I got past analysis but I suspect my dirty hacking has jacked up things: https://github.com/long-stripe/rules_scala/tree/long/sculpt-wip

There's probably a cleaner way to do all this (I have some ideas steeping in the back of my mind) and I'll try to get around to it when I can, but I'm out of my timebox now.

@liucijus
Copy link
Collaborator

closing as stale, feel free to reopen

@liucijus liucijus closed this Jan 31, 2024
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants