Make `Target` type abstract to allow overriding by different concrete implementations #2402

lihaoyi · 2023-04-02T08:18:03Z

This PR makes all of the T.{apply, input, source, sources, persistent} functions return the same Target type, so that you can easily override one with another. T.{command, worker} are still distinct types.

Motivation

This allows us to override one with another, e.g. override a T.sources with a T.apply if we want to replace source files with computed sources.

trait Foo extends Module{
  def thing = T.source(millSourcePath / "foo.txt")
}
trait Bar extends Foo{
  def thing = T{ 
    os.write(T.dest / "foo.txt", "hello")
    PathRef(T.dest / "foo.txt") 
  }
}

Currently, the workaround is to make T.source do the computation, as follows:

trait Foo extends Module{
  def thing = T.source(millSourcePath / "foo.txt")
}
trait Bar extends Foo{
  def thing = T.source{ 
    os.write(T.dest / "foo.txt", "hello")
    PathRef(T.dest / "foo.txt") 
  }
}

But the status quo is wasteful: it would begin hashing foo.txt at the start of every evaluation, it will watch foo.txt for changes, etc.. Even though we know it could never change since it's a generated file. This is because we are not allowed to change the method type of Source to T[PathRef] during the override, and thus we have to preserve all the Source characteristics even though we know they no longer apply

With this PR, we can simply replace the T.sources with a T{...}, and Mill makes use of that information to avoid hashing/watching the PathRef unnecessarily

Implementation

NamedTask and NamedTaskImpl were merged
Target's logic was mostly hoisted into NamedTask, leaving Target an empty marker trait
TargetImpl is unchanged
Input, Source and Sources have been renamed InputImpl, SourceImpl, and SourcesImplt
All the functions that used to return Source/Sources/Persistent/etc. now return the same type Target, meaning that we can easily override one with the other.
I added stubs in mill/define/package.scala to make existing type annotations : Sources, : Input, etc. continue to work as type aliases to Target[T], and our large codebase and test suite required relatively few changes
Command/T.command and Worker/T.worker continue to return their specific type Command[T]/Worker[T], since they are not sub-types of Target[T].

The Task type hierarchy is considerably flatter and simpler:

Before:

Task
- Task.Sequence, Task.TraverseCtx, Task.Mapped, Task.Zipped, T.task, etc.
- NamedTask
  - NamedTaskImpl
    - Command
    - Worker
    - Input
      - Source
      - Sources
- Target
  - TargetImpl (also inherits from NamedTaskImpl)
    - Persistent

After:

Task
- Task.Sequence, Task.TraverseCtx, Task.Mapped, Task.Zipped, T.task, etc.
- NamedTask
  - Command
  - Worker
  - Target
    - TargetImpl
      - PersistentImpl
    - InputImpl
      - SourceImpl
      - SourcesImpl

Testing

Added some tests to TaskTests.scala to demonstrate and validate the behavior when people override with different types. This was previously a compile error

I had to update the error message for the wrong number of param lists, from T{...} definitions must have 0 parameter lists to Target definitions must have 0 parameter lists, since we no longer know which target sub-type a method returns based on its signature

Notes

Despite the overhaul of the type hierarchy, this should mostly be a transparent change. Hopefully it would be minimally source-incompatible with existing code, so even though there's a huge bin-compat breakage here, people upgrading shouldn't be too affected.

lihaoyi · 2023-04-02T12:28:52Z

The correct-number-of-parameter-lists checks no longer work, since they used to inspect the return types of the methods, which they can no longer do since they all have the same return type. I'll see if I can find a different way to implement them

lefou · 2023-04-02T13:16:28Z

Does this affect also T.command? If not, there should never be an argument list allowed in all variants of Target, as only Tasks and Commands should accept parameter lists.

lefou

This is a great change. In my early Mill days, it took me a while to understand, that all these T-macros return different types and that these are not compatible. In the mental model, they are all interchangeable, and this PR is just correcting it. Great!

lihaoyi · 2023-04-02T13:19:48Z

@lefou currently, this PR does make Command a sub-type of Target, since it used to be a NamedTask. I adjusted the error-reporting code to make it work now that Command and Target are no longer disjoint types, so that's fine I think.

lihaoyi · 2023-04-02T13:24:38Z

@lefou maybe two questions I would like to discuss, which concern exactly what we want a Target (as an abstract class) to really mean:

Should Command be a sub-type of Target?

If we define Target as "Task with name/segments", then Command should be a sub-type since it has a name
If we define Target as "Task that is uniquely addressed by name/segments", then Command should not be a sub-type since it requires not only a name/segments but also arguments to fully define what it will run

Should Workers be Targets?

They can be uniquely addressed by a name/segments, but they cannot be run from the CLI (I think?), and do not return any JSON-serializable output (since their whole thing is about keeping stuff in memory for performance and not serializing it)

I think regardless of what definition we choose I can probably make the tests pass, so this is more from a philosophical perspective rather than from any practical constraints. What do we think makes the most sense, to us and to other people?

lihaoyi · 2023-04-02T13:30:35Z

I guess if we define Target as "things that are mostly interchangeable and can over-ride each other", then that rules out Command, since they generally take parameters while other things don't. The incompatible signatures would prohibit over-riding a T{...} or T.input with a T.command no matter what we think.

What about Workers? Should we be able to override a T.worker with a T{...}, or vice versa? My gut feeling is that we shouldn't allow that, but it's not totally clear what the actual reason is.

lihaoyi · 2023-04-02T13:38:31Z

Another issue with the naming is that it is confusing to have "Target" both refer a concrete implementation as well as an abstract class that other implementations also use.

We could rename the abstract class from Target to Task, rename the current Task to something like BaseTask, and then leave Target as the name of the concrete implementation. Like what I did for Input and Source and Sources, I can leave an alias type Target[T] = Task[T] for source compatibility. Then we would have separate names for the interface Task vs the concrete implementation Target, which may be less confusing in future

lefou · 2023-04-02T13:58:57Z

@lihaoyi as you said, Commands are not interchangeable with Targets, so they should not extend Target. They are almost everywhere special handled, in the router, in the evaluator, exactly for the fact that they also depend on their parameters. Commands are some kind of target factories, so to say.

Workers are some utility to work with state or processes and mostly an implementation detail. They aren't cacheable and can't be invoked from CLI, so they should also not inherit Target.

About the names. I think Task should stay Task, they are anonymous and don't need to produce JSON-serializable values. Target should also stay, they produce JSON-able output. It's IMHO ok from a user perspective, that there are more specialized variants of a Target (input, persistent, ...). If we call them consistently, e.g. "input target" or "persistent target", it's probably ok.

Commands are mostly for use from the CLI, but can also be useful as an API, but only if their parameters accept Tasks, otherwise users see misleading error messages. This caused a lot of issues for our users, especially, when they tried to override a command. They currently need to produce JSON-able output, but we don't use that property in any way beside the fact, that the user or external tools can easily inspect the output.

lihaoyi · 2023-04-02T14:04:18Z

Hmm ok. If we do wish to make Commands and Workers not inherit from Target, then I'll need to resurrect the NamedTask trait to put them under, just so they have a place they can branch off from the others in the type hierarchy. Should be easy to do, and if the NamedTask isn't really going to be very user-visible it probably won't add much confusion

lihaoyi · 2023-04-03T02:51:42Z

I managed to merge a bunch more stuff together, so now the type hierarchy is somewhat flatter and simpler:

Before:

Task
- Task.Sequence, Task.TraverseCtx, Task.Mapped, Task.Zipped, T.task, etc.
- NamedTask
  - NamedTaskImpl
    - Command
    - Worker
    - Input
      - Source
      - Sources
- Target
  - TargetImpl (also inherits from NamedTaskImpl)
    - Persistent

After:

Task
- Task.Sequence, Task.TraverseCtx, Task.Mapped, Task.Zipped, T.task, etc.
- NamedTask
  - Command
  - Worker
  - Target
    - TargetImpl
      - PersistentImpl
    - InputImpl
      - SourceImpl
      - SourcesImpl

Target is now an empty marker-trait, with all the necessary logic merged into NamedTask. There's probably more cleanups we can do, but this is probably a reasonably good state for now.

lihaoyi · 2023-04-03T06:09:03Z

main/test/src/mill/util/TestUtil.scala

-      with Target[Int] {
-    val ctx = ctx0.withSegments(ctx0.segments ++ Seq(ctx0.segment))
-    val readWrite = upickle.default.readwriter[Int]
+  class TestTarget(taskInputs: Seq[Task[Int]], val pure: Boolean)(implicit ctx0: mill.define.Ctx)


This TestTarget was always a bit weird; unlike normal TargetImpls, it didn't wrap a Task[T], and instead implemented the Task[T] behavior itself, to keep the task graph simple for testing purposes.

Now that we've simplified the Task hierarchy, we need to be a bit more explicit about implementing the task behavior ourselves, hence all these getter/setter forwarders

lihaoyi · 2023-04-03T06:15:20Z

main/core/src/mill/define/Task.scala

@@ -40,22 +40,74 @@ abstract class Task[+T] extends Task.Ops[T] with Applyable[Task, T] {
  def self: Task[T] = this
 }

+object Task {


This is just moved up from the bottom of the file, so it is next to the companion trait

lefou

Looks good to me. Great change!

lihaoyi added 6 commits April 2, 2023 14:33

main tests pass

74400b5

fix

7849a7f

.

3da2dcb

.

88876c0

.

58269ae

.

21d2a61

lihaoyi marked this pull request as draft April 2, 2023 09:27

.

2014d51

lihaoyi changed the title ~~[WIP] Make Target type abstract to allow overriding by different concrete implementations~~ Make Target type abstract to allow overriding by different concrete implementations Apr 2, 2023

lihaoyi requested review from lefou and lolgab April 2, 2023 12:24

lihaoyi marked this pull request as ready for review April 2, 2023 12:29

.

a1a3f48

lefou reviewed Apr 2, 2023

View reviewed changes

lihaoyi added 4 commits April 3, 2023 10:20

kill CachedTarget

f05a297

cleanup

1a9686e

merge

a534e90

simplify Labelled

21ea325

lihaoyi added 3 commits April 3, 2023 11:06

privatize

905011b

resurrect NamedTask

e8902a7

.

3fbb315

lihaoyi added 3 commits April 3, 2023 11:23

.

33a87a6

Merge branch 'main' into abstract-target

3ebaed3

.

dcb7d9f

lihaoyi commented Apr 3, 2023

View reviewed changes

lefou approved these changes Apr 3, 2023

View reviewed changes

lefou merged commit 2a7d77d into com-lihaoyi:main Apr 3, 2023

lefou added this to the 0.11.0-M8 milestone Apr 3, 2023

lihaoyi mentioned this pull request Apr 21, 2023

Consolidate Docsite/Examples/Scaladoc/Giter8 #2448

Merged

19 tasks

lihaoyi mentioned this pull request May 1, 2023

Remove the ability for T.inputs and T.sources to take input tasks #2488

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `Target` type abstract to allow overriding by different concrete implementations #2402

Make `Target` type abstract to allow overriding by different concrete implementations #2402

lihaoyi commented Apr 2, 2023 •

edited

Loading

lihaoyi commented Apr 2, 2023

lefou commented Apr 2, 2023

lefou left a comment •

edited

Loading

lihaoyi commented Apr 2, 2023

lihaoyi commented Apr 2, 2023 •

edited

Loading

lihaoyi commented Apr 2, 2023 •

edited

Loading

lihaoyi commented Apr 2, 2023

lefou commented Apr 2, 2023

lihaoyi commented Apr 2, 2023

lihaoyi commented Apr 3, 2023 •

edited

Loading

lihaoyi Apr 3, 2023 •

edited

Loading

lihaoyi Apr 3, 2023

lefou left a comment

Make Target type abstract to allow overriding by different concrete implementations #2402

Make Target type abstract to allow overriding by different concrete implementations #2402

Conversation

lihaoyi commented Apr 2, 2023 • edited Loading

Motivation

Implementation

Testing

Notes

lihaoyi commented Apr 2, 2023

lefou commented Apr 2, 2023

lefou left a comment • edited Loading

Choose a reason for hiding this comment

lihaoyi commented Apr 2, 2023

lihaoyi commented Apr 2, 2023 • edited Loading

lihaoyi commented Apr 2, 2023 • edited Loading

lihaoyi commented Apr 2, 2023

lefou commented Apr 2, 2023

lihaoyi commented Apr 2, 2023

lihaoyi commented Apr 3, 2023 • edited Loading

lihaoyi Apr 3, 2023 • edited Loading

Choose a reason for hiding this comment

lihaoyi Apr 3, 2023

Choose a reason for hiding this comment

lefou left a comment

Choose a reason for hiding this comment

Make `Target` type abstract to allow overriding by different concrete implementations #2402

Make `Target` type abstract to allow overriding by different concrete implementations #2402

lihaoyi commented Apr 2, 2023 •

edited

Loading

lefou left a comment •

edited

Loading

lihaoyi commented Apr 2, 2023 •

edited

Loading

lihaoyi commented Apr 2, 2023 •

edited

Loading

lihaoyi commented Apr 3, 2023 •

edited

Loading

lihaoyi Apr 3, 2023 •

edited

Loading