Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate fallback reduce_empty #41885

Merged
merged 2 commits into from
Aug 25, 2021
Merged

Eliminate fallback reduce_empty #41885

merged 2 commits into from
Aug 25, 2021

Conversation

timholy
Copy link
Sponsor Member

@timholy timholy commented Aug 13, 2021

The fallback definitions of reduce_empty and mapreduce_empty are big targets for invalidation, and the benefit they bring is questionable. Instead of having a dedicated error path, instead this prints a custom MethodError hint that is more informative than the current message.

This fixes ~500 invalidations from DiffEqBase, or ~1500 from DifferentialEquations, CC @ChrisRackauckas.

It is, however, potentially breaking, and I can't remember if changing the type of an error is considered one of the "OK" forms of breakage. Perhaps we want to get a bit away from testing the exact error type and instead just test the contents of the printed message?

For good measure, the second commit removes a no-longer-needed @nospecialize for better inferrability in the "sister" method to this one. That gets a few more invalidations.

@KristofferC
Copy link
Sponsor Member

and I can't remember if changing the type of an error is considered one of the "OK" forms of breakage

We have done that many times so I think that is fair game

@timholy
Copy link
Sponsor Member Author

timholy commented Aug 14, 2021

I'll hold off on this until #41888---it seemed silly to implement type-agnostic error messages specifically for this one test, compared to implementing it in a way that it becomes easy to use for everyone.

@tkf
Copy link
Member

tkf commented Aug 14, 2021

and I can't remember if changing the type of an error is considered one of the "OK" forms of breakage

We have done that many times so I think that is fair game

Ultimately I agree with this decision but I feel it requires more nuanced reasoning.

This is how Julia project interprets SemVer:

### How does Julia define its public API?
The only interfaces that are stable with respect to [SemVer](https://semver.org/) of `julia`
version are the Julia `Base` and standard libraries interfaces described in
[the documentation](https://docs.julialang.org/) and not marked as unstable (e.g.,
experimental and internal). Functions, types, and constants are not part of the public
API if they are not included in the documentation, _even if they have docstrings_.

We do mention ArgumentError in the docstring:

julia/base/reduce.jl

Lines 303 to 311 in 8cad32b

"""
Base.reduce_empty(op, T)
The value to be returned when calling [`reduce`](@ref), [`foldl`](@ref) or [`foldr`](@ref)
with reduction `op` over an empty array with element type of `T`.
If not defined, this will throw an `ArgumentError`.
"""
reduce_empty(op, ::Type{T}) where {T} = _empty_reduce_error()

However, since Base.reduce_empty and Base.mapreduce_empty are not included in the documentation, we can conclude that they are not public APIs. Since the behavior of reduce on empty input with unknown semigroup is not specified, we can dodge the discussion if the patch is worth a breaking change (since it's not).

(I do not enjoy playing the role of tight-ass programmer. But, considering that some users may search for the reasoning behind the change in observable behaviors, I think it's ideal that we record that we do follow the rules.)

@codecov
Copy link

codecov bot commented Aug 15, 2021

Codecov Report

Merging #41885 (8cad32b) into master (e8faf9d) will increase coverage by 0.67%.
The diff coverage is n/a.

❗ Current head 8cad32b differs from pull request most recent head eb2435f. Consider uploading reports for the commit eb2435f to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master   #41885      +/-   ##
==========================================
+ Coverage   77.36%   78.04%   +0.67%     
==========================================
  Files         351      351              
  Lines       74134    74091      -43     
==========================================
+ Hits        57356    57824     +468     
+ Misses      16778    16267     -511     
Impacted Files Coverage Δ
stdlib/Dates/src/rounding.jl 0.00% <0.00%> (-100.00%) ⬇️
stdlib/Dates/src/adjusters.jl 0.00% <0.00%> (-89.16%) ⬇️
stdlib/Dates/src/io.jl 28.49% <0.00%> (-61.20%) ⬇️
base/missing.jl 35.20% <0.00%> (-58.86%) ⬇️
base/binaryplatforms.jl 33.45% <0.00%> (-49.28%) ⬇️
stdlib/Dates/src/types.jl 52.63% <0.00%> (-45.13%) ⬇️
base/libdl.jl 32.35% <0.00%> (-25.62%) ⬇️
stdlib/Dates/src/parse.jl 74.72% <0.00%> (-19.75%) ⬇️
base/compiler/stmtinfo.jl 58.33% <0.00%> (-16.67%) ⬇️
stdlib/Dates/src/query.jl 61.81% <0.00%> (-14.11%) ⬇️
... and 52 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a03392a...eb2435f. Read the comment docs.

@timholy
Copy link
Sponsor Member Author

timholy commented Aug 15, 2021

Thanks, @tkf. Your point raises an interesting question: this PR (FYI I have more improvements locally, waiting in #41888) now advertises the option of extending Base.reduce_empty. Advisable, or not? Base.reduce_empty is not part of the stable API given the policy about appearing in documentation, but if the error message specifically directs people to use it, presumably that blurs the boundary.

Again, I don't think it's a big deal for this PR (it hasn't historically appeared in anything considered public API), but given the points you raise it seems to make sense to consider these implications before going ahead in this manner. The alternative is to avoid mentioning Base.reduce_empty and only urge people to supply an init value.

@tkf
Copy link
Member

tkf commented Aug 15, 2021

Yes, I agree that it's not nice to pointing users to overload a not-really-public API via the error messages. I think mentioning init is a good idea in any case, since that's probably the easiest solution for the user to try.

Arguably, it's also nice to point solutions for library authors. So, promoting reduce_empty and mapreduce_empty to the public API may be a straightforward solution. But, I don't think they (in particular mapreduce_empty) are well-behaving for generic cases other than sum [1]. So, I think it'd be great if we can transition to an alternative API if possible.

I think the easiest way forward is to just not explicitly encourage overloading reduce_empty or mapreduce_empty. Experienced Julia programmers probably will immediately realize what method to overload anyway. OK, it may sound like a double-standard 😅, but we also mention that it's fine to explore the use of internal "API" in the manual.

[1] Very tangential to this PR, but here's a part of why: Many mapreduce_empty definitions take the form of f(reduce_empty(op, T)). However, in the usual execution, the output of f is passed to op, not the other way around. The "right way" to do it is reduce_empty(op, return_type(f, Tuple{T})) but, alas, there's no such thing as return_type (in the Julia language proper).

The fallback definitions of `reduce_empty` and `mapreduce_empty` are
big targets for invalidation, and the benefit they bring is
questionable. Instead of having a dedicated error path, instead we
print a custom `MethodError` hint that is more informative than the
current message.

This fixes ~500 invalidations from DiffEqBase
`arg_types_param` is known to be a `Core.SimpleVector`
@timholy timholy changed the title RFC: eliminate fallback reduce_empty Eliminate fallback reduce_empty Aug 17, 2021
@timholy
Copy link
Sponsor Member Author

timholy commented Aug 18, 2021

I know of one package, StaticArrays, that will break due to this change. (The fix: change @test_throws ArgumentError to @test_throws Union{ArgumentError,MethodError}.) Can we determine how widespread the breakage would be?

@KristofferC
Copy link
Sponsor Member

Can we determine how widespread the breakage would be?

Run a full PkgEval maybe?

@timholy
Copy link
Sponsor Member Author

timholy commented Aug 18, 2021

Sounds good. Remind me, can I trigger that?

@KristofferC
Copy link
Sponsor Member

Yes, https://github.com/JuliaCI/Nanosoldier.jl#quick-start (runtests etc).

@timholy
Copy link
Sponsor Member Author

timholy commented Aug 18, 2021

@nanosoldier runtests(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

@KristofferC
Copy link
Sponsor Member

@nanosoldier runtests(["ASE", "AprilTags", "ArchGDAL", "BlackBoxOptim", "CropRootBox", "DirectionalStatistics", "FileTrees", "FunctionOperators", "InitialValues", "JET", "Joseki", "LinearCovarianceModels", "Missings", "MolecularGraph", "PowerSimulations", "Publish", "Reactive", "Robotlib", "Run", "SentinelArrays", "SphericalFourierBesselDecompositions", "StaticArrays", "StatsBase", "Tectonic", "ThreadedIterables", "YAActL"], vs = ":master")

@nanosoldier
Copy link
Collaborator

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

timholy added a commit to timholy/ArchGDAL.jl that referenced this pull request Aug 19, 2021
timholy added a commit to timholy/FileTrees.jl that referenced this pull request Aug 19, 2021
timholy added a commit to timholy/InitialValues.jl that referenced this pull request Aug 19, 2021
timholy added a commit to timholy/MolecularGraph.jl that referenced this pull request Aug 19, 2021
timholy added a commit to timholy/LinearCovarianceModels.jl that referenced this pull request Aug 20, 2021
timholy added a commit to timholy/Missings.jl that referenced this pull request Aug 20, 2021
timholy added a commit to timholy/StatsBase.jl that referenced this pull request Aug 20, 2021
quinnj pushed a commit to JuliaData/SentinelArrays.jl that referenced this pull request Aug 20, 2021
quinnj pushed a commit to JuliaData/Missings.jl that referenced this pull request Aug 20, 2021
@timholy
Copy link
Sponsor Member Author

timholy commented Aug 20, 2021

PRs (or in the case of one GitLab repo, a discourse message) submitted for all double-failures. From my perspective this is good to go, but do let me know of any other concerns.

Copy link
Member

@tkf tkf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for going through all the packages! I didn't think I didn't have to fix my package :)

LGTM!

c42f pushed a commit to JuliaArrays/StaticArrays.jl that referenced this pull request Aug 21, 2021
nalimilan pushed a commit to JuliaStats/StatsBase.jl that referenced this pull request Aug 21, 2021
@timholy timholy merged commit 6b365ac into master Aug 25, 2021
@timholy timholy deleted the teh/reduce_empty branch August 25, 2021 06:37
@bkamins
Copy link
Member

bkamins commented Aug 27, 2021

@KristofferC - do you know why nanosoldier did not catch that this change breaks DataFrames.jl tests? (I will fix the tests on our side, but I am just curious why it has not been detected). Thank you!

@KristofferC
Copy link
Sponsor Member

do you know why nanosoldier did not catch that this change breaks DataFrames.jl tests? (I will fix the tests on our side, but I am just curious why it has not been detected). Thank you!

Likely because the DataFrames tests failed on master as well:

Test colors and non-standard values: missing and nothing: Test Failed at /home/pkgeval/.julia/packages/DataFrames/vuMM8/test/show.jl:219
  Expression: sprint(show, df, context = :color => true) == "\e[1m2×2 DataFrame\e[0m\n\e[1m Row \e[0m│\e[1m Fish   \e[0m\e[1m Mass      \e[0m\n\e[1m     \e[0m│\e[90m String \e[0m\e[90m Float64?  \e[0m\n─────┼───────────────────\n   1 │ Suzy          1.5\n   2 │ Amir   \e[90m missing   \e[0m"
   Evaluated: "2×2 DataFrame\n Row │ Fish    Mass\n     │ String  Float64?\n─────┼───────────────────\n   1 │ Suzy          1.5\n   2 │ Amir    missing" == "\e[1m2×2 DataFrame\e[0m\n\e[1m Row \e[0m│\e[1m Fish   \e[0m\e[1m Mass      \e[0m\n\e[1m     \e[0m│\e[90m String \e[0m\e[90m Float64?  \e[0m\n─────┼───────────────────\n   1 │ Suzy          1.5\n   2 │ Amir   \e[90m missing   \e[0m"
Stacktrace:
 [1] macro expansion
   @ /workspace/srcdir/usr/share/julia/stdlib/v1.8/Test/src/Test.jl:460 [inlined]
 [2] macro expansion
   @ ~/.julia/packages/DataFrames/vuMM8/test/show.jl:219 [inlined]
 [3] macro expansion
   @ /workspace/srcdir/usr/share/julia/stdlib/v1.8/Test/src/Test.jl:1321 [inlined]
 [4] top-level scope
   @ ~/.julia/packages/DataFrames/vuMM8/test/show.jl:218

@bkamins
Copy link
Member

bkamins commented Aug 27, 2021

Thanks. It means that nanosoldier environment does not properly handle context = :color => true. Not sure if this is relevant, but this is the reason.

@KristofferC
Copy link
Sponsor Member

Interesting. Perhaps you can open an issue on the PkgEval repo and it can be further discussed there.

@timholy
Copy link
Sponsor Member Author

timholy commented Aug 27, 2021

Sorry I broke DataFrames @bkamins, let me know if you need any help resolving. In most cases it's just @test_throws ArgumentError ... -> @test_throws Union{ArgumentError,MethodError} .... If you're testing the specific message, then JuliaArrays/StaticArrays.jl#951 is the better model.

@bkamins
Copy link
Member

bkamins commented Aug 27, 2021

No problem - I have fixed it. Thank you for reducing invalidations.

tkf pushed a commit to JuliaFolds/InitialValues.jl that referenced this pull request Sep 29, 2021
aviatesk added a commit to aviatesk/JET.jl that referenced this pull request May 18, 2023
In the `:basic` mode, downgrade `MethodErrorReport` on call of
`reduce_empty` or `mapreduce_empty` to `UncaughtExceptionReport`
so that it can be filtered out in common cases.

xref: <JuliaLang/julia#41885>
aviatesk added a commit to aviatesk/JET.jl that referenced this pull request May 18, 2023
In the `:basic` mode, downgrade `MethodErrorReport` on call of
`reduce_empty` or `mapreduce_empty` to `UncaughtExceptionReport`
so that it can be filtered out in common cases.

xref: <JuliaLang/julia#41885>
aviatesk added a commit to aviatesk/JET.jl that referenced this pull request May 18, 2023
In the `:basic` mode, downgrade `MethodErrorReport` on call of
`reduce_empty` or `mapreduce_empty` to `UncaughtExceptionReport`
so that it can be filtered out in common cases.

xref: <JuliaLang/julia#41885>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:latency Compiler latency
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants