Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ProcessExitedError rather than using `error` #27900

Merged
merged 18 commits into from Apr 3, 2019

Conversation

@oxinabox
Copy link
Contributor

commented Jul 2, 2018

This has been bugging me for years.
I thought it was solved during 0.7-Dev.

As someone calling external processes,
when they fail I want a unique exception type to be thrown,
so I can handle it appropriately.

I think this will require tests being updated
I wrote it as a PR because it is more clear than writing an issue I think.
But this is basically an issue.

base/process.jl Outdated Show resolved Hide resolved
base/process.jl Outdated Show resolved Hide resolved
base/process.jl Outdated Show resolved Hide resolved
base/process.jl Outdated Show resolved Hide resolved
base/process.jl Outdated Show resolved Hide resolved
@oxinabox

This comment has been minimized.

Copy link
Contributor Author

commented Jul 9, 2018

I think I need to rebase this as I think it is based off a version of master that doesn't build

@oxinabox oxinabox force-pushed the oxinabox:patch-10 branch from 8872566 to c98c2f0 Jul 11, 2018

@oxinabox

This comment has been minimized.

Copy link
Contributor Author

commented Jul 11, 2018

Ok testing this locally it all passes except one.
And I can't workout how I broke that one.
It doesn't seem to have anything to do with processes.

I get the error:

      From worker 7:    WARNING: attempting hard kill of repl test after exceeding timeout

Executing tests that run on node 1 only:
precompile                          (1) |    86.67 |   0.88 |  1.0 |    1431.78 |   278.88
SharedArrays                        (1) |    52.23 |   2.06 |  4.0 |    2085.67 |   309.06
Worker 7 failed running test REPL:
Some tests did not pass: 2 passed, 0 failed, 1 errored, 0 broken.REPL: Error During Test at /usr/src/julia/julia$
master/test/testdefs.jl:19
  Got exception LoadError("/usr/src/julia/julia-master/usr/share/julia/stdlib/v0.7/REPL/test/runtests.jl", 4, Loa
dError("/usr/src/julia/julia-master/usr/share/julia/stdlib/v0.7/REPL/test/repl.jl", 74, "hard kill repl test")) o
utside of a @test
  LoadError: LoadError: "hard kill repl test"
  Stacktrace:
   [1] try_yieldto(::typeof(Base.ensure_rescheduled), ::Base.RefValue{Task}) at ./event.jl:196
   [2] wait() at ./event.jl:255
   [3] wait(::Condition) at ./event.jl:46
   [4] wait_readnb(::Base.PipeEndpoint, ::Int64) at ./stream.jl:301
   [5] eof at ./stream.jl:61 [inlined]
   [6] readuntil_vector!(::Base.PipeEndpoint, ::Base.CodeUnits{UInt8,String}, ::Bool, ::Array{UInt8,1}) at ./io.j
l:695
   [7] #readuntil#283 at ./io.jl:773 [inlined]
   [8] #readuntil at ./none:0 [inlined]
   [9] #readuntil#282(::Bool, ::Function, ::Base.PipeEndpoint, ::String) at ./io.jl:768
   [10] readuntil at ./io.jl:758 [inlined]
   [11] (::getfield(Main.Test48Main_REPL.REPLTests, Symbol("##10#15")){Base.PipeEndpoint,Base.PipeEndpoint,String
})(::String) at /usr/src/julia/julia-master/usr/share/julia/stdlib/v0.7/REPL/test/repl.jl:132
   [12] mktempdir(::getfield(Main.Test48Main_REPL.REPLTests, Symbol("##10#15")){Base.PipeEndpoint,Base.PipeEndpoi
nt,String}, ::String) at ./file.jl:569
   [13] mktempdir at ./file.jl:567 [inlined]
   [14] (::getfield(Main.Test48Main_REPL.REPLTests, Symbol("##8#13")))(::Base.PipeEndpoint, ::Base.PipeEndpoint,:
:REPL.LineEditREPL) at /usr/src/julia/julia-master/usr/share/julia/stdlib/v0.7/REPL/test/repl.jl:107
   [15] #fake_repl#5(::REPL.Options, ::Function, ::Any) at /usr/src/julia/julia-master/usr/share/julia/stdlib/v0.
7/REPL/test/repl.jl:53                                                                                              [16] fake_repl(::Any) at /usr/src/julia/julia-master/usr/share/julia/stdlib/v0.7/REPL/test/repl.jl:42            [17] top-level scope at none:0                                                                                   [18] include at ./boot.jl:317 [inlined]                                                                          [19] include_relative(::Module, ::String) at ./loading.jl:1034                                                   [20] include at ./sysimg.jl:29 [inlined]                                                                         [21] include(::String) at /usr/src/julia/julia-master/usr/share/julia/stdlib/v0.7/REPL/test/runtests.jl:3        [22] top-level scope at none:0                                                                                   [23] include at ./boot.jl:317 [inlined]
   [24] include_relative(::Module, ::String) at ./loading.jl:1034
   [25] include at ./sysimg.jl:29 [inlined]
   [26] include(::String) at /usr/src/julia/julia-master/test/testdefs.jl:13
   [27] macro expansion at /usr/src/julia/julia-master/test/testdefs.jl:292 [inlined]
   [28] macro expansion at /usr/src/julia/julia-master/usr/share/julia/stdlib/v0.7/Test/src/Test.jl:1080 [inline$
]
   [29] macro expansion at /usr/src/julia/julia-master/test/testdefs.jl:21 [inlined]
   [30] macro expansion at ./util.jl:289 [inlined]
   [31] top-level scope at /usr/src/julia/julia-master/test/testdefs.jl:19 [inlined]
   [32] top-level scope at ./<missing>:0
   [33] eval at ./boot.jl:319 [inlined]
   [34] #runtests#3(::UInt128, ::Function, ::String, ::String, ::Bool) at /usr/src/julia/julia-master/test/testd$
fs.jl:25
   [35] #runtests at ./none:0 [inlined] (repeats 2 times)
   [36] (::getfield(Distributed, Symbol("##112#114")){Distributed.CallMsg{:call_fetch}})() at /usr/src/julia/jul$
a-master/usr/share/julia/stdlib/v0.7/Distributed/src/process_messages.jl:269
   [37] run_work_thunk(::getfield(Distributed, Symbol("##112#114")){Distributed.CallMsg{:call_fetch}}, ::Bool) a$
 /usr/src/julia/julia-master/usr/share/julia/stdlib/v0.7/Distributed/src/process_messages.jl:56
   [38] macro expansion at /usr/src/julia/julia-master/usr/share/julia/stdlib/v0.7/Distributed/src/process_messa$
es.jl:270 [inlined]
   [39] (::getfield(Distributed, Symbol("##111#113")){Distributed.CallMsg{:call_fetch},Distributed.MsgHeader,Soc$
ets.TCPSocket})() at ./task.jl:257
  in expression starting at /usr/src/julia/julia-master/usr/share/julia/stdlib/v0.7/REPL/test/repl.jl:74
ets.TCPSocket})() at ./task.jl:257                                                                      [219/532]
  in expression starting at /usr/src/julia/julia-master/usr/share/julia/stdlib/v0.7/REPL/test/repl.jl:74
  in expression starting at /usr/src/julia/julia-master/usr/share/julia/stdlib/v0.7/REPL/test/runtests.jl:4

@oxinabox

This comment has been minimized.

Copy link
Contributor Author

commented Nov 7, 2018

So according to the ruling that changing errors to non-errors is a nonbreaking change,
changing the type of exception is also allowed.

So I should rebase this.

@StefanKarpinski

This comment has been minimized.

Copy link
Member

commented Nov 7, 2018

Yes, that would be great!

@oxinabox oxinabox force-pushed the oxinabox:patch-10 branch from c98c2f0 to 005d60c Nov 24, 2018

@oxinabox

This comment has been minimized.

Copy link
Contributor Author

commented Nov 24, 2018

@ararslan did I fix everything you wanted fixing?

base/process.jl Outdated Show resolved Hide resolved
base/exports.jl Outdated Show resolved Hide resolved
@oxinabox

This comment has been minimized.

Copy link
Contributor Author

commented Nov 26, 2018

Problem.
The Distributed standard library exports a ProcessExitedException too.
I am going to rename that one to WorkerProcessExitedException

@oxinabox

This comment has been minimized.

Copy link
Contributor Author

commented Nov 26, 2018

Ok, looks like tests all pass.
It is something after the testing stage that fails.

@ararslan

This comment has been minimized.

Copy link
Member

commented Nov 26, 2018

The Distributed standard library exports a ProcessExitedException too.

Could it not just use the one now defined in Base? That would be non-breaking.

@fredrikekre

This comment has been minimized.

Copy link
Member

commented Nov 26, 2018

Download tests etc are run after the main test suite. This definitely looks related:

Some tests did not pass: 8 passed, 2 failed, 0 errored, 0 broken.download: Test Failed at /tmp/julia/share/julia/test/download.jl:19
  Expression: download("http://httpbin.org/status/404", missing_file)
    Expected: ErrorException
      Thrown: ProcessExitedException
@oxinabox

This comment has been minimized.

Copy link
Contributor Author

commented Nov 26, 2018

Could it not just use the one now defined in Base? That would be non-breaking.

Not trivially, since the one now defined in base holds a Process but the process object AFAIK is not available when that error is thrown in Distributed

@oxinabox

This comment has been minimized.

Copy link
Contributor Author

commented Nov 26, 2018

Download tests etc are run after the main test suite.

Did not know that.
Do they run with make test?
Or do i need to do make testall or some such for them?

@KristofferC

This comment has been minimized.

Copy link
Contributor

commented Nov 26, 2018

make test download I think

@oxinabox

This comment has been minimized.

Copy link
Contributor Author

commented Nov 27, 2018

Travis MacOS failure is unrelated, I think.

So I think this is done. does it [NeedsNEWS]?

@ararslan ararslan added the needs news label Nov 27, 2018

@ararslan

This comment has been minimized.

Copy link
Member

commented Nov 27, 2018

Using WorkerProcessExitedException in Distributed is potentially breaking, since code could be checking for ProcessExitedException in a catch.

If you consolidate ProcessExitedException and WorkerProcessExitedException, you could make the type's field a Union{Vector{Process},Nothing}, and define

ProcessExitedException() = ProcessExitedException(nothing)

for use in Distributed.

@oxinabox

This comment has been minimized.

Copy link
Contributor Author

commented Nov 27, 2018

This is already breaking, since code could be checking for ErrorException in a catch.

The consolidation could be done. If you think that that kinda hackyness is preferable.
Though even then it would be breaking as code could be checking for Distributions.ProcessExcitedException in a catch.

(I would as an aside say there is an argument for Union{Vector{Process}, Missing}, the Process that exitted did exist, we just don't know what it was. But OTOH some people get ansty about Missing in a nonstatistical context.)

@ararslan

This comment has been minimized.

Copy link
Member

commented Nov 27, 2018

This is already breaking, since code could be checking for ErrorException in a catch.

Fair point, though I would guess (based on no actual data) that specifically looking for an ErrorException is pretty uncommon, except maybe in a @test_throws. We could check with a PkgEval run.

If you think that that kinda hackyness is preferable.

I don't think it's hacky at all.

We could alternatively make Distributed fill in the field with the process information from the spawned Julia process, but that seems unnecessary since the information isn't used anywhere.

Though even then it would be breaking as code could be checking for Distributions.ProcessExcitedException in a catch.

If ProcessExitedException is defined in Base, it is defined in all modules that load Base, including Distributed. For example, using Distributed; Distributed.sin works. So one can still use the Distributed qualification without issue.

@oxinabox

This comment has been minimized.

Copy link
Contributor Author

commented Nov 27, 2018

If ProcessExitedException is defined in Base, it is defined in all modules that load Base, including Distributed. For example, using Distributed; Distributed.sin works. So one can still use the Distributed qualification without issue.

I forgot about that. My bad.

If you think that that kinda hackyness is preferable.

I don't think it's hacky at all.

I think it is, a bit, based on the fact that this errror type is specifically for printing about Processes with status codes, etc.
But with that said, this can be made progressively less hacky, if we later tweak Distributed.jl to capture some of that info.

Ok I'll make the change

base/process.jl Outdated Show resolved Hide resolved
base/process.jl Outdated Show resolved Hide resolved
test/spawn.jl Outdated Show resolved Hide resolved
test/spawn.jl Outdated Show resolved Hide resolved

@oxinabox oxinabox force-pushed the oxinabox:patch-10 branch from d832fe0 to e2d07dc Apr 1, 2019

@oxinabox

This comment has been minimized.

Copy link
Contributor Author

commented Apr 1, 2019

Ok, I think I found it.
Extra error handling had been added to the download to give some extra information.
Which was set to throw with error, and thus changed the type of the exception from what was prevously thrown.
To keep it so that wether or not we have extra information to add, the same Exception type is thrown,
I changed it to instead log that extra info to @error.

@oxinabox

This comment has been minimized.

Copy link
Contributor Author

commented Apr 2, 2019

The Travis and AppVeyor failures are unrelated. And according to #infastructure generally broken right now.
I can kick it a few more times, or if we are satisfied we can merge?

@oxinabox

This comment has been minimized.

Copy link
Contributor Author

commented Apr 3, 2019

Can this go in now? Before the feature freeze.
I've rebased through multiple releases now and it is annoying to do, and the code changes under me.

Assuming noone has a dread hatred of using the @error logger for reporting supplementary error information.
Then this is done and approved.

@StefanKarpinski

This comment has been minimized.

Copy link
Member

commented Apr 3, 2019

I'm going to merge as-is tomorrow if this doesn't get a review...

@JeffBezanson

This comment has been minimized.

Copy link
Member

commented Apr 3, 2019

This has had multiple approved reviews for a long time.

@StefanKarpinski StefanKarpinski merged commit b3b6d03 into JuliaLang:master Apr 3, 2019

12 of 14 checks passed

continuous-integration/appveyor/pr AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build failed
Details
buildbot/package_freebsd64 Run complete
Details
buildbot/package_linux32 Run complete
Details
buildbot/package_linux64 Run complete
Details
buildbot/package_macos64 Run complete
Details
buildbot/package_win32 Run complete
Details
buildbot/package_win64 Run complete
Details
buildbot/tester_freebsd64 Run complete
Details
buildbot/tester_linux32 Run complete
Details
buildbot/tester_linux64 Run complete
Details
buildbot/tester_macos64 Run complete
Details
buildbot/tester_win32 Run complete
Details
buildbot/tester_win64 Run complete
Details
stderr = readline(err)
error(stderr)
error_msg = readline(err)
@error "Download Failed" error_msg

This comment has been minimized.

Copy link
@musm

musm Apr 5, 2019

Contributor

this should've been a string interpolation, fixed in #31620

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.