Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove unsafe use of dictionary in multithreading #49

Merged
merged 18 commits into from Jun 10, 2020

Conversation

OkonSamuel
Copy link
Member

@OkonSamuel OkonSamuel commented Jun 1, 2020

closes #48
I also reverted to using channels for Multithreading Threading reason being julia Ctrl + C cancellation doesn't yet work well for multithreading so i have to use channels for now to work around this.

@OkonSamuel
Copy link
Member Author

OkonSamuel commented Jun 1, 2020

Due to chunks iterator introduced and _process_accel_settings method, this PR requires a new tagged release of MLJBase.jl

@ablaom
Copy link
Member

ablaom commented Jun 5, 2020

@OkonSamuel

Not had a chance to review but I see CI is failing on julia 1.0 where there is no multi-threading. Something to do with reversion to Channels?

@ablaom
Copy link
Member

ablaom commented Jun 5, 2020

Due to chunks iterator introduced and _process_accel_settings method, this PR requires a new tagged release of MLJBase.jl

This should be live now but will also require you to wait for #50 because the new MLJBase requires MLJModelInterface 0.3. Aaarrrgghhh....

@ablaom
Copy link
Member

ablaom commented Jun 5, 2020

Okay dev now has extended MLJModelInterface [compat]. Retriggering CI:

@ablaom ablaom closed this Jun 5, 2020
@ablaom ablaom reopened this Jun 5, 2020
@ablaom
Copy link
Member

ablaom commented Jun 5, 2020

@OkonSamuel Just a note that it is easier for me if you push your PR branches directly to github rather than basing the PR on a fork. You should have permission to do that.

@OkonSamuel
Copy link
Member Author

Just a note that it is easier for me if you push your PR branches directly to github rather than basing the PR on a fork. You should have permission to do that.

Noted

Copy link
Member

@ablaom ablaom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OkonSamuel Many thanks for this!

Merge when ready.

We can tag a release but the previous PR to General is stalled for some reason.

@OkonSamuel
Copy link
Member Author

@ablaom before i merge let me drawn your attention to an issue i have been researching for some time now. This unfortunately showed up in the tests


      From worker 2:	error in running finalizer: ErrorException("concurrency violation detected")

      From worker 2:	error in running finalizer: ErrorException("concurrency violation detected")

      From worker 2:	error in running finalizer: ErrorException("concurrency violation detected")

      From worker 2:	error in running finalizer: ErrorException("concurrency violation detected")

      From worker 2:	error in running finalizer: ErrorException("concurrency violation detected")

      From worker 2:	error in running finalizer: ErrorException("concurrency violation detected")

      From worker 2:	error in running finalizer: ErrorException("concurrency violation detected")

      From worker 2:	error in running finalizer: ErrorException("concurrency violation detected")

see it here.
I don't know if youve come across this while using MLJ on your Mac.??
I have been able to reproduce it on my PC(This was done using a Linux PC). But still this rarely occur. It only shows in some cases when i run evaluate using CPUThreads for the first time after MLJ is loaded(Maybe it has something to do with compilation??)
Subsequent runs doesn't show this warning no matter what i do.
The weird thing is that it never halts the program or affects any results in any way.

@ablaom
Copy link
Member

ablaom commented Jun 7, 2020

@OkonSamuel Thank you very much indeed for looking into this. Unfortunately, I'm not sure what to add except that, as far as I know, I have not observed this and cannot reproduce the CI "failure"

julia> versioninfo()
Julia Version 1.3.0
Commit 46ce4d7933 (2019-11-26 06:09 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin19.0.0)
  CPU: Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
Environment:
  JULIA_PATH = /Applications/Julia-1.3.app/Contents/Resources/julia/bin/julia
  JULIA_NUM_THREADS = 5

Had you observed this prior in commits prior to this PR?

@OkonSamuel
Copy link
Member Author

OkonSamuel commented Jun 7, 2020

Had you observed this prior in commits prior to this PR?

@ablaom. No. I Observed this sometime ago on my PC.
The following code reproduces it.

julia> using Distributed

julia> addprocs(2)

julia> @everywhere using MLJ
#The module must be precompiled for error to occur below
julia> @everywhere logic = @load LogisticClassifier pkg=MLJLinearModels
julia> X = table(rand(10000,3));

julia> y = categorical(repeat(["up","no","yes","down"],outer=2500));

julia> mach = machine(logic, X, y)

julia> evaluate!(mach, acceleration = CPUThreads(), repeats=10, verbosity=1);
#At this point the warning should popup but won't stop the evaluation

To get the warning to show up is tricky and you have to edit MLJBase locally at your .julia folder. (You can simple add blank spaces. This is to force MLJ to precompile again)
PS
If you can't reproduce this on Mac then it means it only occurs on a Linux

@OkonSamuel
Copy link
Member Author

@ablaom. I think it's coming from get_measurements method in evaluate method from MLJBase. ( It might be due to precompilation??. Since it only occurs once after compilation and never shows up again).
I may have a fix (This basically makes a single threaded call to get_measurements once before calling it in other threads.) for this but am still testing to see if it's fixes it.(So far i haven't seen the error)

@ablaom
Copy link
Member

ablaom commented Jun 8, 2020

Happy to wait for your experiements. Many thanks!

@OkonSamuel
Copy link
Member Author

OkonSamuel commented Jun 9, 2020

@ablaom. I've implemented a fix here.
This has the consequence of being slower than the current implementation for computationally intensive algorithms (still faster that CPU1 option though).
If you ask me i prefer the current implementation( since these errors only pops up sometimes when MLJ is precompiled).
What's your take on this?

@ablaom
Copy link
Member

ablaom commented Jun 10, 2020

I'm still happy for you to merge when ready. See my comment here: JuliaAI/MLJBase.jl#338 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants