-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update to julia-0.7/1.0 #54
Conversation
@alyst Could you please push a version with an updated |
@andreasnoack Currently testing it locally, then will update the changes. BTW, do you know the fate of johnmileswhite/MNIST.jl? It wasn't touched for years, PRs ignored. It doesn't have much functionality, but it contains a medium-sized MNIST dataset that is used for testing/benchmarking TSne.jl. Copying these data to TSne.jl would make it too heavy. Maybe there's another Julia datasets package that already contains it? Or, if somebody can merge the PR, I can update MNIST.jl to 0.7/1.0. Or, the data could be moved to RDatasets (but we might not want to feed it into a mammoth). |
MNIST is in https://github.com/JuliaML/MLDatasets.jl which is actively maintained so I'd recommend using that version |
Ah, good to now. Will switch to it. Maybe somebody could update MNIST.jl/README.md to redirect people to MLDatasets.jl? So far it's in the registry, so people will keep discovering it. |
It seems there's a considerable CSV-related memory leak. When testing locally, my 24Gb RAM is not enough. When I skip CSV datasets, everything is much faster and memory usage is normal. |
Could you elaborate? |
While previously (v0.6) RDatasets tests were also running pretty slow, I don't recall I had memory issues. I suspect Julia 1.0 Travis build failed because of the same issue (the other builds suffered from network problems). The stacktrace is strange:
|
I reported about the CSV issue with a 2-row file, and I just updated about it there. Talking about csv memory leak, I have been battling for quite some time to find out the issue. Recently I inquired at the forum here. I can't pinpoint about the leak, but it seems very possibly related to csv related utilities. |
@quinnj Do you have any idea what is happening here? |
Some more detailed stats for the first few datasets (also a check that the tests are not just stuck in some loop):
|
@JeffBezanson I think this might hang in subtyping. When interrupting the process I got ds = convert(String, r[:Dataset]) = "mexico"
^C^C^C^C^C^CWARNING: Force throwing a SIGINT
Internal error: encountered unexpected error in runtime:
InterruptException()
has_free_typevars at /Users/osx/buildbot/slave/package_osx64/build/src/jltypes.c:150
subtype_tuple at /Users/osx/buildbot/slave/package_osx64/build/src/subtype.c:851 [inlined]
subtype at /Users/osx/buildbot/slave/package_osx64/build/src/subtype.c:997
exists_subtype at /Users/osx/buildbot/slave/package_osx64/build/src/subtype.c:1078 [inlined]
forall_exists_subtype at /Users/osx/buildbot/slave/package_osx64/build/src/subtype.c:1106
subtype_ccheck at /Users/osx/buildbot/slave/package_osx64/build/src/subtype.c:461
var_gt at /Users/osx/buildbot/slave/package_osx64/build/src/subtype.c:545
exists_subtype at /Users/osx/buildbot/slave/package_osx64/build/src/subtype.c:1078 [inlined]
forall_exists_subtype at /Users/osx/buildbot/slave/package_osx64/build/src/subtype.c:1106
forall_exists_equal at /Users/osx/buildbot/slave/package_osx64/build/src/subtype.c:1045
subtype at /Users/osx/buildbot/slave/package_osx64/build/src/subtype.c:1017
subtype_unionall at /Users/osx/buildbot/slave/package_osx64/build/src/subtype.c:638
subtype_unionall at /Users/osx/buildbot/slave/package_osx64/build/src/subtype.c:638
exists_subtype at /Users/osx/buildbot/slave/package_osx64/build/src/subtype.c:1078 [inlined]
... and I get something similar when taking a sample during execution. It seems to be consistently happening in the |
I'll work on it. I think the problem is that we have to wait until we leave the UnionAll for |
I suspect that this might be the same as reported in JuliaData/CSV.jl#236 |
This memory leak aside, is there any reason else not to merge this PR? It doesn't appear the memory leak is related to the PR itself, and the memory leak can be explored elsewhere. |
@randyzwitch Thanks for the reminder! Now it passes on Julia nightly, so I think it's safe to merge (cc @andreasnoack) |
@alyst I'm not sure passing only on nightly is a reasonable jump for this package, since 0.7 and 1.0 is the original goal |
@randyzwitch CI doesn't pass on 0.7 and 1.0 because of (at least) JuliaLang/julia#28677, RDatasets (and CSV) itself is fine. |
Thanks for the explanation @alyst. Can you add this note to the README, so that users coming in can understand that the package is expected to be working, just that it depends on a fix in master? Then I'll merge it in. |
@randyzwitch Good idea, I've added a note to README. |
Merged |
@randyzwitch Could you please also tag a new minor release? |
It looks like Attobot isn't enabled so I've opened JuliaLang/METADATA.jl#17366 |
This PR requires JuliaData/RData.jl#44 to be merged and minimal RData version to be bumped.