-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
collect_results!
problem on AWS external drive
#409
Comments
Thanks for raising the issue. I have never used AWS so unfortunately I have no idea how to help here... Hopefully a good soul will see this issue and give us some light! |
Thanks for the response, George. As noted above, I was able now to add the error messages. I'll start searching for something about them, and I'm going to share them with a couple of people who know more about AWS than I. -- Denis |
Can you try passing |
Although I may encounter the problem again, for now I'm bypassing it. The problem does indeed lie with my use of AWS S3, which although designed "for virtually any use case", is cheapest, and correspondingly slowest, when used for long-term storage, which is what I primarily have it for. My expert colleague, i.e., the one with knowledge who set me up on AWS originally, and I did play with the options for mounting the S3 drive but could not make it fast enough to be open for Julia when it wants to load files. Instead, we increased the amount of storage I have in my standard AWS account, which should just allow this first planned scale-up to work. After it does, I'll copy the Dr Watson results file into S3. If the next scale-up requires more space, I'm not sure what I'll do (spend money?!), so it would be great to have a Julia solution. I'll leave it up to George to comment on @JonasIsensee 's suggestion of considering modifying the |
The rest of the saving/loadings functions have been updated to allow passing arbitrary keywords to the save/load command, but apparently |
(note: whether specifying IOtype would work or not I have no idea; I have no idea about AWS in general and I am beyond capacity to learn it now...) |
@Datseris Often I feel that "beyond capacity" could be my name ... . But for future reference, I did want to note that I'm having no difficulty using jldopen in the REPL to read and work with a jld2 file that I (just) created (via a Pluto notebook) in that same S3 drive by opening and working with a single large hdf5 file also on S3. Maybe something to do with BSON files? We'll find out one day, but there's no rush. |
https://juliacloud.github.io/AWSS3.jl/stable/ Haven't played with it yet. |
I played with the above package for a short time, but I did not get collect_results! to work with S3. I have a couple of ideas, but instead I have managed to work around my problem, so I'm no longer investigating it, and I'm closing the issue. |
I love Dr Watson and have been using it pretty much every day since I found it last year. Ready to scale up some code I've been running on AWS, I left the code where it is but --- only! --- changed the output directory to an AWS S3 account (because the final output will overflow my 55GB regular account) by replacing the first line below (which was working perfectly) with the second:
const` bsonFileDir::String = datadir("bsonOut")
changed to
const bsonFileDir::String = "/mnt/illData/bsonOut"
I ran the code and was pleased to see the 222 output files arrive successfully in /mnt/illData/bsonOut via, e.g.,
wsave(joinpath(bsonFileDir, bsonName), resultD)
, butcollect_results!(bsonFileDir, update = true, black_list = [:Class])
just hung, unable to read the files that the same script had put there seconds before. A quick search of the web suggested that for an external drive one can use @load for an individual file, but that doesn't get
collect_results!
working. Interestingly,readdir
works, so Julia has no problem seeing the files.I'm running Julia Version 1.10.2 (2024-03-01) and DrWatson v2.14.1. I could, of course, replicate the project on the S3 drive, and perhaps the problem will go away, but I would prefer to have only the one version sitting where it is. As I'm sure others have used external drives, I'm wondering if this problem is perhaps unique to AWS or is there a work-around or some error that I've missed?
Thanks for any insight that can be provided.
-- denfc
P.S. Running it in a (VSCode) "process" instead of the REPL allowed me to see the error messages:
[ Info: Starting a new result collection...
[ Info: Scanning folder /mnt/illData/bsonOut for result files.
[ Info: Added 222 entries. Updated 0 entries. Deleted 0 entries.
[17083] signal (7.2): Bus error
in expression starting at /home/ubuntu/... Script.jl:54
unsafe_store! at ./pointer.jl:146 [inlined]
unsafe_store! at ./pointer.jl:146 [inlined]
jlunsafe_store! at /home/ubuntu/.julia/packages/JLD2/VWinU/src/JLD2.jl:51 [inlined]
jlunsafe_store! at /home/ubuntu/.julia/packages/JLD2/VWinU/src/misc.jl:15 [inlined]
_write at /home/ubuntu/.julia/packages/JLD2/VWinU/src/mmapio.jl:190 [inlined]
jlwrite at /home/ubuntu/.julia/packages/JLD2/VWinU/src/misc.jl:27 [inlined]
commit at /home/ubuntu/.julia/packages/JLD2/VWinU/src/datatypes.jl:348
h5fieldtype at /home/ubuntu/.julia/packages/JLD2/VWinU/src/data/writing_datatypes.jl:378
h5type at /home/ubuntu/.julia/packages/JLD2/VWinU/src/data/writing_datatypes.jl:384
commit at /home/ubuntu/.julia/packages/JLD2/VWinU/src/data/writing_datatypes.jl:200
commit_compound at /home/ubuntu/.julia/packages/JLD2/VWinU/src/data/writing_datatypes.jl:185
unknown function (ip: 0x7ff1747688f9)
_jl_invoke at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:2894 [inlined]
ijl_apply_generic at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:3076
h5fieldtype at /home/ubuntu/.julia/packages/JLD2/VWinU/src/data/writing_datatypes.jl:105
unknown function (ip: 0x7ff174768d35)
_jl_invoke at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:2894 [inlined]
ijl_apply_generic at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:3076
commit_compound at /home/ubuntu/.julia/packages/JLD2/VWinU/src/data/writing_datatypes.jl:159
unknown function (ip: 0x7ff1747688f9)
_jl_invoke at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:2894 [inlined]
ijl_apply_generic at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:3076
h5type at /home/ubuntu/.julia/packages/JLD2/VWinU/src/data/writing_datatypes.jl:137
unknown function (ip: 0x7ff174765b45)
_jl_invoke at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:2894 [inlined]
ijl_apply_generic at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:3076
h5type at /home/ubuntu/.julia/packages/JLD2/VWinU/src/data/writing_datatypes.jl:142
write_dataset at /home/ubuntu/.julia/packages/JLD2/VWinU/src/datasets.jl:653
#write#110 at /home/ubuntu/.julia/packages/JLD2/VWinU/src/compression.jl:137
write at /home/ubuntu/.julia/packages/JLD2/VWinU/src/compression.jl:125 [inlined]
#write#109 at /home/ubuntu/.julia/packages/JLD2/VWinU/src/compression.jl:121 [inlined]
write at /home/ubuntu/.julia/packages/JLD2/VWinU/src/compression.jl:121
_jl_invoke at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:2894 [inlined]
ijl_apply_generic at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:3076
#89 at /home/ubuntu/.julia/packages/JLD2/VWinU/src/fileio.jl:14
unknown function (ip: 0x7ff174760a15)
_jl_invoke at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:2894 [inlined]
ijl_apply_generic at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:3076
#jldopen#69 at /home/ubuntu/.julia/packages/JLD2/VWinU/src/loadsave.jl:4
jldopen at /home/ubuntu/.julia/packages/JLD2/VWinU/src/loadsave.jl:1 [inlined]
#fileio_save#88 at /home/ubuntu/.julia/packages/JLD2/VWinU/src/fileio.jl:6 [inlined]
fileio_save at /home/ubuntu/.julia/packages/JLD2/VWinU/src/fileio.jl:5
unknown function (ip: 0x7ff174760319)
_jl_invoke at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:2894 [inlined]
ijl_apply_generic at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:3076
jl_apply at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined]
jl_f__call_latest at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/builtins.c:812
#invokelatest#2 at ./essentials.jl:892 [inlined]
invokelatest at ./essentials.jl:889 [inlined]
#action#33 at /home/ubuntu/.julia/packages/FileIO/xOKyx/src/loadsave.jl:219
action at /home/ubuntu/.julia/packages/FileIO/xOKyx/src/loadsave.jl:196 [inlined]
#action#32 at /home/ubuntu/.julia/packages/FileIO/xOKyx/src/loadsave.jl:185 [inlined]
action at /home/ubuntu/.julia/packages/FileIO/xOKyx/src/loadsave.jl:185 [inlined]
#save#20 at /home/ubuntu/.julia/packages/FileIO/xOKyx/src/loadsave.jl:129
save at /home/ubuntu/.julia/packages/FileIO/xOKyx/src/loadsave.jl:125 [inlined]
#_wsave#34 at /home/ubuntu/.julia/packages/DrWatson/rXaRB/src/DrWatson.jl:33 [inlined]
_wsave at /home/ubuntu/.julia/packages/DrWatson/rXaRB/src/DrWatson.jl:33 [inlined]
#wsave#35 at /home/ubuntu/.julia/packages/DrWatson/rXaRB/src/DrWatson.jl:44 [inlined]
wsave at /home/ubuntu/.julia/packages/DrWatson/rXaRB/src/DrWatson.jl:42 [inlined]
#collect_results!#89 at /home/ubuntu/.julia/packages/DrWatson/rXaRB/src/result_collection.jl:200
collect_results! at /home/ubuntu/.julia/packages/DrWatson/rXaRB/src/result_collection.jl:84
#collect_results!#88 at /home/ubuntu/.julia/packages/DrWatson/rXaRB/src/result_collection.jl:74
collect_results! at /home/ubuntu/.julia/packages/DrWatson/rXaRB/src/result_collection.jl:74
unknown function (ip: 0x7ff174733079)
_jl_invoke at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:2894 [inlined]
ijl_apply_generic at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:3076
jl_apply at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined]
do_call at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/interpreter.c:126
eval_value at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/interpreter.c:223
eval_stmt_value at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/interpreter.c:174 [inlined]
eval_body at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/interpreter.c:617
jl_interpret_toplevel_thunk at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/interpreter.c:775
jl_toplevel_eval_flex at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/toplevel.c:934
jl_toplevel_eval_flex at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/toplevel.c:877
ijl_toplevel_eval_in at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/toplevel.c:985
eval at ./boot.jl:385 [inlined]
include_string at ./loading.jl:2076
_jl_invoke at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:2894 [inlined]
ijl_apply_generic at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:3076
_include at ./loading.jl:2136
include at ./Base.jl:495
jfptr_include_46403.1 at /home/ubuntu/julia-1.10.2/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:2894 [inlined]
ijl_apply_generic at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:3076
exec_options at ./client.jl:318
_start at ./client.jl:552
jfptr__start_82738.1 at /home/ubuntu/julia-1.10.2/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:2894 [inlined]
ijl_apply_generic at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/gf.c:3076
jl_apply at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined]
true_main at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/jlapi.c:582
jl_repl_entrypoint at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/src/jlapi.c:731
main at /cache/build/builder-amdci5-1/julialang/julia-release-1-dot-10/cli/loader_exe.c:58
unknown function (ip: 0x7ff175a29d8f)
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 419945137 (Pool: 419904699; Big: 40438); GC: 147
The text was updated successfully, but these errors were encountered: