Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Scratch.jl for default data dir? #73

Closed
willow-ahrens opened this issue Feb 18, 2021 · 14 comments
Closed

Use Scratch.jl for default data dir? #73

willow-ahrens opened this issue Feb 18, 2021 · 14 comments

Comments

@willow-ahrens
Copy link
Contributor

willow-ahrens commented Feb 18, 2021

I almost always use a custom data directory for MatrixDepot (my supercomputer filesystem for scratch data) because I don't want to redownload the matrices when the package updates, and I need to ensure the data path is writable. However, there is now a more automatic solution for this kind of problem, Scratch.jl. I think that Scratch.jl may provide a more useful default data directory which persists across updates to the package.

Currently, the default is in the package itself set on line 140 of "data.jl"

const DATA_DIR = abspath(dirname(@__FILE__),"..", "data")

I think it would make a lot of sense to change that line to

const DATA_DIR = @get_scratch!("data")
@willow-ahrens
Copy link
Contributor Author

willow-ahrens commented Feb 18, 2021

I'll also add that I do NOT think MATRIXDEPOT_MYDEPOT should default to a Scratch.jl directory. From what I understand, MATRIXDEPOT_MYDEPOT might be better off without a default, because defaulting to a location inside the package itself encourages users to write files to a location managed by the package manager. Paths inside the package directory might not be writable when user code runs, or might get deleted.

For a more permanent and extensible solution, custom depot code may be better handled by simply telling users to call "include_generator" from outside of the package at runtime, and by deprecating the current code-copying solution. When the user wants to define a package of custom matrix generators called MyFancyMatrices, they would of course need to call include_generator from the __init__ function in the MyFancyMatrices module, to ensure that updates to usermatrixclass occur at runtime and don't get precompiled. Such a solution has two benefits:

  1. Adding new generators would be done by including packages. All the user needs to do is using MyFancyMatrices
  2. Another user can come along and define an ExtraFancyMatrices package, and the two could work together.

As far as I can tell, the solution I have just described already works, and all that needs to change is documentation and a deprecation to the existing code loading system.

@KlausC
Copy link
Collaborator

KlausC commented Feb 19, 2021

I almost always use a custom data directory for MatrixDepot

You can use the environment variable "MATRIXDEPOT_DATA" for that purpose.
"Scatch.jl" requires juliav1.5 afaik; will be an option for the future.

For a more permanent and extensible solution,

Please feel free to provide a PR with your proposed changes wrt MYDEPOT.

@willow-ahrens
Copy link
Contributor Author

Thanks for your quick response, and for your maintenance of a very useful package!

My goal in filing this issue was to contribute a PR which did not default to writing files inside the MatrixDepot package directory, for both MATRIXDEPOT_MYDEPOT and MATRIXDEPOT_DATA. I understand wanting to avoid a dependency on Scratch for now. In the meantime, what do you think about setting a default similar to that of Scratch? This would confer the same benefits regarding persistence across MatrixDepot versions and Julia versions. Internally, Scratch defaults the scratchspace to a subdirectory of Base.DEPOT_PATH, an environment variable that was introduced before 1.0. Thus, the default could be something like

const DATA_DIR = abspath(first(Base.DEPOT_PATH), "matrix_depot_data")

Alternatively, what do you think about defaulting to @getscratch! only when Scratch is loaded using Requires.jl?

@KlausC
Copy link
Collaborator

KlausC commented Feb 20, 2021

Both is possible in combination. I am waiting for your PR.

Some issues:
We must provide a smooth migration path for people relying on the current defaults. (+ Docu)
Don't forget to create the new directory in __init__, if required.
const DATA_DIR = Ref(--default value if no Scatrch available--) to be overrun in __init__ if Scratch is available.

@willow-ahrens
Copy link
Contributor Author

willow-ahrens commented Feb 21, 2021

I've implemented a rough draft of the proposed changes, but ran into a hiccup using Requires to include Scratch. When Scratch is loaded after MatrixDepot, we need to re-run __init__ using the new data directory. This is messy, and since we cannot control load order, it seems unavoidable with a Requires-based approach.

Unless we can explicitly depend on Scratch or initialize lazily, I think it would be better to exclusively use the Base.DEPOT_PATH default.

In the interest of avoiding putting MatrixDepot users through two changes to the defaults (to Base.DEPOT_PATH and then eventually to @get_scratch!), I asked the Scratch developers whether they think it might be possible to support an earlier version of Julia. In my own tests, I think it may at least be possible to support Julia 1.3. For how long would you like MatrixDepot to support Julia 1.0?

@willow-ahrens
Copy link
Contributor Author

Because of the complications and effort involved in any other solution, I decided it would be easiest and least confusing to users if we simply wait until MatrixDepot.jl is comfortable with a dependency on Julia 1.5 or 1.6, which is likely to be the next LTS, and then add Scratch as a normal dependency. Note the discussion in JuliaPackaging/Scratch.jl#22.

@andreasnoack
Copy link
Member

I'm in favor of bumping the requirement to Julia 1.5. Bumping the bound doesn't mean that users of Julia <1.5 cannot use MatrixDepot. It just means that they cannot use the new features.

@willow-ahrens
Copy link
Contributor Author

willow-ahrens commented Jun 9, 2021

It looks like Scratch is now compatible with Julia 1.0, so these changes no longer require a Julia version requirement bump. JuliaPackaging/Scratch.jl#24

@KlausC
Copy link
Collaborator

KlausC commented Jun 9, 2021

Good news! Ass soon as Scratch publishes the new release, I would be happy to see you adapt your PR #75 .

@andreasnoack
Copy link
Member

andreasnoack commented Jun 9, 2021

It has already been released JuliaRegistries/General#38498

@willow-ahrens
Copy link
Contributor Author

I relaxed the compat entry for julia back to "1" in my pull requests. I'm awaiting feedback on #75 and willow-ahrens#1

@KlausC
Copy link
Collaborator

KlausC commented Jun 10, 2021

#75 has been merged just now.

@KlausC KlausC closed this as completed Jun 10, 2021
@willow-ahrens
Copy link
Contributor Author

Since I wasn't able to merge willow-ahrens#1 into #75 before #75 was merged, I have opened it as a separate PR on master as #79

@KlausC
Copy link
Collaborator

KlausC commented Jun 11, 2021

That is as I had expected. #75 should treat the Scratch subject, and #79 the user defined generators.
As these are unrelated, each of these should be applicable independent of the status of the other one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants