Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack does not follow the XDG Base Directory Specification #4243

Closed
rszibele opened this issue Aug 19, 2018 · 17 comments
Closed

Stack does not follow the XDG Base Directory Specification #4243

rszibele opened this issue Aug 19, 2018 · 17 comments

Comments

@rszibele
Copy link
Contributor

rszibele commented Aug 19, 2018

Stack is one of my favorite build tools and unfortunately it currently does not follow the XDG Base Directory Specification: https://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html

My $HOME/.stack folder is currently sitting at around 12GiB (most of which is cache), which requires me to manually check which files I have to back up.

From a quick glance the following files are configuration which would belong in $XDG_CONFIG_HOME/stack:

  • $HOME/.stack/config.yaml
  • $HOME/.stack/global-project/stack.yaml

The files in $HOME/.stack/templates/ probably belong in $XDG_DATA_HOME/stack/templates/, as the user could have his own and therefore could be considered essential.

The other directories seem to only contain cache data which are non-essential and would belong in $XDG_CACHE_HOME/stack.

@ghost
Copy link

ghost commented Aug 19, 2018

If you can add an option to change the directory structure, that would be great

@borsboom
Copy link
Contributor

Just thinking about how this should behave, maybe if $XDG_CACHE_HOME is set then use that for the cached items unless the cached items are already in $HOME/.stack. This way new installations would do the right thing, but existing installations would still keep using their existing cache. And something similar for $XDG_CONFIG_HOME.

@dbaynard
Copy link
Contributor

I would find this helpful — I want to keep down the size of my backups, and so I currently have to manually delete caches.

The global-project example seems fairly complicated — the working directory is the global-project/.stack-work directory. This would have to change, here, too.

@rszibele
Copy link
Contributor Author

rszibele commented Aug 22, 2018

@borsboom Yes, that sounds sensible. We wouldn't want to copy the cache automatically when invoking stack, as it may be very large, so it should keep using the available $HOME/.stack directory if it exists.

@dbaynard Correct me if I'm wrong, but the global-project/.stack-work also falls under the cache category, as it can be re-created given a list of packages that are "gloablly" installed through stack. The logs aren't really important either, as they only become relevant when something goes wrong.

@dbaynard
Copy link
Contributor

dbaynard commented Aug 22, 2018

It does fall in that category, and should therefore be in $XDG_CACHE_HOME. However, it is currently maintained separately to the $HOME/.stack caches, and so I believe it uses the logic from per-project .stack-work directories (though I'm not familiar with that part of the code). I meant to say: that part of the code would have to change, possibly significantly so.

I'm trying to get a handle on how much work would be involved in a PR, and which other parts of the code would be impacted. @rszibele would you be able to take a look?

@dbaynard
Copy link
Contributor

Also, you've listed backups as a reason for this change. Are there others?


I've reproduced the justification from the base directory specification you linked, here.

The XDG Base Directory Specification is based on the following concepts:

There is a single base directory relative to which user-specific data files should be written. This directory is defined by the environment variable $XDG_DATA_HOME.

There is a single base directory relative to which user-specific configuration files should be written. This directory is defined by the environment variable $XDG_CONFIG_HOME.

There is a set of preference ordered base directories relative to which data files should be searched. This set of directories is defined by the environment variable $XDG_DATA_DIRS.

There is a set of preference ordered base directories relative to which configuration files should be searched. This set of directories is defined by the environment variable $XDG_CONFIG_DIRS.

There is a single base directory relative to which user-specific non-essential (cached) data should be written. This directory is defined by the environment variable $XDG_CACHE_HOME.

There is a single base directory relative to which user-specific runtime files and other file objects should be placed. This directory is defined by the environment variable $XDG_RUNTIME_DIR.

@rszibele
Copy link
Contributor Author

rszibele commented Aug 22, 2018

@dbaynard I'll have to take an in-depth look at how much has to be changed and how the global project works. From a quick overall glance at the code base we have a few options:

1. The easy way with as few modifications as possible:

  • modify stack root to point to XDG_CACHE_HOME as the default (as most is cache data)
  • modify the config paths named above
  • modify the templates to point to XDG_DATA_DIR
  • add support for the XDG environment variables

2. The clean solution with a lot more modifications:

  • get rid of the stack root in the config entirely
  • add new configuration variables representing each of the environment variables (XDG_DATA_HOME, XDG_CONFIG_HOME, XDG_CACHE_HOME, XDG_RUNTIME_DIR)
  • find every instance of the stack root and replace with the appropriate one
  • add support for the XDG environment variables

Stack already uses the path-io module, so no new dependencies need to be added, as the following function can be used:
http://hackage.haskell.org/package/path-io-1.4.0/docs/Path-IO.html#v:getXdgDir

I'd prefer the clean solution from a code perspective, but the first one would work equally as well from a practical standpoint and it also should make the global project work as-is without any extra modifications.
NB: I haven't yet looked into how the global project works. It may or may not be much more work with the second solution.

I'm currently working on an experimental tool to generate Flatpak manifests from stack projects to allow easy distribution of Haskell binaries on GNU/Linux, so it could take a bit before I can look into this in more depth.

The main reason is backups and the ability to easily delete cache without knowing the internals of a program or excluding directories manually from backup scripts. Thankfully a lot of projects (new and old: KDE, GNOME, Chromium, Blender, GIMP 2.10, and many more) are supporting the XDG Base Directory specification.

I really hope XDG becomes the gold standard on GNU/Linux instead of the old $HOME/.myprogram, so I am also willing to work towards it whenever I have the capacity.

@dbaynard
Copy link
Contributor

dbaynard commented Aug 23, 2018

That looks great @rszibele! I agree on XDG; it's good to have these reasons explicit, here.

Do note that there are some major changes to stack's caching behaviour (e.g. #4254, #3922) in progress. It seems like this change should be orthogonal. It would be very good to make this change at the same time. @snoyberg is driving those changes.

@severen
Copy link

severen commented Aug 30, 2018

I would also love to see this change, just to throw my 10c (and support) in to the conversation. The "standard" of just throwing everything in to $HOME might be the easy fire and forget choice, but it's terrible if you want the ability to back up files that are important such as configuration easily and periodically wipe caches, and also don't want ls -a ~ to look like a dumpster fire.

@damienflament
Copy link

Just thinking about how this should behave, maybe if $XDG_CACHE_HOME is set then use that for the cached items unless the cached items are already in $HOME/.stack. This way new installations would do the right thing, but existing installations would still keep using their existing cache. And something similar for $XDG_CONFIG_HOME.

I just want to point out a misunderstanding about the XDG_* environment variables.

As specified in the XDG Base Directory Specification, the XDG_* environment variable don't have to be set. Setting the XDG_* environment variable is just a way to override their default values :

XDG_DATA_HOME defines the base directory relative to which user specific data files should be stored. If $XDG_DATA_HOME is either not set or empty, a default equal to $HOME/.local/share should be used.

$XDG_CONFIG_HOME defines the base directory relative to which user specific configuration files should be stored. If $XDG_CONFIG_HOME is either not set or empty, a default equal to $HOME/.config should be used.
[...]

$XDG_CACHE_HOME defines the base directory relative to which user specific non-essential data files should be stored. If $XDG_CACHE_HOME is either not set or empty, a default equal to $HOME/.cache should be used.

It's really important to understand that you MUST NOT expect the XDG_* environment variables to be set to handle the XDG Base directories default locations.

It may be obvious for many of yours, but I just wanted to point it out as many softwares pretending to support the XDG Base Directory Specification expect those environment variables to be set. This results in your home directory being cluttered because those variables are not set by the linux distributions nor the shell (because they don't have to) and make this spec useless.

To briefly show the expected behavior, look at the shell script example below:

readonly config_dir="${XDG_CONFIG_HOME:-$HOME/.config}/stack"

To fully understand this spec, please read it.

@damienflament
Copy link

Just thinking about something while writing my previous comment.

Stack has a configuration file for each project and another one for the global project.
But is there a need for a system-wide configuration file ? It may set default behavior for all the users on the system (think about a system administrator who want to configure the default behavior of all the computers in a university lab).

This file may be located at /etc/stack.yaml by default on Unix systems. But its location should be modifiable during the compilation by the package maintainer.

I don't say there is a need for it. But while rewriting the configuration files loading mechanism, it may be useful to add this feature.

What do you think about it ?

@dbaynard
Copy link
Contributor

dbaynard commented Nov 7, 2018

But is there a need for a system-wide configuration file?

Do you mean, a stack.yaml in addition to the global project file? Or an equivalent to the config.yaml file, but in /etc/?

So,

  • Project stack.yaml
  • Global project stack.yaml
  • System stack.yaml / system config.yaml

@damienflament
Copy link

@dbaynard

Arf. I just saw there is already a global non-project configuration file located at /etc/stack/config.yml (see the documentation).

@dbaynard
Copy link
Contributor

dbaynard commented Jan 17, 2019

Two things:

  1. @rszibele Are you still working on this?

  2. I've just encountered another reason to do this. I’d installed pandoc using stack, then later deleted my ~/.stack directory. As a result pandoc stopped working, giving me an error about data files.

    This is because data files are stored in ~/.stack/snapshots/$compiler/$snapshot/$ghc-version/share/$compiler/$package/data/. Pandoc has quite a few data files

    It would be nice if this sort of thing could be separated from build caches — though for data-files I'm sure there's a workaround.

@rszibele
Copy link
Contributor Author

@dbaynard Unfortunately, I'm unable to allocate any time to this issue at the moment. I'm currently caught up in a commercial project that I'm expecting to ship this June/July (if all goes well).

@dbaynard
Copy link
Contributor

dbaynard commented Mar 3, 2019

Pandoc has recently done this (see jgm/pandoc#3582). It would be nice to get this for the next major release.

@dbaynard
Copy link
Contributor

Closing, having added to the wishlist. PRs welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants