Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should rss2email.json be stored in XDG_DATA_HOME? #164

Closed
hartwork opened this issue Feb 2, 2021 · 23 comments · Fixed by #169
Closed

Should rss2email.json be stored in XDG_DATA_HOME? #164

hartwork opened this issue Feb 2, 2021 · 23 comments · Fixed by #169

Comments

@hartwork
Copy link

hartwork commented Feb 2, 2021

Hi!

The current location ~/.local/share/ is a weird place to keep state with regard to to the file hierarchy standard. It's the .local equivalent of /usr/share which is intend for "Architecture-independent" (see /usr/share) "read-only" (see /usr) data. Maybe ~/.local/var/lib/ — "State information. Persistent data modified by programs as they run" (see /var/lib)— is a better place for the current use semantic?

Best, Sebastian

@Ekleog
Copy link
Member

Ekleog commented Feb 3, 2021

AFAIK XDG_DATA_HOME is the place where rss2email.json should live, and the linked XDG specification shows that it should default to $HOME/.local/share.

To the best of my knowledge, XDG overall governs what goes in $HOME, and FHS overall governs what goes in /.

Does what I'm saying make sense?

@hartwork
Copy link
Author

hartwork commented Feb 3, 2021

Hi Léo, thanks for your interest in this ticket!

I checked that link that you provided in more detail. To my understanding, the combination of statement…

If $XDG_DATA_DIRS is either not set or empty, a value equal to /usr/local/share/:/usr/share/ should be used.

…and statement…

$XDG_DATA_DIRS defines the preference-ordered set of base directories to search for data files in addition to the $XDG_DATA_HOME base directory.

…in particular means that $XDG_DATA_HOME is a user-specific version of /usr/share. Would you agree?
/usr/share is read-only (or write-once) data i.e. not state with regard to FHS.

I find backup for that interpretation at direnv/direnv#406 (comment) but I'd agree with anyone saying that the spec is not explicit enough about state versus non-state: There should a single sentence for us to quote in that spec and we'd have a clear answer.

To me personally it is clear that a user-specific version of /var/lib/ is just what we need here but given the XDG text, I understand it's a difficult sell.

It seems like it's a hole in XDG, https://wiki.debian.org/XDGBaseDirectorySpecification#Proposal:_STATE_directory on this topic reads like the problem has not been fully addressed yet, either.

What do you think?

@Ekleog
Copy link
Member

Ekleog commented Feb 3, 2021

My pleasure! I know you have opened other ones, but this one seemed quick enough to answer, while others will have to wait for a longer span of consecutive free time… sorry about it!

I agree with you that XDG sounds like XDG_DATA_HOME is a user-specific version of XDG_DATA_DIRS, but user-specific also means that what is read-only becomes read-write — this is also, I think, the reason why XDG_DATA_HOME is not just one more element added to the default XDG_DATA_DIRS list, but a variable by itself.

Off the top of my hat, I'd say that we care about supporting XDG as it currently exists more than hypothetical extensions of it that have not yet been standardized. Especially seeing the table at the bottom of your link, I'm not sure I see a meaningful difference between DATA and STATE.

I also see that on my local machine, .local/share appears to have a significant number of state directories similar to the data handled by rss2email.json.

So I'd say, yes the situation is imperfect, and it'd be nicer if the XDG spec could explicitly state that XDG_DATA_HOME is made for mutable data too, but as things are, there is no de jure standard for a good location for these data, and the de facto standard is to put it in XDG_DATA_HOME. Which makes me not see any better solution than what is currently implemented, at least until a STATE directory is accepted into XDG (and then we'd have to think about a migration plan).

Does that make sense?

@hartwork
Copy link
Author

hartwork commented Feb 3, 2021

The more I think about it, even considering read-only /usr/share and /usr/local/share for the location of the state file at

def _get_datafile_path(self):
"""Get the data file path
Following the XDG Base Directory Specification.
"""
data_home = _os.environ.get(
'XDG_DATA_HOME',
_os.path.expanduser(_os.path.join('~', '.local', 'share')))
data_dirs = [data_home]
data_dirs.extend(
_os.environ.get(
'XDG_DATA_DIRS',
':'.join([
_os.path.join(ROOT_PATH, 'usr', 'local', 'share'),
_os.path.join(ROOT_PATH, 'usr', 'share'),
]),
).split(':'))
datafiles = [_os.path.join(data_dir, 'rss2email.json')
for data_dir in data_dirs]
for datafile in datafiles:
if _os.path.isfile(datafile):
return datafile
return datafiles[0]
to me is an indicator, that neither XDG_DATA_HOME nor XDG_DATA_DIRS should have any say about the location of that file (due to /usr/share being read-only a la FHS).

I'm not sure if I have a realistic chance of fixing the XDG spec. If you want to keep the status quo until someone fixed the spec, that's not what I was hoping for, but it's okay with me.

@Ekleog Ekleog changed the title Location ~/.local/share/rss2email.json is a weird place to keep state with regard to FHS rss2email.json should not be looked for in XDG_DATA_DIRS, only in XDG_DATA_HOME Feb 4, 2021
@Ekleog
Copy link
Member

Ekleog commented Feb 4, 2021

Uhhh yes we definitely should not be looking through XDG_DATA_DIRS for rss2email.json, only in XDG_DATA_HOME as things currently stand — as you say, it makes no sense to look for a mutable file at globally read-only locations. Thank you for pointing to this code!

I'd be glad to accept a PR with a changelog update that fixes that to only look at XDG_DATA_HOME, which currently is the least bad place where we could put this data that I know of :)

@hartwork
Copy link
Author

hartwork commented Feb 6, 2021

Uhhh yes we definitely should not be looking through XDG_DATA_DIRS for rss2email.json

Good.

, only in XDG_DATA_HOME as things currently stand — as you say, it makes no sense to look for a mutable file at globally read-only locations.

I don't follow why you consider XDG_DATA_HOME still in the game if XDG_DATA_DIRS is out. XDG_DATA_HOME is defaulting to ~/.local/share/ with emphasis on share because it's the counterpart of /usr/share. It's write-once, it's not for mutable state.

I'd be glad to accept a PR with a changelog update that fixes that to only look at XDG_DATA_HOME, which currently is the least bad place where we could put this data that I know of :)

I'm happy to help out with a PR but if I do the work XDG_DATA_HOME cannot be the future writing location because I'm getting in conflict with myself then, as I don't consider that a good choice. If I am to make a pull request, it will:

  • Use the current code in _get_datafile_path to determine the old supported location to read that file if existing for backwards compatibility
  • Add new code to write to ~/.local/var/lib/.

If that's interesting, we have a deal.

@auouymous
Copy link
Contributor

And which XDG variable do I set if I don't want it in ~/.local/var/lib/? Do we make one up and hope a future update to the specification uses it? The specification is flawed and XDG_DATA_HOME is the best we currently have, it is also what many other projects use for similar files.

@hartwork
Copy link
Author

hartwork commented Feb 7, 2021

Hi @auouymous ,

And which XDG variable do I set if I don't want it in ~/.local/var/lib/?

is that needed? Maybe as root in /var/lib, okay. Anything else?

Do we make one up and hope a future update to the specification uses it?

That's a good question.

The specification is flawed

I agree.

and XDG_DATA_HOME is the best we currently have [..].

I do not agree. I think XDG_DATA_HOME is "terrible", that's the whole point of the ticket.

@auouymous
Copy link
Contributor

The variables are there to let the user choose, so yes, it is needed.

XDG_CACHE_HOME, XDG_CONFIG_HOME, XDG_DATA_HOME and non-compliance are the only options. It is clearly not a config file the user should be modifying. Cache implies the data can be recovered, and while this is partially valid, you won't end up in the same state if the file is deleted (seen entries will be resent). Violating the spec only makes it easier for other projects to do the same, but in their own ways. The end result is a directory full of garbage and no standard hierarchy.

@hartwork
Copy link
Author

hartwork commented Feb 7, 2021

The variables are there to let the user choose, so yes, it is needed.

Personally, I don't see it. For a user other than root, I don't see why a non-default and variable XDG_DATA_HOME would be needed. EDIT: Maybe it's there just because of root.

XDG_CACHE_HOME, XDG_CONFIG_HOME, XDG_DATA_HOME and non-compliance are the only options.

The list is missing XDG_RUNTIME_DIR. That may even be the closest match, less "wrong" than XDG_DATA_HOME in my book at least.

It is clearly not a config file the user should be modifying. Cache implies the data can be recovered, and while this is partially valid, you won't end up in the same state if the file is deleted (seen entries will be resent).

Full ack.

@hartwork
Copy link
Author

hartwork commented Feb 7, 2021

@auouymous @Ekleog I'm preparing a mail to the xdg mailing list now…

@auouymous
Copy link
Contributor

XDG_RUNTIME_DIR
May be subject to periodic cleanup.
Should not store large files as it may be mounted as a tmpfs.

Yup, a directory that is cleaned or doesn't persist sounds like the best choice. ;)

@hartwork
Copy link
Author

hartwork commented Feb 7, 2021

XDG_RUNTIME_DIR
May be subject to periodic cleanup.
Should not store large files as it may be mounted as a tmpfs.

Yup, a directory that is cleaned or doesn't persist sounds like the best choice. ;)

Well, it has it's own can of worms: There is no clear default, the app needs to warn if it's unset, residual in memory may make it go away on every reboot and the "should not place larger files" is both vague in general and to our very case and "should use this directory for communication and synchronization purposes" does not fit our case either. So not exactly a great match either 😞

@hartwork
Copy link
Author

hartwork commented Feb 7, 2021

@auouymous @Ekleog I'm preparing a mail to the xdg mailing list now…

Mail sent. The thread is up here in the mail archive. Wish me replies 🙏 😃

@hartwork hartwork changed the title rss2email.json should not be looked for in XDG_DATA_DIRS, only in XDG_DATA_HOME Should rss2email.json be stored in XDG_DATA_HOME? Feb 7, 2021
@hartwork
Copy link
Author

So that e-mail thread on the xdg mailing list uncovered that the is a new variable XDG_STATE_HOME in the related Git repository but it's not in any released version of the spec so there is no guarnatue that it will be released like that and hence there is some risk in building upon unreleased things. Other than that, XDG_STATE_HOME looks like a fit to me. What do you think?

@Ekleog
Copy link
Member

Ekleog commented Feb 21, 2021

On your spec link, I read:

The >$XDG_STATE_HOME contains state data that should persist between (application) restarts, but that is not important or portable enough to the user that it should be stored in $XDG_DATA_HOME.

This, to me, reads like the correct directory is XDG_DATA_HOME even with the updated spec, because if we lose the data stored there you'll potentially receive thousands of new mails — whereas the listed examples just below (“actions history (logs, history, recently used files, …)” and “current state of the application that can be reused on a restart (view, layout, open files, undo history, …)”) don't match the usage rss2email currently does of XDG_DATA_HOME.

But then, I may have missed something?

@hartwork
Copy link
Author

@Ekleog I think your point is about short-term state versus long-term state. My understanding is that:

  • for short-term state, XDG_RUNTIME_DIR would be the right thing,
  • for long-term state should go to XDG_STATE_HOME and that
  • XDG_DATA_HOME should not be home to any state: long or short.

But that's my interpretation: The spec does a terrible job at drawing precise lines between things and I understand if we have three different interpretations with three people.

@Ekleog
Copy link
Member

Ekleog commented Feb 22, 2021

Hmm? My interpretation of this sentence:

The $XDG_STATE_HOME contains state data that should persist between (application) restarts, but that is not important or portable enough to the user that it should be stored in $XDG_DATA_HOME.

is that by reverting the sentence, I get:

The $XDG_DATA_HOME contains [things that include] state data that should persist between (application) restarts, and that is important and portable enough to the user that it should not be stored in $XDG_STATE_HOME

@hartwork
Copy link
Author

I'm not sure I understand. What do you mean?

@Ekleog
Copy link
Member

Ekleog commented Feb 22, 2021

What I mean is that if I take the contents of the first quote, and just reword it so the subject is now $XDG_DATA_HOME, it gives a sentence that, to me, sounds like XDG_DATA_HOME should be used to store long-term important state, while XDG_STATE_HOME should be used to store long-term unimportant state (like undo history, etc.)

@hartwork
Copy link
Author

Hi @Ekleog, there's an emphasis on "unimportant" there, true, you have a point, yes.

The more I think about the spec, the more I'm tearing myself apart about it. It's probably best if I take myself out of the equation here and just let you pick something; whatever I would suggest will be either in conflict with part of what the spec says verbatim or what I believe the spec was meant to say. I still believe that XDG_STATE_HOME was never about any state initially (but about resource files like fonts or artwork), and as long as I still have that belief, I'll be of limited use here probably.

If taking me out of the equation means we're closing the ticket and stick with status quo, okay. If it means we're sticking with XDG_DATA_HOME but take XDG_DATA_DIRS (or just its read-only defaults) out, okay. If we move to pre-releases XDG_STATE_HOME and and consider rss2emails's data closer to the 0% end of the importance spectrum than to the 100% end so that it does qualify as "unimportant enough data", okay. How would you like to go forward?

@auouymous
Copy link
Contributor

XDG_STATE_HOME should not even be up for consideration until it is part of a released spec. I do agree that the XDG_DATA_DIRS and related code should be removed.

Ekleog added a commit to Ekleog/rss2email that referenced this issue Mar 19, 2021
Ekleog added a commit that referenced this issue Mar 19, 2021
@Ekleog
Copy link
Member

Ekleog commented Mar 19, 2021

Well, I've just pushed and landed #169, IMO this solves this issue so let's close for now, and if other people think the issue was not well-solved with arguments that were not already raised, let's re-discuss it! Thank you everyone for your input :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants