Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce and use more robust worktree-list wrappers #4926

Closed
wants to merge 2 commits into from

Conversation

vermiculus
Copy link
Contributor

The function for this purpose, magit-list-worktrees, is limited to providing a fairly sparse set of information about each worktree: the worktree path, its current HEAD and checked-out branch, and whether or not it's bare. This isn't sufficient for other tooling (or future Magit functionality) that wants to use other properties of the worktree (e.g., whether or not it's prunable).

Introduce magit-list-worktrees-raw to generically capture every property reported by Git as a string-valued hashmap. The post-processing that had been present in magit-list-worktrees is moved to magit-list-worktrees-fix-worktree so that it can be mapped over magit-list-worktrees-raw.

magit-list-worktrees is re-implemented in terms of these two functions to keep the existing interface for legacy code.

Care is taken in the implementation of magit-list-worktrees-raw to avoid unnecessary string allocations by using string-match-p and substring instead of string-split.


I'm not sure of a better name for magit-list-worktrees-raw than what I chose, but it certainly seems lacking. Of course use whatever name you see fit. Perhaps there are similar functions in the codebase that you're aware which can be modeled after here.

@tarsius
Copy link
Member

tarsius commented Apr 23, 2023

Remove kludge for ancient versions of Git

We still support even older versions, so please restore that.

@tarsius
Copy link
Member

tarsius commented Apr 23, 2023

Why use a hashtable? When there are as few elements as here, then that is slower than an alist, and it makes the code harder to read.

@vermiculus
Copy link
Contributor Author

When there are as few elements as here, then that is slower than an alist

Do you happen have a reference for this? I'm having trouble finding actual benchmarks – just various vague remarks. It makes intuitive sense that hashmaps would have some constant overhead that could be worse than linear lookup here, but I'm very curious to the actual characteristics here.

At any rate, what's hard to read about the hashmap implementation? Is it the string-valued keys or something else? I'm trying to determine if I should keep the keys as strings or intern them into symbols. I don't see a meaningful distinction between the options, so I'm not sure what direction you want this to go:

(gethash "worktree" wt)        (puthash "worktree" val wt)
(cdr (assoc "worktree" wt))    (setcdr (assoc "worktree" wt) val)
(cdr (assq 'worktree wt))      (setcdr (assq 'worktree wt) val)
(alist-get 'worktree wt)       (setf (alist-get 'worktree wt) val)
(plist-get 'worktree wt)       (plist-put 'worktree wt val)

If we're going alist, I would expect either the second or forth, yeah?

The function for this purpose, `magit-list-worktrees', is limited to
providing a fairly sparse set of information about each worktree: the
worktree path, its current HEAD and checked-out branch, and whether or
not it's bare. This isn't sufficient for other tooling (or future
Magit functionality) that wants to use other properties of the
worktree (e.g., whether or not it's prunable).

Introduce `magit-list-worktrees-raw' to generically capture every
property reported by Git as a string-valued hashmap. The
post-processing that had been present in `magit-list-worktrees' is
moved to `magit-list-worktrees-fix-worktree' so that it can be mapped
over `magit-list-worktrees-raw'.

`magit-list-worktrees' is re-implemented in terms of these two
functions to keep the existing interface for legacy code.

Care is taken in the implementation of `magit-list-worktrees-raw' to
avoid unnecessary string allocations by using `string-match-p' and
`substring' instead of `string-split'.
@vermiculus
Copy link
Contributor Author

We still support even older versions, so please restore that.

Kludge restored 😃

@vermiculus
Copy link
Contributor Author

I've adopted the alist-get approach, but left it as a separate commit in case you wanted me to implement something else.

@tarsius
Copy link
Member

tarsius commented Apr 24, 2023

Funny enough I looked at this function a few days ago (because the byte-compiler complained about some dead code) and wondered whether to extend it. I decided to not do that until there is a need for it. ;P

While your implementation may be more future-proof, I don't think it is more robust. Using -z isn't necessary because the values cannot contain any newlines. Even if the user specified a --reason containing a newline. That is encoded like locked "abc\nxyz". But your implementation is more complex, so there is more room for errors.

What do you think about the extended implementation in the worktree-info branch?

@vermiculus
Copy link
Contributor Author

vermiculus commented Apr 24, 2023

I've been working on Windows more lately (sobs) and we've seen Git tooling choke on filepaths with special characters, so the use of -z in our own tooling has been pretty embedded in my brain. I suggest the same best practice be followed here -- Git's behavior of quoting special characters in the worktree path can lead to unusable paths -- ultimately, RET on the section would fail. (It's worth pointing out, just for laughs, that linux paths can contain newlines IIRC. If you do that intentionally, you're of course asking for trouble 😃) For a more reasonable situation, if locked does contain a reason with a newline, downstream code has to de-encode Git's output here to display it intelligently. Using -z avoids that mess.

The direction you take with magit--insert-worktree is interesting; I had not considered using advice instead of public customization. That does leave the door open to accidentally breaking config if this private function is renamed/removed in the future, though. (For context, I'm likely looking at a pool of 30-50 users who will be using my extension package at $DAYJOB. If it were just me, I wouldn't be so skittish.) I could easily work with your approach (and close #4925), but I'd be more comfortable if it were marked public unless you already as a rule don't rename/remove private functions without a deprecation period.

With your advice-centered approach, I would recommend feeding in the worktree list/alist/map object itself in magit--insert-worktree rather than just the path; that would make my life slightly easier, at least, so I assume the same would be true for others :-) I wrote a naive patch implementing this in vermiculus@17d7fa0 – and a follow-up commit using the alist approach in the next commit (vermiculus:worktree-info -- vermiculus@d537196).

@tarsius
Copy link
Member

tarsius commented Apr 24, 2023

I've added -z.

That does leave the door open to accidentally breaking config if this private function is renamed/removed in the future, though. (For context, I'm likely looking at a pool of 30-50 users who will be using my extension package at $DAYJOB. If it were just me, I wouldn't be so skittish.)

I've added a comment to ping you before making a breaking change.

Also, since 30-50 colleagues use Magit, could you maybe ask your employer to give me some money?

With your advice-centered approach, I would recommend feeding in the worktree list/alist/map object itself in magit--insert-worktree

Makes sense. What do you think of the new signature?

@vermiculus
Copy link
Contributor Author

Also, since 30-50 colleagues use Magit, could you maybe ask your employer to give me some money?

It's certainly in the plan for me to ask if I do get the adoption I'm expecting :-) but as it stands, I know of only three people (including myself) who use it – with a fourth 'on the way' as it were. Keep in mind though that even if 30-50 devs adopt Magit, we're still only talking ~1% of our ~3k developers – even with the customized tooling I'm building into workflows (via transients and status-/revision-buffers). I'm going to push for it, but it'll be a tough sell. Emacs is still perceived as too obscure – and performance on Windows is just not there yet. (I'm personally putting as much time as I can spare into libgit2/libgit2#6202, which should at least help a little bit with libegit integration.) Given my increased usage of it the past couple months during our migration (and that I'm using it for at $DAYJOB), I was able to start sponsoring personally at least! I'll also make a point to talk to my current teammates using Magit to consider sponsoring.

So my priorities right now with making it as easy as possible for our devs to use Magit are:

  • tight integrations / linking with our development-tracking system
  • more worktree functionality (as our workflows – coming from SVN – are primarily worktree-driven)
  • performance improvements (I'm aware of selective refresh and async section-population as two angles you're wanting to tackle; let me know if you have a maintained collection of other ideas)
  • explore hooking up Forge to our bug-tracker
  • explore packaging Magit as a standalone git client (there's a tracking issue for this somewhere – I know I've seen it) to get folks who won't buy into emacs as a system

One of the advantages I have in my current role is an explicit focus on developer tooling and productivity (in a 'you are paid to work on this' sense), so I'm hoping to work on these things at $DAYJOB as priorities permit.


I think your new signature is good. I like the improvement of passing the propertized and padded label – that certainly makes implementing replacement functions simpler (not having to worry about align at all).

I don't understand the advantage of keeping everything in a flat list that requires referencing documentation to use when more structured options are available that are more self-documenting.

@tarsius
Copy link
Member

tarsius commented Apr 25, 2023

I was able to start sponsoring personally at least! I'll also make a point to talk to my current teammates using Magit to consider sponsoring.

Thanks!

I don't understand the advantage of keeping everything in a flat list that requires referencing documentation to use when more structured options are available that are more self-documenting.

I agree that using an alist would result in a better interface.

The advantage of not doing that, is that it is less work for me. I don't have to think about how to restructure the code, and I don't have to make any design decisions regarding the interface.

E.g., should it return ALIST or (PATH . ALIST)? Only the former would be fully self-documenting, but for 5 out of 6 callers that would complicate matters. So should I optimize for how the function is actually being used or for what is theoretically the best interface?

Since this function, like all helper functions (even if not explicitly marked as private) are primarily intended for internal purposes, I choose to optimize for that. Sometimes it makes sense to spend time getting an interface just right, but Magit is large and has few maintainers, so I do have to cut corners sometimes.

For Magit's own purposes this function doesn't have to be changed at all. What I am doing now seems like a good compromise; you get the extra data you need and I don't have to invest too much time making changes, that don't actually, at this time, benefit Magit itself.

So this is a bit inconvenient for you, but this is probably something you can just deal with once and then mostly forget about.

@tarsius tarsius closed this Apr 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants