Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not trim dots from usernames #6657

Merged
merged 2 commits into from Dec 14, 2023
Merged

Conversation

georgthegreat
Copy link
Contributor

@georgthegreat georgthegreat commented Nov 15, 2023

Sammy Davis Jr. Jr. is a valid name and the dot shoud not be trimmed.

This makes libgit2 a bit more compatible with the git parsing.
See: https://github.com/git/git/blob/master/ident.c#L275

@ethomson
Copy link
Member

Hmm. I think that we need to be a little bit more thoughtful about this than just its removal - this is a breaking change for consumers, and it's a state that we're currently testing for.

In addition, it's not clear to me what we should do here, since git does trim trailing dots when generating new signatures for a commit:

GIT_AUTHOR_NAME="Sammy Davis Jr. Jr." GIT_AUTHOR_EMAIL="sammy.foo@foo.bar." git commit -mtest
[main 3faeab5] yo
 Author: Sammy Davis Jr. Jr <sammy.foo@foo.bar>
 1 file changed, 9 insertions(+)
 create mode 100644 test

So I would expect that if we want to pursue this that it should be an opt-in behavior. But I'm curious about the use case that you're seeing here? Is it that you want trailing dots in a name because - yes - it should be possible? If so, we should consider a mechanism for git and libgit2 to unify here, if there isn't prior art for git itself. Or are you trying to round-trip existing data? Those feel like two different use-cases.

@georgthegreat
Copy link
Contributor Author

georgthegreat commented Nov 20, 2023

I want to keep dots inside user names (keeping them inside email is more like a side-effect: trimming from names only required a sligthly more complicated code).

We do git commit objects conversion into our internal format and back.
As signature is a part of git hash, we need signature to be kept intact.

At the time re-creating git commit with dots inside names is not possible.
We have similar patch applied locally, I am just upstreaming it.

@georgthegreat
Copy link
Contributor Author

georgthegreat commented Nov 20, 2023

This is the git commit that removed dot from the list of cruds.

@ethomson
Copy link
Member

We do git commit objects conversion into our internal format and back.
As signature is a part of git hash, we need signature to be kept intact.

OK, thanks, that makes sense. I think that there's two problems here:

  1. We no longer match the git behavior w/r/t dots. (Thanks for the pointer to that commit, helpful information.) We should probably snap to that.

  2. You don't have a way to round-trip - or otherwise intentionally - create "broken" commits. Even if you "shouldn't" have dots in a signature (or whatever, semicolons, the details don't really matter here), these exist in the wild and you need to be able to re-create them. So we need a mechanism to turn these sorts of checks or transformations off.

    I think that there's actually two pieces here - the commit creation and the commit parsing. What I've been thinking about on the commit parsing side is that we should be able to reimplement git fsck, which is to say, a totally pedantic mode that will report any issue about a commit. But by default we parse (if possible) the commit. Probably commit creation needs to behave similarly.

@georgthegreat
Copy link
Contributor Author

So we need a mechanism to turn these sorts of checks or transformations off.

I do not think there is a need for such option.
Having dots is not wrong.

Getting fsck warning is not very helpful either: we can not rewrite the original commit to fix it as we do not control repositories we are converting from, and we are unable to rewrite them.

@ethomson
Copy link
Member

I do not think there is a need for such option.
Having dots is not wrong.

That's only "not wrong" because git changed the definition of wrongness in August. Before that, it was wrong.

But ignore dots for a moment. What about all the other things that we elide from a signature? Semicolons for example.

Imagine a repository that was able to have a commit that has a signature with a trailing semicolon? How do you round trip that? You cannot today, hence you need a mechanism to do so.

Getting fsck warning is not very helpful either: we can not rewrite the original commit to fix it as we do not control repositories we are converting from, and we are unable to rewrite them.

I'm not suggesting it's useful for your use case. If I were trying to fsck a repository, though; it would be very helpful, indeed.

@georgthegreat
Copy link
Contributor Author

I understand your point here.
Maybe we will come with a PR introducing git2_recreate_commit_without_validation function.

However, this is not the case for the current PR — it just synchronizes libgit2 and git behavior.

@ethomson
Copy link
Member

I understand your point here.
Maybe we will come with a PR introducing git2_recreate_commit_without_validation function.

Yes, I'm speaking in the abstract. I'm not expecting you to solve this problem in this PR. Apologies if I was unclear.

However, this is not the case for the current PR — it just synchronizes libgit2 and git behavior.

I think that's great. Many unit tests are asserting the current behavior though, and need to be updated to understand how dot trimming should work.

@georgthegreat
Copy link
Contributor Author

georgthegreat commented Nov 21, 2023

@ethomson, I have fixed the tests.

@georgthegreat
Copy link
Contributor Author

@ethomson, let's merge it.

@ethomson ethomson merged commit a8cca4d into libgit2:main Dec 14, 2023
26 checks passed
@ethomson
Copy link
Member

Thanks for the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants