Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

git_repository_config does not invalidate a cached git_config based on last date modified #1022

Closed
jspahrsummers opened this issue Oct 25, 2012 · 8 comments
Labels

Comments

@jspahrsummers
Copy link
Contributor

git_repository_config and git_repository_config__weakptr should re-read the config file from disk if it's been modified since it was last read.

Otherwise, git may create branches which, e.g., git_branch_tracking may not know about.

@arrbee
Copy link
Member

arrbee commented Oct 25, 2012

The git_config is a composite view of up to four different config files that may be spread around the user's system. We'll need to check all four places, and we can embed that automatically into some APIs, but I'm wondering if it would be equally satisfactory to give you a git_config_refresh() API that would explicitly reload files that had been modified, instead of doing it implicitly inside some APIs...

@jspahrsummers
Copy link
Contributor Author

As mentioned in Campfire: that helps the native client case, but the user could just as easily do similar things with git on the command line. I'd rather it be safe/correct first and performant second.

@arrbee
Copy link
Member

arrbee commented Oct 25, 2012

I think one way or another, this starts with a git_config_refresh() call and we might as well put that in the public API. The next question is when libgit2 should automatically call that on the users behalf.

Do we do it every time git_repository_config__weakptr() is called internally? That is problematic because the library is written with the assumption that it is cheap to call (after the first time) and we haven't tried to prevent calling it multiple times. In some of the higher level APIs that call through to low level APIs, the deeper functions end up calling git_repository_config__weakptr() repeatedly.

So, as a particularly bad example, git_status_foreach() which internally performs a couple of diffs (thereby examining attributes and ignore rules on all files across the working directory) ends up calling git_repository_config__weakptr() 10-20 times or more (depending on how deep the directory is). That could be 40 - 80 file-up-to-date checks.

More importantly, status is hardly instantaneous; if the config is changed in mid operation by an external program, I don't think it would be "more correct" to reload the config settings. It's actually much more important that it have a consistent view of the config settings across the entire operation. The user of the libgit2 is more likely to know what is an "atomic" operation that needs a consistent view of config / attributes / etc. than the library, which is fundamentally fairly low level, can know.

@jspahrsummers
Copy link
Contributor Author

Ensuring a consistent view of the configuration across operations basically sounds like locking — perhaps an API to prevent automatic config invalidation during critical sections (for performance and/or correctness)?

Your points are all valid, but, ultimately, we need to do something to reconcile external actions upon the repository. Short of watching the filesystem (which is extremely non-portable), I'm not sure what other options there are.

@arrbee
Copy link
Member

arrbee commented Oct 25, 2012

I don't disagree. Let's get git_config_refresh() in place so there is a tool in your hands.

@arrbee
Copy link
Member

arrbee commented Nov 6, 2012

Okay, so, we have a refresh API now. @vmg Do you want to use this issue to have a discussion about the merits of automatically refreshing the config under certain circumstances? I think we're both wary about it, but it might be worth flushing out the pros and cons at some point.

One option is to write a "quick refresh" that just refreshes the repo-level config and not the other tiers, since that is the level that is most likely to be programmatically altered in a way that would be worth responding to dynamically. It could also be a fairly cheap operation if written well (although the confusion over "changes in file A are dynamically picked up and changes in B are not" might cause more confusion).

@jspahrsummers
Copy link
Contributor Author

I think that would be completely reasonable. AFAIK, git only modifies the repo config (when not using the "config" command, of course), and I think users of client apps would not be surprised if a global config change required a restart.

@carlosmn
Copy link
Member

With #2188 merged we now provide a git_repository_config() which will try to update itself on every read, and git_repository_config_snapshot() which will update before returning and then provide a snapshot, so things are up to date but won't change from under you.

So I'm going to close this as finally resolved. If there's an issue with the new behaviour, let's open an issue about that specifically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants