New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Locking / API package config usage refactor #1736
Conversation
This removes another implicit dependency on global config from API pkg
@@ -21,28 +21,23 @@ var ( | |||
ErrNoOperationGiven = errors.New("lfs/api: no operation provided in schema") | |||
) | |||
|
|||
// EndpointSource is an interface which encapsulates the behavior of returning | |||
// `config.Endpoint`s based on a particular operation. | |||
type EndpointSource interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I deleted this because it's a subset of Configuration
, and in fact this code needs the whole of that structure so the abstraction wasn't very useful and was hiding that the code internally was accessing the global config.
@technoweenie this should address your approval condition on #1723 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking 💪 ! I left a few comments, mostly minor. The majority of my thoughts were surrounding the usage and exporting of the locking.Client
type, which I think can be avoided.
} | ||
|
||
var _ Lifecycle = new(HttpLifecycle) | ||
|
||
// NewHttpLifecycle initializes a new instance of the *HttpLifecycle type with a | ||
// new *http.Client, and the given root (see above). | ||
func NewHttpLifecycle(endpoints EndpointSource) *HttpLifecycle { | ||
func NewHttpLifecycle(cfg *config.Configuration) *HttpLifecycle { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it'd be worth documenting the default behavior if nil
is given here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call.
var ( | ||
source = &NopEndpointSource{"https://example.com"} | ||
testConfig = NewTestConfig() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func TestHttpLifecycleMakesRequestsAgainstAbsolutePath(t *testing.T) { | ||
SetupTestCredentialsFunc() | ||
defer RestoreCredentialsFunc() | ||
|
||
l := api.NewHttpLifecycle(source) | ||
l := api.NewHttpLifecycle(testConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Continuing my 🚋 of 💭 from the above comment, I'd be 👍 with keeping the NewTestConfig()
function and just inlining a call to it here:
l := api.NewHttpLifecycle(NewTestConfig())
// credentials as would be used to author a commit. In particular, the | ||
// "user.name" and "user.email" configuration values are used from the | ||
// config.Config singleton. | ||
func CurrentCommitter() Committer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with keeping a dependency on the config
here. What do you think about:
func CommitterFromConfiguration(c *config.Configuration) Committer {
return Committer{
Name: c.Git.Get("user.name"),
Email: c.Git.Get("user.email")
}
}
or... using the config unmarshalling stuff like:
type Committer struct {
Name string `git:"user.name"`
Email string `git:"user.email"`
}
func NewCommitter(cfg *config.Configuration) Committer {
var c Committer
if err := cfg.Unmarshal(&c); err != nil {
// Should never happen, a `panic` is appropriate
// here.
panic(err.Error())
}
return c
}
As an added benefit, this allows us to get rid of some usage of the api.CurrentCommitter()
func.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of using Unmarshal
here. I want to get away from importing config
on every package. There's obviously a lot of work to do in our codebase, but I think the new locking
package should start off on the right foot.
package main
import (
"fmt"
"github.com/git-lfs/git-lfs/config"
"github.com/git-lfs/git-lfs/locking"
)
func main() {
cfg := config.New()
lockclient := locking.New()
cfg.Unmarshal(lockclient)
cfg.Unmarshal(&lockclient.CurrentCommitter)
fmt.Printf("%+v\n", lockclient)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. While I like the Unmarshal
approach to pure data structure population it feels a bit too "magic" and implicit to be using directly on objects like Client
. Accidental naming clashes could cause all sorts of hard to trace issues. While I agree that we should move away from accessing the global config everywhere, passing an explicit configuration structure to other packages seems a lot easier to understand & trace.
However, in order to remove the dependency on a "fat" config.Configuration
I'd be totally happy to have separate package-specific configuration structures, which themselves are populated using Unmarshal
, because that's a much more straight forward data encapsulation case, rather than having a more general purpose struct like Client
being populated directly. I'm concerned that because Client
is general purpose you could add members to it for other reasons and accidentally cause a name clash which bleeds into the config exchange. That won't happen if the package-specific config structure is only ever used for config fields.
The problem with doing this right now is that to do it on locking
I'd also have to do it on api
since that still needs the full config, which in turn would have to do it on httputil
and auth
and I'm not sure if that's even the end of it 😱 It needs doing but can't be done in isolation in locking
, which makes me think it should be a separate refactor.
@@ -22,74 +22,88 @@ var ( | |||
ErrLockAmbiguous = errors.New("lfs: multiple locks found; ambiguous") | |||
) | |||
|
|||
// LockFile attempts to lock a file on the given remote name | |||
// Client is the main interface object for the locking package | |||
type Client struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few thoughts. I really dig the idea of wrapping state up like this, but I don't think that we should expose it. Exposing a client is one more step that downstream consumers of this package have to be aware of. Instead of just saying locking.Lock(obj)
they have to first instantiate a client, maintain it somewhere, and call Lock
on that client instead of directly on the package.
What do you think about using a client, but hiding it? I'm thinking about something like:
package locking
var (
c *client
initOnce sync.Once
)
func getClient() {
initOnce.Do(func() {
c = NewClient(config.Config)
})
return c
}
func LockFile(path string) (id string, err error) {
return getClient().LockFile(path)
}
type client struct { ... }
func (c *client) LockFile(path string) (id string, err error) { ... }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would drastically simplify the calling code for this package (which I think is a 👍 ), but introduces a problem with using the global config.Config
instance by default, which is a 👎.
I think a compromise could be changing the getClient
function to take a *config.Configuration
instance, like so:
var (
cmu sync.Mutex
clients map[*config.Configuration]*client
)
func getClient(cfg *config.Configuration) *client {
cmu.Lock()
defer cmu.Unlock()
if _, ok := clients[cfg]; !ok {
clients[cfg] = newClient(cfg)
}
return clients[cfg]
}
and then changing the functions to accept a config argument first, as such:
package locking
func LockFile(cfg *config.Configuration, path string) (id string, err error) {
return getClient(cfg).LockFile(path)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exposing client objects is pretty common in API libraries. I definitely do not want to keep a global map of config objects to internal client objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this approach was specifically to address @technoweenie's comments about static variables, which we'd have to go back to to move away from the client object approach & simplify the API.
I still think that if we want to add easy external use of the lfs APIs, an additional facade package would probably go a long way. An interface that's simpler for non-core devs without compromising the internals and is easier to keep stable versus the inner implementations.
apiClient := api.NewClient(api.NewHttpLifecycle(cfg)) | ||
|
||
lockDir := filepath.Join(config.LocalGitStorageDir, "lfs") | ||
os.MkdirAll(lockDir, 0755) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we should let callers know about this error (even if it's non-fatal)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, and actually it is fatal 👍
// Client is the main interface object for the locking package | ||
type Client struct { | ||
cfg *config.Configuration | ||
apiClient *api.Client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: let's call this api
to avoid stuttering the word "client".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't do that without making it ambiguous with the api
package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right. Bummer!
Next incremental PR for bringing locking work back:
locking.Client
as primary facade (with state)Configuration
instance not global configapi
package functions to use passed inConfiguration
instance directly instead of globalconfig.Config
(except v1 api since that will be phased out)CurrentCommitter
toConfiguration
and just keeps JSON packaging inapi
Seemed like a good place to stop & submit since this is going into
locking-master
only. Cache refactor will use this new encapsulatedClient
instead of statics.