Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Locking / API package config usage refactor #1736

Merged
merged 7 commits into from Dec 8, 2016

Conversation

sinbad
Copy link
Contributor

@sinbad sinbad commented Dec 5, 2016

Next incremental PR for bringing locking work back:

  • Introduce locking.Client as primary facade (with state)
  • Explicit use of Configuration instance not global config
  • Change remaining api package functions to use passed in Configuration instance directly instead of global config.Config (except v1 api since that will be phased out)
  • Promotes CurrentCommitter to Configuration and just keeps JSON packaging in api

Seemed like a good place to stop & submit since this is going into locking-master only. Cache refactor will use this new encapsulated Client instead of statics.

@sinbad sinbad added the review label Dec 5, 2016
@@ -21,28 +21,23 @@ var (
ErrNoOperationGiven = errors.New("lfs/api: no operation provided in schema")
)

// EndpointSource is an interface which encapsulates the behavior of returning
// `config.Endpoint`s based on a particular operation.
type EndpointSource interface {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deleted this because it's a subset of Configuration, and in fact this code needs the whole of that structure so the abstraction wasn't very useful and was hiding that the code internally was accessing the global config.

@sinbad
Copy link
Contributor Author

sinbad commented Dec 5, 2016

@technoweenie this should address your approval condition on #1723

Copy link
Contributor

@ttaylorr ttaylorr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking 💪 ! I left a few comments, mostly minor. The majority of my thoughts were surrounding the usage and exporting of the locking.Client type, which I think can be avoided.

}

var _ Lifecycle = new(HttpLifecycle)

// NewHttpLifecycle initializes a new instance of the *HttpLifecycle type with a
// new *http.Client, and the given root (see above).
func NewHttpLifecycle(endpoints EndpointSource) *HttpLifecycle {
func NewHttpLifecycle(cfg *config.Configuration) *HttpLifecycle {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it'd be worth documenting the default behavior if nil is given here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call.

var (
source = &NopEndpointSource{"https://example.com"}
testConfig = NewTestConfig()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... I'd rather not have this be a value shared between tests, since there are still some mutative functions on the *config.Configuration type, which could introduce implicit order-based failures.

func TestHttpLifecycleMakesRequestsAgainstAbsolutePath(t *testing.T) {
SetupTestCredentialsFunc()
defer RestoreCredentialsFunc()

l := api.NewHttpLifecycle(source)
l := api.NewHttpLifecycle(testConfig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continuing my 🚋 of 💭 from the above comment, I'd be 👍 with keeping the NewTestConfig() function and just inlining a call to it here:

l := api.NewHttpLifecycle(NewTestConfig())

// credentials as would be used to author a commit. In particular, the
// "user.name" and "user.email" configuration values are used from the
// config.Config singleton.
func CurrentCommitter() Committer {
Copy link
Contributor

@ttaylorr ttaylorr Dec 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with keeping a dependency on the config here. What do you think about:

func CommitterFromConfiguration(c *config.Configuration) Committer {
        return Committer{
                Name: c.Git.Get("user.name"),
                Email: c.Git.Get("user.email")
        }
}

or... using the config unmarshalling stuff like:

type Committer struct {
        Name  string `git:"user.name"`
        Email string `git:"user.email"`
}

func NewCommitter(cfg *config.Configuration) Committer {
        var c Committer
        if err := cfg.Unmarshal(&c); err != nil {
                // Should never happen, a `panic` is appropriate
                // here.
                panic(err.Error())
        }

       return c
}

As an added benefit, this allows us to get rid of some usage of the api.CurrentCommitter() func.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of using Unmarshal here. I want to get away from importing config on every package. There's obviously a lot of work to do in our codebase, but I think the new locking package should start off on the right foot.

package main

import (
  "fmt"

  "github.com/git-lfs/git-lfs/config"
  "github.com/git-lfs/git-lfs/locking"
)

func main() {
  cfg := config.New()
  lockclient := locking.New()
  cfg.Unmarshal(lockclient)
  cfg.Unmarshal(&lockclient.CurrentCommitter)

  fmt.Printf("%+v\n", lockclient)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. While I like the Unmarshal approach to pure data structure population it feels a bit too "magic" and implicit to be using directly on objects like Client. Accidental naming clashes could cause all sorts of hard to trace issues. While I agree that we should move away from accessing the global config everywhere, passing an explicit configuration structure to other packages seems a lot easier to understand & trace.

However, in order to remove the dependency on a "fat" config.Configuration I'd be totally happy to have separate package-specific configuration structures, which themselves are populated using Unmarshal, because that's a much more straight forward data encapsulation case, rather than having a more general purpose struct like Client being populated directly. I'm concerned that because Client is general purpose you could add members to it for other reasons and accidentally cause a name clash which bleeds into the config exchange. That won't happen if the package-specific config structure is only ever used for config fields.

The problem with doing this right now is that to do it on locking I'd also have to do it on api since that still needs the full config, which in turn would have to do it on httputil and auth and I'm not sure if that's even the end of it 😱 It needs doing but can't be done in isolation in locking, which makes me think it should be a separate refactor.

@@ -22,74 +22,88 @@ var (
ErrLockAmbiguous = errors.New("lfs: multiple locks found; ambiguous")
)

// LockFile attempts to lock a file on the given remote name
// Client is the main interface object for the locking package
type Client struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few thoughts. I really dig the idea of wrapping state up like this, but I don't think that we should expose it. Exposing a client is one more step that downstream consumers of this package have to be aware of. Instead of just saying locking.Lock(obj) they have to first instantiate a client, maintain it somewhere, and call Lock on that client instead of directly on the package.

What do you think about using a client, but hiding it? I'm thinking about something like:

package locking


var (
        c        *client
        initOnce sync.Once
)

func getClient() {
        initOnce.Do(func() {
                c = NewClient(config.Config)
        })

        return c
}

func LockFile(path string) (id string, err error) {
        return getClient().LockFile(path)
}

type client struct { ... }

func (c *client) LockFile(path string) (id string, err error) { ... }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would drastically simplify the calling code for this package (which I think is a 👍 ), but introduces a problem with using the global config.Config instance by default, which is a 👎.

I think a compromise could be changing the getClient function to take a *config.Configuration instance, like so:

var (
        cmu     sync.Mutex
        clients map[*config.Configuration]*client
)

func getClient(cfg *config.Configuration) *client {
        cmu.Lock()
        defer cmu.Unlock()

        if _, ok := clients[cfg]; !ok {
                clients[cfg] = newClient(cfg)
        }

        return clients[cfg]
}

and then changing the functions to accept a config argument first, as such:

package locking

func LockFile(cfg *config.Configuration, path string) (id string, err error) {
        return getClient(cfg).LockFile(path)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exposing client objects is pretty common in API libraries. I definitely do not want to keep a global map of config objects to internal client objects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this approach was specifically to address @technoweenie's comments about static variables, which we'd have to go back to to move away from the client object approach & simplify the API.

I still think that if we want to add easy external use of the lfs APIs, an additional facade package would probably go a long way. An interface that's simpler for non-core devs without compromising the internals and is easier to keep stable versus the inner implementations.

apiClient := api.NewClient(api.NewHttpLifecycle(cfg))

lockDir := filepath.Join(config.LocalGitStorageDir, "lfs")
os.MkdirAll(lockDir, 0755)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should let callers know about this error (even if it's non-fatal)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, and actually it is fatal 👍

// Client is the main interface object for the locking package
type Client struct {
cfg *config.Configuration
apiClient *api.Client
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: let's call this api to avoid stuttering the word "client".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't do that without making it ambiguous with the api package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. Bummer!

@sinbad sinbad merged commit 9136c8d into locking-master Dec 8, 2016
@sinbad sinbad deleted the locking-wip/config-refactor branch December 8, 2016 14:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants