Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modularize Managers #57

Closed
wants to merge 15 commits into from
Closed

Modularize Managers #57

wants to merge 15 commits into from

Conversation

blitzrk
Copy link

@blitzrk blitzrk commented Mar 21, 2017

This pull request was originally meant to just modularize the manager system so that you could import ladon without also having to depend on rethink/redis/sqlx drivers if you weren't using them. (My company is considering building an internal system based on Ladon and we would like this feature.) However, it got somewhat more complex. It's certainly possible that this giant PR can be simplified into a few smaller ones, so consider it a work in progress and let me know which ideas/changes you like.

Big things this PR does:

  • Moves the manager tests/interface into a manager subpackage
  • Moves each manager implementation into a subpackage of that
  • Registers each manager "constructor" in manager.DefaultManagers so you can invoke them as if they were drivers (after underscore-importing them).
    • This is probably premature optimization, but it means that a CLI could easily switch managers using a flag.
  • Moves the warden/conditions into an access subpackage
  • Moves the policies into a policy subpackage (this could be moved into access)
  • Creates a new log subpackage to make logging pluggable
    • I get that this will probably be unpopular, but for my use case logrus is a bad fit. I figure that I'll probably end up maintaining my own fork with this change.
  • It currently breaks pre-Go 1.8 compatibility due to using sqlx.ConnectContext, but this can be fixed. FIXED
  • It generalizes options for connecting to a backend for a manager.
    • There is plenty more work to do to make sure that each option is actually respected.
  • It currently uses a local version of ory/common/integration to get around some issues.

@coveralls
Copy link

Coverage Status

Coverage decreased (-9.05%) to 66.791% when pulling 0302524 on blitzrk:manager-plugin into 0f3e57d on ory:master.

@blitzrk
Copy link
Author

blitzrk commented Mar 21, 2017

I think that the coverage decrease isn't a true decrease, but an artifact of splitting up the code. However, I'll work on that. There are definitely some important cases not being covered (now or before).

@coveralls
Copy link

Coverage Status

Coverage decreased (-9.05%) to 66.791% when pulling b114660 on blitzrk:manager-plugin into 0f3e57d on ory:master.

@aeneasr
Copy link
Member

aeneasr commented Mar 21, 2017

Thank you for your work on this. As you already noticed, this is a quite complex PR with a lot of code changes that will introduce some major breaks in the API. There are some really good ideas, I need some time to review this and decide what to do next.

@aeneasr
Copy link
Member

aeneasr commented Mar 21, 2017

What's the reasoning of introducing packages like policies and access? It is best practice to have as few packages as possible in Go so you need less import statements and fewer exports.

@aeneasr
Copy link
Member

aeneasr commented Mar 21, 2017

What is the purpose of moving the different managers to subpackages? go get is still going to fetch all of the dependencies, right? The only way to avoid this is to move them to a completely different package

@blitzrk
Copy link
Author

blitzrk commented Mar 21, 2017

@arekkas Thanks for any time you have to take a look at what I've done. (Certainly anything that gets merged upstream make it easier to maintain my fork.)

I started off with the goal of minimizing dependencies. Yes, go get will still grab them all, but if manager implementations (RethinkDB, redis, SQL, and any future ones) are separated into subpackages, then using a dependency management tool like Glide or the upcoming official dep tool, you can prune unused subpackages and therefore some potentially large dependency trees.

For example, if you were to just use Redis, you could avoid having to require sql-migrate, gorp, goose, gorethink, and many more. As the number of manager implementations grows, the savings grow. And when running tests in docker, this can mean saving a lot of time on a glide install step.

As far as separating out policy and access, well, those two could be combined into just access, honestly. But why access? Because of circular dependencies.

ladon.Ladon contains a Manager and Managers are implemented using Policies, Conditions, etc. So at a bare minimum you need two subpackages. I agree that the number of subpackages generally should be minimized in idiomatic Go. While I got some of my inspiration from https://github.com/micro/go-micro, I didn't take the subpackaging nearly so far. And as the core Go team recommends, I tried to make the subpackage names useful and non-reptitive wherever possible.

P.S. I'm currently writing tests for MySQL that use the manager.Options.

@blitzrk
Copy link
Author

blitzrk commented Mar 21, 2017

You can see in the glide manifest file documentation (http://glide.readthedocs.io/en/latest/glide.yaml/) the description of subpackages: A record of packages being used within a repository. This does not include all packages within a repository but rather those being used.

As implied, unused subpackages (and therefore their dependencies) won't get downloaded on glide install.

@aeneasr
Copy link
Member

aeneasr commented Mar 22, 2017

I see, I didn't know that the core team is working on something that mimics the behaviour of glide. In that case, moving the managers to their own packages makes more sense. In general, I don't have a problem with fetching a few extra dependencies in my code. go get is fast, and even with large projects such as hydra the build binaries are ridicolously small. Go isn't what java, ruby or nodejs are doing to you. Striping away those dependecies might save you a few kb in the end, but nothing more.

From quickly looking over your branch, here are my take aways:

  • I think moving the managers into their own packages or even into their own repositories could make sense. Of course, this will break BC but since we aren't on stable yet this is a change I feel comfortable(ish) with, as updating the code for that change isn't hard. Because (as far as I remember) the main package has no direct dependencies to the specific manager implementations, this shouldn't be a problem.
  • I like the idea of either stripping away logrus, as it is only used here, and either replace it with the logger interface, or simply return good error messages and don't do any logging in the library.
  • I really don't like moving stuff to policy or access subpackages, it bloats your import header and makes the library harder to use because you need to know all the subpackages. Additionally, I can't see any gain as those packages will usually be used by most code that uses ladon, and they don't have any big dependencies themselves. Additionally, this will break more code and will require substantial changes to existing code bases.
  • I don't like handling connections in this library, primarily because we shouldn't be opening multiple pools per codebase to e.g. sql, but rather use one pool of connections for all. While the attempt to unify this is nice, I think this should be done in a different library instead.

In general, I want to support SQL/memory officially and all other adapters like rethink or redis as a community effort. One way to go along with fewer breaks could be to replace sqlx with the standard database/sql and keep the SQL managers in the main package as they won't depend on any additional dependencies, and only move redis/rethink to their own package(s) (e.g. ladon/community). The store integration tests could be moved to a package like ladon/integration.

One more thing as a takeaway, because ladon used to have a similar structure you are suggesting, but was unified later on to be more usable:

  1. I'm pretty sure you can move the managers to their own packages without refactoring the whole code base
  2. Importing ladon/sql and database/sql sucks, suddenly you have to name the imports which also tricks stuff like gomock or similar. ladon/sql isn't really doing SQL, it's a manager that understands SQL - that's a difference. I learned this the hard way because I used to use this pattern quite a lot and it was very annoying to use.

@blitzrk
Copy link
Author

blitzrk commented Mar 22, 2017

Thanks for all the feedback!

Okay, so first big thing: Yeah, I think I can pretty easily put access and policy back into the ladon package. Looking back at it now, I'm pretty sure the import loop problem really only stemmed from ladon_test.go using package ladon instead of package ladon_test. I personally don't mind separating out into a few subpackages, but that's just my own preference. I'll put this back after finishing up some more integration tests.

Speaking of integration tests, I can start changing out sqlx for plain sql. Does this mean you'd also like to not depend on sql-migrate?

Agreed on logging. Returning errors is always the best decision for a library. I haven't thought about this much yet, but what if we were to add one more method to the Manager interface for some sort of HealthCheck? Obviously messing with interfaces makes for very annoying breaking changes, but other than that, the options are to log or drop errors when database connections are lost or reconnected. This is valuable debugging info. I'll see if integration testing brings up any other ideas.

So I'm not sure what you're referring to with the connection pools. I haven't introduced any new pools. DefaultManagers was just a map of names to standardized manager constructors, if you meant that. It doesn't cache or pool anything.

Also, going back to the top of your comment, in regards to fetching dependencies, you're absolutely right that grabbing a few extra is relatively fast. But here's something to consider: because packages are hosted in a distributed manner and git history is mutable, just having a lock file (such as glide provides) isn't really enough to guarantee you'll always be able to build. Someone could easily take their project down or screw up a few commits (maybe squashing them). That's why my company and some others I know clone all dependencies and host them privately. Glide even makes it easy to use them:

- package: github.com/BurntSushi/toml
  version: ^0.2.0
  repo: https://my.company.io/ext/github.com-BurntSushi-toml.git

But now you can see how dependencies suddenly become more of a hassle.

So anyway, I'll make these changes (might not get to them until the weekend) and hopefully they'll reduce the complexity of this PR a bunch. Then let me know what you think, and hopefully it's closer to something you'd like to integrate. :) At the very least, it should make it easier to create an ory/ladon-community which you can move the redis and rethink managers to.

@aeneasr
Copy link
Member

aeneasr commented Mar 22, 2017

Okay, so first big thing: Yeah, I think I can pretty easily put access and policy back into the ladon package. Looking back at it now, I'm pretty sure the import loop problem really only stemmed from ladon_test.go using package ladon instead of package ladon_test.

Yes, please use ladon_test if possible - that makes a lot of sense.

Speaking of integration tests, I can start changing out sqlx for plain sql. Does this mean you'd also like to not depend on sql-migrate?

No, sql-migrate is very important. I also want to walk back on the statement. Let's keep sqlx and try moving the sql manager to a different package (not sure about the naming though, I don't like sql as the package name).

Agreed on logging. Returning errors is always the best decision for a library. I haven't thought about this much yet, but what if we were to add one more method to the Manager interface for some sort of HealthCheck?

I don't think that's neccessary - let's keep this simple and remove logrus!

Also, going back to the top of your comment, in regards to fetching dependencies, you're absolutely right that grabbing a few extra is relatively fast. But here's something to consider: because packages are hosted in a distributed manner and git history is mutable, just having a lock file (such as glide provides) isn't really enough to guarantee you'll always be able to build.

Yes, this is true. With Go there's always the trade-off between lock files and commiting deps for guaranteed reproducible builds. It happened to me once that someone deleted a library from github, so the lock file didn't help. In that sense keeping deps to a minimum is good practice. What has to be understood though is that there is a very serious trade-off between moving forward with fewer dependencies and breaking compatibility - which moving the managers to a new location is. Ladon is seeing a very steady increase amount of clones (currently 300 uniques per week) and while not all of those are going to use ladon in a production system, it will break stuff.

So I'm not sure what you're referring to with the connection pools. I haven't introduced any new pools.

Let's consider this (I only briefly checked the code, so it's possible that I overlooked something):

ladon.NewManager(/* options */) // this creates a db connection to mysql

// later in code

db, err := sqlx.Open("...")
db.DoSomething()

We now have two connetion (pools), one encapsulated in NewManager and one that is used by our other code. Instead we should do something like this:

db := dbmanager.NewConnection(/* options */) // abstracts database connection instantiation

ladon.NewSQLManager(db) // doesn't do a lot, basically returns the manager

// later in code
db.DoSomething()

Now we have one connection pool that we need to take care of, like closing it, handling errors, listening on events, ...

Does that make it clearer?

@aeneasr
Copy link
Member

aeneasr commented Mar 22, 2017

Before you start implementing something, please lay out your plan in writing (bullet points are enough) here so I have something that I can think about. Otherwise we'll probably end up with another discussion and a lot of code being done, that maybe needs changes again :)

@blitzrk
Copy link
Author

blitzrk commented Mar 22, 2017

Got it on the sql-migrate. Got it on removing logrus. Got it on the minimizing breaking changes.

I understand now what you mean by connections. You're correct that this does make them harder to manage. I hadn't thought of that. Currently there is no way to tell from the errors a manager can return if that means the backend is down (and whether/how long to wait for it to reconnect). Therefore you're going to need the database connection to ping to diagnose errors.

So looking back at all my changes (and some yet uncommitted), a lot of lessons. Most of those lessons are that my main goal can be achieved with very few, localized breaking changes. This PR should only be about those. Many of the other breaks were caused by trying to generalize manager instantiation. Now, this would be very cool if the goal were to create a ladon as a CLI instead of library (in fact I'm very nearly there in uncommitted code), however that was really accidental, as that's not a goal for either of us.

Here's where I think I'd like to go next on this PR:

  1. Start over from master
  2. Pull in my integration test fixes, temporarily
  3. Make sure all the tests are passing and test only the public interface (aka package ladon_test)
  4. Move managers into subpackages (import loops shouldn't be a problem now) with good names (not sql)
  5. Mock the database in ladon_test files
  6. Copy manager tests from ladon to each corresponding package and put behind an integration build flag (i.e. // +build integration)

Then as a separate PRs:

  • Fix errors in RethinkDB manager. Currently, you could keep Getting stale data indefinitely if all you do is Get and the connection drops. The only visibility into this is the logs. Since you can't handle logs, the errors should instead be stored and returned on the next get unless a successful Get happened more recently.
  • Fix ory/common/integration. I think there's already an issue about this and the Rethink client version. (Man, it'll be nice to move rethink and redis out into a separate repo, won't it?)

And separate issues/discussions:

  • I know you prefer to have simple New funcs for creating managers and then documenting that other funcs like CreateSchema or Watch may need to be run, but I like the reverse. Once there's a place for community Manager implementations, I may create one for SQL that does some extra things, such as running migrations automatically, passing a context in for cancellable calls with sqlx, etc.

@ligustah
Copy link

I'd just like to mention that I would very much appreciate not having to vendor redis/rethink/logrus etc, when I don't really need either of those. I also think that minimizing direct dependencies is a security win as well.

I wouldn't mind breaking changes to get there and could also offer some limited help in testing.
Thanks a lot for the efforts on this! 👍

@aeneasr
Copy link
Member

aeneasr commented Mar 23, 2017

So looking back at all my changes (and some yet uncommitted), a lot of lessons. Most of those lessons are that my main goal can be achieved with very few, localized breaking changes. This PR should only be about those. Many of the other breaks were caused by trying to generalize manager instantiation.

There are not a lot of reflective people on GitHub. Kudos for that :)

Now, this would be very cool if the goal were to create a ladon as a CLI instead of library

I'd love to have a ladon-cli package, or alternative ladon/cmd!

Here's where I think I'd like to go next on this PR:

  1. Start over from master
  2. Pull in my integration test fixes, temporarily
  3. Make sure all the tests are passing and test only the public interface (aka package ladon_test)
  4. Move managers into subpackages (import loops shouldn't be a problem now) with good names (not sql)
  5. Mock the database in ladon_test files

Sounds good

  1. Copy manager tests from ladon to each corresponding package and put behind an integration build flag (i.e. // +build integration)

I like the way this is currently handled. How about having an ladon/integration or ladon/test package instead? This would also reduce code duplication.

  • Fix errors in RethinkDB manager. Currently, you could keep Getting stale data indefinitely if all you do is Get and the connection drops. The only visibility into this is the logs. Since you can't handle logs, the errors should instead be stored and returned on the next get unless a successful Get happened more recently.

Sounds very good!

  • Fix ory/common/integration. I think there's already an issue about this and the Rethink client version. (Man, it'll be nice to move rethink and redis out into a separate repo, won't it?)

Ladon rethink or rethink in integration package? I merged the new rethink repo location in the common/integration package recently.

@aeneasr
Copy link
Member

aeneasr commented Mar 27, 2017

Just checking in (no pressure) if you're still up to this? I'll release a larger feature soon ( #58 ) and thought of including the new manager setup too and just do one breaking release instead of two.

@blitzrk
Copy link
Author

blitzrk commented Mar 27, 2017

Thanks for checking in. Yeah, I caught a cold, which slowed me down a bit. How immediately are you looking to release? This new plan shouldn't take very long to implement. Let me know when you want it this week and I'll do my best to have it ready.

Also, thank you for taking the time to help me contribute. I really appreciate it. I've seen so many different opinions from maintainers on how they like to receive contributions - from PRs only (code shows dedication and issues are too big of a hassle) to issues first (PRs are too aggressive and maintainers feel bad rejecting them if they don't fit the direction they want to take the project). So thanks for rolling with my crazy first draft.

I'll do those 6 steps as laid out, except with integration tests as a subpackage, as quickly as possible. I'll push commits as I go so you can see the progress.

@aeneasr
Copy link
Member

aeneasr commented Mar 27, 2017

Don't worry, I'll ping you 3-4 days prior to release

@aeneasr
Copy link
Member

aeneasr commented Mar 28, 2017

Oh and get well soon! :)

@aeneasr
Copy link
Member

aeneasr commented Apr 3, 2017

I went ahead and implemented this in #58

@aeneasr aeneasr closed this Apr 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants