New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clustering-related refactoring #3802

Merged
merged 85 commits into from Nov 9, 2017

Conversation

2 participants
@freeekanayaka
Member

freeekanayaka commented Sep 15, 2017

This branch contains refactoring work in preparation of the LXD clustering feature.

High-level goals are:

  • Remove the use of global variables when they hinder testing. In particular it should be possible to create and start more than one Daemon object in-process, to allow Go-level testing of clustering-related behavior (this is overall easier than using bash, and makes it possible to cover subtle scenarios that would be otherwise hard to setup).

  • Abstract way higher-level code from low-level database logic, to make it possible to more easily transition to a dqlite-based database. Try to embrace a more traditional one-transaction-per-request model wherever possible, since distributed transactions will be more sensitive to failures and could hence leave the database in an inconsistent state (e.g. part of a request's workload was committed and part wasn't).

  • Increase testing coverage of code that will need to be changed to implement clustering. This reduces the risk of regressions of existing logic and provides faster feedback than bash integration tests when iterating on specific code areas.

  • Improve code organization by making use of sub-packages. These create strong boundaries that make dependencies more clear, contribute in reducing coupling between components and layers, help in reasoning about code, and provide faster testing feedback since only the code area that has been changed will need to be re-compiled (as opposed to the whole code base, as it mostly happens right now in master).

@freeekanayaka

This comment has been minimized.

Show comment
Hide comment
@freeekanayaka

freeekanayaka Sep 15, 2017

Member

jenkins: test this please

Member

freeekanayaka commented Sep 15, 2017

jenkins: test this please

@freeekanayaka freeekanayaka changed the title from WIP: clustering-related refactoring to clustering-related refactoring Oct 23, 2017

@freeekanayaka

This comment has been minimized.

Show comment
Hide comment
@freeekanayaka

freeekanayaka Oct 23, 2017

Member

@stgraber as discussed during our 1-on-1, I gave the final touches to this branch, which I think is now ready. There might be some minor tweaks to do down the road, but that can be done as follow-up.

Member

freeekanayaka commented Oct 23, 2017

@stgraber as discussed during our 1-on-1, I gave the final touches to this branch, which I think is now ready. There might be some minor tweaks to do down the road, but that can be done as follow-up.

@stgraber

This comment has been minimized.

Show comment
Hide comment
@stgraber

stgraber Oct 23, 2017

Member

got a conflict to fix :)

Member

stgraber commented Oct 23, 2017

got a conflict to fix :)

@freeekanayaka

This comment has been minimized.

Show comment
Hide comment
@freeekanayaka

freeekanayaka Oct 24, 2017

Member

Conflict solved, but tests seem hung.

Member

freeekanayaka commented Oct 24, 2017

Conflict solved, but tests seem hung.

@freeekanayaka

This comment has been minimized.

Show comment
Hide comment
@freeekanayaka

freeekanayaka Oct 25, 2017

Member

jenkins: test this please

Member

freeekanayaka commented Oct 25, 2017

jenkins: test this please

@freeekanayaka

This comment has been minimized.

Show comment
Hide comment
@freeekanayaka

freeekanayaka Oct 25, 2017

Member

jenkins: test this please

Member

freeekanayaka commented Oct 25, 2017

jenkins: test this please

@stgraber

This comment has been minimized.

Show comment
Hide comment
@stgraber

stgraber Oct 25, 2017

Member

Note that @brauner and I aren't ignoring this, we do intend to review it but will need some nice quiet time to do so, with the two of us being crazy busy at a conference, this has been difficult :)

Member

stgraber commented Oct 25, 2017

Note that @brauner and I aren't ignoring this, we do intend to review it but will need some nice quiet time to do so, with the two of us being crazy busy at a conference, this has been difficult :)

@freeekanayaka

This comment has been minimized.

Show comment
Hide comment
@freeekanayaka

freeekanayaka Oct 25, 2017

Member

@stgraber hehe, sure I figured that, thanks. No hurry, it's a huge diff.

Member

freeekanayaka commented Oct 25, 2017

@stgraber hehe, sure I figured that, thanks. No hurry, it's a huge diff.

Show outdated Hide outdated lxd/apparmor.go
Show outdated Hide outdated lxd/sys/cgroup.go
Show outdated Hide outdated lxd/daemon.go
Show outdated Hide outdated lxd/daemon.go
// FIXME: We should probably rather just try a regular unix socket
// connection without using the client. However this is the way
// this logic has historically behaved, so let's keep it like it
// was.

This comment has been minimized.

@stgraber

stgraber Nov 8, 2017

Member

Yeah, I wouldn't mind changing that, this should make things slightly faster too.

@stgraber

stgraber Nov 8, 2017

Member

Yeah, I wouldn't mind changing that, this should make things slightly faster too.

This comment has been minimized.

@freeekanayaka

freeekanayaka Nov 8, 2017

Member

Considering that a) the common case is that the socket file is not there at all and b) if it's there it should be stale and not associated with a process, there is no meaningful gain in performance to be made here. I'd keep it like this for now just to reduce risk, we could clean it up in a second round.

@freeekanayaka

freeekanayaka Nov 8, 2017

Member

Considering that a) the common case is that the socket file is not there at all and b) if it's there it should be stale and not associated with a process, there is no meaningful gain in performance to be made here. I'd keep it like this for now just to reduce risk, we could clean it up in a second round.

Show outdated Hide outdated lxd/task/group.go
Show outdated Hide outdated lxd/task/task.go
Show outdated Hide outdated lxd/daemon.go
schedule := func() (time.Duration, error) {
interval := daemonConfig["images.auto_update_interval"].GetInt64()
return time.Duration(interval) * time.Hour, nil
}

This comment has been minimized.

@stgraber

stgraber Nov 8, 2017

Member

How does that deal with the case where images.auto_update_interval can be set to 0, meaning not to do any updating?

@stgraber

stgraber Nov 8, 2017

Member

How does that deal with the case where images.auto_update_interval can be set to 0, meaning not to do any updating?

This comment has been minimized.

@freeekanayaka

freeekanayaka Nov 8, 2017

Member

In that case this schedule function would return 0 as interval and the task function would not be run. See also the docstring of task.Schedule:

// Schedule captures the signature of a schedule function.
//
// It should return the amount of time to wait before triggering the next
// execution of a task function.
//
// If it returns zero, the function does not get run at all.

See also the TestTask_ZeroInterval test function for see the behavior in action.

@freeekanayaka

freeekanayaka Nov 8, 2017

Member

In that case this schedule function would return 0 as interval and the task function would not be run. See also the docstring of task.Schedule:

// Schedule captures the signature of a schedule function.
//
// It should return the amount of time to wait before triggering the next
// execution of a task function.
//
// If it returns zero, the function does not get run at all.

See also the TestTask_ZeroInterval test function for see the behavior in action.

Show outdated Hide outdated lxd/images.go
Show outdated Hide outdated shared/cert.go

freeekanayaka added some commits Oct 3, 2017

Move devices db APIs to the db.Node facade
Mechanical move of devices-related APIs from accepting a low-level
sql.DB handle to being methods of the high-level Node struct.

Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Add a shared.KeyPairAndCA function to get coventionally named certs
The new KeyPairAndCA function plays well with the certificate naming
conventions and PKI support used in LXD and can be used as a
convenience to load/generate client or server certificates.

For easier testing of server code needing a server cert, two new
TestingKeyPair and TestingAltKeyPair functions, each one returning a
pre-canned CertInfo, have been added.

The initTLSConfig private function was exported, so we can easily
re-use our default TLS parameters wherever needed (basically once in
the client and once in the server, which at the moment duplicate those
settings).

Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Extract initialization of the REST and /dev/lxd http Server
This is a mechanical move of the REST-API-related setup code in
daemon.go to a new api.go file. It makes the code in the Daemon a bit
more terse and allows for easy re-use of the REST http.Server in tests.

Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Add support for gracefully aborting schema.Ensure
This commits adds a new Check() method to the public API of
schema.Schema.

It can be used to perform some checks before actually triggering the
logic that ensures the schema is at the expected version. If the Check
function returns an error, then Ensure() will abort early. If the
error is the special value ErrGracefulAbort, the abort will be
graceful and any change performed by the Check function will be
committed.

The Check() method will be used for the cluster database, to check
that all nodes are at the same version, and possibly to update the
information of a specific node and say "I'm at this version, but I'll
won't update the schema since I see other nodes are behind".

Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Add query.Count utility
This is a convenience for getting the rows count of a certain table,
possibly with WHERE filters applied.

Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Add error messages to lxdTestSuite setup and teardown
When the suite fails it now prints more information about what went
wrong exactly.

Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Extract the APIExtensions list from api10Get
The list of API extensions can now be accessed from other call
sites. This will be needed when figuring the "version" of an LXD
cluster node, which will be a combination of schema version and number
of supported API extensions (both of which increase monotonically).

The declaration has been put into the shared/version package for
consistency with the APIVersion variable already present.

Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Fix spurious tx.Exec argument in lxd/db/schema/query.go
This commit fixes the createSchemaTable which was passing an
unnecessary query parameter to tx.Exec.

Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Gracefully cancel tasks on daemon shutdown
The various periodic tasks that the Daemon spawns have been changed so
they do their best to gracefully terminate as soon as possible when
the given context gets cancelled (which happens upn daemon
shutdown). Due to some of underlying APIs not supporting cancellation
yet, this should be considered just a starting point.

Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Drop the containerLXC.OS() convenience
It wasn't gaining much and increased indirection.

See #3802 (comment)

Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Rename container.StateObject() to container.DaemonObject()
Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Drop the storageShared.OS() convenience
Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Move util.AppArmorCanStack to a private appArmorCanStack in lxd/sys
Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Drop trailing slash from cgroup paths definitions
Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Drop pointless comments about function calls being no-op
Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Rename variable "code" to "kind" for consistency
Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Drop logging message when retrying to listen to a network port
Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Fixed typos in the task sub-package
Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Fix docstring in shared.KeyPairAndCA
Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Drop unused import in lxd/db/certificates.go
Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Rename Cert to Certificate in API names of lxd/db/certificates.go
Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Fix import formatting in lxd/db/patches.go
Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Split version declarations in shared/version into several files
Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Fixed wording in comment
Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Tweak schedule function for pruning images
Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Expose task.Task instead of returning an integer handle
Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>

@stgraber stgraber merged commit fbbaa76 into lxc:master Nov 9, 2017

4 of 5 checks passed

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
Branch target Branch target is correct
Details
Signed-off-by All commits signed-off
Details
Testsuite Testsuite passed
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details

@freeekanayaka freeekanayaka deleted the freeekanayaka:clustering-refactor branch Nov 10, 2017

stgraber added a commit that referenced this pull request Nov 24, 2017

Drop the containerLXC.OS() convenience
It wasn't gaining much and increased indirection.

See #3802 (comment)

Signed-off-by: Free Ekanayaka <free.ekanayaka@canonical.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment