Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support LXD as a backend driver #1533

Merged
merged 47 commits into from
Jun 1, 2020
Merged

Support LXD as a backend driver #1533

merged 47 commits into from
Jun 1, 2020

Conversation

townsend2010
Copy link
Contributor

@townsend2010 townsend2010 commented May 13, 2020

This allows Multipass to communicate with the LXD daemon in order to create and mange instances.


List of TODO items needed for the next iteration:

  • Additional test coverage.
  • See if it's possible to mock NetworkAccessManager and LocalSocketReply instead of using MockLocalSocketServer.
  • Figure out a way to use a single NetworkAccessManager across LXDVirtualMachineFactory and LXDVMImageVault.
  • Overhaul LXDVirtualMachineFactory::hypervisor_health_check() including detecting if LXD socket is not available, etc.
  • Determine how best to handle the cloud-init YAML config. Do we store it somewhere and re-populate upon daemon start so VirtualMachineDesription contains the values? Do we just create it once and let LXDVirtualMachineFactory configure it and then forget about it since LXD has it stored?
  • Re-add back in lxd-request() the ability to wait on operations to complete.
  • Fix LXDVirtualMachine to wait on an instance to be created instead of the clunky while loop w/ sleep().
  • Determine if LXD itself can properly update and prune source images and potentially refactor the VMImageVault's update_images() and prune_expired_images and associated timer.
  • Determine if the source part of the LXD initialization can be modified to not explicitly call out the remote since the image should already be in the local repository.
  • Allow importing images from the "custom" image host and http/file based launches.

@townsend2010 townsend2010 marked this pull request as draft May 13, 2020 20:23
@multipass-ci-bot

This comment has been minimized.

@Saviq
Copy link
Collaborator

Saviq commented May 14, 2020

In file included from /root/parts/multipass/src/src/platform/backends/lxd/lxd_vm_image_vault.cpp:18:
/root/parts/multipass/src/src/platform/backends/lxd/lxd_vm_image_vault.h:51:26: error: private field 'url_downloader' is not used [-Werror,-Wunused-private-field]
    URLDownloader* const url_downloader;
                         ^
1 error generated.

@townsend2010
Copy link
Contributor Author

@Saviq, yep, saw that yesterday and will fix today along with some other things 😀

@townsend2010 townsend2010 force-pushed the add-lxd-driver-support branch 2 times, most recently from 06dd5bb to 8ae5c74 Compare May 14, 2020 13:00
@multipass-ci-bot

This comment has been minimized.

@codecov
Copy link

codecov bot commented May 14, 2020

Codecov Report

Merging #1533 into master will decrease coverage by 0.47%.
The diff coverage is 65.32%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1533      +/-   ##
==========================================
- Coverage   74.47%   73.99%   -0.48%     
==========================================
  Files         215      222       +7     
  Lines        7785     8138     +353     
==========================================
+ Hits         5798     6022     +224     
- Misses       1987     2116     +129     
Impacted Files Coverage Δ
include/multipass/simple_streams_manifest.h 100.00% <ø> (ø)
include/multipass/utils.h 69.23% <ø> (+7.69%) ⬆️
include/multipass/virtual_machine_description.h 100.00% <ø> (ø)
include/multipass/vm_image.h 100.00% <ø> (ø)
include/multipass/vm_image_info.h 100.00% <ø> (ø)
src/daemon/daemon_config.cpp 60.27% <0.00%> (+0.81%) ⬆️
src/platform/backends/lxd/lxd_vm_image_vault.cpp 0.00% <0.00%> (ø)
src/platform/backends/lxd/lxd_vm_image_vault.h 0.00% <0.00%> (ø)
src/platform/platform_linux.cpp 59.67% <40.00%> (-6.99%) ⬇️
src/daemon/daemon.cpp 31.25% <57.14%> (-0.31%) ⬇️
... and 21 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6f7a41e...686d2cd. Read the comment docs.

@multipass-ci-bot

This comment has been minimized.

@multipass-ci-bot

This comment has been minimized.

@multipass-ci-bot

This comment has been minimized.

@multipass-ci-bot

This comment has been minimized.

@multipass-ci-bot

This comment has been minimized.

@multipass-ci-bot

This comment has been minimized.

@multipass-ci-bot

This comment has been minimized.

Copy link
Collaborator

@Saviq Saviq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a lxd plug, don't we?

include/multipass/fetch_type.h Outdated Show resolved Hide resolved
Comment on lines +645 to +646
{},
{},
{}};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should store and populate this…

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, but as you just mentioned, perhaps the image vault should configure it? I'll need to think about it some.

src/daemon/daemon.cpp Outdated Show resolved Hide resolved
src/daemon/default_vm_image_vault.cpp Outdated Show resolved Hide resolved
src/daemon/default_vm_image_vault.cpp Outdated Show resolved Hide resolved
src/platform/backends/lxd/lxd_request.cpp Outdated Show resolved Hide resolved
src/platform/backends/lxd/lxd_virtual_machine.cpp Outdated Show resolved Hide resolved
Comment on lines 164 to 199
if (json_reply["metadata"].toObject()["class"] == QStringLiteral("task") &&
json_reply["status_code"].toInt(-1) == 100)
{
QUrl task_url(QString("%1/operations/%2")
.arg(base_url.toString())
.arg(json_reply["metadata"].toObject()["id"].toString()));

// Instead of polling, need to use websockets to get events
while (true)
{
try
{
auto task_reply = lxd_request(manager, "GET", task_url);

if (task_reply["error_code"].toInt(-1) != 0)
{
break;
}

auto status_code = task_reply["metadata"].toObject()["status_code"].toInt(-1);
if (status_code == 200)
{
break;
}
else
{
std::this_thread::sleep_for(5s);
}
}
// Implies the task is finished
catch (const LXDNotFoundException& e)
{
break;
}
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this go into lxd_request?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of think the caller of lxd_request() should determine how it waits on an operation to finish. Some callers may use the operations/blah/wait method while others may use a polling (and later websockets) method.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it would have to be the caller's decision whether to let lxd_request() handle the async operation or not. I'm kinda thinking ::waitForFinished() style. Except it's not an object, I know ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, still, the caller should determine how it waits on an operation to complete. I suppose we could have a default wait logic in lxd_request and if the caller doesn't want or need to use that, then it can pass in 0 for the timeout and lxd_request will just return the json response from LXD. WDYT?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's roughly what I had in mind. Maybe not inferred from timeout == 0 but rather something explicit like bool wait. Ideally something that resembles something already existing :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, ok, I'll basically put it back to how you originally had it 😁

src/platform/backends/lxd/lxd_virtual_machine.cpp Outdated Show resolved Hide resolved
QUrl task_url(QString("%1/operations/%2/wait")
.arg(base_url.toString())
.arg(state_task["metadata"].toObject()["id"].toString()));
lxd_request(manager, "GET", task_url);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't use that instead of the above poll? Or will it time out?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if you're referring to using wait here or the polling method.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, using wait, why not use it above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, right, I think using a wait with no or a long-ish timeout above would be just as effective. I observed that LXD can take quite a long time uncompressing an image for the first time.

Copy link
Collaborator

@Saviq Saviq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to prune multipass find.

src/platform/backends/lxd/lxd_virtual_machine.cpp Outdated Show resolved Hide resolved
src/platform/backends/lxd/lxd_virtual_machine_factory.cpp Outdated Show resolved Hide resolved
src/platform/backends/lxd/lxd_virtual_machine_factory.cpp Outdated Show resolved Hide resolved
Comment on lines 68 to 108
void mp::LXDVirtualMachineFactory::configure(const std::string& /* name */, YAML::Node& /* meta_config */,
YAML::Node& /* user_config */)
{
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm should we actually configure cloud init through here rather than shipping it with the vm description?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'll look into that.

src/platform/backends/lxd/lxd_vm_image_vault.cpp Outdated Show resolved Hide resolved
} // namespace

mp::LXDVMImageVault::LXDVMImageVault(std::vector<VMImageHost*> image_hosts, const QUrl& base_url)
: image_hosts{image_hosts}, base_url{base_url}, manager{std::make_unique<mp::NetworkAccessManager>()}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{C,Sh}ould the manager not be shared?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only way to make that work and still be generic enough for all image vault & backend driver cases would be to refactor/expand URLDownloader to use the new NetworkManager class.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just mean that, ideally, we'd have a shared manager for everything LXD… Not saying that URLDownloader should, too.

My rationale being that we could then set it up once (especially with SSL and authentication) and share it through all LXD components. Also saving memories.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem I'm seeing here is that DaemonConfig is responsible for setting up the image vault and setting up the backend driver and does that as two separate steps that are only coupled by URLDownloader. In order for LXDVMImageVault and LXDVirtualMachineFactory to share a NetworkAccessManager object would be to somehow relate those two, but do it in a way that DaemonConfig does not have to account for backend and image host differences. I think we want DaemonConfig to be a generic as possible and we achieve that via the platform calls, but still, those platform calls do not couple the image vault and the backed driver.

If we don't go the URLDownloader way, then we'd need to refactor the VirtualMachineFactory to accept the VMImageVault object (at least a pointer to it) and then each factory and image vault can determine their relationship to one another. Some would just ignore each other like now and we'd basically be forcing it hold at least a pointer to the vault and others like LXD would have a fairly coupled relationship like sharing the NetworkAcessManager, etc.

I'll think about it some more, but I don't think it will be trivial.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bear in mind none of the comments I made here are my requirement, or stuff to be fixed in this PR. It's just driving my thinking about the architecture, since this is pushing what we had in place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, and I totally agree with the idea of being able to couple the image vault and vm factory when needed.

@townsend2010
Copy link
Contributor Author

@Saviq,

Thanks for the first review.

Need a lxd plug, don't we?

Yep

Need to prune multipass find.

I'm pretty sure I already did that for cloud images. I know I have not accounted for our custom repository yet. Is this what you are seeing?

Copy link
Collaborator

@Saviq Saviq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's it for now, it does work fine, though :)

src/platform/backends/lxd/lxd_vm_image_vault.cpp Outdated Show resolved Hide resolved
src/platform/backends/lxd/lxd_vm_image_vault.cpp Outdated Show resolved Hide resolved
Comment on lines +180 to +233
void mp::LXDVMImageVault::prune_expired_images()
{
}

void mp::LXDVMImageVault::update_images(const FetchType& fetch_type, const PrepareAction& prepare,
const ProgressMonitor& monitor)
{
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are TODO, right? We should definitely prune old images, and {c,sh}ould we drive the update some way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They might be TODO. LXD can also handle this in and of itself, so should we config LXD to do this for us or we drive it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can make it DTRT without our involvement, probably best.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then I will look into that and failing being able to have LXD control all of that, I'll implement it here.

src/platform/platform_linux.cpp Outdated Show resolved Hide resolved
src/simplestreams/simple_streams_manifest.cpp Show resolved Hide resolved
src/simplestreams/simple_streams_manifest.cpp Show resolved Hide resolved
{
LXDBackend()
: socket_path{QString("%1/test_socket").arg(data_dir.path())},
test_server{mpt::MockLocalSocketServer(socket_path)},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not test this on the access manager level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already unit test the access manager itself in other tests, so I think mocking it as high up as possible is the best way to go for testing the backend. I sense you disagree with this approach, so am I missing some benefits if we use the access manger as well?

Copy link
Collaborator

@Saviq Saviq May 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean to mock the access manager, rather than talking through a test socket (which was the correct thing to do for testing the manager itself).

Unless I'm misunderstanding what MockLocalSocketServer does?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duh on my end. I will do that to simplify this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this a bit more, what will need to happen is to make LocalAccessManager a non-final class and then create a mock LocalAccessManager class and have it respond how we want. Then we'll need to modify LXDVirtualMachine and LXDVMImageVault to accept a LocalAccessManager object (ie, the mocked object) and not create one in the ctor.

Then we can get rid of the MockLocalSocketServer and just put that logic back into the test_local_network_access_manager.cpp file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even some more diving into this and really, doing it the current way where we talk over a local socket to a "mocked" server is the most straightforward way to go. If we went the route of mocking NetworkAccessManager itself, we'll basically need to reimplement that along with LocalSocketReply and that doesn't really make much sense. I think leaving it the way it is is the right thing to do in order not to introduce much more complexity in testing for the sake of not using a temporary local socket.

@multipass-ci-bot

This comment has been minimized.

Chris Townsend and others added 11 commits May 29, 2020 12:19
- Don't return const variable.
- Use current_state() for getting and saving state upon initialization.
Co-authored-by: Michał Sawicz <michal.sawicz@canonical.com>
- Add `lxd` snap plug.
- Remove "None" fetch type since there is now the LXDVMImageVault implementation.
- Make some logging messages trace level.
- This will create a "multipass" project and define a default profile that uses a
  Multipass specific network bridge.
@multipass-ci-bot

This comment has been minimized.

@multipass-ci-bot

This comment has been minimized.

@Saviq
Copy link
Collaborator

Saviq commented Jun 1, 2020

We'll need to handle this on startup:

cze 01 12:48:46 michal-laptop multipassd[767819]: Requesting LXD: GET unix:///var/snap/lxd/common/lxd/unix.socket@1.0/projects/multipass
cze 01 12:48:46 michal-laptop multipassd[767819]: Caught an unhandled exception: Cannot connect to /var/snap/lxd/common/lxd/unix.socket: 0
cze 01 12:48:46 michal-laptop systemd[1]: snap.multipass.multipassd.service: Main process exited, code=exited, status=1/FAILURE

@townsend2010
Copy link
Contributor Author

@Saviq,

We'll need to handle this on startup:

Yes, the health check needs a complete overhaul. Is that something that needs to be in a alpha/preview/experimental release of this?

@Saviq
Copy link
Collaborator

Saviq commented Jun 1, 2020

Is that something that needs to be in a alpha/preview/experimental release of this?

It's probably ok for now.

@townsend2010
Copy link
Contributor Author

@Saviq,

Ack, then I'll put this on my list of TODO's for the next release of this 😁

@multipass-ci-bot
Copy link
Collaborator

multipass-ci-bot commented Jun 1, 2020

macOS build available: multipass-1.3.0-dev.269.pr1533+g056f1874.mac-Darwin.pkg
Snap build available: snap refresh multipass --channel edge/pr1533

@Saviq
Copy link
Collaborator

Saviq commented Jun 1, 2020

I say:

bors merge

@bors
Copy link
Contributor

bors bot commented Jun 1, 2020

Build failed:

@Saviq Saviq merged commit 3c66cfd into master Jun 1, 2020
@bors bors bot deleted the add-lxd-driver-support branch June 1, 2020 19:05
Saviq added a commit that referenced this pull request Jun 2, 2020
1533: Support LXD as a backend driver r=Saviq a=townsend2010

This allows Multipass to communicate with the LXD daemon in order to create and mange instances.



Co-authored-by: Michał Sawicz <michal@sawicz.net>
Co-authored-by: Chris Townsend <christopher.townsend@canonical.com>
@Saviq Saviq added this to the v1.3.0 milestone Jun 5, 2020
@Saviq Saviq mentioned this pull request Jun 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants