Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upStorage Subsystem Design R4 #1842
Comments
kalkin
changed the title from
Storage Subsystem Design R4
to
WIP: Storage Subsystem Design R4
Mar 16, 2016
kalkin
changed the title from
WIP: Storage Subsystem Design R4
to
Storage Subsystem Design R4
Mar 17, 2016
andrewdavidwong
added
the
enhancement
label
Apr 6, 2016
andrewdavidwong
added this to the Release 4.0 milestone
Apr 6, 2016
kalkin
referenced this issue
in woju/qubes-core-admin
Apr 9, 2016
Closed
Save pool in config xml instead of storage.conf #2
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Apr 10, 2016
Member
Could you remind why Storage.get_pool(volume), not simply Volume.pool (object reference, not its name)?
How volume life cycle looks like? For example qvm-create + qvm-start:
- During
qvm-create- first instantiate of the QubesVM object - what state is invm.storage? And what invm.volume_config(is there such thing?) - After
vm.create_on_disk- the same question - After
app.save()- how storage-related information will look like inqubes.xml? - After loading of
qubes.xml- how each of those objects got loaded? (I assume the final state should be exactly the same as in step 2). - During domain startup - what functions get called and by whom?
|
Could you remind why
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kalkin
Apr 10, 2016
Member
Could you remind why Storage.get_pool(volume), not simply Volume.pool (object reference, not its name)?
Yes object reference is better. I should switch to that.
How volume life cycle looks like?
- AppVM and TemplateVM have default
volume_configset QubesVM.__init__(self, app, xml, volume_config={}, **kwargs)Update volume config- If XML is passed, parse XPath(domain/volume-config/volume) and update
self.volume_config - If a
volume_configparameter was passed, updateself.volume_config
- If XML is passed, parse XPath(domain/volume-config/volume) and update
- On
QubesVM.on_domain_init_loaded()the storage is initialized.Storage.__init__(vm)initializes the volumes from thevolume_config:
if hasattr(vm, 'volume_config'):
for name, conf in self.vm.volume_config.items():
pool = self.get_pool(conf['pool'])
vm.volumes[name] = pool.init_volume(conf)
For example qvm-create
QubesVM.create_on_disk(source_template)callsStorage.create_on_disk(source_template)(see design above). At this point Imagefiles are written to disk.
Example qvm-start
QubesVM.start()callsStorage.start()which iterates through all volumes of the vm and calls*Pool.start():
for volume in self.vm.volumes.values():
self.get_pool(volume).start(volume)
- During qvm-create - first instantiate of the QubesVM object - what state is in vm.storage?
No state at all. It just nicer not to have all this logic in QubesVM. I already mentioned, may be Storage should be merged with QubesVM
And what in vm.volume_config (is there such thing?)
Yes there is such thing. See default values for AppVM and for TemplateVM. This are updated like i explained above.
2 After vm.create_on_disk - the same question
Nothing changes.
3 After app.save() - how storage-related information will look like in qubes.xml?
<domain>
...
<volume-config>
<volume name="root" pool="default" volume_type="snapshot"/>
<volume name="volatile" pool="default" volume_type="volatile"/>
<volume name="private" pool="mypool" volume_type="read-write"/>
<volume name="kernel" pool="linux-kernel" volume_type="read-only"/>
</volume-config>
...
</domain>4 After loading of qubes.xml - how each of those objects got loaded? (I assume the final state should be exactly the same as in step 2).
QubesVM.__init__parses thevolume_configfrom xml and updates the defaultvolume_config. This is also how an AppVM could have more volumes, which were assigned to it i.e.: viaqvm-blockQubesVM.on_domain_init_loaded()initializes storage, pools(only currently, because of the need of passing a vm), and initializes the volumes.
5 During domain startup - what functions get called and by whom?
QubesVM.start()callsStorage.start()which calls for each volume*Pool.start(volume). At this point the volatile volume is reset, the lvm pool would do a snapshot from the origin for root of an AppVM or any other pool implementation specific thing.QubesVM.block_devices()property is called by the jinja2 template. It returns[v.block_device() for v in self.volumes.values()].BlockDevicecontains all the data needed to be written to a libvirt xml
Yes object reference is better. I should switch to that.
if hasattr(vm, 'volume_config'):
for name, conf in self.vm.volume_config.items():
pool = self.get_pool(conf['pool'])
vm.volumes[name] = pool.init_volume(conf)
for volume in self.vm.volumes.values():
self.get_pool(volume).start(volume)
No state at all. It just nicer not to have all this logic in
Yes there is such thing. See default values for AppVM and for TemplateVM. This are updated like i explained above.
Nothing changes.
<domain>
...
<volume-config>
<volume name="root" pool="default" volume_type="snapshot"/>
<volume name="volatile" pool="default" volume_type="volatile"/>
<volume name="private" pool="mypool" volume_type="read-write"/>
<volume name="kernel" pool="linux-kernel" volume_type="read-only"/>
</volume-config>
...
</domain>
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Apr 10, 2016
Member
Ok, now it is much clearer for me. Few more questions:
How Volume.vid got initialized and then saved? Is it simply missing in above example xml? Or maybe it is calculated dynamically by vm.volumes[name] = pool.init_volume(conf)?
How root volume of AppVM know which template's root image should be used? Is it done by pool.init_volume(conf) too (how?)? or maybe some other method? The same question for kernel volume (how does it get vm.kernel).
Will vm1.volumes['root'] be different than vm2.volumes['root'] (having the same template)? Will it be different than template.volumes['root'] (I guess so)? How about its vid? Will it be the same, but differ in volume_type attribute?
I assume for template switch, vm.volumes['root'] will be replaced by appropriate event handler (property-set:template event).
|
Ok, now it is much clearer for me. Few more questions: How |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kalkin
Apr 10, 2016
Member
How Volume.vid got initialized and then saved? Is it simply missing in above example xml?
This depends on pool implementation. XenPool does not need it. It only needs vm type, name and volume name to locate the volume. Maybe in future But currently we don't need it.
Or maybe it is calculated dynamically by vm.volumes[name] = pool.init_volume(conf)?
Yes it is. In the xen pool implementationVolumes.vidis set toVolumes.path, but not saved because not needed.How root volume of AppVM know which template's root image should be used?
This is implementation specific. Here how it looks for XenPool. It instantiates a XenPool for the TemplateVM of the AppVM to get the directory where the original image is. Now when i explained that, this sounds really hacky
The same question for kernel volume (how does it get vm.kernel).
Currently because we pass always the vm object to the pool, it just picks it from self.vm.kernel
Will vm1.volumes['root'] be different than vm2.volumes['root'](having the same template)?
Will it be the same Volume instance? - No.
Will it be the same device/file? - This is implementation specific.
Will it be different than template.volumes['root'](I guess so)? How about its vid? Will it be the same, but differ in volume_type attribute?
Yes an AppVM would have a volume named root of volume_type snapshot, while the template would have a volume named root of volume_type origin
Generally a vid would be something unique for the pool. Like i explained above you rarely need the vid if you get a vm object passed.
I assume for template switch, vm.volumes['root'] will be replaced by appropriate event handler (property-set:template event).
I must admit I have not thought through this, because if you get passed the vm you can always grab the template and do "autodiscovery", but if we make Pool independent from vm we can use an event handler.
This depends on pool implementation.
This is implementation specific. Here how it looks for XenPool. It instantiates a
Currently because we pass always the vm object to the pool, it just picks it from
Will it be the same
Yes an AppVM would have a volume named root of volume_type snapshot, while the template would have a volume named root of volume_type origin Generally a
I must admit I have not thought through this, because if you get passed the vm you can always grab the template and do "autodiscovery", but if we make |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kalkin
Apr 10, 2016
Member
Generally a vid would be something unique for the pool
I'm sorry, not unique! A vid is just something what makes sense for the pool. I.e an LvmPoolSnapshotVolume would have the vid set to the name of the LvmOriginalVolume volume. So multiple AppVMs would have a root config which has the same vid. This doesn't matter, because the pool logic will make sure that a 'snapshot' volume_type is always mounted read-only and will not change.
I'm sorry, not unique! A vid is just something what makes sense for the pool. I.e an |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Apr 10, 2016
Member
This is implementation specific. Here how it looks for XenPool. It instantiates a XenPool for the TemplateVM of the AppVM to get the directory where the original image is. Now when i explained that, this sounds really hacky
😟
IMO Pool should not have vm reference, but it should be passed to Pool.init_volume. Then, depending on implementation, Volume instance may keep vm reference, or may drop it.
IMO Pool should not have vm reference, but it should be passed to |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kalkin
Apr 10, 2016
Member
IMO Pool should not have vm reference, but it should be passed to Pool.init_volume. Then, depending on implementation, Volume instance may keep vm reference, or may drop it
Sounds reasonable.
Sounds reasonable. |
This was referenced Apr 18, 2016
added a commit
to marmarek/old-qubes-core-admin
that referenced
this issue
May 20, 2016
added a commit
to marmarek/old-qubes-core-admin
that referenced
this issue
May 21, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
May 21, 2016
Member
Design update proposal:
Add Volume.verify, then call it from Storage.verify_files. And do not check files/images/whatever presence elsewhere (for example volume instantiate).
Rationale:
It should be possible to have full QubesVM object without actual files being in place. Of course such VM can't be started etc. The use case I care mostly here is restoring from backup: it consists of loading qubes.xml from the backup, then pick some of them and only then restore files. Possibly changing some properties in the meantime (for example kernel). So full qubes.xml loading must succeed without VM files being in place.
Other use case is some crash (or even user error) resulting in missing VM files. Missing file of one VM should not block all of them (which is the case when exception is raised during qubes.xml loading).
Right now I've tripped over kernel pool: https://github.com/woju/qubes-core-admin/blob/core3-devel/qubes/storage/kernels.py#L61-L63
Design update proposal:Add Rationale:It should be possible to have full QubesVM object without actual files being in place. Of course such VM can't be started etc. The use case I care mostly here is restoring from backup: it consists of loading Other use case is some crash (or even user error) resulting in missing VM files. Missing file of one VM should not block all of them (which is the case when exception is raised during Right now I've tripped over kernel pool: https://github.com/woju/qubes-core-admin/blob/core3-devel/qubes/storage/kernels.py#L61-L63 |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
@marmarek Sounds reasonable |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
@marmarek Something like this: kalkin/qubes-core-admin@7f4e71e ? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
May 22, 2016
Member
Yes, exactly :)
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
|
Yes, exactly :) Best Regards, |
andrewdavidwong
added
C: core
P: major
labels
May 23, 2016
added a commit
that referenced
this issue
May 31, 2016
added a commit
to woju/qubes-core-admin
that referenced
this issue
Jun 2, 2016
added a commit
to woju/qubes-core-admin
that referenced
this issue
Jun 2, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
andrewdavidwong
Jun 28, 2016
Member
Does per-VM encryption fall under this issue? [https://github.com/QubesOS/qubes-issues/issues/1293#issuecomment-229028321]
|
Does per-VM encryption fall under this issue? [https://github.com/QubesOS/qubes-issues/issues/1293#issuecomment-229028321] |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kalkin
Jun 30, 2016
Member
@andrewdavidwong Theoretically it should be possible with a custom storage implementation, but the possibility to encrypt the volatile image could be added even to current storage implementations.
|
@andrewdavidwong Theoretically it should be possible with a custom storage implementation, but the possibility to encrypt the volatile image could be added even to current storage implementations. |
na--
referenced this issue
Oct 27, 2017
Closed
Difficult to determine free and used disk space with LVM thin provisioning #3240
marmarek
added
the
C: doc
label
Jan 22, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Mar 30, 2018
Member
The last part, documentation, is done here: https://dev.qubes-os.org/projects/core-admin/en/latest/qubes-storage.html
|
The last part, documentation, is done here: https://dev.qubes-os.org/projects/core-admin/en/latest/qubes-storage.html |
kalkin commentedMar 16, 2016
•
edited
Edited 1 time
-
kalkin
edited Jul 13, 2016 (most recent)
Design
This is a deisign proposal for the storage subsystem for QubesOS R4.0. For previous discussions see also the PR: Allow defining and using custom Storage types
Requirements
StorageVolumesVolume
Encapsulates all data about a volume for serialization to qubes.xml and libvirt config.
Interface
Pool
A Pool is used to manage different kind of volumes (File based/LVM/Btrfs/...).
Interface
3rd Parties providing own storage implementations will need to implement the following interface.
Storage
The
Storageclass provides managment methods for domain's volumes. The method are called by the volume at the appropriate time. Currently it's inqubes/storage/__init__.py, but I'm considering to move it somewhere else, or make it a part ofQubesVM, because most of the methods just iterate overself.vm.volumesand execute a method. See also my current Storage versionInterface
Further Details
QubesVM.volume_configwould contain a dict {'volume_name':{config}} from the xml configuration for the current domain.QubesVM.volumesis a dict containing{ 'root_img' : Volume, 'private_img' : Volume, ...}Qubes.pool_configcontains the pool config parsed from qubes.xml. Will be replacedQubes.pool_configwithQubes.poolscontaining*PoolOpen Questions
Volume.resize()? There might be volume implementations which have strategies for shrinking. Should resize also accept smaller sizes as the current one? Or should we even haveextend()andshrink()EDITS:
Storage.import()