Skip to content

Commit

Permalink
Default AZ for instance if cross_az_attach=False and checking from API
Browse files Browse the repository at this point in the history
If we're booting from an existing volume but the instance is not being
created in a requested availability zone, and cross_az_attach=False,
we'll fail with a 400 since by default the volume is in the 'nova'
AZ and the instance does not have an AZ set - because one wasn't requested
and because it's not in a host aggregate yet.

This refactors that AZ validation during server create in the API to
do it before calling _validate_bdm so we get the pre-existing volumes
early and if cross_az_attach=False, we validate the volume zone(s) against
the instance AZ. If the [DEFAULT]/default_schedule_zone (for instances) is
not set and the volume AZ does not match the
[DEFAULT]/default_availability_zone then we put the volume AZ in the request
spec as if the user requested that AZ when creating the server.

Since this is a change in how cross_az_attach is used and how the instance
default AZ works when using BDMs for pre-existing volumes, the docs are
updated and a release note is added.

Note that not all of the API code paths are unit tested because the
functional test coverage does most of the heavy lifting for coverage.
Given the amount of unit tests that are impacted by this change, it is
pretty obvious that (1) many unit tests are mocking at too low a level and
(2) functional tests are better for validating these flows.

Closes-Bug: #1694844

Change-Id: Ib31ba2cbff0ebb22503172d8801b6e0c3d2aa68a
  • Loading branch information
mriedem committed Oct 31, 2019
1 parent 2d84a56 commit 07a24dc
Show file tree
Hide file tree
Showing 10 changed files with 361 additions and 148 deletions.
15 changes: 14 additions & 1 deletion doc/source/admin/availability-zones.rst
Expand Up @@ -133,6 +133,14 @@ With respect to availability zones, a server is restricted to a zone if:
``availability_zone`` with the ``POST /servers/{server_id}/action`` request
using microversion 2.77 or greater.

4. :oslo.config:option:`cinder.cross_az_attach` is False,
:oslo.config:option:`default_schedule_zone` is None,
the server is created without an explicit zone but with pre-existing volume
block device mappings. In that case the server will be created in the same
zone as the volume(s) if the volume zone is not the same as
:oslo.config:option:`default_availability_zone`. See `Resource affinity`_
for details.

If the server was not created in a specific zone then it is free to be moved
to other zones, i.e. the :ref:`AvailabilityZoneFilter <AvailabilityZoneFilter>`
is a no-op.
Expand Down Expand Up @@ -174,7 +182,12 @@ a server to another zone could also break affinity with attached volumes.
``cross_az_attach=False`` is not widely used nor tested extensively and
thus suffers from some known issues:

* `Bug 1694844 <https://bugs.launchpad.net/nova/+bug/1694844>`_
* `Bug 1694844 <https://bugs.launchpad.net/nova/+bug/1694844>`_. This is
fixed in the 21.0.0 (Ussuri) release by using the volume zone for the
server being created if the server is created without an explicit zone,
:oslo.config:option:`default_schedule_zone` is None, and the volume zone
does not match the value of
:oslo.config:option:`default_availability_zone`.
* `Bug 1781421 <https://bugs.launchpad.net/nova/+bug/1781421>`_


Expand Down
1 change: 1 addition & 0 deletions nova/api/openstack/compute/servers.py
Expand Up @@ -737,6 +737,7 @@ def create(self, req, body):
exception.FlavorMemoryTooSmall,
exception.InvalidMetadata,
exception.InvalidVolume,
exception.MismatchVolumeAZException,
exception.MultiplePortsNotApplicable,
exception.InvalidFixedIpAndMaxCountRequest,
exception.InstanceUserDataMalformed,
Expand Down
150 changes: 140 additions & 10 deletions nova/compute/api.py
Expand Up @@ -1032,6 +1032,104 @@ def _validate_host_or_node(self, context, host, hypervisor_hostname):
except exception.ResourceProviderNotFound:
raise exception.ComputeHostNotFound(host=hypervisor_hostname)

def _get_volumes_for_bdms(self, context, bdms):
"""Get the pre-existing volumes from cinder for the list of BDMs.
:param context: nova auth RequestContext
:param bdms: BlockDeviceMappingList which has zero or more BDMs with
a pre-existing volume_id specified.
:return: dict, keyed by volume id, of volume dicts
:raises: VolumeNotFound - if a given volume does not exist
:raises: CinderConnectionFailed - if there are problems communicating
with the cinder API
:raises: Forbidden - if the user token does not have authority to see
a volume
"""
volumes = {}
for bdm in bdms:
if bdm.volume_id:
volumes[bdm.volume_id] = self.volume_api.get(
context, bdm.volume_id)
return volumes

@staticmethod
def _validate_vol_az_for_create(instance_az, volumes):
"""Performs cross_az_attach validation for the instance and volumes.
If [cinder]/cross_az_attach=True (default) this method is a no-op.
If [cinder]/cross_az_attach=False, this method will validate that:
1. All volumes are in the same availability zone.
2. The volume AZ matches the instance AZ. If the instance is being
created without a specific AZ (either via the user request or the
[DEFAULT]/default_schedule_zone option), and the volume AZ matches
[DEFAULT]/default_availability_zone for compute services, then the
method returns the volume AZ so it can be set in the RequestSpec as
if the user requested the zone explicitly.
:param instance_az: Availability zone for the instance. In this case
the host is not yet selected so the instance AZ value should come
from one of the following cases:
* The user requested availability zone.
* [DEFAULT]/default_schedule_zone (defaults to None) if the request
does not specify an AZ (see parse_availability_zone).
:param volumes: iterable of dicts of cinder volumes to be attached to
the server being created
:returns: None or volume AZ to set in the RequestSpec for the instance
:raises: MismatchVolumeAZException if the instance and volume AZ do
not match
"""
if CONF.cinder.cross_az_attach:
return

if not volumes:
return

# First make sure that all of the volumes are in the same zone.
vol_zones = [vol['availability_zone'] for vol in volumes]
if len(set(vol_zones)) > 1:
msg = (_("Volumes are in different availability zones: %s")
% ','.join(vol_zones))
raise exception.MismatchVolumeAZException(reason=msg)

volume_az = vol_zones[0]
# In this case the instance.host should not be set so the instance AZ
# value should come from instance.availability_zone which will be one
# of the following cases:
# * The user requested availability zone.
# * [DEFAULT]/default_schedule_zone (defaults to None) if the request
# does not specify an AZ (see parse_availability_zone).

# If the instance is not being created with a specific AZ (the AZ is
# input via the API create request *or* [DEFAULT]/default_schedule_zone
# is not None), then check to see if we should use the default AZ
# (which by default matches the default AZ in Cinder, i.e. 'nova').
if instance_az is None:
# Check if the volume AZ is the same as our default AZ for compute
# hosts (nova) and if so, assume we are OK because the user did not
# request an AZ and will get the same default. If the volume AZ is
# not the same as our default, return the volume AZ so the caller
# can put it into the request spec so the instance is scheduled
# to the same zone as the volume. Note that we are paranoid about
# the default here since both nova and cinder's default backend AZ
# is "nova" and we do not want to pin the server to that AZ since
# it's special, i.e. just like we tell users in the docs to not
# specify availability_zone='nova' when creating a server since we
# might not be able to migrate it later.
if volume_az != CONF.default_availability_zone:
return volume_az # indication to set in request spec
# The volume AZ is the same as the default nova AZ so we will be OK
return

if instance_az != volume_az:
msg = _("Server and volumes are not in the same availability "
"zone. Server is in: %(instance_az)s. Volumes are in: "
"%(volume_az)s") % {
'instance_az': instance_az, 'volume_az': volume_az}
raise exception.MismatchVolumeAZException(reason=msg)

def _provision_instances(self, context, instance_type, min_count,
max_count, base_options, boot_meta, security_groups,
block_device_mapping, shutdown_terminate,
Expand All @@ -1057,12 +1155,23 @@ def _provision_instances(self, context, instance_type, min_count,
security_groups)
self.security_group_api.ensure_default(context)
port_resource_requests = base_options.pop('port_resource_requests')
LOG.debug("Going to run %s instances...", num_instances)
instances_to_build = []
# We could be iterating over several instances with several BDMs per
# instance and those BDMs could be using a lot of the same images so
# we want to cache the image API GET results for performance.
image_cache = {} # dict of image dicts keyed by image id
# Before processing the list of instances get all of the requested
# pre-existing volumes so we can do some validation here rather than
# down in the bowels of _validate_bdm.
volumes = self._get_volumes_for_bdms(context, block_device_mapping)
volume_az = self._validate_vol_az_for_create(
base_options['availability_zone'], volumes.values())
if volume_az:
# This means the instance is not being created in a specific zone
# but needs to match the zone that the volumes are in so update
# base_options to match the volume zone.
base_options['availability_zone'] = volume_az
LOG.debug("Going to run %s instances...", num_instances)
try:
for i in range(num_instances):
# Create a uuid for the instance so we can store the
Expand Down Expand Up @@ -1116,7 +1225,7 @@ def _provision_instances(self, context, instance_type, min_count,
block_device_mapping = (
self._bdm_validate_set_size_and_instance(context,
instance, instance_type, block_device_mapping,
image_cache, supports_multiattach))
image_cache, volumes, supports_multiattach))
instance_tags = self._transform_tags(tags, instance.uuid)

build_request = objects.BuildRequest(context,
Expand Down Expand Up @@ -1472,7 +1581,7 @@ def _prepare_image_mapping(self, instance_type, mappings):
def _bdm_validate_set_size_and_instance(self, context, instance,
instance_type,
block_device_mapping,
image_cache,
image_cache, volumes,
supports_multiattach=False):
"""Ensure the bdms are valid, then set size and associate with instance
Expand All @@ -1486,14 +1595,15 @@ def _bdm_validate_set_size_and_instance(self, context, instance,
:param image_cache: dict of image dicts keyed by id which is used as a
cache in case there are multiple BDMs in the same request using
the same image to avoid redundant GET calls to the image service
:param volumes: dict, keyed by volume id, of volume dicts from cinder
:param supports_multiattach: True if the request supports multiattach
volumes, False otherwise
"""
LOG.debug("block_device_mapping %s", list(block_device_mapping),
instance_uuid=instance.uuid)
self._validate_bdm(
context, instance, instance_type, block_device_mapping,
image_cache, supports_multiattach)
image_cache, volumes, supports_multiattach)
instance_block_device_mapping = block_device_mapping.obj_clone()
for bdm in instance_block_device_mapping:
bdm.volume_size = self._volume_size(instance_type, bdm)
Expand Down Expand Up @@ -1531,7 +1641,7 @@ def _check_compute_supports_volume_type(context):
raise exception.VolumeTypeSupportNotYetAvailable()

def _validate_bdm(self, context, instance, instance_type,
block_device_mappings, image_cache,
block_device_mappings, image_cache, volumes,
supports_multiattach=False):
"""Validate requested block device mappings.
Expand All @@ -1542,6 +1652,7 @@ def _validate_bdm(self, context, instance, instance_type,
:param image_cache: dict of image dicts keyed by id which is used as a
cache in case there are multiple BDMs in the same request using
the same image to avoid redundant GET calls to the image service
:param volumes: dict, keyed by volume id, of volume dicts from cinder
:param supports_multiattach: True if the request supports multiattach
volumes, False otherwise
"""
Expand Down Expand Up @@ -1614,9 +1725,12 @@ def _validate_bdm(self, context, instance, instance_type,
"size specified"))
elif volume_id is not None:
try:
volume = self.volume_api.get(context, volume_id)
volume = volumes[volume_id]
# We do not validate the instance and volume AZ here
# because that is done earlier by _provision_instances.
self._check_attach_and_reserve_volume(
context, volume, instance, bdm, supports_multiattach)
context, volume, instance, bdm, supports_multiattach,
validate_az=False)
bdm.volume_size = volume.get('size')

# NOTE(mnaser): If we end up reserving the volume, it will
Expand Down Expand Up @@ -4107,10 +4221,26 @@ def _check_volume_already_attached_to_instance(self, context, instance,
pass

def _check_attach_and_reserve_volume(self, context, volume, instance,
bdm, supports_multiattach=False):
bdm, supports_multiattach=False,
validate_az=True):
"""Perform checks against the instance and volume before attaching.
If validation succeeds, the bdm is updated with an attachment_id which
effectively reserves it during the attach process in cinder.
:param context: nova auth RequestContext
:param volume: volume dict from cinder
:param instance: Instance object
:param bdm: BlockDeviceMapping object
:param supports_multiattach: True if the request supports multiattach
volumes, i.e. microversion >= 2.60, False otherwise
:param validate_az: True if the instance and volume availability zones
should be validated for cross_az_attach, False to not validate AZ
"""
volume_id = volume['id']
self.volume_api.check_availability_zone(context, volume,
instance=instance)
if validate_az:
self.volume_api.check_availability_zone(context, volume,
instance=instance)
# If volume.multiattach=True and the microversion to
# support multiattach is not used, fail the request.
if volume['multiattach'] and not supports_multiattach:
Expand Down
4 changes: 4 additions & 0 deletions nova/conf/availability_zone.py
Expand Up @@ -55,6 +55,10 @@
* Any string representing an existing availability zone name.
* None, which means that the instance can move from one availability zone to
another during its lifetime if it is moved from one compute node to another.
Related options:
* ``[cinder]/cross_az_attach``
"""),
]

Expand Down
9 changes: 8 additions & 1 deletion nova/conf/cinder.py
Expand Up @@ -89,13 +89,20 @@
If False, volumes attached to an instance must be in the same availability
zone in Cinder as the instance availability zone in Nova.
This also means care should be taken when booting an instance from a volume
where source is not "volume" because Nova will attempt to create a volume using
the same availability zone as what is assigned to the instance.
If that AZ is not in Cinder (or allow_availability_zone_fallback=False in
If that AZ is not in Cinder (or ``allow_availability_zone_fallback=False`` in
cinder.conf), the volume create request will fail and the instance will fail
the build request.
By default there is no availability zone restriction on volume attach.
Related options:
* ``[DEFAULT]/default_schedule_zone``
"""),
]

Expand Down

0 comments on commit 07a24dc

Please sign in to comment.