Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbd-mirror A/A: track images in policy map #15788

Merged
merged 4 commits into from Nov 16, 2017

Conversation

vshankar
Copy link
Contributor

@vshankar vshankar added the rbd label Jun 20, 2017
@vshankar
Copy link
Contributor Author

#15691

@vshankar vshankar requested a review from dillaman June 20, 2017 18:16
@vshankar
Copy link
Contributor Author

items that need to be addressed:

  • persist map changes to disk
  • shuffle/unmap policy APIs

@@ -3097,6 +3097,7 @@ static const std::string IMAGE_KEY_PREFIX("image_");
static const std::string GLOBAL_KEY_PREFIX("global_");
static const std::string STATUS_GLOBAL_KEY_PREFIX("status_global_");
static const std::string INSTANCE_KEY_PREFIX("instance_");
static const std::string MIRROR_IMAGE_MAP_KEY_PREFIX("map_image_");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: map_image_ -> image_map_ to match constant's naming

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't that result in these entries being picked by when IMAGE_KEY_PREFIX is used as the key? I had faced this earlier...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are they used in the same object? I thought one was in rbd_mirroring and the other was in rbd_leader? Could also name it "instance_image_"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the sense, I used instance_map earlier and that got these keys to be listed out where INSTANCE_KEY_PREFIX was being used (leader object). So I stuck to keeping it distinct. image_map would be safe to use here.

@@ -3654,6 +3659,84 @@ int instances_remove(cls_method_context_t hctx, const string &instance_id) {
return 0;
}

int mirror_image_map_list(
cls_method_context_t hctx,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: hanging indent should be 4 spaces to differentiate the params from the function body

std::string &start_after, uint64_t max_return,
std::map<std::string, cls::rbd::MirrorImageMap> *image_map) {

int max_read = RBD_MAX_KEYS_READ;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: move into the while loop and properly initialize max_read to the min of RBD_MAX_KEYS_READ and the number of remaining items from max_return

@@ -3654,6 +3659,84 @@ int instances_remove(cls_method_context_t hctx, const string &instance_id) {
return 0;
}

int mirror_image_map_list(
cls_method_context_t hctx,
std::string &start_after, uint64_t max_return,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: const std::string &start_after

librados::ObjectWriteOperation *op,
const std::string &global_id, cls::rbd::MirrorImageMap &image_map);
void mirror_image_map_remove(
librados::ObjectWriteOperation *op, std::string &global_id);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: const std::string&


struct MappedImage {
std::string id;
std::string global_id;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: global_image_id

};

struct MappedImage {
std::string id;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this id used for? It's not obvious w/o tracing back through the code. As I mentioned in the (obsolete) ImageSpec class, I think you really need to track a set of peer UUIDs (black for local pool, populated UUID for remote peer pools). Once you drop to zero UUIDs, you know you can unmap the image from an instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The id here is not for tracking. This structure is used by PoolReplayer::handle_update() and id gets assigned to whatever we get in image_id in PoolWatcher (although as you mentioned that passing id to start image replayers is not really required as we already pass the global_id). This can be taken away if required.

}
};

struct RemappedImage {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just track this all within the same data structure above? It would have a state of associating, unassociating, associated, and rebalancing. Probably should also track "next instance id" for the case of rebalancing so that your policy can re-execute while a rebalance is in-progress and make decisions based on the expected state, not current state. Also should add a "last mapped" timestamp to ensure you don't ping-pong images back-and-force (i.e. a given image that is already mapped cannot be re-mapped for X minutes -- config option).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These structures are used to pass a bunch of images to PoolReplayer::handle_update(). Tracking next instance id (plus other things) should mostly be a part of some other "long lived" data structure such as the map itself (or a part of the map which tracks peer uuids).

Regarding the RemappedImage structure, yeh, we have just stick to having just one structure for representing mapped, removed and remapped images.

std::ostream &, const MapSpec &mapspec);

// in-memory map of tracked images
struct InImage {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you would only need this data structure on a transient basis when loading the map from disk and then the policy's in-memory map would be initialized from it. Of course, the global image id -> instance id map is so simple it could just be represented by a std::map

@@ -364,6 +364,34 @@ struct TrashImageSpec {
};
WRITE_CLASS_ENCODER(TrashImageSpec);

enum MirrorImageMapState {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's drop this for now -- right now, the existence of the struct indicates the image is associated (or dissociating or associating).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I've taken out the state tracking part and just maintain instance_id on disk.

@vshankar vshankar force-pushed the mirror-ha-poolwatcher-policy branch 3 times, most recently from 73599ae to 2f7f293 Compare June 29, 2017 10:44
@vshankar
Copy link
Contributor Author

updated pr with a minor chage in tracking async operations in policy.

@vshankar vshankar force-pushed the mirror-ha-poolwatcher-policy branch 2 times, most recently from 138e1c6 to da35ebd Compare June 29, 2017 13:03
Copy link

@dillaman dillaman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cls_rbd changes also need updates to ceph_test_cls_rbd


max_read = MIN(RBD_MAX_KEYS_READ, max_return - image_map->size());
if (!vals.empty()) {
last_read = mirror_image_map_key(image_map->rbegin()->first);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key should already be good-to-go -- no need to prefix an additional prefix onto it.

const std::string& global_image_id =
it->first.substr(MIRROR_IMAGE_MAP_KEY_PREFIX.size());

cls::rbd::MirrorImageMap imagemap;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: image_map

int max_read = MIN(RBD_MAX_KEYS_READ, max_return);
std::string last_read = mirror_image_map_key(start_after);

while (max_read) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: max_read > 0

int mirror_image_map_list(cls_method_context_t hctx,
const std::string &start_after,
uint64_t max_return,
std::map<std::string, cls::rbd::MirrorImageMap> *image_map) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: image_maps or image_mapping

@dillaman
Copy link

@vshankar ignore -- apparently I was looking at the wrong PR

@dillaman dillaman changed the title rbd-mirror A/A: track images in policy map [DNM] rbd-mirror A/A: track images in policy map Jul 19, 2017
@vshankar vshankar force-pushed the mirror-ha-poolwatcher-policy branch 4 times, most recently from bf0c12e to 37eacbf Compare July 31, 2017 09:50
const std::string &start_after,
uint64_t max_return,
std::map<std::string, cls::rbd::MirrorImageMap> *image_map) {
int max_read = MIN(RBD_MAX_KEYS_READ, max_return);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: move this inside the while loop and pick the min of max keys read or max_return - image_map->size()


/**
* Input:
* @param std::map<std::string, cls::rbd::MirrorImageMap>: image map
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: for the set and remove methods, you should just take a global id and an image map struct. You can combine multiple class method calls within the same librados operation. This will keep it consistent w/ all other RBD class methods.

@@ -0,0 +1,62 @@
// -*- mode:C++; tab-width:8; c-basic-offset:2; indent-tabs-mode:t -*-
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still would like to see the RemoveRequest folded into UpdateRequest where you give it a collection of global image id that need updated instance ids / data + a collection of global image ids to remove.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It kind of felt a bit odd to accept a map of images -> data to be updated and a set of image ids to remove, so I made them into separate state machines (earlier I did have them clubbed up as they were accepting a map of image -> instance ids with empty instance_id hinting removal).

* @endverbatim
*/
LoadRequest(librados::IoCtx &ioctx,
std::map<std::string, cls::rbd::MirrorImageMap> *image_map,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: see previous nit about swapping these to be something like image_maps or image_mapping


if (!image_map.empty()) {
m_start_after = image_map.rbegin()->first;
image_map_list();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: wrap w/ if (image_map.size() == MAX_RETURN) to avoid the extra empty call

template <typename I>
void Policy<I>::update_images_removed(const std::string &mirror_uuid,
const std::set<std::string> &global_image_ids) {
dout(20) << dendl;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only once the global image id has been removed from all peers (local and remote) should the remove remap logic execute

}

void finish(int r) override {
policy->finish_async_op();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: cannot finish the async op until after last dereference of policy

enum ShuffleType {
SHUFFLE_TYPE_INSTANCES_ADDED = 0, // rebalance due to addition of instances
SHUFFLE_TYPE_INSTANCES_REMOVED, // rebalance due to removal of instances
SHUFFLE_TYPE_IMAGES_ADDED_REMOVED, // rebalance due to adding/removing images
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: case doesn't appear to be used (of course, not sure the ShuffleType would need to live either)

}

uint64_t images_per_instance = nr_images / nr_instances;
if (nr_images % nr_instances) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: not a boolean result so append w/ != 0


bool instance_present(const std::vector<std::string> &instance_ids,
const std::string &instance) {
return !(std::find(instance_ids.begin(),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: swap == for != and drop the negation

@vshankar vshankar force-pushed the mirror-ha-poolwatcher-policy branch 2 times, most recently from ccdbf46 to 741e3ae Compare August 17, 2017 17:20
@vshankar
Copy link
Contributor Author

minor update: added encoder test for rbd::mirror::image_map::MirrorPolicyData

@vshankar vshankar force-pushed the mirror-ha-poolwatcher-policy branch 3 times, most recently from a5a6231 to 5d92b41 Compare August 28, 2017 07:38
@vshankar vshankar changed the title [DNM] rbd-mirror A/A: track images in policy map rbd-mirror A/A: track images in policy map Aug 28, 2017

Option("rbd_mirror_image_policy_update_check_interval", Option::TYPE_INT, Option::LEVEL_ADVANCED)
.set_default(1)
.set_description("interval (in seconds) to check images for mirror daemon peer updations"),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "updations" -> "updates"

@@ -5276,6 +5276,14 @@ static std::vector<Option> get_rbd_mirror_options() {
Option("rbd_mirror_leader_max_acquire_attempts_before_break", Option::TYPE_INT, Option::LEVEL_ADVANCED)
.set_default(3)
.set_description("number of failed attempts to acquire lock after missing heartbeats before breaking lock"),

Option("rbd_mirror_image_policy_update_check_interval", Option::TYPE_INT, Option::LEVEL_ADVANCED)
.set_default(1)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: add a .set_min(1) as well

@@ -5276,6 +5276,14 @@ static std::vector<Option> get_rbd_mirror_options() {
Option("rbd_mirror_leader_max_acquire_attempts_before_break", Option::TYPE_INT, Option::LEVEL_ADVANCED)
.set_default(3)
.set_description("number of failed attempts to acquire lock after missing heartbeats before breaking lock"),

Option("rbd_mirror_image_policy_update_check_interval", Option::TYPE_INT, Option::LEVEL_ADVANCED)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: check -> throttle?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: maybe a double instead of int so that you can advance the state machine faster?

@@ -1,6 +1,7 @@
add_library(rbd_mirror_types STATIC
instance_watcher/Types.cc
leader_watcher/Types.cc)
leader_watcher/Types.cc
image_map/Types.cc)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: alphabetical order

@@ -21,6 +22,12 @@ set(rbd_mirror_internal
types.cc
image_map/LoadRequest.cc
image_map/UpdateRequest.cc
image_map/ImageAction.cc

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: alphabetical order


derr << ": invalid transition (action_type=" << action_type << ", state="
<< state << ")" << dendl;
return {SUB_ACTION_TYPE_INVALID, STATE_UNASSIGNED};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: assert?

}
};

Mutex m_lock;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the lock required? Looks like the Policy::m_map_lock is always locked when using the ImageAction instance

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, its not required. But I'm thinking if at all locking in ImageAction is required in future. Also, this lock is never contented as of now (due to all access is under Policy::m_map_lock), so its pretty much lightweight.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally -- I'd yank it since ImageAction should always be used under the image map's lock

RWLock::WLocker map_lock(m_map_lock);

for (auto const &global_image_id : global_image_ids) {
ImageAction::SubActionType sub_action_type = m_image_action.sub_action_start(global_image_id);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invoking this method will invoke one of the subaction callbacks -- might be nice if the switch below could be eliminated and instead have those callbacks queue themselves in the set/remove update lists.

if (r == 0) {
policy->notify_listener(to_map, to_unmap, to_purge);
} else {
// on failure, resume pending actions (if any) for this

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On failure I don't think it would be safe to proceed to the action. If you failed to notify an instance, try again until that instance is removed (marked dead) so that you know it's safe to proceed.

// on an ack, finish the current sub-action and schedule the
// next if available or start a next action if finished.
// on a nack: directly finish the current action (not that
// this will invoke the context callback if any with a -ve

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: -ve -> negative

@vshankar
Copy link
Contributor Author

@dillaman couldn't write up a list of changes in this current version (from the last version) of the PR.

// retry friendly callback contexts for sub actions. note that
// callback context when an action is completed need not be
// of this type as its just invoked once, upon completion.
struct SubActionContext : Context {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel there could be a better way of implementing this abstract class and how its used by the derived classes below (SubActionMapContext etc..). Thoughts?

@vshankar
Copy link
Contributor Author

changes in the latest update (w/ comments incorporated)

  • split the original image_map::Policy class into ImageMap and image_map::Policy. ImageMap schedules image updates, persists on-disk image map, notifies listener and makes use of image_map::Policy.
  • image_map::Policy is an abstract class which maintains the master instance map (instance_id -> images) and provides public APIs to lookup, map, unmap and shuffle (to move images mapped to an instance to another instance). Implementations of image_map::Policy need to provide custom implementations of a handful of pure virtual functions. These are invoke by image_map::Policy as per the need. image_map::SimplePolicy is such an implementation of this abstract class which does a simple mapping of M images to N instances.


// NOTE: this class assumes callers take care of serialization when
// invoking class APIs
class ImageAction {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: since this actually is a per-image container of actions, perhaps it should be named ImageActions (plural)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done -- this and moved to image_map.

@@ -0,0 +1,124 @@
// -*- mode:C++; tab-width:8; c-basic-offset:2; indent-tabs-mode:t -*-

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: move to image_map subdirectory / namespace since it's only used by ImageMap

@@ -20,6 +23,9 @@ set(rbd_mirror_internal
Threads.cc
types.cc
image_map/LoadRequest.cc
image_map/Policy.cc
image_map/SimplePolicy.cc
image_map/Types.cc

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: duplicate -- already in rbd_mirror_types above

@@ -0,0 +1,26 @@
// -*- mode:C++; tab-width:8; c-basic-offset:2; indent-tabs-mode:t -*-

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: move to image_map directory / namespace (and drop the ImageMap prefix since it would be duplicated by the namespace. Could also add it to the existing image_map/Types.h file.


std::unique_ptr<image_map::Policy<ImageCtxT> > m_policy; // our mapping policy

Context *m_timer_task;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: default initialize to nullptr here instead of the constructor


for (auto const &global_image_id : global_image_ids) {
std::string instance_id = m_policy->lookup(global_image_id);
assert(instance_id != Policy<I>::UNMAPPED_INSTANCE_ID);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: probably should just continue if the instance is already unmapped. It might be possible for a duplicate remove event to fire.


// removal notification will be notified instantly. this is safe
// even after scheduling action for images as we still hold m_lock
notify_listener_remove_images(mirror_uuid, to_remove);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: wrap w/ if (!mirror_uuid.empty())

<< peer_uuid << dendl;

auto rm = m_image_info[global_image_id].registered_peers.erase(peer_uuid);
assert(rm);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: might be possible that peer sends duplicate removal event, so just ignore.


void complete(int r) override {
finish(r);
image_map->set_retry_instance_id(global_image_id, instance_id);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if ImageAction and Policy are better integrated, the policy would already know where the image is currently mapped and where the desired next mapping is located. Therefore, you wouldn't need to separately track where to re-send notifications.

instance_id(instance_id) {
}

void update_map_sub_action() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: why have these "helper" methods instead of just directly invoking the methods from the appropriate finish override?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong reason to have it this way... SubActionContext invokes set_retry_instance_id() which requires it to record global_image_id and instance_id. Alternatively, each derived class could have accessed the base class member variable (via SubActionContext::) and invoked the appropriate methods in finish(). Just found this a bit cleaner.

I'll relook at this w.r.t. your comment on integrating Policy and ImageActions.

@vshankar vshankar force-pushed the mirror-ha-poolwatcher-policy branch 2 times, most recently from f36c439 to 00b25b3 Compare October 16, 2017 12:13
@vshankar
Copy link
Contributor Author

@dillaman PR updated with changes.

@vshankar vshankar force-pushed the mirror-ha-poolwatcher-policy branch 2 times, most recently from a9fbd19 to 6d88a0b Compare October 23, 2017 04:50
std::map<std::string, ActionState> m_actions;
std::map<std::string, utime_t> m_shuffled_timestamp;

std::set<std::string> m_dead_instances;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to retain this as policy keeps the instance around till all images mapping to an instance are unmapped (migrated) when an instance is removed.

RWLock::WLocker map_lock(m_map_lock);

for (auto const &it : image_mapping) {
map(it.first, it.second.instance_id, utime_t(0, 0), m_map_lock);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should persist the "mapped" timestamp within the on-disk image map so a failure of primary still honors the migration timeout upon recovery. This could be added in a add-on commit to this PR so I don't need to keep re-reviewing large sections.

virtual ~Policy() {
}

struct Action {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: does this need to be public? It also doesn't seem to be tied to Policy so perhaps it could be moved to its own stand-alone .h/.cc files?

std::set<std::string> *remap_global_image_ids);

// state machine related methods
void init_state(const std::string &global_image_id);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK -- so here is my proposal to simplify this interface. The ImageMap should never construct actions. Instead, provide add_image(global_image_id, Context*, ...), remove_image(global_image_id, Context*, ...), and shuffle_image(global_image_id, Context, ...). The add_image method would replace init_state, the internal action state machine would automatically handle releasing the image's state if no further actions are scheduled (removing the need for release_state and actions_pending).

When combined with the suggestions to eliminate the subactions, you would probably then only need to have start_next_action(global_image_id) and finish_action(global_image_id, int r).

Context *on_purge = new SubActionPurgeContext(this, global_image_id);

Context *on_finish = new FunctionContext(
[this, global_image_id, on_unmap, on_purge](int r) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: break this out into its own handle_remove_action method instead of using a lambda. It has the added bonus that the dout at the top of the handle method helps state tracking

Context *on_map = new SubActionMapContext(this, global_image_id);

Context *on_finish = new FunctionContext([this, global_image_id, on_map](int r) {
assert(m_lock.is_locked());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: break this out into its own handle_add_action method instead of using a lambda. It has the added bonus that the dout at the top of the handle method helps state tracking

ActionType action_type;
State current_state;
Transition transition;
};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice to have an error_transition to automatically handle the failure path depending on the current action+state. For example, I think we should assume that acquire/release notifications to instances should only fail after the peer has been blacklisted. The only retry-able case would be when updating the on-disk map.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for clarity -- retries with peer would be handled by the listener (of ImageMap class)?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes -- InstanceWatcher will retry the peer until successful or until the peer has been blacklisted (this last part needs to be handled).

STATE_UNASSIGNED = 0,
STATE_ASSOCIATED,
STATE_DISASSOCIATED,
STATE_PURGING,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: STATE_UPDATING_MAP

ACTION_TYPE_SHUFFLE,
};

enum SubActionType {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I almost wonder if we can just eliminate the subactions and promote the few transitory states to first-level states (i.e. associating, dissociating, updating_map). There should be two idle states (associated and unassociated) where you can start new actions and transition to new transitory states.

}

template <typename I>
bool ImageMap<I>::start_next_action(const std::string &global_image_id,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: it appears that start_next_action is always associated w/ scheduling an action if possible, so perhaps instead of returning a bool here, just schedule the action if possible.


// shuffle images when instances are added/removed
void add_instances(const std::vector<std::string> &instance_ids,
std::set<std::string> *remap_global_image_ids, bool shuffle);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: is shuffle ever false besides in the constructor? If not, I think it can be dropped since at initialization, there isn't anything to shuffle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually this can be removed, but not due to the reason of it always being false but because the map is empty when Policy is instantiated -- so do_shuffle_add_instances() would not do anything.

@vshankar vshankar force-pushed the mirror-ha-poolwatcher-policy branch 2 times, most recently from 876e273 to 9a47114 Compare November 12, 2017 15:58
Also, a "simple" policy implementation that maps M images to
N instances (M/N per rbd mirror daemon instance).

Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Signed-off-by: Venky Shankar <vshankar@redhat.com>
Fixes: http://tracker.ceph.com/issues/18786
Signed-off-by: Venky Shankar <vshankar@redhat.com>
@vshankar
Copy link
Contributor Author

vshankar commented Nov 13, 2017

changes in this update:

  • addressed comments
  • context callback for on-disk map update, acquire and release are StateTransition::State based (removed subactions from everywhere)
  • operation retry or abort is now handled by Policy (was handled by ImageMap earlier). also, only on-disk map updates are retried on failure.
  • Policy::remove_instances() is not a virtual function anymore (and not policy implementation dependent) -- it should always remap images to other (live) instances.

@ceph ceph deleted a comment from vshankar Nov 16, 2017
Copy link

@dillaman dillaman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@dillaman dillaman merged commit 87c1dc5 into ceph:master Nov 16, 2017
@tchaikov
Copy link
Contributor

this change breaks the "make check" on aarch64 consistently. see https://jenkins.ceph.com/job/ceph-pull-requests-arm64/9051/consoleFull#-2069175742d63714d2-c8d8-41fc-a9d4-8dee30be4c32

[ RUN      ] TestMockImageMap.SetLocalImages
seed 46228
/home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/tools/rbd_mirror/image_map/StateTransition.cc: In function 'static const rbd::mirror::image_map::StateTransition::Transition& rbd::mirror::image_map::StateTransition::transit(rbd::mirror::image_map::StateTransition::ActionType, rbd::mirror::image_map::StateTransition::State)' thread ffff666d4f40 time 2017-11-16 02:04:30.983796
/home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/tools/rbd_mirror/image_map/StateTransition.cc: 78: FAILED assert(false)
 ceph version 13.0.0-3210-g503ce71 (503ce7158d23a8f2da60e79690b5780bad56a9b7) mimic (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0xfc) [0xffff9915364c]
 2: (()+0x7cede8) [0xaaaad9fe6de8]
 3: (rbd::mirror::image_map::Policy::start_next_action(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xb4) [0xaaaad9fe185c]
 4: (()+0x4b9848) [0xaaaad9cd1848]
 5: (FunctionContext::finish(int)+0x3c) [0xaaaad9ce99e4]
 6: (Context::complete(int)+0x1c) [0xaaaad9ce131c]
 7: (()+0x4b4ec8) [0xaaaad9cccec8]
 8: (FunctionContext::finish(int)+0x3c) [0xaaaad9ce99e4]
 9: (Context::complete(int)+0x1c) [0xaaaad9ce131c]
 10: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa44) [0xffff9915ac04]
 11: (ThreadPool::WorkThread::entry()+0x14) [0xffff9915bb74]
 12: (()+0x6fc4) [0xffffa1a6afc4]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

@vshankar @dillaman

@vshankar
Copy link
Contributor Author

I'll look into it.

@tchaikov
Copy link
Contributor

tchaikov commented Nov 16, 2017

@vshankar my wild guess is the typecast in EncodeVisitor. i am pulling together a quick fix. #18964


struct PolicyMetaUnknown {
static const PolicyMetaType TYPE = static_cast<PolicyMetaType>(-1);

Copy link
Contributor

@tchaikov tchaikov Nov 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vshankar i think this is suspicious. as the underlying type of enum is unsigned by default. and casting it to and from uint32_t does not always work.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point here to to assign it to an unused ID (i.e. 0xFFFFFFFF)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my concern was that the MSBs of the number could be removed when being casted. but that's not the case.

@vshankar vshankar deleted the mirror-ha-poolwatcher-policy branch November 16, 2017 13:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants