Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #50 - Clean all Qpid queues to zero #52

Closed
wants to merge 1 commit into from

Conversation

kgaikwad
Copy link
Member

No description provided.

queues_cleared << qname
end
ensure
queues_cleared
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the ensure block here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh..my bad.
I was thought of adding rescue..ensure block here as in clear(qname) method I am triggering a command.
I missed to add rescue block.
I will modify the code and update the PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ensure block doesn't probably work as you expect, try:

def foo
  raise "Error"
rescue => e
  "rescue"
ensure
  "ensure"
end

foo returns "rescue" not "ensure" - the ensure block doesn't have effect on return value

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, your right.
Removed ensure block and added return value in rescue itself.
For return statement in ensure block, it was giving rubocop warning - 'Do not return from an ensure block'

@kgaikwad kgaikwad force-pushed the 50_clear_qpid_queues branch 2 times, most recently from 6496532 to 6b01b9a Compare April 19, 2017 16:25
@iNecas
Copy link
Member

iNecas commented Apr 20, 2017

After real work testing, I would suggest to:

  1. limit the list of queues we delete only on the persisted ones:
 Failed: Exception: Exception from Agent: {u'error_code': 7, u'error_text': 'not-found: Delete failed. No such queue: 3dfb669f-ba51-4093-9d47-19efc686166d:0.0 (/builddir/build/BUILD/qpid-cpp-0.30/src/qpid/broker/Broker.cpp:1485)'}

It seems we should delete only the persisted ones:

qpid-config --ssl-certificate=/etc/pki/katello/qpid_client_striped.crt -b amqps://localhost:5671  list queue
Objects of type 'queue'
  name                                           durable  autoDelete
  ====================================================================
  85cff499-be17-49b5-a8dd-c41ee06a8e9f:0.0       False    True
  907a50d7-5e79-4eba-a507-e0a810bef063:1.0       False    True
  cd3941be-d52e-4980-8243-62b2bf0318e8:1.0       False    True
  celeryev.5c2d60ec-8586-412f-9e86-3d4d92902283  False    True
  katello_event_queue                            True     False
  pulp.task                                      True     False
  sat61-rhel6.example.com:event                  True     False
  1. the queue can't be deleted when the services are using it: we should stop the services first, katello-service stop --except qpidd

Since this is more a migration step, rather than pre-upgrade check, I suggest doing the updates above, but still not merging, until we distinguish between pre_upgrade checks and upgrade steps.

def run
with_spinner('clear qpid queues') do |spinner|
total_queues_cleared = feature(:qpid_queues).clear_all
spinner.update "[#{total_queues_cleared.length}] queues cleared.\
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the [14] queues cleared would look better without [], so it would be 14 queues cleared

@kgaikwad kgaikwad force-pushed the 50_clear_qpid_queues branch 2 times, most recently from 71db180 to 9731a97 Compare May 22, 2017 07:38
@kgaikwad
Copy link
Member Author

@iNecas ,
Done with changes that suggested by you in previous comments.
Added katello-service feature to start-stop the service.

Due to this, on service start it is creating queues again and executing rerun check.
I guess this will get resolved after adding pre-post migration steps for it.

def available_qpid_queues
output = qpid_config("list queue --show-property=name \
--show-property=autoDelete | awk '$2 ~ /False/{ print $1 }'")
output ? output.split(' ') : []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qpid_config(...).to_s.split(' ')


def run
with_spinner('clear qpid queues') do |spinner|
feature(:katello_service).make_stop(spinner, '--exclude qpidd')
Copy link
Member

@iNecas iNecas Jun 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remember the original state of the service, and start only if it was running before at the end. What about having something like this:

def make_stop(spinner, args) do
   originally_runner_services = running_services
   execute!("katello-service stop #{args}")
   yield
 ensure
   execute!("katello-service start --only #{originally_running_services}")
end

@@ -0,0 +1,47 @@
class Features::QpidQueues < ForemanMaintain::Feature
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could generalize this to qpid feature, and collect mutliple functionalities around this feature in the future

@@ -92,6 +92,10 @@ def execute(command, options = {})
def shellescape(string)
Shellwords.escape(string)
end

def qpid_tools_installed?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to put it into the qpid feature? The SystemHelpers should contain only the more generic commands, usable across the definitions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No specific reason.
I have given QpidQueues as a name of feature that's why I thought this might be needed for other features specific to qpid.
As per your above comment, I will change feature name as qpid and will move this method to feature itself.

@iNecas
Copy link
Member

iNecas commented Jun 7, 2017

Yes, after https://github.com/iNecas/foreman_maintain/pull/67, this should go into migrations steps, as we rely on the installer to fill in the deleted queues, and therefore it should happen in the same phase.

def make_stop(spinner, args = '')
originally_running_services = running_services
spinner.update 'Stopping katello running services..'
execute!("katello-service stop #{args}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the args could be called services, so that we can work with it a bit better. After doing so, we can compare the running_services with services and:

  1. stop only running_serivces & services
  2. skip this completely, if the services to be stopped are actually running.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iNecas ,
Could you please elaborate 2nd point as I am bit confuse here.
As you mentioned firstly I need to rename args as services then

  • Instead of passing exclude list of services, I have to pass services which needs to be stop.
  • After that I need to compare running_services with services,
    stop only services which are common in both list.

Please correct me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I'm missed there is the except word used. In that case, what about having options with possible keys :only and :exclude, but since we would have the list of currently running services, we would use only --only option for the actual katello-service and only in case we don't have an empty list.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iNecas ,
Yes, it will be better. As we have running_services list value, we can include :only and :exclude option keys.
I will modify and update the PR.

@kgaikwad
Copy link
Member Author

@iNecas ,
Done with changes. Ready for review.
One small query:
On service restart few persistent queues will get created. It results into check rerun.
and at the end of check run showing message - {queue_count} queues cleared. These queues will be recreated using installer
If queues are creating on service restart, should I need to change success message? and how we can avoid check rerun.

@kgaikwad kgaikwad force-pushed the 50_clear_qpid_queues branch 2 times, most recently from 3827558 to ae6d1bf Compare June 27, 2017 12:33
@kgaikwad
Copy link
Member Author

Modified success-info message to - {queue_count} queues cleared. These queues are recreated by restarting katello-services.

@iNecas
Copy link
Member

iNecas commented Aug 14, 2017

I've raised some questions on mailing list if we want to make this step part of the upgrade scenario, as it can affect negatively the clients connected to the system. Perhaps we will need to do some additional queue restore, as described in https://access.redhat.com/solutions/3148641in "ONLY if you end up with broken content under /var/lib/qpidd and dont have a working backup" scenario.

def make_stop(spinner, options = {})
services = find_services_for_only_filter(running_services, options)
if services.empty?
spinner.update 'No any running katello service..'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: no katello service running

def make_start(spinner, options = {})
services = find_services_for_only_filter(stopped_services, options)
if services.empty?
spinner.update 'No any katello service to start.'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@iNecas
Copy link
Member

iNecas commented Aug 14, 2017

@kgaikwad until we get the answers, could you extract the katello_service into separate PR, as there are other use-cases that might use it before we merge this one. Such as #68

@kgaikwad
Copy link
Member Author

@iNecas ,
Yes, sure.
Created new PR #83 for Katello-Service feature by creating new issue in Redmine.

Please let me know if any additional changes required in it.
I will update this Qpid PR later once katello-service feature PR get merged.

@iNecas
Copy link
Member

iNecas commented Aug 21, 2017

#83 got merged. Please rebased. Basd on scrum call discussion, it would be good looking into queue re-creation part of the https://access.redhat.com/solutions/3148641

@iNecas
Copy link
Member

iNecas commented Aug 21, 2017

On top of https://access.redhat.com/solutions/3148641, we could perhaps just dump the list of queues before the cleanup and re-create the queues based on the list after deleting the queues. Thoughts @pmoravec ?

@pmoravec
Copy link
Contributor

Yes it makes sense - though IMHO only for upgrades where qpid-cpp-server* package was bounced to 0.34 (and when the directory structure has changed). Since doing so during every upgrade would redundantly slow down the upgrade - creating one such queue for one Content Host takes some time (up to 1s) and having thousands of Content Hosts..

2 possible ways how to identify the queues (just the pulp.agent.* are required to identify):

  1. chose from candlepin consumers or katello_systems just the systems with katello-agent package installed - to skip generating the queue for goferd-less systems.

  2. identify what queues were there before (and after) the upgrade):

    ls /var/lib/qpidd/.qpidd/qls/jrnl /var/lib/qpidd/.qpidd/qls/jrnl2 /var/lib/qpidd/qls/jrnl /var/lib/qpidd/qls/jrnl2 2> /dev/null | sort -u | grep pulp.agent

  • jrnl for old dir before upgrade, jrnl2 after upgrade (if e.g. a system registered after yum update but before satellite-installer)
  • the other pair of dirs is due to RHEL6/RHEL7 different paths (if applicable to upstream)

@kgaikwad
Copy link
Member Author

@upadhyeammit,
As you are already working on backup & restore stuff, request to review these changes.
If these changes are not required then we can close this PR.

@upadhyeammit
Copy link
Contributor

Closing this PR as the changes are not required now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants