Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Host Verify Credentials connects directly to the host #6625

Open
agrare opened this issue Jan 20, 2020 · 27 comments
Open

Host Verify Credentials connects directly to the host #6625

agrare opened this issue Jan 20, 2020 · 27 comments

Comments

@agrare
Copy link
Member

agrare commented Jan 20, 2020

When verifying host credentials the verify operation is run from the controller directly not over the queue: https://github.com/ManageIQ/manageiq-ui-classic/blob/master/app/controllers/host_controller.rb#L229

@himdel
Copy link
Contributor

himdel commented Jan 21, 2020

@agrare So, what's the alternative?

I understand we need to run validate over the queue, I have no idea how to achieve that, or where to find any docs we have on that?

(I mean, I assume there's a preferred way of queuing a verify operation that's documented somewhere?)

@agrare
Copy link
Member Author

agrare commented Jan 21, 2020

I understand we need to run validate over the queue, I have no idea how to achieve that

Same as the rest of the operations we run over the queue, we need a verify_credentials_task method that calls verify_credentials. I can add that for you if you want.

@himdel
Copy link
Contributor

himdel commented Jan 21, 2020

OK, I can see there is a validate_credentials_task that we're using for provider validation.
(And that those 2 are the same method, not sure which of the names is deprecated.)

And it looks like right now only ExtManagementSystem has it.

So. if you wouldn't mind handling the backend bits, I'd be grateful :).


Also, looks like the current code is relying on being able to catch exceptions, but it seems validate_credentials_task only retuns the error message, not anything to match against MiqException::MiqSshUtilHostKeyMismatch, can we extend the method to also return the exception object?

@himdel
Copy link
Contributor

himdel commented Jan 21, 2020

And one more thing, I'm worried about it using MiqTask.wait_for_taskid, instead of UI's initiate_wait_for_task code.

I don't think we can have our initial request wait 30 seconds...

EDIT: separate issue though, providers do the same thing now

@agrare
Copy link
Member Author

agrare commented Jan 28, 2020

@himdel ManageIQ/manageiq#19775

@himdel
Copy link
Contributor

himdel commented Jan 29, 2020

Thanks! taking it from there :)

@agrare
Copy link
Member Author

agrare commented Feb 14, 2020

@himdel looks like ems edit verification also doesn't go over the queue, #6667

@himdel
Copy link
Contributor

himdel commented Feb 17, 2020

Thanks! taking it from there :)

Actually no, I can't.

I still need to be able to access the exception on failure, ManageIQ/manageiq#19775 doesn't seem to address that.

(+ we may need a change to account for #6667 (comment) #6667 (comment) ... that's a feature not a bug, and if validation on edit is supposed to work, we need an api that does not require the UI to know the password, and ensures we're validating the right thing (not reusing saved password for different server, for example).)

@agrare
Copy link
Member Author

agrare commented Feb 17, 2020

I still need to be able to access the exception on failure, ManageIQ/manageiq#19775 doesn't seem to address that.

@himdel ManageIQ/manageiq#19775 matches what we are doing with the new ems verify_credentials method not expected to be a drop-in replacement for what is currently there.

You should be able to get the exception message from the task.

@himdel
Copy link
Contributor

himdel commented Feb 17, 2020

@agrare Aah I think I see it now, thanks :)

verify_credentials_task returns a task id, so, we can initate_wait_for_task,

it takes the same arguments as verify_task, except for an initial userid .. which seems to not be an id, but a string - presumably the login name (User.curent_user.userid, I think).

after some experimentation with the queue, seems like on error, status is Error, message is the exception message, state is Finished, and the exception class does NOT get saved anywhere.

So, seems like we only need to fix the queue runner to save the exception class somewhere, otherwise we have no way to handle MiqException::MiqSshUtilHostKeyMismatch. Can I give you that task @agrare ?

(BTW as least simulate_queue_worker behaves differently when the error is NotImplementedError - that one doesn't end up changing the task, but errors the whole simulate_queue_worker run)

BTW is there any wiki/doc document for these things? I think I'll start one if not, too many undocumented core things to remember :).

@agrare
Copy link
Member Author

agrare commented Feb 17, 2020

it takes the same arguments as verify_task, except for an initial userid .. which seems to not be an id, but a string - presumably the login name (User.curent_user.userid, I think).

This is true for all MiqTasks, the schema for "miq_tasks" has t.string "userid". Confusing name, yes but like you said it is based off of the Users#userid attribute

@agrare
Copy link
Member Author

agrare commented Feb 17, 2020

So, seems like we only need to fix the queue runner to save the exception class somewhere, otherwise we have no way to handle MiqException::MiqSshUtilHostKeyMismatch

Hm that's going to be tough because MiqQueue#deliver doesn't even return the exception class and changing the return signature of that would be........crazy

Is there another way to do this? Can we catch that exception and return a known string that you can key off of? Or just drop the weird exception handling from the UI completely?

@agrare
Copy link
Member Author

agrare commented Feb 17, 2020

(BTW as least simulate_queue_worker behaves differently when the error is NotImplementedError - that one doesn't end up changing the task, but errors the whole simulate_queue_worker run)

Really? That's not what I'm seeing

>> MiqTask.generic_action_with_callback({}, {:class_name => "Host", :method_name => "foo", :args => []})
>> simulate_queue_worker

** Delivering Message id: [10000016640747],  id: [], Zone: [default], Role: [], Server: [], MiqTask id: [10000000053281], Ident: [generic], Target id: [], Instance id: [], Task id: [], Command: [Host.foo], Timeout: [600], Priority: [100], State: [ready], Deliver On: [], Data: [], Args: []

  MiqTask Load (0.9ms)  SELECT  "miq_tasks".* FROM "miq_tasks" WHERE "miq_tasks"."id" = $1 LIMIT $2  [["id", 10000000053281], ["LIMIT", 1]]
  MiqTask Inst Including Associations (0.2ms - 1rows)
   (0.4ms)  BEGIN
  SQL (1.0ms)  UPDATE "miq_tasks" SET "message" = $1, "state" = $2, "miq_server_id" = $3, "started_on" = $4, "updated_on" = $5 WHERE "miq_tasks"."id" = $6  [["message", "Task starting"], ["state", "Active"], ["miq_server_id", 10000000000012], ["started_on", "2020-02-17 15:36:47.028952"], ["updated_on", "2020-02-17 15:36:47.029356"], ["id", 10000000053281]]
   (7.2ms)  COMMIT
  MiqTask Load (0.3ms)  SELECT  "miq_tasks".* FROM "miq_tasks" WHERE "miq_tasks"."id" = $1 LIMIT $2  [["id", 10000000053281], ["LIMIT", 1]]
  MiqTask Inst Including Associations (0.1ms - 1rows)
  MiqServer Load (0.3ms)  SELECT  "miq_servers".* FROM "miq_servers" WHERE "miq_servers"."id" = $1 LIMIT $2  [["id", 10000000000012], ["LIMIT", 1]]
  MiqServer Inst Including Associations (0.2ms - 1rows)
   (0.1ms)  BEGIN
  SQL (0.3ms)  UPDATE "miq_tasks" SET "status" = $1, "message" = $2, "state" = $3, "updated_on" = $4 WHERE "miq_tasks"."id" = $5  [["status", "Error"], ["message", "undefined method `foo' for #<Class:0x00005555eb7837c8>\nDid you mean?  fork"], ["state", "Finished"], ["updated_on", "2020-02-17 15:36:47.079288"], ["id", 10000000053281]]
   (0.8ms)  COMMIT
   (0.1ms)  BEGIN
  SQL (0.4ms)  DELETE FROM "miq_queue" WHERE "miq_queue"."id" = $1 AND "miq_queue"."lock_version" = $2  [["id", 10000016640747], ["lock_version", 0]]
   (0.8ms)  COMMIT
^C
>> MiqTask.last
  MiqTask Load (0.8ms)  SELECT  "miq_tasks".* FROM "miq_tasks" ORDER BY "miq_tasks"."id" DESC LIMIT $1  [["LIMIT", 1]]
  MiqTask Inst Including Associations (0.2ms - 1rows)
=> #<MiqTask id: 10000000053281, name: nil, state: "Finished", status: "Error", message: "undefined method `foo' for #<Class:0x00005555eb783...", userid: "system", created_on: "2020-02-17 15:36:44", updated_on: "2020-02-17 15:36:47", pct_complete: nil, context_data: nil, results: nil, miq_server_id: 10000000000012, identifier: nil, started_on: "2020-02-17 15:36:47", zone: nil>

@agrare
Copy link
Member Author

agrare commented Feb 17, 2020

BTW is there any wiki/doc document for these things? I think I'll start one if not, too many undocumented core things to remember :).

For which things? A lot of this is insider developer stuff which unless it is really obscure I think we expect people to be able to figure out. If there are things you think would benefit from documentation go for it though!

@himdel
Copy link
Contributor

himdel commented Feb 18, 2020

Hm that's going to be tough because MiqQueue#deliver doesn't even return the exception class and changing the return signature of that would be........crazy
Is there another way to do this? Can we catch that exception and return a known string that you can key off of? Or just drop the weird exception handling from the UI completely?

I mean.. yes and no. If using the queue means we can't handle exceptions, then we can't use the queue in general. That's really it :). I would not treat exception handling using the exception class as "weird" ;), that should be the default.

As for the specific case ... maybe? But sounds like a lot of work in the wrong direction.
IMO we should just extend MiqTask with an exception_klass field, this will be necesary in other places that need to run tasks.

But, we can also catch the exception inside the verify_credentials? (https://github.com/ManageIQ/manageiq/pull/19775/files#diff-cae72f3c7d5b9ebc549497c6d8e9834aR759) method, and save {:message => e.message, :klass => e.klass}.to_json in the message field, if you really want to go the way of fixing one specific use case.

@himdel
Copy link
Contributor

himdel commented Feb 18, 2020

Really? That's not what I'm seeing...

I was testing with...

diff --git a/app/models/miq_task.rb b/app/models/miq_task.rb
index 3afbfd1cf6..5f607f4aa7 100644
--- a/app/models/miq_task.rb
+++ b/app/models/miq_task.rb
@@ -64,6 +64,10 @@ class MiqTask < ApplicationRecord
   scope :no_status_selected,      ->           { running.where.not(:status => %(Ok Error Warn)) }
   scope :with_status_in,          ->(s, *rest) { rest.reduce(MiqTask.send(s)) { |chain, r| chain.or(MiqTask.send(r)) } }
 
+  def magic
+    raise NoMethodError, "foo"
+  end
+
   def ensure_started
     self.started_on ||= Time.now.utc if state == STATE_ACTIVE
   end
MiqTask.generic_action_with_callback({:userid=>'admin', :action=>'Foo'}, {:args=>[], :class_name=>'MiqTask', :instance_id => MiqTask.first.id, :method_name => 'magic', :queue_name =>'generic', :role => 'ems_operations', :zone => MiqServer.my_zone})
simulate_queue_worker

but testing again on a different machine yields the result you're seeing ¯\_(ツ)_/¯. I probably missed an unrealted thing, sorry :).

@himdel
Copy link
Contributor

himdel commented Feb 18, 2020

For which things? A lot of this is insider developer stuff which unless it is really obscure I think we expect people to be able to figure out. If there are things you think would benefit from documentation go for it though!

Well, honestly any of them :)

In the ideal case, I would say the UI should never use any backend method that's not documented in developer docs. That's the only way to end up with a clean, documented interface down the road.

But yeah, we're not there yet, so I'll start writing down notes in https://github.com/ManageIQ/manageiq-ui-classic/wiki/Backend-knowledge-tidbits :)

@agrare
Copy link
Member Author

agrare commented Feb 21, 2020

I mean.. yes and no. If using the queue means we can't handle exceptions, then we can't use the queue in general. That's really it :).

We have to use the queue for ems_operations, that isn't an option this is a bug. I'm happy to add in the exception class to the error string if that is what you need but not using the queue isn't an option.

@Fryguy
Copy link
Member

Fryguy commented Feb 21, 2020

Reading back, I think we need to consider the long term case where everything UI goes over the REST API. If we were to implement this via API, you wouldn't get exceptions raised. Instead I guess it would be part of the miq_task payload. So, I would lean towards a design more in that vein, and not use exception handling.

@himdel
Copy link
Contributor

himdel commented Feb 21, 2020

OK, so we need to move the SSH key logic to the side that is executing the task on the queue, and figure out a way of sending a flash message contents.

My problem is that right now as it is, we're simply losing the information. We have no way to implement the same behaviour without knowing the exception class, the message is not useful.

@agrare
Copy link
Member Author

agrare commented Feb 21, 2020

I can add an option to the verify task to accept pubkey changes if you add a checkbox for it on the UI, the flash message can display the error which should mention the error was that the public key of the server has changed

@himdel
Copy link
Contributor

himdel commented Feb 21, 2020

@agrare I'm not sure how that would work, if the checkbox is off, all we could do is display a generic "Too bad" message, because we have no way of knowing that was the issue, right?

If we want to preserve the existing behaviour, the task has to fail in a way that can tell us what happened and allow us to queue again.

If we're OK with a "Failed" message and a "force" checkbox, I can do that :).

@agrare
Copy link
Member Author

agrare commented Feb 21, 2020

all we could do is display a generic "Too bad" message

Won't you be able to show the exception message from the task?

@himdel
Copy link
Contributor

himdel commented Feb 21, 2020

Oh, if you change the task to use the right exception message, sure :).

But then that task needs to catch MiqException::MiqSshUtilHostKeyMismatch and return the right message.

I still think the UI should be outputting the message, based on the task data (such as the exception, or a more explicit way of passing the information, like a result hash with a :host_key_changed => true), but, sure, that could work too.

@himdel
Copy link
Contributor

himdel commented Feb 21, 2020

((Note that if the backend is outputting the message, it needs to figure out the right language, not sure we have i18n set up in the queue?)) ((Or use the N_ + __ combo, but the message needs to be fully static then.))

@himdel himdel removed their assignment Apr 24, 2020
@chessbyte
Copy link
Member

@agrare @kavyanekkalapu can someone from the UI team bring this across the finish line. Obviously, there is no rush, but would be good for more maintainable code

@chessbyte chessbyte moved this from Backlog to To do in Roadmap Sep 14, 2021
@miq-bot
Copy link
Member

miq-bot commented Feb 27, 2023

This issue has been automatically marked as stale because it has not been updated for at least 3 months.

If you can still reproduce this issue on the current release or on master, please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions! More information about the ManageIQ triage process can be found in the triage process documentation.

@miq-bot miq-bot added the stale label Feb 27, 2023
@Fryguy Fryguy removed the stale label Mar 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Roadmap
  
To do
Development

No branches or pull requests

7 participants