Skip to content
This repository has been archived by the owner on Nov 15, 2019. It is now read-only.

Use inventory v4 #75

Closed
wants to merge 34 commits into from
Closed

Conversation

josejulio
Copy link
Member

@josejulio josejulio commented Oct 12, 2017

Switching from inventory v3 to v4.

Pending:

Depends on ManageIQ/manageiq-providers-kubernetes#173

@josejulio
Copy link
Member Author

@miq-bot add_label wip, refactoring

@miq-bot miq-bot changed the title WIP - Use inventory v4 [WIP] WIP - Use inventory v4 Oct 12, 2017
@josejulio josejulio changed the title [WIP] WIP - Use inventory v4 [WIP] Use inventory v4 Oct 12, 2017
@josejulio
Copy link
Member Author

The fetching part is mostly done, need to finish with fetch_availability and metrics.

I plan to squash into a single commit all this fetching stuff, clean up a bit (remove no longer used methods) and update tests.

Inventory no longer uses Canonical Paths, they do have an id (which is opaque) but the name attribute is used instead of parsing the CPs.
The inventory API allows us to get a whole tree (e.g. the whole server for parsing), calls no longer require to specify the feed.

@miq-bot
Copy link
Member

miq-bot commented Oct 20, 2017

This pull request is not mergeable. Please rebase and repush.

@miq-bot
Copy link
Member

miq-bot commented Oct 23, 2017

This pull request is not mergeable. Please rebase and repush.

@miq-bot
Copy link
Member

miq-bot commented Oct 24, 2017

This pull request is not mergeable. Please rebase and repush.

@israel-hdez
Copy link
Member

@miq-bot add_label gaprindashvili/no

@josejulio
Copy link
Member Author

**

 - Linter/Yaml - missing config files

I don't really know what it means.
Any idea?

@lucasponce
Copy link
Contributor

Thx @josejulio as for HAWKULAR-1275 I'm going to base my changes on top of your PR.

Copy link
Member

@israel-hdez israel-hdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uff... Dropped some comments.

server.properties[feature] = 'true' if agent_config.try(:[], feature) == true
end
if server.properties['In Container'] == 'true'
container_id = collector.container_id(eap.feed)['Container Id']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this working without the container_id method in the collector?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure its not working :)
I didn't notice that, not sure why is not failing. I suppose I should use server.config instead

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to fix that.

def raw_availability_data(*args)
connection.metrics.avail.raw_data(*args)
def server_groups_from_host_controller(host_controller)
host_controller.children_by_type('Domain Server Group')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an improvement.... Probably we could monkey patch the Resources class of the client gem. Although, I don't know if that would really be an "improvement". I mean... this isn't making any API calls, so it doesn't make much sense to be here in the collector.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, we could monkey patch the Resource class and remove these methods from the collector.

collection = persister.middleware_deployments
fetch_availabilities_for(feeds, collection, collection.model_class::AVAIL_TYPE_ID) do |deployment, availability|
fetch_availabilities_for(collector.deployments, collection, collection.model_class::AVAIL_TYPE_ID) do |deployment, availability|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this is very similar regarding hitting HServices twice to get again the deployments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my other comment, I think we should refactor it to fetch each top resource only once.

deployment.status = process_deployment_availability(availability.try(:[], 'data').try(:first))
end
subdeployments_by_deployment_id = collector.subdeployments.group_by(&:parent_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this a little inefficient in terms of performance. It is hitting Hservices for each available deployment, fetching always the same data and, thereafter, filtering it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think is only hitting Hservices once, grouping by their parent_id after fetching.

  • Fetch all the deployments, process them all (see end on line 195)
  • Fetch all the subdeployments, group then by their parent_id and process them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. My eyes saw it inside the fetch_availabilities_for block

fetch_domain_availabilities(feeds_of_interest)
fetch_deployment_availabilities
fetch_server_availabilities(collector.eaps)
fetch_server_availabilities(collector.domain_servers)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two lines are hitting HServices but servers were previously obtained in fetch_middleware_servers and fetch_domain_servers. Perhaps you can store servers in instance variables to save another hit to HServices.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could do that, but I don't know if is OK (memory-wise) to store all the servers.
Maybe we could refactor this to process each top resource in a single go?
e.g.

  • Fetch server A
    • Process server A
    • Process deployments of server A
    • Process datasources of server A
  • etc..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already have them in memory. I was thinking on just not discarding the data by using instance variables :) (I think it's easier than the refactor)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are lazy fetched from the server see the resource method [1]. So i shouldn't have them all at once, unless I'm using a non-lazy enumeration method, which is also possible.

That said, I can do as you say and we can later revisit that if there is any performance issue.

[1] https://github.com/hawkular/hawkular-client-ruby/blob/master/lib/hawkular/inventory/inventory_api.rb#L75

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! So, there is some underlying magic!
Well... In that case, I'll leave to you the decision :)

end
end

def parse_deployment_name(name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! No more parsing of names :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Praise the new inventory!

'Running Mode' => 'NORMAL',
'Version' => '9.0.2.Final',
'Product Name' => 'WildFly Full',
'Host State' => 'running',
'Is Domain Controller' => 'true',
'Name' => 'master',
}
type_id = 'Host Controller'
feed = 'master.Unnamed%20Domain'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the right feed? In other places you replaced this with wf-domain.

Copy link
Member Author

@josejulio josejulio Nov 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the feed doesn't matter here, as we are using the id and never the feed.
I'm only using the feed to know which agent (and operative system) a server belongs to.
I'm going to update it to wf-domain, to remain consistent.

@jotak Is it a good way to know that? (using feed id to link server with agent and operative system)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure to see where you link servers with agents, but to answer, yes the feed id is more or less the agent id and could be used in that purpose.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -125,7 +198,7 @@
let!(:db_server) do
ems_hawkular.middleware_servers.create(
:feed => 'f1',
:ems_ref => '/t;hawkular/f;f1/r;resource1',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ems_ref is the primary key in MiQ (isolated to each provider)... What will happen if two independent servers (aka, two agents) happen to have the same server id??? Having the feed in the ems_ref worked, to some extent, to avoid collisions. But, now, is it just a bare ID???

Previously, potential collisions happened for the domain case, where ems_ref included the server name instead of a random guid like in the case of standalone servers. So, if you have two domains each one with a server named server-one a collision in MiQ was being avoided by having the feed id in ems_ref. How would that be handled now?... May collisions like this never happen?

I haven't seen how the IDs look like, so I'm unsure.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ids are opaque (kind of), and unique, so it must be fine. In the inventory server, it's the primary key for resources, there's no composed key. And in its implementation it contains the feedid.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can assume that IDs are unique. Currently implementation uses the old "CanonicalPath" as their "id".

Copy link
Member Author

@josejulio josejulio Nov 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edit: Whoops, wrong thread

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'm happy if IDs are unique :)

@@ -97,10 +96,10 @@ def assert_specific_domain_server
it 'will perform a full refresh on 127.0.0.1 even though the os type is not there yet' do
# using different cassette that represents the hawkular inventory without the operating system resource type
# TODO: Make this work with live tests
# Perhaps having a separate MW without OS configuration on its config
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you make it work? (asking because you removed :record => :none)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, didn't make it work for live tests. Only updated the yaml to remove all domain and OS interaction from the server. WIll add back the :record => :none.

@lucasponce
Copy link
Contributor

@josejulio minor bug from my side (as I am using this branch for hawkular-1275)

I changed locally the list_feeds by root_resources in the verify_credentials

    def verify_credentials(_auth_type = nil, options = {})
      begin
        # As the connect will only give a handle
        # we verify the credentials via an actual operation
        connect(options).inventory.root_resources

I saw that meanwhile setting up my environment but I think you would like to fix it on this one.

@josejulio
Copy link
Member Author

@miq-bot remove_label wip

@miq-bot miq-bot changed the title [WIP] Use inventory v4 Use inventory v4 Nov 21, 2017
@miq-bot miq-bot removed the wip label Nov 21, 2017
@josejulio
Copy link
Member Author

@israel-hdez
Updated the code, can you take a second look?

Copy link
Member

@israel-hdez israel-hdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it looks good.

@@ -58,15 +57,15 @@ def fetch_availabilities
def fetch_server_availabilities(parser)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll have to reconsider how these methods work.
Didn't you put some events regarding the availabilities? I guess we should now be using them. But that's for another PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some events wrt availability, but aren't enabled for every resource.

@israel-hdez
Copy link
Member

I tried these changes and I'm curious about why we can no longer identify which server group a server belongs when it's not running.
With the WF example domain config, there is the server-thee that is created in a stopped state. It belongs IIRC, to other-server-group, but apparently its Server Group property comes unset from Hawkular and, thus, MiQ is unable to identify and set the belonging group.

@josejulio
Copy link
Member Author

I tried these changes and I'm curious about why we can no longer identify which server group a server belongs when it's not running.
With the WF example domain config, there is the server-thee that is created in a stopped state. It belongs IIRC, to other-server-group, but apparently its Server Group property comes unset from Hawkular and, thus, MiQ is unable to identify and set the belonging group.

I'm not sure, but i suppose Hawkular is no longer telling us that information.
I'll check, maybe i'm missing something.

@josejulio
Copy link
Member Author

So, after checking, we don't have the server-group on the config when the server is not running:

"config": {
    "Server State": "STOPPED"
  }

If this did before, maybe we should report a bug on the agent :)

@israel-hdez
Copy link
Member

@josejulio I remember that it worked fine previously. So, yes! A bug on the agent should be reported :)

* Refresher class is now inheriting from ManagerRefresher. This forces to include a builder class, which creates the required parser, collector and persister objects when the refresh is happening.
* In the settings file, a new property is added. Because now the refresher is inheriting from ManagerRefresher, targeted refresh must be explicitly enabled.
* The parser has been broken into multiple parsers. This is in preparation to implement targeted refresh.
…files

The specs changed its form. Now, the specs aren't using the protected nor private methods to check behaviours.

The old parser/middleware_manager_spec.rb file is now removed since it is now replaced by the new files.
@israel-hdez
Copy link
Member

@josejulio BTW.... I saw some specs failing, even with the operations fixes. Are you also having some errors in the specs?

@josejulio
Copy link
Member Author

All specs are passing on my side.

Break parser into entity focused parsers
@josejulio
Copy link
Member Author

@lucasponce @jshaughn
Heads up!
@israel-hdez refactored some of the code and is already on this PR.
josejulio#1

israel-hdez and others added 2 commits November 29, 2017 10:06
Secondary references in inventory collection are experimental for now. Stop using them and use an index in the persister to link servers with server groups.
Fix rubocop warnings and stop using inventory collection secondary refs
@lucasponce
Copy link
Contributor

@lucasponce @jshaughn
Heads up!
@israel-hdez refactored some of the code and is already on this PR.
josejulio#1

Thanks for the update.
I wonder if due the status we are if it makes sense to push a whole Hawkular-1275 instead to split with inventory v4 only and then our work.
I don't know :-) still finishing the changes locally here
https://github.com/lucasponce/manageiq-providers-hawkular/tree/hawkular-1275
Where @jshaughn is also PR his changes.
I don't know a lot of pieces moving on the same code, but yes, I guess first @josejulio needs to consolidate his changes, then mine, and last Jay as alerting is the user of most of the changes.

@jshaughn
Copy link
Contributor

I wonder if due the status we are if it makes sense to push a whole Hawkular-1275 instead to split with inventory v4 only and then our work.

+1, at this point we need to consolidate into one branch (per repo). For the hawkular-client-ruby and and manageiq-provider-hawkular I'd suggest hawkular-1275, hosted by lucas or josejulio. The 1275 branches would later be merged to master. But for manageiq and manageiq-ui-classic I'd maybe suggest hawkular-ga, which would likely be held from master until a later date?

So, I think @josejulio @israel-hdez and @lucasponce should probably coordinate to decide who will own the consolidated branch, and then how best to consoldiate. Let's meet if that would help.

@miq-bot
Copy link
Member

miq-bot commented Nov 29, 2017

Checked commits josejulio/manageiq-providers-hawkular@4ee81f8~...4035d76 with ruby 2.3.3, rubocop 0.47.1, haml-lint 0.20.0, and yamllint 1.10.0
42 files checked, 1 offense detected

**

  • 💣 💥 🔥 🚒 - Linter/Yaml - missing config files

@josejulio josejulio closed this Nov 30, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants