Replace the hardware intake system. The current one depends on XML output from LLDP and LSHW. Unfortunately the schema changes somewhat frequently which makes it brittle, LSHW/LLDP may not be available on the host platform, and most importantly we have to add support for every hardware type in collins. The goal is to provide a flexible enough intake system to support hardware that collins doesn't know about right now. The current thinking is we'll introduce a JSON format to replace the XML formats in use, and provide converters from LLDP/LSHW to the JSON format. Since collins supports API versioning, we'll likely peg the old endpoints at version 1.1 and the new version as 1.2. We'll open up discussion on the list about the JSON format before we start coding.
Notes from Dan from internal ticket:
The current flat key/value store for assets and meta tags appears to be inadequate.
Example: Meta tags can't properly store VLan information for an asset. An asset may have multiple VLans, and each VLan has an id and name. Group id is used to distinguish the VLan tags from other tags, but the correlations between ids and names is lost.
Allow the creation of assets which serve as the values for asset meta tags. The meta value itself will be a pointer to the id of the sub-asset.
For example, to solve the issue with VLans, each VLan would be a sub-asset. The same principle could be applied to most other groups of tags such as disks, CPU's and Nics.
The asset/sub-asset tree would be limited to a depth of 1, ie. sub-assets could not have sub-assets of their own.
In order to handle searching these values, we would use Lucene to flatten and index all top-level assets. Values of sub-assets could be squashed into multi-value keys.
Currently group_id is used to add a dimension to asset meta values to group related values together. We could simply add another subgroup_id to address situations like the VLan issue.
Make no DB-level changes, address the issues in the app.
Now that we have solr in place, allowing the import of arbitrary JSON should be ok, since we can pretty easily index whatever data we want into solr without having to worry normalizing the data for mysql table insertion.
In order to keep asset hardware data exposed as key/values, we'd adopt a translation scheme from keys to x-path-like traversals of JSON objects. For example "hardware/cpu_speed_ghz", or "network_interface/vlan/0/name"
outputting the stored JSON in the API is trivial, but we'd need a little refactoring in the web app to neatly display the data.
We definitely need to keep the data normalized since search isn't the only thing that uses the values. That being said I think we can accept a format that will work with the underlying schema.
I don't think we've fully figured this out, but I'm going to start prototyping two features that I think will help solve this issue:
For example if a JSON document looks something like
"DISK" : [
"SIZE_BYTES" : 1234567890
This will get translated into an asset meta with a name of "DISK_SIZE_BYTES" and a value of "1234567890", using the asset_meta_value.group_id to handle multiple disks. We still have that dimensionality issue, but however that gets solved I think these two features will still be useful.
I will also re-investigate my above "sub-asset" idea and see how easy that would be to implement now that we have solr searching fully in-place.