Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 7 additions & 9 deletions central/sensor/service/pipeline/nodeinventory/pipeline.go
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might have nodes with the same name in an envrionment with multiple clusters. Considering OCP, I am unsure if this is something customers can do. I would have to check. But regardless, doing (a.) A standardized representation of a node object based on nodeDatastore.NodeString(node), and doing (b.) Ensuring the cluster is associated with it will make these log lines survive when we start handling secured clusters that are not OCP. Also, in other places, I've seen nodes being referenced as "cluster/node", although it's not standardized everywhere. Considering the above, what improvements are we getting from removing the cluster name from the logs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find using the standard node string as a good idea! The problem is that it requires to have the node object, but in the discussed place in code (central/sensor/service/pipeline/nodeinventory/pipeline.go) we do not have it yet. Only later, we query the DB and fetch the node object for the given node inventory.

I will add the IDs to each node names in places where we could not get the cluster name. Would that be okay?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add the IDs to each node names in places where we could not get the cluster name. Would that be okay?

Yes, that was the original approach, if I remember correctly. Thanks for improving it.

Original file line number Diff line number Diff line change
Expand Up @@ -63,36 +63,34 @@ func (p *pipelineImpl) Run(ctx context.Context, _ string, msg *central.MsgFromSe
if ninv == nil {
return errors.Errorf("unexpected resource type %T for node inventory", event.GetResource())
}
invStr := fmt.Sprintf("for node %s (id: %s)", ninv.GetNodeName(), ninv.GetNodeId())
log.Infof("received node inventory %s", invStr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this INFO is helpful. Removing it reduces the visibility of a critical piece of information. The rate of inventories doesn't justify removing it. What is the rationale? Is there another log or pointer that would give people the same information?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is a replacement for it:

  • if all goes well, we will see scanned inventory from node %q with %d components
  • in case of errors, we will see one of the errors

I moved this to debug, as otherwise the info would be printed twice.

log.Debugf("node inventory %s contains %d packages to scan from %d content sets", invStr,
nodeStr := fmt.Sprintf("(node name: %q, node id: %q)", ninv.GetNodeName(), ninv.GetNodeId())
log.Debugf("received inventory %s contains %d packages to scan from %d content sets", nodeStr,
len(ninv.GetComponents().GetRhelComponents()), len(ninv.GetComponents().GetRhelContentSets()))
if event.GetAction() != central.ResourceAction_UNSET_ACTION_RESOURCE {
log.Errorf("node inventory %s with unsupported action: %s", invStr, event.GetAction())
log.Errorf("inventory %s has unsupported action: %q", nodeStr, event.GetAction())
return nil
}
ninv = ninv.Clone()

// Read the node from the database, if not found we fail.
node, found, err := p.nodeDatastore.GetNode(ctx, ninv.GetNodeId())
if err != nil {
log.Errorf("fetching node (id: %q) from the database: %v", ninv.GetNodeId(), err)
log.Errorf("fetching node %s from the database: %v", nodeStr, err)
return errors.WithMessagef(err, "fetching node: %s", ninv.GetNodeId())
}
if !found {
log.Errorf("fetching node (id: %q) from the database: node does not exist", ninv.GetNodeId())
log.Errorf("fetching node %s from the database: node does not exist", nodeStr)
return errors.WithMessagef(err, "node does not exist: %s", ninv.GetNodeId())
}
log.Debugf("node %s found, enriching with node inventory", nodeDatastore.NodeString(node))

// Call Scanner to enrich the node inventory and attach the results to the node object.
err = p.enricher.EnrichNodeWithInventory(node, ninv)
if err != nil {
log.Errorf("enriching node %s: %v", nodeDatastore.NodeString(node), err)
return errors.WithMessagef(err, "enrinching node %s", nodeDatastore.NodeString(node))
}
log.Debugf("node inventory for node %s has been scanned and contains %d results",
nodeDatastore.NodeString(node), len(node.GetScan().GetComponents()))
log.Infof("scanned inventory from node %s with %d components", nodeDatastore.NodeString(node),
len(node.GetScan().GetComponents()))

// Update the whole node in the database with the new and previous information.
err = p.riskManager.CalculateRiskAndUpsertNode(node)
Expand Down
16 changes: 9 additions & 7 deletions compliance/collection/intervals/intervals.go
Original file line number Diff line number Diff line change
Expand Up @@ -42,23 +42,25 @@ func NewNodeScanIntervalFromEnv() NodeScanIntervals {
i := NodeScanIntervals{}
i.base = env.NodeScanningInterval.DurationSetting()
i.deviation = 0.0
if d := env.NodeScanningIntervalDeviation.DurationSetting(); d > 0 {
if d >= i.base {
absDeviation := env.NodeScanningIntervalDeviation.DurationSetting()
if absDeviation > 0 {
if absDeviation >= i.base {
i.deviation = 1
absDeviation = i.base
} else {
i.deviation = d.Seconds() / i.base.Seconds()
i.deviation = absDeviation.Seconds() / i.base.Seconds()
}
}
i.initialMax = env.NodeScanningMaxInitialWait.DurationSetting()
log.Infof("Scanning intervals: base interval: %s, maximum absolute deviation from base: %.2f%%, maximum first scan interval: %s",
i.base, i.deviation*100.0, i.initialMax)
log.Infof("Scanning intervals: base interval: %s, maximum absolute deviation from base: %s, first scan starts not later than in: %s",
i.base, absDeviation, i.initialMax)
return i
}

// Initial returns the initial node scanning interval.
func (i *NodeScanIntervals) Initial() time.Duration {
interval := multiplyDuration(i.initialMax, randFloat64())
log.Infof("initial scanning in %s", interval)
log.Infof("Initial scanning in %s", interval)
return interval
}

Expand All @@ -68,6 +70,6 @@ func (i *NodeScanIntervals) Next() time.Duration {
if i.deviation > 0 {
interval = deviateDuration(interval, i.deviation)
}
log.Infof("next node scan in %s", interval)
log.Infof("Next node scan in %s", interval)
return interval
}
10 changes: 5 additions & 5 deletions compliance/collection/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -168,10 +168,10 @@ func manageNodeScanLoop(ctx context.Context, i intervals.NodeScanIntervals, scan
case <-ctx.Done():
return
case <-t.C:
log.Infof("starting a node scan for node %q", nodeName)
log.Infof("Scanning node %q", nodeName)
msg, err := scanNode(scanner)
if err != nil {
log.Errorf("error running scanNode: %v", err)
log.Errorf("error running node scan: %v", err)
} else {
nodeInventoriesC <- msg
}
Expand Down Expand Up @@ -269,10 +269,10 @@ func main() {
// Set up Compliance <-> NodeInventory connection
niConn, err := clientconn.AuthenticatedGRPCConnection(env.NodeScanningEndpoint.Setting(), mtls.Subject{}, clientconn.UseInsecureNoTLS(true))
if err != nil {
log.Errorf("Could not initialize connection to NodeInventory service. Node Scanning will be unavailable: %v", err)
log.Errorf("Disabling node scanning for this node: could not initialize connection to node-inventory container: %v", err)
}
if niConn != nil {
log.Info("Initialized NodeInventory gRPC connection")
log.Info("Initialized gRPC connection to node-inventory container")
nodeInventoryClient = scannerV1.NewNodeInventoryServiceClient(niConn)
}
}
Expand All @@ -282,7 +282,7 @@ func main() {
if err != nil {
log.Fatal(err)
}
log.Info("Initialized Sensor gRPC stream connection")
log.Info("Initialized gRPC stream connection to Sensor")
defer func() {
if err := conn.Close(); err != nil {
log.Errorf("Failed to close connection: %v", err)
Expand Down
8 changes: 4 additions & 4 deletions sensor/common/compliance/node_inventory_handler_impl.go
Original file line number Diff line number Diff line change
Expand Up @@ -89,19 +89,19 @@ func (c *nodeInventoryHandlerImpl) run() <-chan *central.MsgFromSensor {
}
if !c.centralReady.IsDone() {
// TODO(ROX-13164): Reply with NACK to compliance
log.Warnf("Received NodeInventory but Central is not reachable. Requesting Compliance to resend NodeInventory later")
log.Warnf("Received node inventory but Central is unavailable")
continue
}
if inventory == nil {
log.Warnf("Received nil NodeInventory - not sending node inventory to Central")
log.Warnf("Received nil node inventory: not sending to Central")
break
}
if nodeID, err := c.nodeMatcher.GetNodeID(inventory.GetNodeName()); err != nil {
log.Warnf("Node '%s' unknown to sensor - not sending node inventory to Central", inventory.GetNodeName())
log.Warnf("Node '%s' unknown to sensor: not sending node inventory to Central", inventory.GetNodeName())
} else {
inventory.NodeId = nodeID
metrics.ObserveReceivedNodeInventory(inventory)
log.Infof("Mapping NodeInventory name '%s' to Node ID '%s'", inventory.GetNodeName(), nodeID)
log.Debugf("Mapping node inventory name '%s' to Node ID '%s'", inventory.GetNodeName(), nodeID)
c.sendNodeInventory(toC, inventory)
}
}
Expand Down