Skip to content

Issue 1009 calculate diff software#1305

Merged
chiiph merged 11 commits intomainfrom
issue-1009-calculate-diff-software
Jul 8, 2021
Merged

Issue 1009 calculate diff software#1305
chiiph merged 11 commits intomainfrom
issue-1009-calculate-diff-software

Conversation

@chiiph
Copy link
Copy Markdown
Contributor

@chiiph chiiph commented Jul 5, 2021

No description provided.

@chiiph chiiph marked this pull request as ready for review July 6, 2021 14:18
@chiiph chiiph requested a review from zwass July 6, 2021 14:18
@chiiph chiiph linked an issue Jul 6, 2021 that may be closed by this pull request
Copy link
Copy Markdown
Member

@zwass zwass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think this solution is headed in the right direction, but I have a nontrivial change to propose.

Some of my assumptions about the common case:

  • There may be quite a lot of unique software throughout a fleet (potentially making loading all software quite expensive)
  • A host will typically (on most update intervals) have minor changes, if any, to the installed software.

Currently, this method loads all software for the host, and all software for all hosts, before performing the diff. What if instead in our first step we just loaded the software for the host?

Then we can diff the software received from the host with what we have stored and then look up the IDs (inserting if necessary) for any that are not yet stored for the host.

Does that make sense? What do you think of the proposal?

Comment thread server/datastore/mysql/software.go Outdated
Comment on lines +137 to +139
func (d *Datastore) generateChangesForNewSoftware(host *fleet.Host) (
softwareChanges, error,
) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also take the transaction for consistency?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to do the selects within the tx as well, you mean?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Comment thread server/datastore/mysql/software.go Outdated
for currentKey := range current {
if _, ok := incoming[currentKey]; !ok {
deletesHostSoftware = append(deletesHostSoftware, allSoftware[currentKey])
// TODO: delete from software if no host has it
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be fine to do this as an out of band cleanup job.

@chiiph
Copy link
Copy Markdown
Contributor Author

chiiph commented Jul 7, 2021

What if instead in our first step we just loaded the software for the host?

Then we can diff the software received from the host with what we have stored and then look up the IDs (inserting if necessary) for any that are not yet stored for the host.

Does that make sense? What do you think of the proposal?

So we would check the differences between the host software only. If nothing changed (assumed it's the most common case, we exit early). Otherwise, we delete what we have to delete (we already should have the IDs). Then we get the IDs for what needs adding. Whatever is not there, we insert in software, and then insert to host_software.

Did I understand your idea correctly? I think it makes sense.

@zwass
Copy link
Copy Markdown
Member

zwass commented Jul 7, 2021

Yep, that's correct!

@chiiph chiiph requested a review from zwass July 8, 2021 14:20
Copy link
Copy Markdown
Member

@zwass zwass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strategy looks good! I requested some clarifications -- please make those before merging if you agree.

Comment on lines +131 to +149
var deletesHostSoftware []interface{}
deletesHostSoftware = append(deletesHostSoftware, hostID)

for currentKey := range currentIdmap {
if _, ok := incomingBitmap[currentKey]; !ok {
deletesHostSoftware = append(deletesHostSoftware, currentIdmap[currentKey])
// TODO: delete from software if no host has it
}
}
if len(deletesHostSoftware) > 1 {
sql := fmt.Sprintf(
`DELETE FROM host_software WHERE host_id = ? AND software_id IN (%s)`,
strings.TrimSuffix(strings.Repeat("?,", len(deletesHostSoftware)-1), ","),
)
if _, err := tx.Exec(sql, deletesHostSoftware...); err != nil {
return errors.Wrap(err, "delete host software")
}
}
return nil
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of this clarification?

Suggested change
var deletesHostSoftware []interface{}
deletesHostSoftware = append(deletesHostSoftware, hostID)
for currentKey := range currentIdmap {
if _, ok := incomingBitmap[currentKey]; !ok {
deletesHostSoftware = append(deletesHostSoftware, currentIdmap[currentKey])
// TODO: delete from software if no host has it
}
}
if len(deletesHostSoftware) > 1 {
sql := fmt.Sprintf(
`DELETE FROM host_software WHERE host_id = ? AND software_id IN (%s)`,
strings.TrimSuffix(strings.Repeat("?,", len(deletesHostSoftware)-1), ","),
)
if _, err := tx.Exec(sql, deletesHostSoftware...); err != nil {
return errors.Wrap(err, "delete host software")
}
}
return nil
var deletesHostSoftware []uint
for currentKey := range currentIdmap {
if _, ok := incomingBitmap[currentKey]; !ok {
deletesHostSoftware = append(deletesHostSoftware, currentIdmap[currentKey])
// TODO: delete from software if no host has it
}
}
if len(deletesHostSoftware) == 0 {
return nil
}
sql := fmt.Sprintf(
`DELETE FROM host_software WHERE host_id = ? AND software_id IN (%s)`,
strings.TrimSuffix(strings.Repeat("?,", len(deletesHostSoftware)), ","),
)
if _, err := tx.Exec(sql, hostID, deletesHostSoftware...); err != nil {
return errors.Wrap(err, "delete host software")
}
return nil

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to the early exit of if len(deletesHostSoftware) == 0 {

The rest doesn't seem to work, the deletesHostSoftware needs to be of interface{} because otherwise it needs a cast for each element. And tx.Exec(sql, hostID, deletesHostSoftware...) doesn't seem to work either sadly.

The fun fact here is that I did try those things, so we are very much aligned :)

currentIdmap map[string]uint,
incomingBitmap map[string]bool,
) error {
var insertsHostSoftware []interface{}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var insertsHostSoftware []interface{}
var insertsHostSoftware []uint

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with it, but we can't do that because otherwise later on when we use it we would have to cast it to []interface{}, which involves going through each item and casting them individually.

Comment on lines +207 to +210
selectFunc := d.db.Select
if tx != nil {
selectFunc = tx.Select
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I like that :)

Comment thread server/datastore/mysql/software.go Outdated

currentBitmap := make(map[string]bool)
for _, s := range current {
currentBitmap[softwareToUniqueString(s)] = false
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it a bit easier to follow when true is used for sets (even though it doesn't change the behavior)

Suggested change
currentBitmap[softwareToUniqueString(s)] = false
currentBitmap[softwareToUniqueString(s)] = true

@chiiph chiiph merged commit 18fa2f6 into main Jul 8, 2021
@chiiph chiiph deleted the issue-1009-calculate-diff-software branch July 8, 2021 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Software inventory issues at larger scale

2 participants