Skip to content

Conversation

@shrouti1995
Copy link
Collaborator

@shrouti1995 shrouti1995 commented Aug 13, 2025

Adding support to export conntrack metrics per zone.
Details:

  1. Implemented proper metric collection through Prometheus collector interface

@shrouti1995 shrouti1995 requested a review from Copilot August 13, 2025 10:39
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@shrouti1995 shrouti1995 marked this pull request as draft August 13, 2025 10:51
// ConntrackCollectorWithAggAccessor wraps the existing collector with access to the aggregator snapshot
type ConntrackCollectorWithAggAccessor struct {
*conntrackCollector
SnapshotFunc func() map[uint16]map[uint32]int

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used anywhere?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

* trying to fix the static check issue

* trying to fix the static check issue
* adding test cases

* adding mock tests

* Context-Based Cancellation Refactoring

* add centralised error propagation

* solve lint error
* adding test cases

* adding mock tests

* Context-Based Cancellation Refactoring

* add centralised error propagation

* solve lint error

* modelling test cases in table format

* graceful shut down

* centralised config

* solve lint error
@shrouti1995 shrouti1995 requested a review from jcooperdo October 28, 2025 15:20
@shrouti1995 shrouti1995 self-assigned this Oct 28, 2025
Copy link
Collaborator Author

@shrouti1995 shrouti1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anitgandhi @jcooperdo when you get time

}

// NewZoneMarkAggregator creates a mock aggregator for testing
func NewZoneMarkAggregator() (*MockZoneMarkAggregator, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be named NewMockZoneMarkAggregator()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

// LoadConfig loads conntrack configuration from environment variables
func LoadConfig() *Config {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's confusing to use env vars directly for config here, when the top-level exporter config is driven by CLI flags
we should be consistent and use CLI flags for everything.
this package (internal/conntrack) shouldn't really care about flags vs env vars actually, it should only care about its own Config type/struct and associated defaults

it should be left up to the caller to define how to change the config (env vars, flags, whatever)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

applied

Comment on lines 119 to 134
// Test concurrent snapshot access
done := make(chan bool, 10)
for i := 0; i < 10; i++ {
go func() {
snapshot := agg.Snapshot()
if snapshot == nil {
t.Error("Concurrent snapshot returned nil")
}
done <- true
}()
}

// Wait for all goroutines
for i := 0; i < 10; i++ {
<-done
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could simplify this with a waitgroup

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

applied

cs []prometheus.Collector
mu sync.Mutex
cs []prometheus.Collector
conntrackEnabled bool

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there ever a situation where this is false but aggregator is non-nil?
i don't believe there would be , in which case this extra bool doesn't provide any value

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

shrouti1995 and others added 2 commits October 29, 2025 15:57
Co-authored-by: Anit Gandhi <anitgandhi@gmail.com>
}
}

// LoadConfig loads conntrack configuration from environment variables

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aggregator.go would be a better name

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feels out-dated at this point seeing as we don't have env vars anymore and also in this PR we didn't have the hardcoded consts either
i'd recommend getting rid of this doc and instead making the godoc in internal/conntrack/config.go more clear by adding these details

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

newDatapathCollector(c.Datapath.List),
}

// Create the aggregator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't see why we should put this in internal/ovsexporter at all
we should instead put the conntrack prometheus logic in internal/conntrack/exporter.go , and then just import it from main.go (if the additional exporter is enabled, which should be behind a new config flag bool in main.go)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my bad. added it in this way.

}

// StopWithTimeout cancels listening and closes the connection with a configurable timeout.
func (a *ZoneMarkAggregator) StopWithTimeout(timeout time.Duration) error {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to be exported?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed it

Comment on lines 345 to 349
func (a *ZoneMarkAggregator) GetError() error {
// This is a non-blocking way to check if there are any errors
// The actual error handling happens in Stop()
return nil
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like a no-op, remove if it's not going to be useful

// Additional generic netlink family collectors can be added here.
newDatapathCollector(c.Datapath.List),
},
collectors := []prometheus.Collector{

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can revert this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file should be in internal/conntrack right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleaned it up.

Comment on lines 88 to 92
case syscall.SIGHUP:
log.Printf("Received SIGHUP, reloading config...")
// TODO: Add config reload logic here
log.Printf("Config reloaded")
return

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're not going to support this now (which i don't think we really need to), let's just not handle SIGHUP

// TODO: Add config reload logic here
log.Printf("Config reloaded")
return
case syscall.SIGQUIT:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't seem like we're actually treating SIGQUIT any differently than the other signals in practice

also, generally handling SIQQUIT gracefully is a bit weird. The Go runtime already does the "correct" thing , which is to dump a stack trace and exit.

let's remove handling for this, and just leave the standard SIGINT and SIGTERM for graceful termination

shrouti1995 and others added 2 commits November 3, 2025 18:40
Co-authored-by: Anit Gandhi <anitgandhi@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants