-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: support auto-compaction with finer granularity #8563
*: support auto-compaction with finer granularity #8563
Conversation
Some manual test result: test 5s interval:
test no compaction:
test auto-compaction doesn't compact at every 1
|
if the approach seems fine, I'll refactor the tests for compactor as it is currently tightly couple with compactor's implementation. |
compactor/periodic.go
Outdated
if err == nil || err == mvcc.ErrCompacted { | ||
plog.Noticef("Finished auto-compaction at revision %d", rev) | ||
} else { | ||
plog.Noticef("Failed auto-compaction at revision %d (%v)", err, rev) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
, rev, err
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch
b0e83d6
to
ce4d657
Compare
compactor/periodic.go
Outdated
continue | ||
} | ||
|
||
rev, remaining := t.getRev(t.periodInHour) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not necessary to remove this to support the feature and removing it completely changes the behavior of the periodic compactor
compactor/periodic.go
Outdated
plog.Noticef("Failed auto-compaction at revision %d (%v)", err, rev) | ||
plog.Noticef("Retry after %v", checkCompactionInterval) | ||
t.mu.RUnlock() | ||
rev := t.rg.Rev() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is wrong. we should keep the previous logic to compact the revision of the past last x duration.
ce4d657
to
a3364ce
Compare
@xiang90 I changed the compactor logic with following:
|
why do we change the previous logic? i feel all we need to do there is to change the wait time from a fix x hours to some user given timeout, no? |
@xiang90 the previous logic has a |
There are a few issues with your approach, which the old approach addressed.
I would suggest you understand how the previous code handles failures before simplifying it. The rev recording part can be simplified though. |
@xiang90 i see. let me rethink about the error handling part. |
a3364ce
to
0d3bea6
Compare
@xiang90 i add fast retry behavior into the code. |
this is getting more complicated. i am wondering why not keep the exact previous logic and change the checkInterval to min(user-specified-timeout/10, 1*second) or something. |
@xiang90 checking every |
99% people will compact > 1 minute interval. that is max 0.16 read request / second, which is almost nothing. |
ff350b5
to
da7f6d7
Compare
@xiang90 redone the pr with previous approach. PTAL edit: Thing to notice. the original code compacts every hour after T > retentionPeriod. That must be change for <1 hour compaction retention. So I change the compaction frequency to every |
da7f6d7
to
e88267f
Compare
compactor/periodic.go
Outdated
} | ||
} | ||
|
||
// N divides Periodic.period in into checkCompactionInterval duration | ||
var N = 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not a const? why make this public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably name this to something more descriptive too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i'll come up with a better name.
compactor/periodic.go
Outdated
@@ -66,31 +70,27 @@ func (t *Periodic) Run() { | |||
case <-t.ctx.Done(): | |||
return | |||
case <-clock.After(checkCompactionInterval): | |||
t.mu.Lock() | |||
t.mu.RLock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why change to use rlock here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i noticed that we are just reading t.paused
. so I used rlock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the critical area is super tight and not accessed frequently. it probably not worths changing it.
if you want to make it more efficient, do it in another PR. Let us keep one PR for one thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alright then
@@ -113,8 +113,8 @@ func (t *Periodic) Resume() { | |||
t.paused = false | |||
} | |||
|
|||
func (t *Periodic) getRev(h int) (int64, []int64) { | |||
i := len(t.revs) - int(time.Duration(h)*time.Hour/checkCompactionInterval) | |||
func (t *Periodic) getRev() (int64, []int64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pass in n? probably we do not need the global defined const n
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the n
can be used in testing. that's why I had a const n
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah. ok. makes sense.
e88267f
to
e8b35de
Compare
@fanminshi Fix Travis CI failures? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in general. Thanks for handling the backward compatibility of the flags carefully.
Just a couple comments on code structure and documentation.
etcdmain/config.go
Outdated
fs.IntVar(&cfg.AutoCompactionRetention, "auto-compaction-retention", 0, "Auto compaction retention for mvcc key value store. 0 means disable auto compaction.") | ||
fs.StringVar(&cfg.AutoCompactionMode, "auto-compaction-mode", "periodic", "Interpret 'auto-compaction-retention' as hours when 'periodic', as revision numbers when 'revision'.") | ||
fs.StringVar(&cfg.AutoCompactionRetention, "auto-compaction-retention", "0", "Auto compaction retention for mvcc key value store. 0 means disable auto compaction.") | ||
fs.StringVar(&cfg.AutoCompactionMode, "auto-compaction-mode", "periodic", "'periodic' means hours if an integer or a duration string otherwise, 'revision' means revision numbers to retain by auto compaction") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the Interpret 'auto-compaction-retention' as..
preamble helpful. It directly relates the two flags. Keep it? Maybe:
Interpret 'auto-compaction-retention' one of: periodic|revision. 'periodic' for duration based retention, defaulting to hours if no time unit is provided (e.g. '5m'). 'revision' for revision number based retention.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
your suggestion seems clearer. I'll change to that.
@@ -73,24 +77,20 @@ func (t *Periodic) Run() { | |||
continue | |||
} | |||
} | |||
|
|||
if clock.Now().Sub(last) < executeCompactionInterval { | |||
if clock.Now().Sub(last) < t.period { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we simplify the timer logic a bit here? I found the logic a bit difficult to follow with the case <-clock.After(checkCompactionInterval):
select case and the periodDivisor
combined together.
Could we instead do something like
select {
...
case <-clock.After(t.period - clock.Now().Sub(last)):
...
and then eliminate this if clock.Now().Sub(last) < t.period {
check and the periodDivisor
?
(let me know if I'm missing an important subtlety to how the timers are being use here...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpbetz I am trying to minimize any change. we can refactor the code in a future pr if you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
@@ -73,24 +77,20 @@ func (t *Periodic) Run() { | |||
continue | |||
} | |||
} | |||
|
|||
if clock.Now().Sub(last) < executeCompactionInterval { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we're improving this function. Can move the remaining code in the for loop into the case <-clock.After
for readability? It's only ever run as part of that case
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we refactor the code in a future pr?
e8b35de
to
8cc9995
Compare
made the |
@@ -29,8 +29,7 @@ var ( | |||
) | |||
|
|||
const ( | |||
checkCompactionInterval = 5 * time.Minute | |||
executeCompactionInterval = time.Hour | |||
checkCompactionInterval = 5 * time.Minute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this const can also be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checkCompactionInterval
is used Revision
compactor.
https://github.com/coreos/etcd/blob/master/compactor/revision.go#L64
I don't think it should be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
8cc9995
to
c38b923
Compare
compactor/periodic.go
Outdated
if rev < 0 { | ||
continue | ||
} | ||
|
||
plog.Noticef("Starting auto-compaction at revision %d (retention: %d hours)", rev, t.periodInHour) | ||
plog.Noticef("Starting auto-compaction at revision %d (retention: %v )", rev, t.period) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/(retention: %v )/(retention: %v)/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
c38b923
to
b4297d4
Compare
etcdmain/help.go
Outdated
@@ -99,7 +99,7 @@ clustering flags: | |||
--auto-compaction-retention '0' | |||
auto compaction retention length. 0 means disable auto compaction. | |||
--auto-compaction-mode 'periodic' | |||
'periodic' means hours, 'revision' means revision numbers to retain by auto compaction | |||
Interpret 'auto-compaction-retention' one of: periodic|revision. 'periodic' for duration based retention, defaulting to hours if no time unit is provided (e.g. '5m'). 'revision' for revision number based retention. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interpret
to make casing consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sgtm
b4297d4
to
2532594
Compare
h int | ||
) | ||
// AutoCompactionRetention defaults to "0" if not set. | ||
if len(cfg.AutoCompactionRetention) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a check see if AutoCompactionRetention is not set.
merge when green. |
_, err := t.c.Compact(t.ctx, &pb.CompactionRequest{Revision: rev}) | ||
if err == nil || err == mvcc.ErrCompacted { | ||
t.revs = remaining | ||
last = clock.Now() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we delete this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good a catch. I don't think this is meant to be deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it actually causes a bug as #9443
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I dig into on why last = clock.Now()
is deleted again. It is explained:
I change the compaction frequency to every checkCompactionInterval when T > retentionPeriod.
#8563 (comment)
I think that's cause.
fixes #8503