Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broker truncation #2276

Merged
merged 9 commits into from
Apr 15, 2015
Merged

Broker truncation #2276

merged 9 commits into from
Apr 15, 2015

Conversation

otoolep
Copy link
Contributor

@otoolep otoolep commented Apr 14, 2015

This change implements basic topic truncation on Brokers. It does this by periodically checking each topic for its size, and if greater than a certain threshold and the topic data has been replicated, deleting segments until the required size is reached.

The change also exposes maximum topic and segment sizes via configuration, as I believe some people may want to change these numbers.

I decided to implement this via a go-routine, so that the function to check the topic sizes could simply walk the filesystem, and not have to worry too much about efficiency vis a vis caching sizes in RAM. I think a design whereby truncation is performed every few minutes is fine (it's how Apache Kafka works), and this approach seems simpler and more robust.

Remaining:

  • letting data nodes know when they must redirect to other nodes for topic data.

@otoolep otoolep force-pushed the broker_truncation branch 2 times, most recently from a725955 to dc896ce Compare April 14, 2015 02:02
@@ -24,6 +24,12 @@ import (
// only occurs when the reader is at the end of all the data.
const DefaultPollInterval = 100 * time.Millisecond

// DefaultMaxTopicSize is the largest a topic can get before truncation.
const DefaultMaxTopicSize = 1024 * 1024 * 1024 // 10MB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

//1GB

@otoolep
Copy link
Contributor Author

otoolep commented Apr 14, 2015

Thanks @jwilder -- comment was a copy 'n' paste error.

@otoolep
Copy link
Contributor Author

otoolep commented Apr 14, 2015

Thanks for feedback @jwilder -- updated.

@otoolep otoolep force-pushed the broker_truncation branch 3 times, most recently from 207b900 to 13dfdc6 Compare April 14, 2015 03:23
@jwilder
Copy link
Contributor

jwilder commented Apr 14, 2015

LGTM. I'm a little concerned about how frequently ReadSegments is called since it's doing a Readdir(0) each time. It's called to get the topic Size() and then to get the segment size in a loop multiple times for each topic. If there are a lot of Topics to truncate, this might cause some sporadic performance issues. Don't think you need to change it now but we might want to see how it performs w/ a large numbers of topics.

@otoolep otoolep force-pushed the broker_truncation branch 3 times, most recently from f5f78f9 to 24196e7 Compare April 14, 2015 04:44
@otoolep
Copy link
Contributor Author

otoolep commented Apr 14, 2015

@jwilder -- I don't think it would make a huge impact, but it's not great, I agree. Take a look at the modified implementation. I think it's safe, since the topic is locked.

@otoolep otoolep force-pushed the broker_truncation branch 2 times, most recently from 0309647 to 2413ad6 Compare April 14, 2015 04:53
return
}
nBytesDeleted += size
segments = segments[1 : len(segments)-1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is slicing an item off on both ends. Shouldn't it be segments[1:]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it should be. :-) That's what you get when you don't unit-test. :-)

@jwilder
Copy link
Contributor

jwilder commented Apr 14, 2015

Yeah, that looks better.

@otoolep
Copy link
Contributor Author

otoolep commented Apr 14, 2015

OK, updated so there is only a single read of the topic directory. Since the topic is locked, segments can't change since writing to the topic also takes the lock.

@otoolep otoolep force-pushed the broker_truncation branch 9 times, most recently from d021b6f to b6d68bb Compare April 14, 2015 22:12
@@ -691,6 +744,13 @@ func (t *Topic) IndexForURL(u url.URL) uint64 {
return t.indexByURL[u]
}

// SetIndexForURL sets the replicated index for a given data URL.
func (t *Topic) SetIndexForURL(index uint64, u url.URL) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exported this so I could easily unit-test.

If a data node requests a topic index that is earier than is present for
a topic, tombstones allow the broker to know that the data node should
be redirected to another node that has the topic's data already
replicated. If no tombstone exists, then the broker can simply restart
replaying the topic data it has.
return
}
}
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this to a named function so it's isolated? Also, the Open() is already kind of big.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, was on the fence about this.

@benbjohnson
Copy link
Contributor

Overall lgtm, mostly minor nits.

@otoolep
Copy link
Contributor Author

otoolep commented Apr 14, 2015

Thanks @benbjohnson -- maybe I'll just scrap the named returns. Might be clearer.

@otoolep
Copy link
Contributor Author

otoolep commented Apr 14, 2015

Named returns scrapped, simpler is better. Green build.

otoolep added a commit that referenced this pull request Apr 15, 2015
@otoolep otoolep merged commit ba90815 into master Apr 15, 2015
@otoolep otoolep deleted the broker_truncation branch April 15, 2015 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants