Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ACCUMULO-4752 Create documentation on improving performance #46

Merged
merged 6 commits into from Dec 8, 2017

Conversation

mikewalch
Copy link
Member

No description provided.

@mikewalch mikewalch changed the title ACCUMULO-4752 Create documentation on improving peformance ACCUMULO-4752 Create documentation on improving performance Dec 6, 2017

1. Decrease the [major compaction ratio][compaction] of a table to decrease the number of
files per tablet. Less files reduces the latency of reads.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusting table.file.compress.blocksize and table.file.compress.blocksize.index can also be helpful. Lowering table.file.compress.blocksize can result in better random seek performance. However it increases the index size in the file. If the indexes are too large to fit in cache, this can hinder performance. Also, as the index size increases the depth of the index tree in each file may increase. Increasing table.file.compress.blocksize.index can reduce the depth of the tree.


1. Increase the [major compaction ratio][compaction] of a table to limit the number of major compactions
which improves ingest peformance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting table.file.compress.type=snappy can increase write spead at the expense of using more disk space.

@keith-turner
Copy link
Contributor

@mikewalch the picture looks great. A metablock contains all of the the root nodes for each locality group. So this is slightly off in the picture.

@mikewalch
Copy link
Member Author

@keith-turner, I updated the picture. Let me know if that works.

@mikewalch mikewalch merged commit 0a525d5 into apache:master Dec 8, 2017
@mikewalch mikewalch deleted the accumulo-4752 branch December 8, 2017 21:58
asfgit pushed a commit that referenced this pull request Dec 8, 2017
ACCUMULO-4752 Create documentation on improving performance (#46)

* Also, created documentation on RFile along with diagram
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants