Skip to content

Commit

Permalink
HBASE-24455 Correct the doc of "On the number of column families" (#1799
Browse files Browse the repository at this point in the history
)

Signed-off-by: Wellington Ramos Chevreuil <wchevreuil@apache.org>
Signed-off-by: Viraj Jasani <vjasani@apache.org>
  • Loading branch information
bsglz committed Jun 1, 2020
1 parent f5b90fc commit 716702a
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions src/main/asciidoc/_chapters/schema_design.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -127,8 +127,9 @@ ____
== On the number of column families
HBase currently does not do well with anything above two or three column families so keep the number of column families in your schema low.
Currently, flushing and compactions are done on a per Region basis so if one column family is carrying the bulk of the data bringing on flushes, the adjacent families will also be flushed even though the amount of data they carry is small.
When many column families exist the flushing and compaction interaction can make for a bunch of needless i/o (To be addressed by changing flushing and compaction to work on a per column family basis). For more information on compactions, see <<compaction>>.
Currently, flushing is done on a per Region basis so if one column family is carrying the bulk of the data bringing on flushes, the adjacent families will also be flushed even though the amount of data they carry is small.
When many column families exist the flushing interaction can make for a bunch of needless i/o (To be addressed by changing flushing to work on a per column family basis).
In addition, compactions triggered at table/region level will happen per store too.
Try to make do with one column family if you can in your schemas.
Only introduce a second and third column family in the case where data access is usually column scoped; i.e.
Expand Down

0 comments on commit 716702a

Please sign in to comment.