table_prefix assumes underscore separator #3

Closed
johnsto opened this Issue Jun 4, 2012 · 3 comments

Projects

None yet

2 participants

@johnsto
johnsto commented Jun 4, 2012

The table_prefix parameter on Connection is incredibly useful, except it mandates that you use the underscore character as a separator. This is a problem in situations where you don't want any character after your prefix (i.e. it's a set length), or indeed want to use something else.

I propose a change that requires users specify the separator of their choice as art of the prefix. The effort required for users to update their code would be minimal, and the code would benefit from not having a "table_prefix_separator"!

-d

@wbolster
Owner
wbolster commented Jun 4, 2012

Your request sounds reasonable. However I think the common case is the one where you do want a separator for legibility reasons, so having a table_prefix_separator doesn't look that bad to me. Additionally, it wouldn't break existing code. (Although I doubt that there is much stable production code using HappyBase, since it's still a really young project.)

If the separator is part of the prefix, users end up with ugly strings like "foo_" in their code in the common case. HappyBase was designed to hide ugly code (Thrift!) after all!

@wbolster wbolster added a commit that referenced this issue Jun 4, 2012
@wbolster Add table_prefix_separator argument to Connection()
Also cleanup error handling in the constructor. Closes issue #3.
45db2b5
@wbolster wbolster closed this Jun 4, 2012
@johnsto
johnsto commented Jun 5, 2012

Cheers for fixing this! I still think Connection(table_prefix="cleese_") reads better than Connection(table_prefix="cleese", table_prefix_separator="_"), but guess that's personal taste :)

@wbolster
Owner
wbolster commented Jun 5, 2012

Your example is not realistic, since the table_prefix_separator argument is optional. Your example would rather look like this:

c = Connection(table_prefix="cleese")

…which I think is rather clean. Only if you want a different separator, the code would become a bit more verbose.

Btw, I don't see why one would not want an underscore separator between the prefix and the table name itself. Having an underscore in the name makes the web status reports for master and region nodes easier to read, and also the output of commands like list in the HBase shell. If one is forced to work with existing tables, things may be different, but friendly code should make the common case easy, and the uncommon case possible. Anyway, I agree with PEP20 that beautiful is better than ugly. :)

Note that the HBase storage model does not penalise long table table names, in contrast to long column names. Column names (both family and qualifier) are written to disk for each KeyValue instance, whereas the table names are only used as directory names and contained in keys in some of the metadata tables.

@wbolster wbolster was assigned Jun 5, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment