Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli: Redefine Stores Flag #4577

Closed
BramGruneir opened this issue Feb 22, 2016 · 20 comments
Closed

cli: Redefine Stores Flag #4577

BramGruneir opened this issue Feb 22, 2016 · 20 comments
Assignees
Milestone

Comments

@BramGruneir
Copy link
Member

Split the --stores flag into having multiple --store flags.

The following is old, see Ben's comment below to see the format we plan to use.

So the new flag will be used once per store, and contain the following values, in order:

  • attributes always optional, separated by commas
  • size optional unless using an in memory store
  • location optional if defining and in memory store

So some examples:
// attributes, size and location
--store=abc1,def2,20%=/mnt/hda1

// attribute, size and location
--store=abc1,200000=/mnt/hda1

// only size and location
--store=50%=/mnt/hda1

// only attribute and size (in memory)
--store=abc1,200000

// only size (in memory)
--store=34%

// attributes with no size
--store=abc1,def2=/mnt/ssd1

So one limitation of this format is that you can't have an all numeric attribute without defining a size. But this format doesn't preclude them, just define the size as 100%.
--store=12345,100%=/mnt/ssd2

This format does mean that you can't use commas as attributes, but I think that's reasonable. Currently, the colon is used instead.

This format is not set in stone, if anyone can think of a better way to define this flag, please reply to this issue.

The implementation of this new format is being tracked in issue #4095.

@BramGruneir BramGruneir added this to the Beta milestone Feb 22, 2016
@petermattis petermattis changed the title CLI: Redefine Stores Flag cli: Redefine Stores Flag Feb 22, 2016
@tbg
Copy link
Member

tbg commented Feb 22, 2016

I think it should also be possible to specify sizes in human-readable formats (mb, gb) and I would avoid the percentage symbol (bad for scripting). Percentages can be specified as .5 (for 50%) and I think this should be self-explanatory if you see it.

I don't like muddling up the size with the attributes and special-casing based on its being numeric, but somehow an extra separator doesn't make it better.

--store abc,def,.8=ssd
--store abc,def,50gb=ssd
--store 50gb=/tmp/data
--store abc,def=/tmp/data
--store /tmp/data

Looks good to me.

@petermattis
Copy link
Collaborator

It feels awkward for the size and labels to be intermixed like that. Perhaps the size should be distinguished with different syntax:

--store abc=/tmp/data:50%
--store def=/tmp/data2:1gb

@BramGruneir
Copy link
Member Author

@petermattis By putting the amount after the directory, it limits the directory syntax.
What about the special cases?
In memory with attribute?
In memory without attribute?

@petermattis
Copy link
Collaborator

Is limiting the directory syntax a problem? We could use a rarer delimiter like @ or [:

--store /tmp/data@50%
--store /tmp/data[50%]

@tbg
Copy link
Member

tbg commented Feb 22, 2016

We discussed that it's a bad idea to have anything after the path since pretty much anything is a legit path name.

@tbg
Copy link
Member

tbg commented Feb 22, 2016

I like attrs=/path=asdasd@size though. Maybe we can just limit the allowed paths.

@bdarnell
Copy link
Contributor

We could also allow the delimiter to be escaped so that it wouldn't be completely disallowed.

My preference would be to demote attributes since they're something of an advanced feature, so you'd specify attributes with attr=abc,def and the size could be size=50%. I'd probably even specify the path with a path= prefix when combined with any other syntax (so there would just be a special case for the case where the entire --store flag is a single path)

The examples become:

--store=attr=abc1,def2;size=20%;path=/mnt/hda1
--store=attr=abc1;size=200000;path=/mnt/hda1
--store=size=50%;path=/mnt/hda1
--store=attr=abc1;size=200000;type=mem  # in-memory usage is rare enough that it should be requested explicitly instead of defaulted
--store=size=34%;type=mem
--store=attr=abc1,def2;path=/mnt/ssd1

As long as we keep the simple case of --store /mnt/sda1 working, I don't think we need to push too hard for conciseness in the more complex cases that use attributes or size limits. It's better to have a simple, regular syntax.

@petermattis
Copy link
Collaborator

Ben's proposal is fine by me.

@BramGruneir
Copy link
Member Author

I'm leaning toward's Ben's proposal as well. I like the explicit nature of it.
Also, shorthand -s instead of always having to write --store.

@jseldess
Copy link
Contributor

I looked through the mongo, cassandra, and hbase docs to see how they handle such settings. Really hard to track it all down! I'll look more tomorrow, but I find Ben's proposal a lot more intuitive than what I've been seeing.

@tbg
Copy link
Member

tbg commented Feb 23, 2016

I like @bdarnell's proposal as well, but I'm still in favor of supporting human-readable sizes (at least to some extent: 100gb, 55mb, not necessarily 5.1251mb).

@petermattis
Copy link
Collaborator

@tschottdorf Agreed about human-readable input. See #4348.

@BramGruneir
Copy link
Member Author

I like human readable as well. I found a library https://github.com/dustin/go-humanize that I'm going to be referencing to do the parsing.

@tbg
Copy link
Member

tbg commented Feb 23, 2016

Go-humanize does parsing?

On Tue, Feb 23, 2016, 00:44 Bram Gruneir notifications@github.com wrote:

I like human readable as well. I found a library
https://github.com/dustin/go-humanize that I'm going to be referencing to
do the parsing.


Reply to this email directly or view it on GitHub
#4577 (comment)
.

@petermattis
Copy link
Collaborator

@BramGruneir
Copy link
Member Author

Any final comments?
The plan right now is to implement Ben's style with human-readable values.

@petermattis
Copy link
Collaborator

Do it.

@JackKrupansky
Copy link

For reference, Cassandra supports the JBOD model for storage - Just a Bunch Of Disks. The data_file_directories parameter in the cassandra.yaml configuration file can specify a list if directories, typically one per storage device, with no other settings for each directory/device.

Then it gets messy...

Prior to release 3.2, on each memtable/sstable flush it will select the directory (device) which has the most available space as a percentage of the total available space on all of the listed directories/devices. Actually, it uses a random weighted value so it won't always pick the directory/device with the most space, one goal being to balance writes for performance.

As of release 3.2, Cassandra simply splits the range of tokens (the partition key subset of the primary key is hashed to produce a token) to even cover each listed storage device.

Since 3.2, there currently isn't any compensation for differences in capacity or free space on the individual devices.

The important thing since 3.2 was to assure that the sstables for a given token would not be split across devices since that would open the door to zombie data - deleted data on one device which could come back to life if the device containing its tombstone were to be lost and the node (partially) repaired from another node.

There are probably also still some cases where the pre-3.2 behavior is still used (if the configured partitioner does not support a token range splitter, for example, but that would not be common.)

3.2 was released on January 12, 2016.

@BramGruneir
Copy link
Member Author

So I need to make a small change to the format. Going with a semicolon to separate the parameters looks good on paper... but semicolons don't work well in bash.
--store=attr=abc1,def2;size=200000;path=/mnt/hda1

So, slight change, commas to separate parameters and colons to separate attributes.
--store=attr=abc1:def2,size=200000,path=/mnt/hda1

@BramGruneir BramGruneir assigned BramGruneir and unassigned jseldess Feb 25, 2016
BramGruneir added a commit to BramGruneir/cockroach that referenced this issue Feb 25, 2016
BramGruneir added a commit to BramGruneir/cockroach that referenced this issue Feb 27, 2016
Fixes cockroachdb#4577

This changes the --stores flag in which all stores are specified in a single
command line flag into the --store flag.
This new flag adds in the option to specify the size for non in memory stores.
@jseldess
Copy link
Contributor

Covered in docs for cockroach start.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants