New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Host regex volume chooser for WALs #1607
Comments
FWIW, the Would such a chooser be useful for other scopes, other than the |
Yes, such a chooser would be useful for other scopes as well. If we have a set of tservers running on top of a particular HDFS cluster, then it might be beneficial to only choose that cluster for storing RFiles. The implementation would work for both table and "non-table" volume choosing. So in the table case, the choice configuration is similar to the PreferredVolumeChooser in that the configuration could be per table. I cannot decide whether what constitutes a "hostgroup" should be definable on a table by table bases or not. Also I updated the description per your observation above. |
After thinking a little more about this, I think this chooser should extend the PreferredVolumeChooser and hence the properties and defaults should follow the same scheme. Here is the class javadoc I am starting with. Please tell me if this makes sense:
|
I was thinking more about the use case for configuring different servers to choose different volumes and realized that we don't need a new class to achieve this. This can already be easily accomplished by setting There is some convenience in deploying the same config globally, but still behaving differently from server to server, but I think that convenience might be marginal. What do you think? |
I was a little scared to do that because of the configuration comparison mechanism that tries to ensure configurations are consistent across hosts. Are you saying that the volume chooser configurations to not get factored into that comparison? I will try to find that code. In any case when handling large systems, having inconsistent configurations on different sets of nodes is usually asking for trouble. |
Only the properties that begin with |
Well, then we will try doing this with local configurations and see how that goes. If that goes well, then I will not need to create this chooser unless you feel it is still worthwhile. |
The experiment worked well. We simply have the PerTableVolumeChooser being used at the top, and then the PreferredVolumeChooser for the logger. The preferred volumes for the logger are then set differently in the accumulo-site.xml depending on the host. This has the desired effect. That being said I still think this might be a useful chooser. It will require getting the host and port available through the parameters but that does not look too hard. This is however lower priority now given our system is working as desired. |
There is a need to allow one to tie the volumes for sets of tservers to a specific volume. This is especially useful if one wants to tie the WALs to a specific volume for performance or space purposes. I suggest that we create a HostRegexVolumeChooser that can choose a volume based on the tserver hostname. So given the following configuration:
We might create the following configuration
This will tie the WALs for host group A to volume1 and the WALs for host group B to volume2
The text was updated successfully, but these errors were encountered: