Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

container auto create - turn on and off #102

Closed
gilv opened this issue Dec 6, 2016 · 5 comments
Closed

container auto create - turn on and off #102

gilv opened this issue Dec 6, 2016 · 5 comments

Comments

@gilv
Copy link
Contributor

gilv commented Dec 6, 2016

If Spark persists an objects in the container that is not exists, current code will automatically create the container. To make this happen, Stocator init method always checks if container exists and if not creates it automatically. All this happen in the init method of SwiftAPIClient.

In certain cases we would like to disable automatic container creation and skip the check if container exists during init method. This will save our cost of the HEAD Container operation during the init method.

There is need to introduce new property "fs.stocator.dataroot.autocreate" and make it true by default, when this property is not provided in the configuration. But if it's provided and the value is false, the init method should not check if container exists and skip automatic container creation.

@gilv
Copy link
Contributor Author

gilv commented Dec 6, 2016

@djalova Can you check it please?

@michaelfactor
Copy link
Collaborator

@gilv In addition to the above, instead of doing the head at all, would it make sense to try the PUT of the object, catch the exception for the non-existing container and create the container, and then redo the object PUT? The check first approach typically involves two calls but can be three. The try and recover approach is typically one call but can be three. The check first approach is simpler code.

@gilv
Copy link
Contributor Author

gilv commented Dec 6, 2016

@michaelfactor it depends on what is the 1st operation. If it's PUT - then your suggestion is fine. But it may be HEAD, or GET, etc.. so we will need to try-catch many operation in Stocator. It will be complicated change, since there is no single point when there is HEAD or GET.

@michaelfactor
Copy link
Collaborator

@gilv Agreed completely -- this is a performance-simplicity tradeoff and the recovery on the HEAD or GET will be different for the PUT. Not sure you need to autocreate on HEAD or GET of an object since if the container doesn't exist, obviously the object doesn't. But this is obviously a performance optimization and we need to understand its value in practice before rushing to make the change.

@djalova
Copy link
Contributor

djalova commented Dec 13, 2016

@gilv I think all operations start out by calling FileSystem.exists() whether its a PUT or GET operation. In the case where auto create is off should we just throw a new exception when a container is not present?

@gilv gilv added the on-hold label Jun 21, 2017
@gilv gilv closed this as completed Aug 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants