-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to pull model #13
Comments
Implementation Details - First, the client will select an agent which has the complete requested data set. After deciding an agent it will create the tranferRequest and will pass it to the request manager of site B on Instead of designing new end-point to pull the data, request manager will pass the transferRequest to the selected agent on The request manager will approve the request based on two parameters - Time and data size. If the site has enough storage capacity then only manager will approve the request. |
Rishi,
good but please outline where things can break :)
E.g. what if site admin of the site which suppose to push data will decide at
that moment to shutdown the site.
I don't think we need such complexity for delegation.
Best,
Valentin.
…On 0, Rishi ***@***.***> wrote:
![screen shot 2017-04-12 at 1 33 15 am](https://cloud.githubusercontent.com/assets/10094679/24928429/09d9f9dc-1f20-11e7-9ec5-2516f0b15630.png)
First, the client will select the agent which has the complete requested data set. Then it will create the tranferRequest and will send it to the request manager of site B on `/manager` endpoint. Request manager will store the request in the pool. The request manager will approve requests from the pool, if site conditions are good (no disk issue) and may disapprove request is site needs
time to handle its own issues. Instead of designing new end-point to pull the data, request manager will pass the transferRequest to the selected agent on `/request` endpoint. Now that agent will get the transfer request and will push the data.
The request manager will approve the request based on two parameters - Time and data size. If the site has the enough storage capacity then only manager will approve the request.
--
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#13 (comment)
|
If the transfer fails then we will again make the new instance of the request and will send it to the mentioned agent. If the same request fails more then three times then we will throw an error and will stop the process. |
Correct, but don't make 3 as a hard-coded number, it should be configurable.
…On 0, Rishi ***@***.***> wrote:
If the transfer fails then we will again make the new instance of the request and will send it to the mentioned agent. If the same request fails more then three times then we will throw an error and will stop the process.
--
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#13 (comment)
|
Currently we implemented push approach:
The way how it's done now in my prototype is the following: a client sends request to an agent to transfer dataset /a/b/c from to site X. The agent first checks if it has this dataset, if so, it initiates the transfer by pushing data from itself to site X. If that agent does not have this dataset it broadcasts request to all known agents. The agent who has it replies and request is delegated to that agent. This agent then pushes the data from itself to site X.
It has some flaw, e.g. site can go down or experience maintenance or run out of disk space, therefore we need to explore, develop and eventually switch to pull model.
Sites today have complete control over the agent that puts data into their site. This is a design choice that was made in order to put the responsibility for transfers onto the site ops team. E.g. the site can turn off their agent when they have problems with storage. They can throttle it if there are issues. They can stop the agent if they loose disk and thus run out of space, or run out of space for some other reason. In pull model request will land to a site which request the data and fetch it from original site. From the above description we'll redirect request to agent sitting on site X and it will download dataset /a/b/c from whatever site holds its copy.
The text was updated successfully, but these errors were encountered: