Switch to pull model #13

vkuznet · 2017-04-07T12:07:40Z

Currently we implemented push approach:

The way how it's done now in my prototype is the following: a client sends request to an agent to transfer dataset /a/b/c from to site X. The agent first checks if it has this dataset, if so, it initiates the transfer by pushing data from itself to site X. If that agent does not have this dataset it broadcasts request to all known agents. The agent who has it replies and request is delegated to that agent. This agent then pushes the data from itself to site X.

It has some flaw, e.g. site can go down or experience maintenance or run out of disk space, therefore we need to explore, develop and eventually switch to pull model.

Sites today have complete control over the agent that puts data into their site. This is a design choice that was made in order to put the responsibility for transfers onto the site ops team. E.g. the site can turn off their agent when they have problems with storage. They can throttle it if there are issues. They can stop the agent if they loose disk and thus run out of space, or run out of space for some other reason. In pull model request will land to a site which request the data and fetch it from original site. From the above description we'll redirect request to agent sitting on site X and it will download dataset /a/b/c from whatever site holds its copy.

rishiloyola · 2017-04-11T20:16:03Z

Implementation Details - First, the client will select an agent which has the complete requested data set. After deciding an agent it will create the tranferRequest and will pass it to the request manager of site B on /manager endpoint. Request manager will store the request in the pool. The request manager will approve requests from the pool, if site conditions are good (no disk issue) and may disapprove request is site needs time to handle its own issues.

Instead of designing new end-point to pull the data, request manager will pass the transferRequest to the selected agent on /request endpoint. After getting the request from siteB that agent will push the data on upload endpoint.

The request manager will approve the request based on two parameters - Time and data size. If the site has enough storage capacity then only manager will approve the request.

vkuznet · 2017-04-11T20:25:06Z

Rishi, good but please outline where things can break :) E.g. what if site admin of the site which suppose to push data will decide at that moment to shutdown the site. I don't think we need such complexity for delegation. Best, Valentin.

…

On 0, Rishi ***@***.***> wrote: ![screen shot 2017-04-12 at 1 33 15 am](https://cloud.githubusercontent.com/assets/10094679/24928429/09d9f9dc-1f20-11e7-9ec5-2516f0b15630.png) First, the client will select the agent which has the complete requested data set. Then it will create the tranferRequest and will send it to the request manager of site B on `/manager` endpoint. Request manager will store the request in the pool. The request manager will approve requests from the pool, if site conditions are good (no disk issue) and may disapprove request is site needs time to handle its own issues. Instead of designing new end-point to pull the data, request manager will pass the transferRequest to the selected agent on `/request` endpoint. Now that agent will get the transfer request and will push the data. The request manager will approve the request based on two parameters - Time and data size. If the site has the enough storage capacity then only manager will approve the request. -- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: #13 (comment)

rishiloyola · 2017-06-02T13:40:30Z

If the transfer fails then we will again make the new instance of the request and will send it to the mentioned agent. If the same request fails more then three times then we will throw an error and will stop the process.

vkuznet · 2017-06-02T14:01:03Z

Correct, but don't make 3 as a hard-coded number, it should be configurable.

…

On 0, Rishi ***@***.***> wrote: If the transfer fails then we will again make the new instance of the request and will send it to the mentioned agent. If the same request fails more then three times then we will throw an error and will stop the process. -- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: #13 (comment)

vkuznet added the api-development label May 10, 2017

vkuznet added this to the June development milestone May 10, 2017

rishiloyola closed this as completed Jul 13, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch to pull model #13

Switch to pull model #13

vkuznet commented Apr 7, 2017

rishiloyola commented Apr 11, 2017 •

edited

Loading

vkuznet commented Apr 11, 2017 via email

rishiloyola commented Jun 2, 2017

vkuznet commented Jun 2, 2017 via email

Switch to pull model #13

Switch to pull model #13

Comments

vkuznet commented Apr 7, 2017

rishiloyola commented Apr 11, 2017 • edited Loading

vkuznet commented Apr 11, 2017 via email

rishiloyola commented Jun 2, 2017

vkuznet commented Jun 2, 2017 via email

rishiloyola commented Apr 11, 2017 •

edited

Loading