methods changes for sic #46

gkiar · 2016-12-22T21:17:37Z

each bullet is a piece of feedback directly from a reviewer. in comments underneath each is the response. Once I have a response for each, I will bake the question and answer into the relevant paragraphs in the methods section.

Methods

Data Storage

point to emphasize: have de-identified data and store it any way that is publicly accessible that makes you happy.
- what kind of protocols should be considered? Only HTTP?
  
  either
- If we considered to virtualize the machines, the users might want to have different access points and applied mount for instance, via NFS or CIFS.
  
  sure
- Moreover, could be another API used as for instance mount the Storage as a Volume?
  
  sure
Cloud environments

point to emphasize: middleware provides flexibility for deployment across varied compute resources
- do you consider to use API middleware to solve the problem of different providers? There are libraries that allow to run machines from multiple clouds.
  
  middleware can definitely solve the problem of multiple providers; in a single "cloud" (i.e. amazon or google, but not both), such middleware can be used if one chooses but is not necessary
Docker

point to emphasize: the cloud and docker enables scalability in resources and consistent performance across resources. prebuilt images and packages make such deployment relatively easy (as compared to managing a local cluster/compute resource)
- is proposed to run in AWS EC2 in the case study. But what are the differences between run in a local datacenter?
  
  compute is "infinitely" scalable, machines are isolated, and hardware is consistent, in the cloud --data centers are none of these.
- Moreover, AWS has already a service dedicated to Docker containers. Could you consider to use this kind of tools in your approach?
  
  Yup, ECS is awesome and we will update our deployment strategy to use it
- On the other hand, there are already tools like Totum that may facilitate the deployment of Docker containers. Could be a pre-installed machine help to deploy new containers?
  
  Sure, pick a machine with docker or install docker yourself, makes no difference
Open standards for data

point to emphasize: data standards make tools interoperable and goodly; data should be anonymized or equivalent so that security is never an issue.
- what are the standards and how they are used? It should be clarified in the manuscript.
  
  this doesn't really make sense to me, but my best guess at answering is to say that standards are documented and community accepted schemas for organizing data, and when one's data is compliant with the standard it enables generality of tools to apply out-of-the-box to a wider range of datasets.
- Did you consider several levels of security? For instance, only allow the reviewers to access the container - online available?
  
  again I don't really get this sentence... General policy on security is that data should be anonymized or de-identified, and there is nothing to worry about.
What are the differences of this architecture comparing with only publishing a README with instructions? Easy for end-user, complex for developer/researcher.

Creating a docker container is not significantly harder for developer/researcher, as they had to install all of the given dependencies in order for their tool to run, and write them down in a readme in order for it to be documented. docker is simply writing them down in a script which is interpreted by a virtualization engine to do the installation for you
Docker vs Vagrant?

answer this one is discussion not methods
vagrant is a layer on top of virtualization, and can sit on top of docker even. They are not really comparable in terms of execution, just in that they both document a set of installation requirements
- Could be a virtual machine do the same? What are the differences for the proposed pipeline? This kind of technical details should be addressed in the discussion, because in the end, the manuscript is placed as a technical research paper.
  
  answer this one is discussion not methods
  virtual machines could do the same, but have considerably more overhead and "hard-drive" files which can bloat the system. The benefit of docker is that ultimately if you are running pipelines, you are running a set of scripts and then exiting the environment - all else considered equal, the less overhead the better, leaving more resources available to the pipeline.

gkiar · 2016-12-22T21:18:17Z

methods section of #44

gkiar · 2017-01-04T18:16:05Z

changes addressed in recent push and tag to overleaf. :)

gkiar self-assigned this Dec 22, 2016

gkiar mentioned this issue Dec 22, 2016

Reviewer feedback #44

Closed

37 tasks

gkiar closed this as completed Jan 4, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

methods changes for sic #46

methods changes for sic #46

gkiar commented Dec 22, 2016 •

edited

Loading

gkiar commented Dec 22, 2016

gkiar commented Jan 4, 2017

methods changes for sic #46

methods changes for sic #46

Comments

gkiar commented Dec 22, 2016 • edited Loading

Methods

gkiar commented Dec 22, 2016

gkiar commented Jan 4, 2017

gkiar commented Dec 22, 2016 •

edited

Loading