You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
each bullet is a piece of feedback directly from a reviewer. in comments underneath each is the response. Once I have a response for each, I will bake the question and answer into the relevant paragraphs in the methods section.
Methods
Data Storage
point to emphasize: have de-identified data and store it any way that is publicly accessible that makes you happy.
what kind of protocols should be considered? Only HTTP?
either
If we considered to virtualize the machines, the users might want to have different access points and applied mount for instance, via NFS or CIFS.
sure
Moreover, could be another API used as for instance mount the Storage as a Volume?
sure
Cloud environments
point to emphasize: middleware provides flexibility for deployment across varied compute resources
do you consider to use API middleware to solve the problem of different providers? There are libraries that allow to run machines from multiple clouds.
middleware can definitely solve the problem of multiple providers; in a single "cloud" (i.e. amazon or google, but not both), such middleware can be used if one chooses but is not necessary
Docker
point to emphasize: the cloud and docker enables scalability in resources and consistent performance across resources. prebuilt images and packages make such deployment relatively easy (as compared to managing a local cluster/compute resource)
is proposed to run in AWS EC2 in the case study. But what are the differences between run in a local datacenter?
compute is "infinitely" scalable, machines are isolated, and hardware is consistent, in the cloud --data centers are none of these.
Moreover, AWS has already a service dedicated to Docker containers. Could you consider to use this kind of tools in your approach?
Yup, ECS is awesome and we will update our deployment strategy to use it
On the other hand, there are already tools like Totum that may facilitate the deployment of Docker containers. Could be a pre-installed machine help to deploy new containers?
Sure, pick a machine with docker or install docker yourself, makes no difference
Open standards for data
point to emphasize: data standards make tools interoperable and goodly; data should be anonymized or equivalent so that security is never an issue.
what are the standards and how they are used? It should be clarified in the manuscript.
this doesn't really make sense to me, but my best guess at answering is to say that standards are documented and community accepted schemas for organizing data, and when one's data is compliant with the standard it enables generality of tools to apply out-of-the-box to a wider range of datasets.
Did you consider several levels of security? For instance, only allow the reviewers to access the container - online available?
again I don't really get this sentence... General policy on security is that data should be anonymized or de-identified, and there is nothing to worry about.
What are the differences of this architecture comparing with only publishing a README with instructions? Easy for end-user, complex for developer/researcher.
Creating a docker container is not significantly harder for developer/researcher, as they had to install all of the given dependencies in order for their tool to run, and write them down in a readme in order for it to be documented. docker is simply writing them down in a script which is interpreted by a virtualization engine to do the installation for you
Docker vs Vagrant?
answer this one is discussion not methods
vagrant is a layer on top of virtualization, and can sit on top of docker even. They are not really comparable in terms of execution, just in that they both document a set of installation requirements
Could be a virtual machine do the same? What are the differences for the proposed pipeline? This kind of technical details should be addressed in the discussion, because in the end, the manuscript is placed as a technical research paper.
answer this one is discussion not methods
virtual machines could do the same, but have considerably more overhead and "hard-drive" files which can bloat the system. The benefit of docker is that ultimately if you are running pipelines, you are running a set of scripts and then exiting the environment - all else considered equal, the less overhead the better, leaving more resources available to the pipeline.
The text was updated successfully, but these errors were encountered:
each bullet is a piece of feedback directly from a reviewer. in comments underneath each is the response. Once I have a response for each, I will bake the question and answer into the relevant paragraphs in the methods section.
Methods
The text was updated successfully, but these errors were encountered: