-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker pilot run #43
Comments
…ng all scripts; Stata unsupported
@gentzkow I've written a Dockerfile that produces a minimal working Docker image that could be used to replicate results. Right now, it builds an image that sets up the dependencies and environment required for running The below steps set up and run a Docker container that recreates all
|
Thanks @szahedian! Very cool I've installed Docker and played around with their tutorial a bit. On the build step above I get an error when installing R: A couple of questions about how this would work:
|
Hmm that's interesting. I've experimented some more locally and that section of the
Yes it is possible to take control of a running container through shell commands as if you had ssh-ed into it. The simplest way is to specify interactive mode. You can try this out with something like
Yes that's right. What I've written imagines the following usage pattern:
If we put the current image on Docker Hub, the usage pattern would be the same but we could eliminate (ii). If we continue with Docker, we should probably discuss the usage pattern we want to support. For instance, in this current version, none of the repo code is actually housed in the container; the user's copy of the repo is mounted into the Docker container, and the container is unaware that the files it operates on are "external". We could make edits to But there could be other usage patterns where we, say, do copy the repo into the container, and allow users to execute CLI commands inside a full Docker sandbox. This seems closer to the world where users just |
Thanks @szahedian. Still getting an error; this time on the Lyx step I'm a bit confused by this:
It sounds at first glance like "the repo is not in the container; the repo is in the container." Can you clarify? An ideal usage pattern to support would be
|
Darn! I've pushed some further edits that will hopefully allow it to build and be run interactively (details below). @gentzkow if this doesn't fix it we can go through a more deliberate debugging process.
I can see how that may have been unclear. Let me try to do better. The cloned repo exists as bits on your computer's disk, which your OS is able to address. When you mount the repo into the Docker container, the container is able to address those same bits, only to the container they appear at a different location in the filesystem — concretely,
Sounds good! This doesn't seem too far from the path we were on. Along with the (hopeful) bug fixes, I've edited the Dockerfile so that the container can be run interactively. The steps are similar.
Then you should see a bash prompt. You'll finally need to run |
Your response to the "in the container" question is super clear. Thanks! Still getting errors. I tried commenting out the Lyx install but I get an error on the Conda install step as well. Shucks. Rather than spend a bunch of time debugging this, though, what about if you go ahead and push the image to Docker Hub then I try pulling the image from there? In some ways that's actually a better test of what we want to do. Another question: Is it possible to interact w/ applications running in the container in GUI as well? E.g., could I open up a Jupyter notebook in the container and work with it? Or could I open the Lyx application and make changes to a file in the GUI interface? |
I pushed the build to Dockerhub. It should be accessible by running:
The answer to this question depends on the application you'd like to run graphically. A Jupyter notebook started in Docker can be accessed graphically. You can test this for yourself by running For general applications like LyX that require the OS to render the interface, it becomes more difficult to access these through the container. The procedure is simplest on Linux, and seem more complicated on MacOS. Even if the steps aren't too difficult, they may become technically opaque and outweigh the setup cost savings of using Docker in the first place. In the case of LyX, though, this may not be necessary because a user could install LyX on their local machine, edit files, and then compile them using the CLI inside Docker — because of the file mount, changes made on the local machine would be visible to the Docker container. |
Success! This works great. I successfully ran run_all.py from the container and also played around with running some other scripts a la carte. I think this would be an excellent way to ship replication code. A couple of things I think we might want to do when we create an actual replication archive:
Let me know if you agree with those and we can open separate issues to implement. I guess the remaining question is whether we can make it work with Stata. Do you want to wrap this issue and open a new one to investigate that? |
Awesome! Glad to hear it works.
Sounds good! How should I proceed with respect to branches? I'm thinking I close this issue, but continue using this same branch to integrate Stata. That way all updates to code and documentation explaining Docker can be made in one go. |
Great! I don't think we want to commit gslab_make directly to the template. We just made the switch to submodules in #38 and I think that will work great for our internal work. What I was thinking of was flipping over to direct commit when we release replication packages. We can decide that down the line. For (2), I'm thinking we might just want to add an option to config.yaml called suppress_git_warnings or something like that. We could then flip that switch when we release a replication archive. Unless that would be super easy to implement though I'd be fine setting it aside for now. On Stata, yes -- let's open a new issue but continue work in this branch. And obviously we do not merge the branch back to master. |
Concerning (2), those warnings are caused by A more considered approach may be to suppress those warnings at the level of |
Thanks. We could have It occurs to me that we'd probably always want to disable those warnings when someone runs |
I'll keep thinking about this. I propose we pause on "suppress git warnings" until we close #45, because how we signal to suppress warnings may depend on changes we make to the structure of |
In this issue, we will demonstrate replication of a simple project within a Docker container.
The text was updated successfully, but these errors were encountered: