Python jupyter workbooks exploring doing neuroscience shuffle analyses in the cloud

Notes on deploying a jupyter container to Google cloud.

I created a jupyter Docker container with pre-installed nelpy and a different default password by forking the docker-stacks repostiory. My fork is now pushed to github ckemere/docker-stacks.

Option 1 - Start up a virtual machine with Docker and then manually start container

Create an instance using either the web console interface or the command line. Here's an example command line:

gcloud compute instances create [instance-name] --image=cos-stable-61-9765-79-0 --image-project=cos-cloud \
   --machine-type=n1-standard-64

This will create a n1-standard-64 instance using the container-optimized OS. Note that cos-stable-61-9765-79-0 was the most recent stable image as of October 2017, but may change over time. Also note I had to be logged in to see the actual name. There is a magic option, --metadata-from-file user-data=[file name], that I've explored using to automatically download and start up the Docker container, but that has generally not worked reliably.

You need to open the 8888 (and 8787 if using Dask) ports for your instance, or ssh tunnel to access them. If you want to do this manually using a terminal, you can use the commands:
```
gcloud compute firewall-rules create jupyter-notebook --allow https:8888
gcloud compute firewall-rules create dask-webinterface --allow https:8787
```
Note that these will create these firewall rules for your entire Google Cloud project. If you want to be more selective, you can do them just for an instance. (gcloud compute firewall-rules create jupyter-notebook --allow https:8888 --source-tags=[instance-name]).

Further note: Despite the firewall rules. I haven't been successful in getting the dask scheduler dashboard (port 8787) to expose to the world. To access the scheduler, I've needed to tunnel the port, using something like gcloud compute ssh [instance-name] -- -L8787:localhost:8787.

ssh into your instance. You can use the browser-based shell or the gcloud command, gcloud compute ssh [instance-name].
Docker is already installed, so to start the container, you need to run the command:

docker run ckemere/jupyter --net=host

The --net=host flag exposes all the container ports to the instance.

Now, find your instance's IP address from the web console, and load it into your browser https://instance-ip:8888 and you should see a jupyter notebook.
In my typical workflow, I then use the jupyter terminal interface to clone whatever repository has my analysis scripts into the container. In the example notebooks, you'll also see how to use the gcsfs package to load data stored in google cloud data buckets. You can upload data to these using a drag and drop web interface through the cloud console.

Option 2 - Startup a container using the alpha direct container option in google cloud.

You have to register for access to this, but it lets you just start a container during instance boot up. I've mainly done this in the web console. Once you've been granted access, a check box that says "Deploy a container image to this VM instance." will appear and you can chose that and then type in ckemere/jupyter as the container name. In this option, the container automatically exposes ports to the main network, so you don't need to worry about the -p option. You will, however, need to open your firewall if you want to access your notebooks directly (vs. an ssh tunnel).

Option 3 - Startup a container and a dask scheduler using the Container Engine, rather than the Compute Engine

See the dask-kubernetes repository for more information about this. Note that I've also created a version of the dask-kubernetes Docker image that has nelpy installed. It's called ckemere/dask-kubernetes. You can see those Docker files in my fork of the repository.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.gitignore		.gitignore
Compare Run and PBE HMMs Pruned.ipynb		Compare Run and PBE HMMs Pruned.ipynb
Compare Run and PBE HMMs.ipynb		Compare Run and PBE HMMs.ipynb
Dask Distributed test.ipynb		Dask Distributed test.ipynb
Explore Latent RUN Space-Backup.ipynb		Explore Latent RUN Space-Backup.ipynb
Explore Latent RUN Space.ipynb		Explore Latent RUN Space.ipynb
LICENSE		LICENSE
README.md		README.md
RUN and PBE VTCs.ipynb		RUN and PBE VTCs.ipynb
Simple Dask Distributed Test.ipynb		Simple Dask Distributed Test.ipynb
Start Dask Cluster.ipynb		Start Dask Cluster.ipynb
Test HDF5 Load.ipynb		Test HDF5 Load.ipynb
Test Reading Data from Google Cloud from Cloud Instance.ipynb		Test Reading Data from Google Cloud from Cloud Instance.ipynb
Test Writing Data to Google Cloud.ipynb		Test Writing Data to Google Cloud.ipynb
Write HDF5.ipynb		Write HDF5.ipynb
draw_place_fields.ipynb		draw_place_fields.ipynb
load_hc3_to_nelpy.ipynb		load_hc3_to_nelpy.ipynb
model_plotting.py		model_plotting.py
replay.py		replay.py
score_bayes_parallel-dask-joblib.ipynb		score_bayes_parallel-dask-joblib.ipynb
score_bayes_parallel-dask.ipynb		score_bayes_parallel-dask.ipynb
score_bayes_parallel-orig.ipynb		score_bayes_parallel-orig.ipynb
sparsity-1d.ipynb		sparsity-1d.ipynb

License

ckemere/CloudShuffles

Folders and files

Latest commit

History

Repository files navigation

Python jupyter workbooks exploring doing neuroscience shuffle analyses in the cloud

Notes on deploying a jupyter container to Google cloud.

Option 1 - Start up a virtual machine with Docker and then manually start container

Option 2 - Startup a container using the alpha direct container option in google cloud.

Option 3 - Startup a container and a dask scheduler using the Container Engine, rather than the Compute Engine

About

Topics

Resources

License

Stars

Watchers

Forks

Languages