This is an examle of how to setup instance on floydhub to run lesson 1 (currently with the sample dataset) of the awesome deep learning course available at http://course.fast.ai/.
First, you'll need a floydhub account and have the floyd CLI installed. Follow their online instruction at https://www.floydhub.com/welcome.
Next, you'll want to create a working directory in your local computer. All files in this directory will be uploaded to the floydhub cloud instance when you do floyd run
(explained in detail later).
mkdir ~/Projects/
cd ~/Projects/
If you would like to skip all the details and just to get things going, you may choose to clone all files in this repo.
git clone https://github.com/YuelongGuo/floydhub.fast.ai.git
cd floydhub.fast.ai/
Now, to fill in with a bit more details:
The fast.ai github repository (https://github.com/fastai/courses/tree/master/deeplearning1/nbs) includes files for all the phase I lessons. For the purpose of this example, and to keep it simple, I copied only the bare minimul necessary files to the working directory. This includes:
- lesson1.ipynb
- utils.py
- vgg16.py
- vgg16bn.py
- some data to analyze
I only copied the dogscats sample dataset (dogscats/sample if you unzip the dogscats.zip at http://www.platform.ai/files/dogscats.zip). However, this is considered very bad practice, because I am creating a copy of the data every time I start a new cloud instance. Floydhub has some instruction on how to manage data seperately, which I tried but haven't had it all figured out. I'll keep trying because for the full dataset this is necessary.
The scripts written in lesson 1 runs with python 2 + Keras + Theano, so you need to specify --env theano:py2
when you start the cloud instance (more details below). However, Floydnet theano:py2 environment is still missing one package - bcolz. For this, you need to add it to file "floyd_requirements.txt". This is the floydnet default for installing dependencies.
echo "bcolz" > floyd_requirements.txt
We will use Floyd CLI to initiate a cloud instance. Initiation:
floyd init your_favorate_task_name_e.g._neural_networks
To start a GPU instance with Jupyter notebook that is compatable to lesson 1 scripts, you need to specify the following parameters:
floyd run \
--mode jupyter \
--env theano:py2 \
--gpu
Wait for a few minutes, then you should get a website address in your console for the running Jupyter notebook.
We are almost there, but there is one more thing you need to configure - the keras.json file
Start a console session in jupyter notebook, then
mkdir ~/.keras
echo '{
"image_dim_ordering": "th",
"epsilon": 1e-07,
"floatx": "float32",
"backend": "theano"
}' > ~/.keras/keras.json
Please note the home directory on Floydhub instance is actually /root/ directory.
Now lesson 1 should run with the sample dataset.
For the full dataset, it is recommended that you build a data object in the floydnet. Refer to http://docs.floydhub.com/commands/data/ Then, specify the data id when you start a cloud instance:
floyd run \
--mode jupyter \
--data data_ID_e.g._jq4ZXUCSVer4t65rWeyieG
--env theano:py2 \
--gpu
Then the data should be available in /input directory.
After I try this out, I'll come back and share my experience.