Skip to content
This repository has been archived by the owner on Jun 14, 2023. It is now read-only.

Commit

Permalink
fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
ignacio committed Jan 11, 2019
1 parent 512ff7b commit b8ba626
Show file tree
Hide file tree
Showing 12 changed files with 289 additions and 250 deletions.
10 changes: 5 additions & 5 deletions source/user/howto/develop-model.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ Do ``git push origin master`` in both created directories. This puts your initia
----------------------------------------

The structure of ``your_project`` created using
`DEEP UC template <https://github.com/indigo-dc/cookiecutter-data-science>`__ contains
`DEEP UC template <https://github.com/indigo-dc/cookiecutter-data-science>`_ contains
the following core items needed to develop a DEEP UC model:
::

Expand All @@ -60,9 +60,9 @@ accordingly into the directory structure.


2.2 Make datasets
==================
=================

Source files in this directory aim to manipulate with raw dataset(s).
Source files in this directory aim to manipulate raw datasets.
The output of this step is also raw data, but cleaned and/or pre-formatted.
::

Expand All @@ -73,9 +73,9 @@ The output of this step is also raw data, but cleaned and/or pre-formatted.
2.3 Build features
===================

This step takes the output from the previous step ``Make datasets`` and
This step takes the output from the previous step `Make datasets` and
creates train, test as well as validation ML data from raw but cleaned and pre-formatted data.
The realisation of this step depends on concrete UC, the aim of the application as well as
The realisation of this step depends on the concrete Use Case, the aim of the application as well as
available technological backgrounds (e.g. high-performance supports for data processing).
::

Expand Down
208 changes: 117 additions & 91 deletions source/user/howto/rclone.rst
Original file line number Diff line number Diff line change
@@ -1,12 +1,18 @@
.. include:: <isonum.txt>
.. include:: <isonum.txt>
.. highlight:: console


How to use rclone
=================


Installation of rclone in Docker image (pro)
--------------------------------------------
All the applications in the `DEEP Open Catalog <https://deephdc.github.io/>`_ are packed within a Docker containing rclone installed by default. If you want to create a Docker containing your own application, you should install rclone in the container to be able to access the data stored remotely. The following lines are an example of what has to be added in the Dockerfile when installation is based on Ubuntu. For other Linux flavors, please, refer to the `rclone official site <https://rclone.org/downloads/>`_ ::
All the applications in the `DEEP Open Catalog <https://deephdc.github.io/>`_ are packed within a Docker containing
rclone installed by default. If you want to create a Docker containing your own application, you should install rclone
in the container to be able to access the data stored remotely. The following lines are an example of what has to be
added in the Dockerfile when installation is based on Ubuntu. For other Linux flavors, please, refer to
the `rclone official site <https://rclone.org/downloads/>`_ ::

# install rclone
RUN wget https://downloads.rclone.org/rclone-current-linux-amd64.deb && \
Expand All @@ -23,13 +29,17 @@ Nextcloud configuration for rclone
----------------------------------
.. image:: ../../_static/nc-access.png

After login into `DEEP-Nextcloud <https://nc.deep-hybrid-datacloud.eu/login>`_ with your DEEP-IAM credentials, go to (1) **Settings (top right corner)** |rarr| (2) **Security** |rarr| (3) **Devices & sessions**. Set a name for you application and clik on **Create new app password**. That user and password is what one needs to include in the rclone config file (rclone.conf).
After login into `DEEP-Nextcloud <https://nc.deep-hybrid-datacloud.eu/login>`_ with your DEEP-IAM credentials, go to
(1) **Settings (top right corner)** |rarr| (2) **Security** |rarr| (3) **Devices & sessions**. Set a name for you
application and clik on **Create new app password**. That user and password is what one needs to include in the rclone
config file (``rclone.conf``).


Creating rclone.conf
--------------------

You can install rclone at your host or run Docker image with rclone installed (see installation steps of rclone above). In order to create the cofiguration file (rclone.conf) for rclone::
You can install rclone at your host or run Docker image with rclone installed (see installation steps of rclone above).
In order to create the configuration file (``rclone.conf``) for rclone::
$ rclone config
choose "n" for "New remote"
Expand All @@ -41,115 +51,131 @@ You can install rclone at your host or run Docker image with rclone installed (s
specify password (see "Nextcloud configuration for rclone" above).
by default rclone.conf is created in your $HOME/.config/rclone/rclone.conf

The rclone.conf file should be in your host, i.e. outside of container. **DO NOT STORE IT IN THE CONTAINER** (e.g. if you use uDocker, it will be stored in your filesystem, even being in the container).

.. important::
The rclone.conf file should be in your host, i.e. outside of container. **DO NOT STORE IT IN THE CONTAINER** (e.g.
if you use uDocker, it will be stored in your filesystem, even being in the container).

Then one has two options:

If your know under what user your run your application in the container (e.g. if docker or nvidia-docker is used, most probably this is 'root') you can mount your host rclone.conf into the container as::
If your know under what user your run your application in the container (e.g. if docker or nvidia-docker is used, most
probably this is 'root') you can mount your host ``rclone.conf`` into the container as::
$ docker run -ti -v $HOSTDIR_WITH_RCLONE_CONF/rclone.conf:/root/.config/rclone/rclone.conf <your-docker-image>

i.e. you mount rclone.conf file itself directly as a volume.
i.e. you mount ``rclone.conf`` file itself directly as a volume.

One can also mount rclone directory with the rclone.conf file::
One can also mount rclone directory with the ``rclone.conf`` file::

$ docker run -ti -v $HOSTDIR_WITH_RCLONE_CONF:/root/.config/rclone <your-docker-image>

A more reliable way can be to mount either rclone directory or directly rclone.conf file into a pre-defined location and not (container) user-dependent place::
A more reliable way can be to mount either rclone directory or directly ``rclone.conf`` file into a pre-defined location
and not (container) user-dependent place::

$ docker run -ti -v $HOSTDIR_WITH_RCLONE_CONF:/rclone <your-docker-image>
$ docker run -ti -v $HOSTDIR_WITH_RCLONE_CONF:/rclone <your-docker-image>

One has, however, to call rclone with "--config" option to point to the rclone.conf file, e.g::
One has, however, to call rclone with ``--config`` option to point to the ``rclone.conf`` file, e.g::

$ rclone --config /rclone/rclone.conf ls deep-nextcloud:/Datasets/dogs_breed/models
$ rclone --config /rclone/rclone.conf ls deep-nextcloud:/Datasets/dogs_breed/models

Example code on usage of rclone from python
-------------------------------------------

**Simple example**

A simple call of rclone from python is via subprocess.Popen()::
A simple call of rclone from python is via ``subprocess.Popen()``

import subprocess
# from deep-nextcloud into the container
command = (['rclone', 'copy', 'deep-nextcloud:/Datasets/dogs_breed/data', '/srv/dogs_breed_det/data'])
result = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
output, error = result.communicate()
.. code-block:: python
import subprocess
# from deep-nextcloud into the container
command = (['rclone', 'copy', 'deep-nextcloud:/Datasets/dogs_breed/data', '/srv/dogs_breed_det/data'])
result = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
output, error = result.communicate()
**Advanced examples**

More advanced usage includes calling rclone with various options (ls, copy, check) in order to check file existence at Source, check if after copying two versions match exactly.

* rclone_call::

def rclone_call(src_path, dest_dir, cmd = 'copy', get_output=False):
""" Function
rclone calls
"""
if cmd == 'copy':
command = (['rclone', 'copy', '--progress', src_path, dest_dir])
elif cmd == 'ls':
command = (['rclone', 'ls', src_path])
elif cmd == 'check':
command = (['rclone', 'check', src_path, dest_dir])
if get_output:
result = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
else:
result = subprocess.Popen(command, stderr=subprocess.PIPE)
output, error = result.communicate()
return output, error

* rclone_copy::

def rclone_copy(src_path, dest_dir, src_type='file'):
""" Function for rclone call to copy data (sync?)
:param src_path: full path to source (file or directory)
:param dest_dir: full path to destination directory (not file!)
:param src_type: if source is file (default) or directory
:return: if destination was downloaded, and possible error
"""
error_out = None
if src_type == 'file':
src_dir = os.path.dirname(src_path)
dest_file = src_path.split('/')[-1]
dest_path = os.path.join(dest_dir, dest_file)
else:
src_dir = src_path
dest_path = dest_dir
# check first if we find src_path
output, error = rclone_call(src_path, dest_dir, cmd='ls')
if error:
print('[ERROR] %s (src):\n%s' % (src_path, error))
error_out = error
dest_exist = False
else:
# if src_path exists, copy it
output, error = rclone_call(src_path, dest_dir, cmd='copy')
if not error:
# compare two directories, if copied file appears in output
# as not found or not matching -> Error
print('[INFO] File %s copied. Check if (src) and (dest) really match..' % (dest_file))
output, error = rclone_call(src_dir, dest_dir, cmd='check')
if 'ERROR : ' + dest_file in error:
print('[ERROR] %s (src) and %s (dest) do not match!' % (src_path, dest_path))
error_out = 'Copy failed: ' + src_path + ' (src) and ' + \
dest_path + ' (dest) do not match'
dest_exist = False
else:
output, error = rclone_call(dest_path, dest_dir,
cmd='ls', get_output = True)
file_size = [ elem for elem in output.split(' ') if elem.isdigit() ][0]
print('[INFO] Checked: Successfully copied to %s %s bytes' % (dest_path, file_size))
dest_exist = True
else:
print('[ERROR] %s (src):\n%s' % (dest_path, error))
error_out = error
More advanced usage includes calling rclone with various options (ls, copy, check) in order to check file existence at
Source, check if after copying two versions match exactly.

* rclone_call

.. code-block:: python
def rclone_call(src_path, dest_dir, cmd = 'copy', get_output=False):
""" Function
rclone calls
"""
if cmd == 'copy':
command = (['rclone', 'copy', '--progress', src_path, dest_dir])
elif cmd == 'ls':
command = (['rclone', 'ls', src_path])
elif cmd == 'check':
command = (['rclone', 'check', src_path, dest_dir])
if get_output:
result = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
else:
result = subprocess.Popen(command, stderr=subprocess.PIPE)
output, error = result.communicate()
return output, error
.. todo:: This Python code should be corretly indented

* rclone_copy

.. code-block:: python
def rclone_copy(src_path, dest_dir, src_type='file'):
""" Function for rclone call to copy data (sync?)
:param src_path: full path to source (file or directory)
:param dest_dir: full path to destination directory (not file!)
:param src_type: if source is file (default) or directory
:return: if destination was downloaded, and possible error
"""
error_out = None
if src_type == 'file':
src_dir = os.path.dirname(src_path)
dest_file = src_path.split('/')[-1]
dest_path = os.path.join(dest_dir, dest_file)
else:
src_dir = src_path
dest_path = dest_dir
# check first if we find src_path
output, error = rclone_call(src_path, dest_dir, cmd='ls')
if error:
print('[ERROR] %s (src):\n%s' % (src_path, error))
error_out = error
dest_exist = False
else:
# if src_path exists, copy it
output, error = rclone_call(src_path, dest_dir, cmd='copy')
if not error:
# compare two directories, if copied file appears in output
# as not found or not matching -> Error
print('[INFO] File %s copied. Check if (src) and (dest) really match..' % (dest_file))
output, error = rclone_call(src_dir, dest_dir, cmd='check')
if 'ERROR : ' + dest_file in error:
print('[ERROR] %s (src) and %s (dest) do not match!' % (src_path, dest_path))
error_out = 'Copy failed: ' + src_path + ' (src) and ' + \
dest_path + ' (dest) do not match'
dest_exist = False
else:
output, error = rclone_call(dest_path, dest_dir,
cmd='ls', get_output = True)
file_size = [ elem for elem in output.split(' ') if elem.isdigit() ][0]
print('[INFO] Checked: Successfully copied to %s %s bytes' % (dest_path, file_size))
dest_exist = True
else:
print('[ERROR] %s (src):\n%s' % (dest_path, error))
error_out = error
dest_exist = False
return dest_exist, error_out
return dest_exist, error_out
.. todo:: This Python code should be corretly indented
9 changes: 5 additions & 4 deletions source/user/howto/train-model-locally.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,8 @@ Possible option include image classifiers, etc.

.. todo:: Check that names of the docker containers are correct for the image classifier example.

You will find that your model has an associate Docker container in DockerHub. Please download and run the container with:
You will find that your model has an associate Docker container in `DockerHub <https://hub.docker.com/u/deephdc/>`_.
Please download and run the container with:

.. code-block:: console
Expand All @@ -38,7 +39,7 @@ For example if you wanted to download the image classifier model you would have
.. code-block:: console
$ docker pull https://hub.docker.com/r/deephdc/deep-oc-image-classification-tf
$ docker run -p 5000:5000 -p 6006:6006 -ti imgclas-tf-normal /bin/bash
$ docker run -p 5000:5000 -p 6006:6006 -ti deep-oc-image-classification-tf /bin/bash
We are using the port ``5000`` to deploy the API and the port ``6006`` to monitor the training (for example using
`Tensorboard <https://www.tensorflow.org/guide/summaries_and_tensorboard>`_).
Expand All @@ -53,7 +54,7 @@ We are using the port ``5000`` to deploy the API and the port ``6006`` to monito
5. Train the model
==================

Now comes the fun! Go to `<http://0.0.0.0:5000>`_ and look for the train mehod. Modify the training parameters you wish to
change and execute. If some kind of monitorization tool is available for this model you will be able to folllow the training
Now comes the fun! Go to `<http://0.0.0.0:5000>`_ and look for the ``train`` method. Modify the training parameters you wish to
change and execute. If some kind of monitorization tool is available for this model you will be able to follow the training
progress from `<http://0.0.0.0:6006>`_.

0 comments on commit b8ba626

Please sign in to comment.