tensorflow-cpu-data-science-project

Repository containing scaffolding for a Python 3-based data science project using on the TensorFlow ecosystem.

Creating a new project from this template

Simply follow the instructions to create a new project repository from this template.

Project organization

Project organization is based on ideas from Good Enough Practices for Scientific Computing.

Put each project in its own directory, which is named after the project.
Put external scripts or compiled programs in the bin directory.
Put raw data and metadata in a data directory.
Put text documents associated with the project in the doc directory.
Put all Docker related files in the docker directory.
Install the Conda environment into an env directory.
Put all notebooks in the notebooks directory.
Put files generated during cleanup and analysis in a results directory.
Put project source code in the src directory.
Name all files to reflect their content or function.

Using Conda

Creating the Conda environment

After adding any necessary dependencies that should be downloaded via conda to the environment.yml file and any dependencies that should be downloaded via pip to the requirements.txt file you create the Conda environment in a sub-directory ./envof your project directory by running the following commands.

$ conda env create --prefix ./env --file environment.yml

Once the new environment has been created you can activate the environment with the following command.

$ conda activate ./env
(/path/to/env) $

Note that the ./env directory is not under version control as it can always be re-created from the ./bin/create-conda-environment.sh file as necessary.

Updating the Conda environment

If you add (remove) dependencies to (from) the environment.yml file after the environment has already been created, then you can update the environment with the following command.

$ conda env update --prefix ./env --file environment.yml --prune

Verifying the Conda environment

After building the Conda environment you can check that Horovod has been built with support for TensorFlow and MPI with the following command.

$ conda activate ./env # optional if environment already active
(/path/to/env) $ horovodrun --check-build

Listing the full contents of the Conda environment

The list of explicit dependencies for the project are listed in the environment.yml file. To see the full lost of packages installed into the environment run the following command.

conda list --prefix ./env

Using Docker

In order to build Docker images for your project and run containers you will need to install Docker and
Docker Compose.

Detailed instructions for using Docker to build and image and launch containers can be found in the docker/README.md.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tensorflow-cpu-data-science-project

Creating a new project from this template

Project organization

Using Conda

Creating the Conda environment

Updating the Conda environment

Verifying the Conda environment

Listing the full contents of the Conda environment

Using Docker

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
bin		bin
data		data
doc		doc
docker		docker
notebooks		notebooks
results		results
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt

License

kaust-vislab/tensorflow-cpu-data-science-project

Folders and files

Latest commit

History

Repository files navigation

tensorflow-cpu-data-science-project

Creating a new project from this template

Project organization

Using Conda

Creating the Conda environment

Updating the Conda environment

Verifying the Conda environment

Listing the full contents of the Conda environment

Using Docker

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages