Skip to content

Latest commit

 

History

History
153 lines (106 loc) · 5.9 KB

ch02_installation.asciidoc

File metadata and controls

153 lines (106 loc) · 5.9 KB

Chapter 2: Create an environment for chemoinformatics

Build the environment required for this document.

About Anaconda

Anaconda is a package for easy environment creation and management for doing machine learning. You can also easily install packages, like RDKit, which will be explained later.

Q&A

Why use Anaconda?

The programming language Python has a relatively large number of standard libraries, but you need to install the libraries for chemoinformatics yourself. This is not a big deal if you get used to it, but it will be troublesome for beginners. Anaconda comes into play in order to reduce this effort.

There are two major versions of Python: 2.x and 3.x.

Support for 2.x will end in 2020, so new learners do not need to use 2.x.

How to install Anaconda

Let’s install Anaconda. Visit the official website and download the Python 3 installer for your environment. If the OS is Linux / Mac, you can select the installer of GUI / CUI, so download Python 3.7 64-bit command line installer.

APX+RVX
$ bash ~/Downloads/Anaconda3-2019.03-MacOSX-x86_64.Sh # Please change the installer name accordingly

Press Enter

Welcome to Anaconda3 2018.12

In order to continue the installation process, please review the license
agreement.
Please, press ENTER to continue
>>>

Continue to press Enter and enter yes or no

Do you accept the license terms? [yes|no]
[no] >>>

I am asked where to install, but the default location is usually fine. Press Return.

Anaconda3 will now be installed into this location:
/Users/kzfm/anaconda3

  - Press ENTER to confirm the location
  - Press CTRL-C to abort the installation
  - Or specify a different location below

After installation you will be asked if you want to install VSCode, press No.

Thank you for installing Anaconda3!

===========================================================================

Anaconda is partnered with Microsoft! Microsoft VSCode is a streamlined
code editor with support for development operations like debugging, task
running and version control.

To install Visual Studio Code, you will need:
  - Internet connectivity

Visual Studio Code License: https://code.visualstudio.com/license

Do you wish to proceed with the installation of Microsoft VSCode? [yes|no]
>>> Please answer 'yes' or 'no':
>>>

Once the Anaconda installation is complete, you will be able to use the 'conda' command from a command prompt or terminal.

Build a Virtual Environment and Install packages

The command after "-n" is "py4chemoinformatics", but you can use any name you like. After building the virtual environment, install the packages used in this chapter and later.

$ conda create -n py4chemoinformatics python3.6
$ source activate py4chemoinformatics # Mac/Linux
$ activate py4chemoinformatics # Windows

# install packages
$ conda install -c conda-forge rdkit
$ conda install -c conda-forge seaborn
$ conda install -c conda-forge ggplot
$ conda install -c conda-forge git

If your jupyter notebook does not work in a virtual environment, you will need to install it again.

$ conda install notebook ipykernel

Description of the installed package

RDKit

RDKit is one of the most frequently used toolkits in the field of chemoinformatics. It is one of the so-called open source software (OSS) and can be used free of charge. For more information please refer to Introduction.

seaborn

It is one of the packages for visualizing statistical data.

ggplot

One of the graph drawing packages is that it can draw rationally with a consistent grammar. Originally developed for the statistical analysis language R, it was ported to Python by the company yhat .

Git

It is a version control system. I will not explain Git in this book, but if you do not know Git at all, take a look at Git Primer, which can be understood by anyone.

As explained in "Introduction", all data including pdf will be downloaded by the following command, so please download it as necessary.

$ git clone https://github.com/joofio/py4chemoinformatics.git

More about Conda

Why create a virtual environment

Some systems use Python internally to provide various features, so changing the Python version for a particular package can cause problems. Virtual environments solve these problems. Even if the package requires different library versions, you can set up a virtual Python environment for trial and error. If it becomes unnecessary, the virtual environment can be easily deleted without causing any problems in the original environment. So, by being able to create separate development environments in one system, you will not be bothered by library dependencies problems and Python version differences that often occur during development.

In this book, only one virtual environment is prepared for this book, but in reality, many virtual environments are created and developed. For this reason, I will list frequently used conda subcommands.

$ conda install <package name> # install package
$ conda create -n <Name-of-virtual-environment> python = <version> # Create virtual environment.
$ conda info -e  # Display virtual environment list created
$ conda remove -n <environment-name> # Virtual environment deletion
$ source activate <environment-name> # Using virtual environment ( Mac/Linux)
$ activate <environment-name> # Using virtual environment (Windows)
$ source deactivate # leaving virtual environment
$ conda list # Display a list of libraries installed in the virtual environment you are using now