<div style="border: 2px solid #8A9AD0; margin: 1em 0.2em; padding: 0.5em;">

# Virtual Environments For Software Development

by [The Carpentries](https://training.galaxyproject.org/hall-of-fame/carpentries/), [Helena Rasche](https://training.galaxyproject.org/hall-of-fame/hexylena/)

CC-BY licensed content from the [Galaxy Training Network](https://training.galaxyproject.org/)

**Objectives**

- What are virtual environments in software development and why you should use them?
- How can we manage Python virtual environments and external (third-party) libraries?

**Objectives**

- Set up a Python virtual environment for our software project using <code style="color: inherit">venv</code> and <code style="color: inherit">pip</code>.
- Run our software from the command line.

**Time Estimation: 30M**
</div>


<p>“Virtual Environments” allow you to easily manage your installed Python packages and prevent conflicts between different project’s dependencies. In general most modern projects should use <code style="color: inherit">conda</code> for dependency management, but <code style="color: inherit">venv</code> can be convenient for Python-only projects.</p>
<blockquote class="comment" style="border: 2px solid #ffecc1; margin: 1em 0.2em">
<div class="box-title comment-title" id="comment"><i class="far fa-comment-dots" aria-hidden="true" ></i> Comment</div>
<p>This tutorial is significantly based on <a href="https://carpentries.org">the Carpentries</a> lesson <a href="https://carpentries-incubator.github.io/python-intermediate-development/">“Intermediate Research Software Development”</a>.</p>
</blockquote>
<p>If you have a python project you are using, you will often see something like
following two lines somewhere at the top.</p>
<div class="language-python highlighter-rouge"><div><pre style="color: inherit; background: transparent"><code style="color: inherit"><span class="kn">from</span> <span class="n">matplotlib</span> <span class="kn">import</span> <span class="n">pyplot</span> <span class="k">as</span> <span class="n">plt</span>
<span class="kn">import</span> <span class="n">numpy</span> <span class="k">as</span> <span class="n">np</span>
</code></pre></div></div>
<p>This means that our code requires two <em>external libraries</em> (also called third-party packages or dependencies) -
<code style="color: inherit">numpy</code> and <code style="color: inherit">matplotlib</code>.
Python applications often use external libraries that don’t come as part of the standard Python distribution. This means
that you will have to use a <em>package manager</em> tool to install them on your system.
Applications will also sometimes need a
specific version of an external library (e.g. because they require that a particular
bug has been fixed in a newer version of the library), or a specific version of Python interpreter.
This means that each Python application you work with may require a different setup and a set of dependencies so it
is important to be able to keep these configurations separate to avoid confusion between projects.
The solution for this problem is to create a self-contained <em>virtual
environment</em> per project, which contains a particular version of Python installation plus a number of
additional external libraries.</p>
<p>Virtual environments are not just a feature of Python - all modern programming languages use them to isolate code
of a specific project and make it easier to develop, run, test and share code with others. In this tutorial, we learn how
to set up a virtual environment to develop our code and manage our external dependencies.</p>
<blockquote class="agenda" style="border: 2px solid #86D486;display: none; margin: 1em 0.2em">
<div class="box-title agenda-title" id="agenda">Agenda</div>
<p>In this tutorial, we will cover:</p>
<ol id="markdown-toc">
<li><a href="#virtual-environments" id="markdown-toc-virtual-environments">Virtual Environments</a></li>
</ol>
</blockquote>
<h2 id="virtual-environments">Virtual Environments</h2>
<p>So what exactly are virtual environments, and why use them?</p>
<p>A Python virtual environment is an <strong>isolated working copy</strong> of a specific version of
Python interpreter together with specific versions of a number of external libraries installed into that
virtual environment. A virtual environment is simply a <em>directory with a particular
structure</em> which includes links to and enables multiple side-by-side installations of
different Python interpreters or different versions of the same external library to coexist on your machine and only one to be selected for each of our projects. This allows you to work on a particular
project without worrying about affecting other projects on your machine.</p>
<p>As more external libraries are added to your Python project over time, you can add them to
its specific virtual environment and avoid a great deal of confusion by having separate (smaller) virtual environments
for each project rather than one huge global environment with potential package version clashes. Another big motivator
for using virtual environments is that they make sharing your code with others much easier (as we will see shortly).
Here are some typical scenarios where the usage of virtual environments is highly recommended (almost unavoidable):</p>
<ul>
<li>You have an older project that only works under Python 2. You do not have the time to migrate the project to Python 3
or it may not even be possible as some of the third party dependencies are not available under Python 3. You have to
start another project under Python 3. The best way to do this on a single machine is to set up two separate Python virtual
environments.</li>
<li>One of your Python 3 projects is locked to use a particular older version of a third party dependency. You cannot use the
latest version of the
dependency as it breaks things in your project. In a separate branch of your project, you want to try and fix problems
introduced by the new version of the dependency without affecting the working version of your project. You need to set up
a separate virtual environment for your branch to ‘isolate’ your code while testing the new feature.</li>
</ul>
<p>You do not have to worry too much about specific versions of external libraries that your project depends on most of the time.
Virtual environments enable you to always use the latest available version without specifying it explicitly.
They also enable you to use a specific older version of a package for your project, should you need to.</p>
<blockquote class="tip" style="border: 2px solid #FFE19E; margin: 1em 0.2em">
<div class="box-title tip-title" id="tip-a-specific-python-or-package-version-is-only-ever-installed-once"><button class="gtn-boxify-button tip" type="button" aria-controls="tip-a-specific-python-or-package-version-is-only-ever-installed-once" aria-expanded="true"><i class="far fa-lightbulb" aria-hidden="true" ></i> Tip: A Specific Python or Package Version is Only Ever Installed Once<span class="fold-unfold fa fa-minus-square"></span></button></div>
<p>Note that you will not have a separate Python or package installations for each of your projects - they will only
ever be installed once on your system but will be referenced from different virtual environments.</p>
</blockquote>
<h3 id="managing-python-virtual-environments">Managing Python Virtual Environments</h3>
<p>There are several commonly used command line tools for managing Python virtual environments:</p>
<ul>
<li><code style="color: inherit">venv</code>, available by default from the standard <code style="color: inherit">Python</code> distribution from <code style="color: inherit">Python 3.3+</code></li>
<li><code style="color: inherit">virtualenv</code>, needs to be installed separately but supports both <code style="color: inherit">Python 2.7+</code> and <code style="color: inherit">Python 3.3+</code>versions</li>
<li><code style="color: inherit">pipenv</code>, created to fix certain shortcomings of <code style="color: inherit">virtualenv</code></li>
<li><code style="color: inherit">conda</code>, package and environment management system (also included as part of the Anaconda Python distribution often used by the scientific community)</li>
<li><code style="color: inherit">poetry</code>, a modern Python packaging tool which handles virtual environments automatically</li>
</ul>
<p>While there are pros and cons for using each of the above, all will do the job of managing Python
virtual environments for you and it may be a matter of personal preference which one you go for.
In this course, we will use <code style="color: inherit">venv</code> to create and manage our
virtual environment (which is the preferred way for Python 3.3+).</p>
<p>Until you encounter the needs of a project which goes beyond what is available
in the Python ecosystem, e.g. when you depend on external packages like htslib
or bioinformatics tools that are simply not distributed as part of PyPI, then
<code style="color: inherit">venv</code> is a good choice to get started with.</p>
<h3 id="managing-python-packages">Managing Python Packages</h3>
<p>Part of managing your (virtual) working environment involves installing, updating and removing external packages
on your system. The Python package manager tool <code style="color: inherit">pip</code> is most commonly used for this - it interacts
 and obtains the packages from the central repository called <a href="https://pypi.org/">Python Package Index (PyPI)</a>.
<code style="color: inherit">pip</code> can now be used with all Python distributions (including Anaconda).</p>
<blockquote class="tip" style="border: 2px solid #FFE19E; margin: 1em 0.2em">
<div class="box-title tip-title" id="tip-a-note-on-anaconda-and-code-style-quot-color-inherit-quot-conda-code"><button class="gtn-boxify-button tip" type="button" aria-controls="tip-a-note-on-anaconda-and-code-style-quot-color-inherit-quot-conda-code" aria-expanded="true"><i class="far fa-lightbulb" aria-hidden="true" ></i> Tip: A Note on Anaconda and <code style=&quot;color: inherit&quot;>conda</code><span class="fold-unfold fa fa-minus-square"></span></button></div>
<p>Anaconda is an open source Python
distribution commonly used for scientific programming - it conveniently installs Python, package and environment management <code style="color: inherit">conda</code>, and a
number of commonly used scientific computing packages so you do not have to obtain them separately.
<code style="color: inherit">conda</code> is an independent command line tool (available separately from the Anaconda distribution too) with dual functionality: (1) it is a package manager that helps you find Python packages from
remote package repositories and install them on your system, and (2) it is also a virtual environment manager. So, you can use <code style="color: inherit">conda</code> for both tasks instead of using <code style="color: inherit">venv</code> and <code style="color: inherit">pip</code>.</p>
</blockquote>
<h3 id="many-tools-for-the-job">Many Tools for the Job</h3>
<p>Installing and managing Python distributions, external libraries and virtual environments is, well,
complex. There is an abundance of tools for each task, each with its advantages and disadvantages, and there are different
ways to achieve the same effect (and even different ways to install the same tool!).
Note that each Python distribution comes with its own version of
<code style="color: inherit">pip</code> - and if you have several Python versions installed you have to be extra careful to use the correct <code style="color: inherit">pip</code> to
manage external packages for that Python version.</p>
<p><code style="color: inherit">venv</code> and <code style="color: inherit">pip</code> are considered the <em>de facto</em> standards for virtual environment and package management for Python 3.
However, the advantages of using Anaconda and <code style="color: inherit">conda</code> are that you get (most of the) packages needed for
scientific code development included with the distribution. If you are only collaborating with others who are also using
Anaconda, you may find that <code style="color: inherit">conda</code> satisfies all your needs. It is good, however, to be aware of all these tools,
and use them accordingly. As you become more familiar with them you will realise that equivalent tools work in a similar
way even though the command syntax may be different (and that there are equivalent tools for other programming languages
too to which your knowledge can be ported).</p>
<figure id="figure-1" style="max-width: 90%; margin:auto;"><img src="../../images/xkcd/python_environment.png" alt="Python environment hell XKCD comic  showing boxes like pip, easy_install, homebrew 2.7, anaconda, homebrew 3.6, /usr/local/Cellar, ~/python/, and a chaotic mess of arrows moving between them all. At the bottom is the text: My python environment has become so degraded that my laptop has been declared a superfund site. (A superfund site is generally an environmental disaster area.). " width="492" height="487" loading="lazy" /><figcaption><span class="figcaption-prefix"><strong>Figure 1</strong>:</span> Python Environment Hell from XKCD 1987 (CC-BY-NC 2.5)</figcaption></figure>
<p>Let us have a look at how we can create and manage virtual environments from the command line using <code style="color: inherit">venv</code> and manage packages using <code style="color: inherit">pip</code>.</p>
<h3 id="creating-a-venv-environment">Creating a <code style="color: inherit">venv</code> Environment</h3>
<p>Creating a virtual environment with <code style="color: inherit">venv</code> is done by executing the following command:</p>


In [None]:
python3 -m venv /path/to/new/virtual/environment

<p>where <code style="color: inherit">/path/to/new/virtual/environment</code> is a path to a directory where you want to place it - conventionally within
your software project so they are co-located.
This will create the target directory for the virtual environment (and any parent directories that don’t exist already).</p>
<p>For our project, let’s create a virtual environment called <code style="color: inherit">venv</code> off the project root:</p>


In [None]:
python3 -m venv venv

<p>If you list the contents of the newly created <code style="color: inherit">venv</code> directory, on a Mac or Linux system
(slightly different on Windows as explained below) you should see something like:</p>


In [None]:
ls -l venv

<p>So, running the <code style="color: inherit">python3 -m venv venv</code> command created the target directory called <code style="color: inherit">venv</code>
containing:</p>
<ul>
<li><code style="color: inherit">pyvenv.cfg</code> configuration file with a home key pointing to the Python installation from which the command was run,</li>
<li><code style="color: inherit">bin</code> subdirectory (called <code style="color: inherit">Scripts</code> on Windows) containing a symlink of the Python interpreter binary used to create the
environment and the standard Python library,</li>
<li><code style="color: inherit">lib/pythonX.Y/site-packages</code> subdirectory (called <code style="color: inherit">Lib\site-packages</code> on Windows) to contain its own independent set of installed Python packages isolated from other projects,</li>
<li>various other configuration and supporting files and subdirectories.</li>
</ul>
<blockquote class="tip" style="border: 2px solid #FFE19E; margin: 1em 0.2em">
<div class="box-title tip-title" id="tip-naming-virtual-environments"><button class="gtn-boxify-button tip" type="button" aria-controls="tip-naming-virtual-environments" aria-expanded="true"><i class="far fa-lightbulb" aria-hidden="true" ></i> Tip: Naming Virtual Environments<span class="fold-unfold fa fa-minus-square"></span></button></div>
<p>What is a good name to use for a virtual environment? Using “venv” or “.venv” as the
name for an environment and storing it within the project’s directory seems to be the recommended way -
this way when you come across such a subdirectory within a software project,
by convention you know it contains its virtual environment details.
A slight downside is that all different virtual environments
on your machine then use the same name and the current one is determined by the context of the path
you are currently located in. A (non-conventional) alternative is to
use your project name for the name of the virtual environment, with the downside that there is nothing to indicate
that such a directory contains a virtual environment. In our case, we have settled to use the name “venv” since it is
not a hidden directory and we want it to be displayed by the command line when listing directory contents (hence,
no need for the “.” in its name that would, by convention, make it hidden). In the future,
you will decide what naming convention works best for you. Here are some references for each of the naming conventions:</p>
<ul>
<li><a href="https://docs.python-guide.org/dev/virtualenvs/">The Hitchhiker’s Guide to Python</a> notes that “venv” is the general convention used globally</li>
<li><a href="https://docs.python.org/3/library/venv.html">The Python Documentation</a> indicates that “.venv” is common</li>
<li><a href="https://discuss.python.org/t/trying-to-come-up-with-a-default-directory-name-for-virtual-environments/3750">“venv” vs “.venv” discussion</a></li>
</ul>
</blockquote>
<p>Once you’ve created a virtual environment, you will need to activate it:</p>


In [None]:
source venv/bin/activate

<p>Activating the virtual environment will change your command line’s prompt to show what virtual environment
you are currently using (indicated by its name in round brackets at the start of the prompt),
and modify the environment so that running Python will get you the particular
version of Python configured in your virtual environment.</p>
<p>You can verify you are using your virtual environment’s version of Python by checking the path using <code style="color: inherit">which</code>:</p>


In [None]:
which python3

<p>When you’re done working on your project, you can exit the environment with:</p>


In [None]:
deactivate

<p>If you’ve just done the <code style="color: inherit">deactivate</code>, ensure you reactivate the environment ready for the next part:</p>


In [None]:
source venv/bin/activate

<blockquote class="tip" style="border: 2px solid #FFE19E; margin: 1em 0.2em">
<div class="box-title tip-title" id="tip-python-within-a-virtual-environment"><button class="gtn-boxify-button tip" type="button" aria-controls="tip-python-within-a-virtual-environment" aria-expanded="true"><i class="far fa-lightbulb" aria-hidden="true" ></i> Tip: Python Within A Virtual Environment<span class="fold-unfold fa fa-minus-square"></span></button></div>
<p>Within a virtual environment, commands <code style="color: inherit">python</code> and <code style="color: inherit">pip</code> will refer to the version of Python you created the environment with. If you create a virtual environment with <code style="color: inherit">python3 -m venv venv</code>, <code style="color: inherit">python</code> will refer to <code style="color: inherit">python3</code> and <code style="color: inherit">pip</code> will refer to <code style="color: inherit">pip3</code>.</p>
<p>On some machines with Python 2 installed, <code style="color: inherit">python</code> command may refer to the copy of Python 2 installed outside of the virtual environment instead, which can cause confusion. You can always check which version of Python you are using in your virtual environment with the command <code style="color: inherit">which python</code> to be absolutely sure. We continue using <code style="color: inherit">python3</code> and <code style="color: inherit">pip3</code> in this material to avoid confusion for those users, but commands <code style="color: inherit">python</code> and <code style="color: inherit">pip</code> may work for you as expected.</p>
</blockquote>
<h3 id="installing-external-libraries-in-an-environment-with-pip">Installing External Libraries in an Environment with <code style="color: inherit">pip</code></h3>
<p>We noticed earlier that our code depends on two <em>external libraries</em> - <code style="color: inherit">numpy</code> and <code style="color: inherit">matplotlib</code>. In order
for the code to run on your machine, you need to
install these two dependencies into your virtual environment.</p>
<p>To install the latest version of a package with <code style="color: inherit">pip</code> you use pip’s <code style="color: inherit">install</code> command and specify the package’s name, e.g.:</p>


In [None]:
pip3 install numpy
pip3 install matplotlib

<p>or like this to install multiple packages at once for short:</p>


In [None]:
pip3 install numpy matplotlib

<blockquote class="tip" style="border: 2px solid #FFE19E; margin: 1em 0.2em">
<div class="box-title tip-title" id="tip-how-about-code-style-quot-color-inherit-quot-python3-m-pip-install-code"><button class="gtn-boxify-button tip" type="button" aria-controls="tip-how-about-code-style-quot-color-inherit-quot-python3-m-pip-install-code" aria-expanded="true"><i class="far fa-lightbulb" aria-hidden="true" ></i> Tip: How About <code style=&quot;color: inherit&quot;>python3 -m pip install</code>?<span class="fold-unfold fa fa-minus-square"></span></button></div>
<p>Why are we not using <code style="color: inherit">pip</code> as an argument to <code style="color: inherit">python3</code> command, in the same way we did with <code style="color: inherit">venv</code>
(i.e. <code style="color: inherit">python3 -m venv</code>)? <code style="color: inherit">python3 -m pip install</code> should be used according to the
<a href="https://pip.pypa.io/en/stable/user_guide/#running-pip">official Pip documentation</a>; other official documentation
still seems to have a mixture of usages. Core Python developer Brett Cannon offers a
<a href="https://snarky.ca/why-you-should-use-python-m-pip/">more detailed explanation</a> of edge cases when the two options may produce
different results and recommends <code style="color: inherit">python3 -m pip install</code>. We kept the old-style command (<code class="language-plaintext highlighter-rouge">pip3 install</code>) as it seems more
prevalent among developers at the moment - but it may be a convention that will soon change and certainly something you should consider.</p>
</blockquote>
<p>If you run the <code style="color: inherit">pip3 install</code> command on a package that is already installed, <code style="color: inherit">pip</code> will notice this and do nothing.</p>
<p>To install a specific version of a Python package give the package name followed by <code style="color: inherit">==</code> and the version number, e.g.
<code class="language-plaintext highlighter-rouge">pip3 install numpy==1.21.1</code>.</p>
<p>To specify a minimum version of a Python package, you can
do <code style="color: inherit">pip3 install numpy&gt;=1.20</code>.</p>
<p>To upgrade a package to the latest version, e.g. <code style="color: inherit">pip3 install --upgrade numpy</code>.</p>
<p>To display information about a particular installed package do:</p>


In [None]:
pip3 show numpy

<p>To list all packages installed with <code style="color: inherit">pip</code> (in your current virtual environment):</p>


In [None]:
pip3 list

<p>To uninstall a package installed in the virtual environment do: <code style="color: inherit">pip3 uninstall package-name</code>.
You can also supply a list of packages to uninstall at the same time.</p>
<h3 id="exportingimporting-an-environment-with-pip">Exporting/Importing an Environment with <code style="color: inherit">pip</code></h3>
<p>You are collaborating on a project with a team so, naturally, you will want to share your environment with your
collaborators so they can easily ‘clone’ your software project with all of its dependencies and everyone
can replicate equivalent virtual environments on their machines. <code style="color: inherit">pip</code> has a handy way of exporting,
saving and sharing virtual environments.</p>
<p>To export your active environment - use <code style="color: inherit">pip freeze</code> command to
produce a list of packages installed in the virtual environment.
A common convention is to put this list in a <code style="color: inherit">requirements.txt</code> file:</p>


In [None]:
pip3 freeze > requirements.txt
cat requirements.txt

<p>The first of the above commands will create a <code style="color: inherit">requirements.txt</code> file in your current directory.
The <code style="color: inherit">requirements.txt</code> file can then be committed to a version control system and
get shipped as part of your software and shared with collaborators and/or users. They can then replicate your environment and
install all the necessary packages from the project root as follows:</p>


In [None]:
pip3 install -r requirements.txt

<p>As your project grows - you may need to update your environment for a variety of reasons. For example, one of your project’s dependencies has
just released a new version (dependency version number update), you need an additional package for data analysis
(adding a new dependency) or you have found a better package and no longer need the older package (adding a new and
removing an old dependency). What you need to do in this case (apart from installing the new and removing the
packages that are no longer needed from your virtual environment) is update the contents of the <code style="color: inherit">requirements.txt</code> file
accordingly by re-issuing <code style="color: inherit">pip freeze</code> command and propagate the updated <code style="color: inherit">requirements.txt</code> file to your collaborators
via your code sharing platform (e.g. GitHub).</p>
<blockquote class="tip" style="border: 2px solid #FFE19E; margin: 1em 0.2em">
<div class="box-title tip-title" id="tip-official-documentation"><button class="gtn-boxify-button tip" type="button" aria-controls="tip-official-documentation" aria-expanded="true"><i class="far fa-lightbulb" aria-hidden="true" ></i> Tip: Official Documentation<span class="fold-unfold fa fa-minus-square"></span></button></div>
<p>For a full list of options and commands, consult the <a href="https://docs.python.org/3/library/venv.html">official <code style="color: inherit">venv</code> documentation</a>
and the <a href="https://docs.python.org/3/installing/index.html#installing-index">Installing Python Modules with <code style="color: inherit">pip</code> guide</a>. Also check out the guide <a href="https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/#installing-packages-using-pip-and-virtual-environments">“Installing packages using <code style="color: inherit">pip</code> and virtual environments”</a>.</p>
</blockquote>
<h2 id="running-python-scripts-from-command-line">Running Python Scripts From Command Line</h2>
<p>Congratulations! Your environment is now activated and set up to run your script
from the command line.</p>


# Key Points

- Virtual environments keep Python versions and dependencies required by different projects separate.
- A virtual environment is itself a directory structure.
- Use `venv` to create and manage Python virtual environments.
- Use `pip` to install and manage Python external (third-party) libraries.
- `pip` allows you to declare all dependencies for a project in a separate file (by convention called `requirements.txt`) which can be shared with collaborators/users and used to replicate a virtual environment.
- Use `pip3 freeze > requirements.txt` to take snapshot of your project's dependencies.
- Use `pip3 install -r requirements.txt` to replicate someone else's virtual environment on your machine from the `requirements.txt` file.

# Congratulations on successfully completing this tutorial!

Please [fill out the feedback on the GTN website](https://training.galaxyproject.org/training-material/topics/data-science/tutorials/python-venv/tutorial.html#feedback) and check there for further resources!
