# The Python Ecosystem - Packages, tools, distributions

## The Python Package Index
* https://pypi.python.org
* Online repository of packages submitted by the community
* Buyer Beware!
* Organizations can set up their own internal mirror of the index https://packaging.python.org/en/latest/guides/hosting-your-own-index/

## Installing Python Packages
The `pip` tool is the most common way to install Python packages.  It's easy to install if not already pre-installed with your Python installation, and easy to use.  It has a lot of capability for installing from different locations, but by default installs from the Python Package Index

In [None]:
%%bash
# Install the "requests" package (latest version)
pip install requests

### Pip Defaults to:
- Installing packages from the Python Package Index
- Installing package to your system-wide Python directory (if root)
- Installing packages for the version of Python pip was installed with

In [None]:
%%bash
# Install the "requests" package into your own home directory (in a .local directory by default)
pip install --user requests

In [None]:
%%bash
# "Downgrade" requests to a specific version
pip install --upgrade --user requests==2.24.0

In [None]:
%%bash
# Remove requests package
pip remove --user requests

In [None]:
%%bash
# Ensure you're installing for a specific version of Python (3.7 in this case)
python3.7 -m pip install --user requests==2.25.1

### Pip can specify versions in a lot of ways
* Minimums:  requests>=2.25
* Maximums:  requests<3.0
* Ranges:    requests>=2.0,<=3.0
* Specific version:  requests==2.25.1

There are several combinations of these available.  Pip follows a PEP standard way of specifying version numbers, found here:  https://www.python.org/dev/peps/pep-0440/#version-specifiers

### What if you need multiple packages?  With specific versions of some?
pip can accept multiple packages (as many as you like) for the install command

In [None]:
%%bash
pip install --user requests>=2.25 pytest==6.2.3 pytest-mock==3.5.1 dateutil

### That works well, but imagine your project now has 10 dependencies... or 20... or 30
Specifying and installing packages manually for every project, or every new machine you want to use a project on is tedious and error prone.  There's got to be a better way!

### requirements.txt - The Better Way
The requirements.txt file is a simple way to specify the requirements of your projects so they can be easily reproduced by whomever is using your package on any machine where you want to run it.
The file is simply a list of the packages to install, one per line, with version specifiers the same way pip uses them.

In [None]:
!cat requirements.txt

In [None]:
%%bash
# Install all requirements from a requirements.txt file 
# NOTE: It doesn't actually have to be named "requirements.txt"
pip install --user -r requirements.txt

In [None]:
%%bash
# Didn't start with a requirements.txt, but you need one now?  No problem!
# List off packages, with versions, that have been installed.
pip freeze

In [None]:
%%bash
# Use it to generate your requirements.txt
pip freeze > requirements-captured.txt
cat requirements-captured.txt

### A note about pip freeze...
This will pin every version by default, and show sub-dependencies, so you may need to edit and clean up a bit!  Try to capture just your project's DIRECT dependencies and let pip do the rest

### Why pin versions at all?
- Ensure different developers are working on the same packages
- No surprises if a package drops support for your Python version or removes a feature
- Deterministic Builds
- Also, Deterministic Builds
- Did I mention Deterministic Builds?

### Versioning and Packages
- **Semantic Versioning**
    - Versions specified as MAJOR.MINOR.PATCH (and occasionally a fourth value for a build number)
    - Each "segment" is independent and increases independently
    - **Major** version increment indicates large, possibly backward-incompatible changes (Python 2.7 -> 3.6)
    - **Minor** version increment indicates smaller changes, usually backward compatible. (Python 3.6 -> Python 3.7)
    - **Patch** version increment indicates bug fixes, security updates, etc. - no functional changes
    - Most Python packages are versioned this way
    - https://semver.org/
- **Calendar Versioning**
    - Version numbers are based on a release schedule (usually year and month)
    - Ex. Ubuntu 20.04 ->  Released in April, 2020
    - No indication of what is contained in the version (but nothing misleading either!)
    - https://calver.org/

## Alternatives to pip
- **pipenv** - Attempts to remove the problem of dependency conflicts in pip
- **poetry** - Designed primarily to handle dependies for publishing packages
- **conda** - Basically pip, but for the Anaconda Python Distribution

### So now you've got dependencies handled for your project...
### But what happens when you set up the next one?

## Dependency Hell
![Tech Loops](images/tech_loops_2x.png)

## Enter Virtualenv - isolated dependencies for Python projects
- Virtualenv ("virtual environment") is a Python package
- Distributed as the "venv" module in modern Python versions
- Uses environment settings, paths, and a few other "tricks" to isolate a project from the system, and from each other!
- Despite the name, NOT a virtual machine or emulation layer

In [None]:
%%bash
# Create a new virtualenv
python3 -m venv myproject

In [None]:
%%bash
# Take a peek in the virtualenv
ls -al myproject

In [None]:
%%bash
# What was installed that we can now use?
ls -al myproject/bin

In [None]:
%%bash
# Activate the virtualenv and then check which Python we're using
source myproject/bin/activate
echo "Show Python Path"
which python
python --version
source deactivate

In [None]:
%%bash
# Install things in JUST the virtual environment (not system wide)
source myproject/bin/activate
python -m pip install requests pytest python-dateutil
echo "Inside the virtualenv"
python -m pip freeze
source deactivate

In [None]:
%%bash
echo "Outside the virtualenv"
python -m pip freeze

### Virtualenv has a lot of options
- Specify the Python version in the virtualenv
- Specify the directory to create the virtualenv
- Whether or not to install pip by default

In [None]:
!python3 -m venv --help

In [None]:
### If you don't need it anymore, just delete it!
### Virtualenvs are just directories, nothing magical
!rm -rf ./myproject

## Lab: Setting up a new Python project
Create a new Python project and install some dependencies to get started.
* Create a new directory for your project (choose any name you like) and create a virtualenv named `venv` inside it.  Make sure you set it up with Python 3!
* Activate and install the following packages in your virtualenv:  
    * `requests`
    * `black`
    * `pytest`
* Use pip to generate a `requirements.txt` file for your project
* BONUS:  Figure out what you should add to your `.gitignore` file if you wanted to put this new project into git, but exclude the virtualenv.

## Digging Deeper - More tools to explore

### setuptools - Python tools for building and distributing packages
- Provides standard configuration files to specify project names, version, dependencies, etc.
- Allows on-installation scripts to run if your package needs to be setup
- Well established and supported by the Python community:  
- https://setuptools.readthedocs.io/en/latest/

### virtualenvwrapper - A set of scripts for managing multiple virtual environments on a system
- Easily set up new virtualenvs for projects in a single, well-defined directory
- Tools to easily switch projects by changing directories and activating virtualenvs for you
- List off existing virtualenvs, remove them, etc.
- https://virtualenvwrapper.readthedocs.io/en/latest/index.html

## Best Practices for avoiding Dependency Hell
- When you start a new project, or clone an existing one for the first time, set up a virtualenv
- Only install dependencies to the system that will be used everywhere (like virtualenv itself, or pip)
- Always run your project in a virtualenv, even if it's the only tool on the system
- Use tools to upgrade your dependencies in development, but always pin for deployments!
- Be explicit in your project about which versions of Python you support, and which versions of packages, so everyone can set up their environment correctly.
- Need different dependencies for dev/test vs. prod?  Make multiple requirements.txt files and multiple virtualenvs!