# Introduction

In [None]:
%%html
<marquee style='width: 30%; color: blue;'><b>Welcome!</b></marquee>

## Installation and Practical Considerations

Installing Python and the suite of libraries that enable scientific computing is straightforward whether you use Windows, Linux, or Mac OS X. This section will outline some of the considerations when setting up your computer.

### Python 2 vs Python 3

This course uses the syntax of Python 3, which contains language enhancements that are not compatible with the *2.x* series of Python. Though Python 3.0 was first released in 2008, adoption has been relatively slow, particularly in the scientific and web development communities. This is primarily because it took some time for many of the essential packages and toolkits to be made compatible with the new language internals.

It was recently announced that Python 2.7 will not be maintained past January 1, 2020.

In [None]:
import sys
sys.version

'3.10.10 (tags/v3.10.10:aad5f6a, Feb  7 2023, 17:20:36) [MSC v.1929 64 bit (AMD64)]'

****************
## Pip
****************


Once installed, you can download, install and uninstall any compliant Python software
product with a single command. It also enables you to add this network installation
capability to your own Python software with very little work.

Python 2.7.9 and later (on the python2 series), and Python 3.4 and later include
pip by default.

To see if pip is installed, open a command prompt and run



    command -v pip

If pip is installed, you should see something like this: 

``/usr/local/bin/pip``



## The Zen of Python

Python aficionados are often quick to point out how "intuitive", "beautiful", or "fun" Python is.
While I tend to agree, I also recognize that beauty, intuition, and fun often go hand in hand with familiarity, and so for those familiar with other languages such florid sentiments can come across as a bit smug.
Nevertheless, I hope that if you give Python a chance, you'll see where such impressions might come from.
And if you *really* want to dig into the programming philosophy that drives much of the coding practice of Python power-users, a nice little Easter egg exists in the Python interpreter: simply close your eyes, meditate for a few minutes, and ``import this``:

In [None]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


# How to Run Python Code

Python is a flexible language, and there are several ways to use it depending on your particular task.
One thing that distinguishes Python from other programming languages is that it is *interpreted* rather than *compiled*.
This means that it is executed line by line, which allows programming to be interactive in a way that is not directly possible with compiled languages like Fortran, C, or Java. This section will describe four primary ways you can run Python code: the *Python interpreter*, the *IPython interpreter*, via *Self-contained Scripts*, or in the *Jupyter notebook*.

### The Python Interpreter

The most basic way to execute Python code is line by line within the *Python interpreter*.
The Python interpreter can be started by installing the Python language (see the previous section) and typing ``python`` at the command prompt (look for the Terminal on Mac OS X and Unix/Linux systems, or the Command Prompt application in Windows):
```
$ python
Python 3.7.2 (default, Dec 29 2018, 00:00:04) 
[Clang 4.0.1 (tags/RELEASE_401/final)] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
```
With the interpreter running, you can begin to type and execute code snippets.
Here we'll use the interpreter as a simple calculator, performing calculations and assigning values to variables:
``` python
>>> 1 + 1
2
>>> x = 5
>>> x * 3
15
```

The interpreter makes it very convenient to try out small snippets of Python code and to experiment with short sequences of operations.

### The IPython interpreter

If you spend much time with the basic Python interpreter, you'll find that it lacks many of the features of a full-fledged interactive development environment.
An alternative interpreter called *IPython* (for Interactive Python) is bundled with the Anaconda distribution, and includes a host of convenient enhancements to the basic Python interpreter.
It can be started by typing ``ipython`` at the command prompt:
```
$ ipython
Python 3.7.2 (default, Dec 29 2018, 00:00:04) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.2.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: 
```
The main aesthetic difference between the Python interpreter and the enhanced IPython interpreter lies in the command prompt: Python uses ``>>>`` by default, while IPython uses numbered commands (e.g. ``In [1]:``).
Regardless, we can execute code line by line just as we did before:
``` ipython
In [1]: 1 + 1
Out[1]: 2

In [2]: x = 5

In [3]: x * 3
Out[3]: 15
```
Note that just as the input is numbered, the output of each command is numbered as well.

### Self-contained Python scripts

Running Python snippets line by line is useful in some cases, but for more complicated programs it is more convenient to save code to file, and execute it all at once.
By convention, Python scripts are saved in files with a *.py* extension.
For example, let's create a script called *test.py* which contains the following:
``` python
# file: test.py
print("Running test.py")
x = 5
print("Result is", 3 * x)
```
To run this file, we make sure it is in the current directory and type ``python`` *``filename``* at the command prompt:
```
$ python test.py
Running test.py
Result is 15
```
For more complicated programs, creating self-contained scripts like this one is a must.

### The Jupyter notebook

A useful hybrid of the interactive terminal and the self-contained script is the *Jupyter notebook*, a document format that allows executable code, formatted text, graphics, and even interactive features to be combined into a single document.
Though the notebook began as a Python-only format, it has since been made compatible with a large number of programming languages, and is now an essential part of the [*Jupyter Project*](https://jupyter.org/).
The notebook is useful both as a development environment, and as a means of sharing work via rich computational and data-driven narratives that mix together code, figures, data, and text.

### Launching the Jupyter Notebook

The Jupyter notebook is a browser-based graphical interface to the IPython shell, and builds on it a rich set of dynamic display capabilities.
As well as executing Python/IPython statements, the notebook allows the user to include formatted text, static and dynamic visualizations, mathematical equations, JavaScript widgets, and much more.
Furthermore, these documents can be saved in a way that lets other people open them and execute the code on their own systems.

Though the IPython notebook is viewed and edited through your web browser window, it must connect to a running Python process in order to execute code.
This process (known as a "kernel") can be started by running the following command in your system shell:

```
$ jupyter notebook
```

This command will launch a local web server that will be visible to your browser.
It immediately spits out a log showing what it is doing; that log will look something like this:

```
$ jupyter notebook
[NotebookApp] Serving notebooks from local directory: /Users/jakevdp/PythonDataScienceHandbook
[NotebookApp] 0 active kernels 
[NotebookApp] The IPython Notebook is running at: http://localhost:8888/
[NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
```

Upon issuing the command, your default browser should automatically open and navigate to the listed local URL;
the exact address will depend on your system.
If the browser does not open automatically, you can open a window and manually open this address (*http://localhost:8888/* in this example).

If you have jupyter lab installed you can launch it by running the following command:

```
$ jupyter lab
```

Jupyter lab is very similar to jupyter notebook with more flexibility and extra functionality. You can install jupyter lab by using conda installer:

```
$ conda install -c conda-forge jupyterlab
```

## Help and Documentation

In [1]:
help(len)

Help on built-in function len in module builtins:

len(obj, /)
    Return the number of items in a container.



Depending on your interpreter, this information may be displayed as inline text, or in some separate pop-up window.

Because finding help on an object is so common and useful, IPython introduces the ``?`` character as a shorthand for accessing this documentation and other relevant information:

In [2]:
len?

[1;31mSignature:[0m [0mlen[0m[1;33m([0m[0mobj[0m[1;33m,[0m [1;33m/[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m Return the number of items in a container.
[1;31mType:[0m      builtin_function_or_method

This notation works for just about anything, including object methods:

In [3]:
L = [1, 2, 3]
L

[1, 2, 3]

In [4]:
L.insert

<function list.insert(index, object, /)>

or even objects themselves, with the documentation from their type:

In [5]:
L?

[1;31mType:[0m        list
[1;31mString form:[0m [1, 2, 3]
[1;31mLength:[0m      3
[1;31mDocstring:[0m  
Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list.
The argument must be an iterable if specified.

Importantly, this will even work for functions or other objects you create yourself!
Here we'll define a small function with a docstring:

In [None]:
def square(a):
    """Return the square of a."""
    return a ** 2

Note that to create a docstring for our function, we simply placed a string literal in the first line.
Because doc strings are usually multiple lines, by convention we used Python's triple-quote notation for multi-line strings.

Now we'll use the ``?`` mark to find this doc string:

In [None]:
square?

[0;31mSignature:[0m [0msquare[0m[0;34m([0m[0ma[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m Return the square of a.
[0;31mFile:[0m      ~/Drive/projects/Business-Analytics/01-Python-Overview/<ipython-input-7-c96e82bfafc5>
[0;31mType:[0m      function


This quick access to documentation via docstrings is one reason you should get in the habit of always adding such inline documentation to the code you write!

## Accessing Source Code with ``??``
Because the Python language is so easily readable, another level of insight can usually be gained by reading the source code of the object you're curious about.
IPython provides a shortcut to the source code with the double question mark (``??``):

In [None]:
square??

[0;31mSignature:[0m [0msquare[0m[0;34m([0m[0ma[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mSource:[0m   
[0;32mdef[0m [0msquare[0m[0;34m([0m[0ma[0m[0;34m)[0m[0;34m:[0m[0;34m[0m
[0;34m[0m    [0;34m"""Return the square of a."""[0m[0;34m[0m
[0;34m[0m    [0;32mreturn[0m [0ma[0m [0;34m**[0m [0;36m2[0m[0;34m[0m[0;34m[0m[0m
[0;31mFile:[0m      ~/Drive/projects/Business-Analytics/01-Python-Overview/<ipython-input-7-c96e82bfafc5>
[0;31mType:[0m      function


For simple functions like this, the double question-mark can give quick insight into the under-the-hood details.

If you play with this much, you'll notice that sometimes the ``??`` suffix doesn't display any source code: this is generally because the object in question is not implemented in Python, but in C or some other compiled extension language.
If this is the case, the ``??`` suffix gives the same output as the ``?`` suffix.
You'll find this particularly with many of Python's built-in objects and types, for example ``len`` from above:

In [6]:
len??

[1;31mSignature:[0m [0mlen[0m[1;33m([0m[0mobj[0m[1;33m,[0m [1;33m/[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m Return the number of items in a container.
[1;31mType:[0m      builtin_function_or_method

Using ``?`` and/or ``??`` gives a powerful and quick interface for finding information about what any Python function or module does.

## Exploring Modules with Tab-Completion

IPython's other useful interface is the use of the tab key for auto-completion and exploration of the contents of objects, modules, and name-spaces.
In the examples that follow, we'll use ``<TAB>`` to indicate when the Tab key should be pressed.

### Tab-completion of object contents

Every Python object has various attributes and methods associated with it.
Like with the ``help`` function discussed before, Python has a built-in ``dir`` function that returns a list of these, but the tab-completion interface is much easier to use in practice.
To see a list of all available attributes of an object, you can type the name of the object followed by a period ("``.``") character and the Tab key:

```ipython
In [10]: L.<TAB>
L.append   L.copy     L.extend   L.insert   L.remove   L.sort     
L.clear    L.count    L.index    L.pop      L.reverse  
```

To narrow-down the list, you can type the first character or several characters of the name, and the Tab key will find the matching attributes and methods:

```ipython
In [10]: L.c<TAB>
L.clear  L.copy   L.count  

In [10]: L.co<TAB>
L.copy   L.count 
```

If there is only a single option, pressing the Tab key will complete the line for you.
For example, the following will instantly be replaced with ``L.count``:

```ipython
In [10]: L.cou<TAB>

```

Though Python has no strictly-enforced distinction between public/external attributes and private/internal attributes, by convention a preceding underscore is used to denote such methods.
For clarity, these private methods and special methods are omitted from the list by default, but it's possible to list them by explicitly typing the underscore:

```ipython
In [10]: L._<TAB>
L.__add__           L.__gt__            L.__reduce__
L.__class__         L.__hash__          L.__reduce_ex__
```

For brevity, we've only shown the first couple lines of the output.
Most of these are Python's special double-underscore methods (often nicknamed "dunder" methods).

## Overview of Python Virtual Environments
*This guide is targetted at intermediate or expert users who want low-level control over their Python environments.*

When you're working on multiple coding projects, you might want a couple different version of Python and/or modules installed. This helps keep each workflow in its own sandbox instead of trying to juggle multiple projects (each with different dependencies) on your system's version of Python. The guide here covers one way to handle multiple Python versions and Python environments on your own (i.e., without a package manager like `conda`). See the [Using the workflow](https://gist.github.com/wronk/a902185f5f8ed018263d828e1027009b#using-the-workflow) section to view the end result.

<p align="center">
  <img width="350" src="https://imgs.xkcd.com/comics/python_environment_2x.png">
  <br>
  h/t @sharkinsspatial for linking me to the perfect cartoon
</p>

### Use cases
1. Working on 2+ python projects that each have their own dependencies; e.g., a Python 3.6 project and a Python 3.8 project, or developing/testing a module that needs to work across multiple versions of Python. It's not reasonable to uninstall/reinstall Python modules every time you want to switch projects.
2. If you want to execute code on the cloud, you can set up a Python environment that mirrors the relevant cloud instance. For example, your favorite Amazon EC2 deep learning instance may run Python 3.6, and you could hit obstacles if you developed locally with Python 3.8.
3. You might have some working Python code and want to make sure everything stays frozen so that it'll still work in the future. Without virtual environments, upgrading Python modules could unintentionally break that year-old project. Going back to determine the correct version for each dependency would be a huge pain.

This guide shows how to solve these issues with pyenv and virtualenv (along with virtualenvwrapper). It illustrates how to obtain lower-level control of your development environment (compared to Anaconda/`conda`, for example). It's tedious to setup, but very easy exert a high level of control on your Python environments after that. This is intended for MacOS, but all the tools work on Unix-like systems -- you'll just have to make use of `apt-get` instead of `brew` and detour through the original installation guides in some spots. 

For comparison to Anaconda, see [note below](#other-notes)
 
## Instructions
1. **[pyenv](https://github.com/pyenv/pyenv)**: Short for "Python environment." Pyenv manages which version of Python is visible to your computer (and temporarily hides other versions). With pyenv, you can install multiple versions of Python and quickly switch between the "activated" version (i.e., the version your computer will use to execute code).

    **Installation/use**: From, [pyenv's install instructions](https://github.com/pyenv/pyenv#homebrew-on-macos), `brew install pyenv` on Mac. See the docs for installation via `git clone` on other other systems. 
    
    Then you can list and install new Python versions like:
    ```
    pyenv install 3.7.7  # Install Python version
    pyenv install 3.6.3
    pyenv versions       # List Python versions
    
    # Later, we will switch version with something like `pyenv global 3.7.3`, but don't do this yet
    ```
    
    Also install [pyenv-virtualenv](https://github.com/pyenv/pyenv-virtualenv) like `brew install pyenv-virtualenv`, which we'll need later.
    
    **Technical details**: When you execute a Python script or use pip, pyenv intercepts that command and sends it to the Python environment that is activated. It does this using shims on the `PATH` environment variable, which allow Python-related commands to be dynamically rerouted. We'll set the `PATH` shims later in this guide.
2. Confirm python version

    Make sure you have an up to date version of python **at the system level** (and not from pyenv). You can check and fix (if required) using the below code.

    ```
    python --version  # Should be a python 3 version 
    
    # If the above gives python 2 and not python 3:
    brew install python
    brew info python # See where the unversioned symlinks live. Likely `/usr/local/opt/python/libexec/bin`
    
    # Update your PATH so the unversioned python/pip aliases are used. Run the below line to accomplish this.
    # Update the command if the unversioned symlinks live in a different location or if you use .bashrc/.profile 
    # instead of a ~/.zshrc
    echo 'export PATH=/usr/local/opt/python/libexec/bin:$PATH' >> ~/.zshrc
    
3. **[virtualenv](https://virtualenv.pypa.io/en/stable/)**: Short for "virtual environment." This tool allows manages separate directories for each environment so you can install modules (e.g., with `pip`) to each environment individually.

    **Installation**: 
    `pip install virtualenv` in your terminal
    
    **Use:** It's possible to use virtualenv directly as ([as described here](https://virtualenv.pypa.io/en/stable/userguide/)), but we'll use virtualenvwrapper instead.

    **Technical details**: `virtualenv` keeps each environment (and its installed modules) in separate folders; therefore, each is like a silo that doesn't interact with any other virtual environment. By default, the exact file location is defined by the user, but we will use virtualenvwrapper to manage these locations for us.

4. **[virtualenvwrapper](https://virtualenvwrapper.readthedocs.io/en/latest/)**. This helps `pyenv` and `virtualenv` gel like PB&J. With it, you witch between environments using a single command (where each environment has it's own version of Python and own installed modules).

    **Installation**: `pip install virtualenvwrapper` and then `brew install pyenv-virtualenvwrapper` to extend pyenv. Then you'll need to do some one-time setup; in your .zshrc/.bashrc/.bash_profile, add the following:
    ```
    # Setup virtualenv home
    export WORKON_HOME=$HOME/.virtualenvs
    source /usr/local/bin/virtualenvwrapper.sh
    
    # Tell pyenv-virtualenvwrapper to use pyenv when creating new Python environments
    export PYENV_VIRTUALENVWRAPPER_PREFER_PYVENV="true"
    
    # Set the pyenv shims to initialize
    if command -v pyenv 1>/dev/null 2>&1; then
     eval "$(pyenv init -)"
    fi
    ```
    Make sure that the directory you define for `WORKON_HOME` actually exists (or use `mkdir ~/.virtualenvs`), and then restart your terminal. 
    
    See [Troubleshooting](https://gist.github.com/wronk/a902185f5f8ed018263d828e1027009b#troubleshooting) if your system has issues finding `virtualenvwrapper.sh`. Full virtualenvwrapper [installation instructions here](https://virtualenvwrapper.readthedocs.io/en/latest/index.html#introduction).


## Using the workflow
We're all ready to use this in the terminal! As shown below, we'll first set the Python environment with `pyenv`, and then make a couple virtual environments with `virtualenvwrapper`. Then we'll use the `workon` command to switch between them.
```
pyenv global 3.6.3           # Set your system's Python version with pyenv
mkvirtualenv my_legacy_proj  # Create a new virtual environment using virtualenvwrapper; it'll be tied to Python 3.6.3
pip install numpy scipy      # Install the packages you want in this environment

pyenv global 3.8.2         # Set your system's Python version with pyenv
mkvirtualenv new_web_proj  # Create and switch to a new virtual environment with a newer version of python
pip install flask boto

workon                 # List the environments available
workon my_legacy_proj  # Use virtualenvwrapper to switch back to the original project
```

## Troubleshooting
1. If you're on MacOS and have issues with pyenv like:
    ```
    zipimport.ZipImportError: can't decompress data; zlib not available
    make: *** [install] Error 1
    pyenv: version `3.5.0' is not installed
    ```
    
    Make sure you have newest version of XCode CLI installed by running: `xcode-select --install`

1. If you have file not found issues with pyenv's `virtualenvwrapper.sh`, you should be able to check where it lives with `pyenv which virtualenvwrapper.sh`. Substitute in this path in your .zshrc/.bashrc/.bash_profile.
1. If on MacOS you're having issues with pip installs and getting an error like:
    ```
    Error in sitecustomize; set PYTHONVERBOSE for traceback:
    KeyError: 'PYTHONPATH'
    ```

    try deleting homebrew's link to python by deleting the `~/.local` folder.
1. If you're upgrading to a new version of python and having issues using `mkvirtualenv` giving getting an error like:
    ```
    pyenv: virtualenv: command not found

    The `virtualenv' command exists in these Python versions:
    2.7.14
    3.6.3
    ```
    
    make sure you've set the desired version of python and enter on the command line `pyenv virtualenvwrapper` before trying to create a new virtual environment with the `mkvirtualenv` command.
    
1. If you're upgrading to a new version of python and having issues with `virtualenvwrapper` giving getting an error like:
    ```
    /usr/local/opt/python/bin/python3.7: Error while finding module specification for 'virtualenvwrapper.hook_loader' (ModuleNotFoundError: No module named 'virtualenvwrapper')
    virtualenvwrapper.sh: There was a problem running the initialization hooks.

    If Python could not import the module virtualenvwrapper.hook_loader,
    check that virtualenvwrapper has been installed for
    VIRTUALENVWRAPPER_PYTHON=/usr/local/opt/python/libexec/bin/python and that PATH is
    set properly.
    ```
    
    First, make sure the underlying tools are installed with `pip install virtualenv virtualenvwrapper`. If that still doesn't work, `pip` might be referring to a default version of python when you want to install it for a different version. You can explicitly call the version of python to refer to with something like `/usr/local/bin/pip3.7 install virtualenv virtualenvwrapper`. If that still doesn't work, try executing `pyenv virtualenvwrapper` as in Troubleshooting item 4.
    
## Other notes
1. **Anaconda** does have functionality to handle some of the problems outlined above. Generally, it provides a lower bar for entry to Python development because the Anaconda software distribution contains both the `conda` package manager as well as many useful python modules, which is great for new Python users. Anaconda is also a good choice for Windows users as getting all Python packages to play nicely together is a challenge on Windows. However, there are some downsides to Anaconda/`conda`:
    * Any package that can't be installed via the `conda` package manager must be installed with pip or some other method; at that point, you're managing two install streams. For more advanced development, this can get messy.
    * You aren't always going to be using the most up-to-date version of modules because Continuum must repackage each one into their own system before calling `conda update` will actually provide the newest version of that module's code.
    * Different versions of Python will not always have the latest module updates -- Continuum focuses its resources on certain versions of python, and you're relying on their team to incorporate all package updates to those version as well as the less-popular versions (like 3.4 and 3.5).
    * Because the Anaconda software distribution is a large self-contained install, it'll install many packages that you might not need. Miniconda solves this to some degree as it only contains the `conda` package manager and its dependencies.

    That aside, a good discussion of Anaconda's benefits, and some counter-arguements are [here](http://jakevdp.github.io/blog/2016/08/25/conda-myths-and-misconceptions/#Myth-#5:-conda-doesn't-work-with-virtualenv,-so-it's-useless-for-my-workflow).
  
1. **Virtual environment prefix on source prompt**

    If you want your command prompt to show the virtual environment you're currently working with, add this to you .bashrc/.bash_profile:
    
    ```
    # Prefix source prompt with virtualenvwrapper environment
    if which pyenv > /dev/null; then eval "$(pyenv init -)"; fi
    eval "$(pyenv virtualenv-init -)"
    ```
    
    Your terminal command prompt will now look something like `(my_project_py3) Mark@Marks-MBP:~/Builds/ $`
    
1. **Directory scheme**
    
    This is my own personal preference, but when setting up my Python environment, I also tend to store modules I'm developing in a `Builds` directory (i.e., `/Users/wronk/Builds`). Similarly, I put data in `/Users/wronk/Data`. Then, I'll define an environment variable in my .bashrc/.bash_profile (e.g., named `BUILDS_DIR` and `DATA_DIR`) so that writing scripts/Python code is more agnostic of the exact machine I'm using. 

    For example, any shell scripts can traverse directories from the `BUILDS_DIR` environment variable instead of a hard-coded path, and I'll use something like `my_data_dir = os.environ['DATA_DIR']` in Python code so it'll work on any machine that mirrors my directory scheme. That tends to look cleaner in code and is easier for getting the same code running locally and on the cloud (or another computer).