# What is Python?

Python is an *interpreted* language that is popular for many programming tasks. 

Python is:
* Easy to learn -- you don't need to know what the computer is doing under the hood, like in C or C++
* Pre-installed on Linux
* Syntactically simple
* Widely supported - the user community is large, and Stack Overflow has answers to most problems
* Rated as the #1 to #5 most popular language, depending on the metric and the survey.



# What can I use Python for?

* Machine learning and Deep learning (Pytorch, TensorFlow)
* Back-end web development (Django, flask)
* Data science / data mining (scikit)
* Data visualization 
* Plotting (matplotlib)
* Scripting 
* Interact with web resources, such as site scrapers
* Programming with databases (sqlite3)
* Traditional computer vision (OpenCV)
* Embedded programming (EmbeddedPython)
* Multithreaded applications (multithreading)

And so much more!

XKCD's take on Python 

![title](python_xkcd.png)

* Bonus fact - ```import antigravity``` in Python brings up this comic.

# Packages: Installing and Importing:

To install a package, we use pip (stands for *pip installs packages*), the Python package manager. 

## Installing packages

To install a package that you know the name of (for instance, numpy), simply go to the command line and type:

```bash 
pip install numpy
```

Sometimes, we don't know the name of the package, so we need to search it. While Google is helpful, pip also includes a search function. Say I want to install OpenCV, the computer vision library, but I don't know the exact package name. I can type:

```bash
pip search opencv
```
and among the results will be ```opencv-python```, which is the package.

Installation of packages is straightforward (you may have to type a 'y' somewhere). Uninstalling a package can be done with
```bash
pip uninstall opencv-python
```

If you want a certain version, you can specify like this:
```bash
pip install numpy==1.16.2
```

To get a list of packages you have installed in your system or virtual environment, use:
```bash
pip freeze > requirements.txt
```
which produces the file **requirements.txt**. You can also install a bunch of packages from a requirements file with
```bash
pip install -r requirements.txt
```

## Importing packages in your script
Python scripts include the functionality of packages using the **import** keyword. Assuming that a package is in your path (most of the time, don't worry about this; pip will take care of it), you can import packages using the following. 
```python
import numpy
import tensorflow
import matplotlib
```
You can also reference long package names in your script as shorter strings. Some common ones include:
```python
import numpy as np
import tensorflow as tf
import matplotlib as matplot
```
Occasionally, you may want to go down in the heirarchy of the package to just include one part of it. For instance, you can import just the 'array' functionality of numpy with
```python
from numpy import array
```

You may occasionally see code from other people that **import**s the **\*** function from a package, e.g.
```python
from numpy import *
```
**PLEASE -- don't do this.** It can cause bugs that are hard to track down. For example, consider the following code:


In [14]:
from numpy import *

print('Numpy mod is: %d' % mod(5,2))

def mod(a, b):
    # Yes, I know this is *not* how you do mod, but this is illustrative rather 
    # than mathematically correct
    return a * b

print('My implementation of mod is: %d' % mod(5, 2))

Numpy mod is: 1
My implementation of mod is: 10


As you can see, it's easy for the two implementations of the function to collide, and it won't even give you an error -- it's simply a function of what was defined last. If you had two packages with the same function name, you're going to get results you may not expect. Instead, we should do:

In [9]:
import numpy as np

# Delete implementation of mod if you've already run this cell
del mod
# Try to run the mod function when it's not implemented. I put this in a try-except 
# block so that the error is caught. Since 'mod' is not implemented yet, it will
# print the string in the 'except' block.
try:
    mod(5,2)
except NameError as ne:
    print("Mod function not implemented yet!")
    
# This is now unambiguous with numpy.mod
def mod(a, b):
    return a * b

print('Numpy mod is: %d' % np.mod(5,2))
print('My implementation of mod is: %d' % mod(5, 2))

Mod function not implemented yet!
Numpy mod is: 1
My implementation of mod is: 10


In this case, it's very clear that the first **mod** is from numpy, and the second is your implementation. Even if you do ```from numpy import mod```, it's a lot more transparent because you can see that that is explicitly taken from numpy.

# Popular Python Packages

* Numpy (numpy) - Python's numerical package, used a lot for scientific computing
* TensorFlow (tensorflow) or PyTorch (torch) - standards for machine learning
* OpenCV (cv2) - traditional computer vision
* Scipy (scipy) - implements lots of traditional machine learning functionality (nearest neighbors, support vector machine, clustering, and more)

Several packages are available as part of the Python standard library - no installation necessary
* os - operating system functions. Useful for building paths to files
* re - regular expressions. These are powerful tools for finding strings matching a pattern.
* sys - useful for getting command line arguments and many other things
* shutil - interaction with file system
* random - random number generation
* math - mathematical functions

# Things to Know

* No explicit types
* Whitespace defines the script
* Scripts are interpreted 
  * No compilation for a given architecture/computer needed
  * Can run a script until it breaks  
  * Script still must be syntactically correct before it runs

# Virtual environments

A virtual environment is a way of compartmentalizing a Python project from other projects. For example, lets assume that we had one project that depended on version 1.8 of NumPy (the numerical python package) and couldn't be upgraded. What if we want to use version 1.12 of NumPy in another project -- that's going to mess up project #1. 

Virtual environments solve this problem. In effect, they make separate spaces for each project you want to make, so that you can use two different versions of the same package. 

Importantly, you can use one virtual environment for multiple scripts and projects. For example, I have an environment on my computer for TensorFlow, one for PyTorch, and so on. I use the PyTorch environment for lots of machine learning projects with PyTorch, but can exit out and start up the TF environment easily to use that framework. 

Popular virtual environments are:
* Anaconda -- this includes an instance of Jupyter notebooks, which lets you make rich scripts in a web browser
* Python also has a built-in venv function, which I'm not as familiar with.

We won't really cover these in more detail here, because we will be using *containers* (e.g., Singularity, Docker) this summer that solve many of these problems

# Python syntax

In [None]:
#! /usr/bin/env python
# ^^ This is often called the 'shebang'. On Unix systems, it allows you to make 
# script executable (chmod +x), then run it as a program with ''./script.py' rather
# than calling python explicitly (python script.py)

import numpy as np

a = 1
# This is going to throw an error -- the spacing is wrong. Try fixing this!
  b = 2

In [None]:
a = 1
if True:
    b = 1



I can't put tabs in a Jupyter notebook easily, but tabs and spaces are *not* the same in Python. Best practice is to set your editor to automatically expand tabs to a given number of spaces. It a) makes your script actually run, and b) spaces are constant width in any editor, while tabs may look like 4, 6, 8, or some user-defined number of tabs. 

## Flow statements
Flow control statements control any program. In Python, control statements are terminated with a colon : and trigger another level of indentation.

In [None]:
# 'if' statement
if 1 > 3: # Need a statement that evaluates to True or False (which are 
    # reserved python keywords)
    print("Math doesn't work")
else:
    print("Whew, we saved math.")
    
# For loops need something to iterate over. We can go with a list:
for i in ['we', 'are', 'writing', 'python']:
    print(i)
    
# or use 'range' to get a sequence of numbers:
for j in range(5):
    print(j)
    
# While loops execute as long as the criteria is true
w = 0
q = 0
while w < 4:
    q += w
    w += 1
    