## CS102-4 - Further Computing

Mark Howard<br>
School of Mathematical & Statistical Sciences<br>
NUI Galway<br>


(Notebooks adadpted from Prof. Götz Pfeiffer)


In this part of the module, we will study certain packages that extend and enhance the functionality of the `python` programming language and its standard implementation.

The topics to be covered are:
    
  1. **Aspects of Scientific Computing**
    
  1. **Aspects of Data Wrangling**
    
  1. **Aspects of Data Visualization**
    
  1. **Aspects of Machine Learning**


Specifically, we will use and look at the following popular and powerful *libraries*:

* [`numpy`](https://numpy.org): scientific computing

* [`pandas`](https://pandas.pydata.org): data analysis

* [`matplotlib`](https://matplotlib.org): visualization

* [`scikit-learn`](https://scikit-learn.org): machine learning

Occasionally, we might borrow some support from

* [`scipy`](https://scipy.org): mathematics, algorithms

* [`seaborn`](https://seaborn.pydata.org): statistical data

Once installed on the machine, these packages can be imported into any `python` session.
By popular convention, this is done as follows:

* `numpy` is usually imported as `np`:

In [None]:
import numpy as np
np.__version__

* `pandas` is usually imported as `pd`:

In [None]:
import pandas as pd
pd.__version__

* The drawing routines in `matplotlib.pyplot` are usually imported as `plt`; 
  the version number is atteched to the top level package `matplotlib`:

In [None]:
import matplotlib.pyplot as plt
import matplotlib
matplotlib.__version__

* `seaborn` is usually imported as `sns`:

In [None]:
import seaborn as sns
sns.__version__

* Don't do this

![Screenshot%201-10-22%20at%2015.52.05.png](attachment:Screenshot%201-10-22%20at%2015.52.05.png)

* The libraries `scipy` and `scikit-learn` consist of several packages that need to be imported individually.
We'll see which and how later when we need them.  For now, let's just check the version numbers.

In [None]:
import scipy
scipy.__version__

In [None]:
import sklearn
sklearn.__version__

## Lecture Notes on `github`

* Lecture notes for this part of the course come in the form of `jupyter`  notebooks.

* This allows us to include interactive `python` code
  with the text.
  
* The notebooks will be uploaded (and updated) on `github`
  at
  
>  https://github.com/gpfeiffer/cs102-4

* If `jupyter` is installed on your computer, you can download
  the notebooks and execute them there.
  
* Alternatively, you can follow the links 
[![Open in Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/gpfeiffer/cs102-4/main)
or
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/gpfeiffer/cs102-4)
to execute the notebooks on `binder`, or Google's `colab` site ...

## Book/Notebook
**Python Data Science Handbook**
<br/>Jake VanderPlas
<br/>O'Reilly 2016
<br/>Jupyter [notebook](https://jakevdp.github.io/PythonDataScienceHandbook/) edition on `github`

## IPython

* An enhanced interactive `python` interpreter

* Started in 2001

* Provides a number of useful syntactic additions to the `python` language most of which built into `jupyter`

### Single Key Extensions

We'll briefly discuss the effects of the following symbols.

  1. `?`: Documentation

  1. `<tab>`:  Auto-Completion

  1. `%`: Magic Commands

  1. `!`: Shell Commands

In [None]:
?

###  Help and Documentation

###  Help and Documentation
* using `python`'s own `help` system

In [None]:
help(len)

* `ipython`'s `?` operator ...

In [None]:
len?

In [None]:
?len

* ... works on objects as well as on functions and methods ...

In [None]:
L = [1, 2, 3]
?L

In [None]:
?L.append

In [None]:
np?

* ... even on `python` functions you create yourself

In [None]:
def square(x):
    """returns the square of x"""
    return x * x

In [None]:
?square

* `ipython`'s `??` operator provides even more information, e.g. the function's source code, if that's available

In [None]:
??square

### Tab Completion

Type in the following commands and press `<tab>` ...

```python
    L.

    L._

    from sklearn import 

    import ma
```

In [None]:
L._

### Magic Commands

* start with the `%` symbol (line magic), or with `%%` (cell magic).

In [None]:
%timeit L = [n ** 2 for n in range(999)]

In [None]:
%timeit L = [n * n for n in range(999)]

In [None]:
%%timeit
L = []
for n in range(999):
    L.append(n*n)

In [None]:
%timeit?

In [None]:
%magic

In [None]:
%lsmagic

### Input and Output History

In [None]:
%history -n 1-4

In [None]:
%history?

* Code cells in this notebook are decorated with `In[1]` and `Out[1]`s.
* These are actual `python` objects that can be accessed as a whole or in parts.
* `In` is a `python` list.
* `Out` is a `python` dictionary, storing only the non-empty return values.

In [None]:
print(In)

In [None]:
print(In[2])

In [None]:
print(Out)

In [None]:
print(Out[4])

### Shell Commands

In [None]:
# path = !pwd
path=!cd
print(path)

So much for now.  There is more on the [`ipython`](http://ipython.org/) website

## References
### `python`

* The `import` statement [[doc]](https://docs.python.org/3/reference/simple_stmts.html#import)
  provides access to packages, or their parts and elements.

In [None]:
import math
math.pi

In [None]:
from math import pi
pi

In [None]:
import numpy as np
np.array([1,2,3])

### `IPython`

* Built-in Magic Commands [[doc]](https://ipython.readthedocs.io/en/stable/interactive/magics.html)

### `Jupyter`

* Notebook Documentation [[doc]](https://jupyter-notebook.readthedocs.io/en/stable/)

## Exercises

1. Which `python` command provides access to additional packages?

2. In `IPython` and in `jupyter` notebooks, which operator provides access to the online documentation?

3. What is the difference between `?` and `??`?

4. Which key can you press for auto-completion of partial `python` code?

5. Which symbol starts a "magic" command?

6. What is the difference between "line magic" and "cell magic"?

7. In a `jupyter` notebook, what are the names of the variables that
provide access to historical input and output?

8. Is there a magic command for displaying the input history?  How can you see its documentation?

9. What are the names of the `python` variables that store the input and output history of a `jupyter` notebook?

9. Which symbol starts a shell command, issued from within a `jupyter` notebook?