# A1. Tools 

<img src="img/anaconda.png" width="450px">

<div class="alert alert-success">
Anaconda is an open-source distribution of Python, designed for scientific computing, data science and avanced computing. 
</div>

<div class="alert alert-info">
The anaconda website is 
<a href="https://www.anaconda.com" class="alert-link">here</a>.
</div>

Anaconda is a distribution, that is, a collection of packages that are curated and maintained together. Anaconda also comes with conda, which is a package manager, allowing you to download, install, and manage other packages. 

You can check which version of python you are using at any time. Once you have installed anaconda you should make sure you are using Python version 3.6. Use the following command line function to check: 

Here's what it looked like when I did it just now:

<img src="img/check-python-version.PNG" width="800px">

<img src="img/jupyter.png" width="300px">

<div class="alert alert-success">
Jupyter notebooks are a way to combine executable code, code outputs, and text into one connected file (.ipynb). They run in a web browser and connect to a kernel to be able to execute code. 
</div>

<div class="alert alert-success">
The official Jupyter website is available 
<a href="http://jupyter.org" class="alert-link">here</a>.
</div>


You do not need to download Jupyter as it comes packaged with anaconda. 

<img src="img/NBViewer.png" width="800px">

<div class="alert alert-success">
Notebooks can be rendered on webpages, and shared with others. NBViewer is a tool to host and render notebooks.
</div>

<div class="alert alert-info">
NBViewer is available 
<a href="https://nbviewer.jupyter.org/" class="alert-link">here</a>.
</div>

NBViewer is not a tool that you need to download, or necessarily use at all, it is simply a useful tool available online to view notebooks.

<img src="img/git.png" width="400px">

<img src="img/github.png" width="400px">

<div class="alert alert-success">
Git is a version control system: a tool to track changes in files, across multiple locations.
</div>

<div class="alert alert-success">
Github is a web based version of Git, and version control repository or internet hosting service. It's a place to put code that is tracked with git.</div>

<div class="alert alert-info">
Install 
<a href="https://git-scm.com/book/en/v2/Getting-Started-Installing-Git" class="alert-link">git</a>,
if you don't already have it, and create an account on 
<a href="https://github.com/" class="alert-link">Github</a>.
</div>

Git & Github are not the same thing, though, in practice, they are commonly used together, whereby git is used as a tool to version control code and manage multiple copies stored across your computer, as well as on remote repositories that are stored on Github.

You can check that you have git installed (doesn't matter which version for now) using the following command line function.

Here's what it looked like when I did it just now:

<img src="img/check-git-version.PNG" width="800px">

## Git Cheatsheet

The most common git functions are:

- git status
    - Check the status of a git repository
- git add 'file'
    - Add a file to staging area
- git commit -m 'message'
    - Log a 'save point' of all changes in the staging area.
- git push
    - Copy commits to remote
- git diff 'file'
    - Check what has changed in file since last commit
- git clone 'repo'
    - Create a local copy of a git repository
- git pull
    - Update your local copy of a git repository from the remote

# Enviornments

<div class="alert alert-success">
Environments are isolated, independent installations of a programming language and groups of packages, that don't interfere with each other. 
</div>

<div class="alert alert-info">
Anaconda has detailed instructions on using environments available 
<a href="https://conda.io/docs/using/envs.html" class="alert-link">here</a>.
</div>

You do not need to use environments, however you may find it useful if you want or need to maintain multiple different versions of Python. 

If you want to use an enviornment, use the following command line function, replacing 'envname' with a name to call this enviornment. Running this will install a new env with python 3.6 and the anaconda distribution.

You will then need to activate this env everytime you want to use it. 

To activate your env:

To deactivate you env:

# Useful parts of the standard (python) library


The full list of packages in the standard library is available [here](https://docs.python.org/3.6/library/index.html).

### Basic Utilities

- [os](https://docs.python.org/3.6/library/os.html) - miscellaneous operating system operations.
- [sys](https://docs.python.org/3.6/library/sys.html) - system operations.
- [datetime](https://docs.python.org/3.6/library/datetime.html) - manipulating dates & times.
- [glob](https://docs.python.org/3.6/library/glob.html) - searching path names.

### Useful Functions

- [math](https://docs.python.org/3.6/library/math.html) - mathematical functions.
- [random](https://docs.python.org/3.6/library/random.html) - (pseudo) random number generators.
- [re](https://docs.python.org/3.6/library/re.html) - regular expressions.

### File Formats

- [json](https://docs.python.org/3.6/library/json.html) - support for working with JSON files.
- [csv](https://docs.python.org/3.6/library/csv.html) - support for working with CSV files.

### Data Objects

- [collections](https://docs.python.org/3.6/library/collections.html) - container data types.
- [pickle](https://docs.python.org/3.6/library/pickle.html) - serializing & de-serializing (saving and loading complex objects).

# Useful Python Packages

These packages are all included in the anaconda distribution.

### Core Packages

- [scipy](https://www.scipy.org) - mathematics, science, and engineering.
- [numpy](http://www.numpy.org) - numerical computing with arrays & array operations.
- [pandas](https://pandas.pydata.org) - data structures and data analysis.
- [scikit-learn](http://scikit-learn.org/stable/) - machine learning and data analysis.
    
### Text Mining

- [nltk](http://www.nltk.org) - natural language processing.
- [gensim](https://radimrehurek.com/gensim/) - topic modelling.

### Mathematics & Statistics

- [sympy](http://www.sympy.org/en/index.html) - symbolic mathematics.
- [statsmodels](http://www.statsmodels.org/stable/index.html) - statistical modelling.

### Web Scraping

- [requests](http://docs.python-requests.org/en/master/) - HTTP requests.
- [scrapy](https://scrapy.org) - web scraping.
    
### Plotting / Vizualization Libraries
- [matplotlib](https://matplotlib.org) - 2D plotting library.
- [seaborn](https://seaborn.pydata.org/) - visualization (based on matplotlib).
- [bokeh](http://bokeh.pydata.org/en/latest/) - interactive visualizations.
    
### Graph Theory / Networks
- [networkx](https://networkx.github.io/) - network analysis.
- [graph-tool](https://graph-tool.skewed.de/) - manipulation and analysis of graphs.
    
### Deep Learning

- [theano](http://deeplearning.net/software/theano/) - mathematical operations on multi-dimensional arrays.
- [tensorflow](https://www.tensorflow.org/) - numerical computation using data flow graphs.
- [keras](https://keras.io) - a high-level neural network library.

# Typesetting in $\LaTeX$

#### Declaring $\LaTeX$

To specify that a chunk of text should be rendered with $\LaTeX$, you must enclose it in dollar signs (`$`), like so: `$chunk of text$`. The result should be: $chunk of text$. 

Note that $\LaTeX$ doesn't care about spaces, so if you want to have spaces, you have to add them in yourself, using backslashes: `$text\ with\ spaces$` for $text\ with\ spaces$.

Also, if you see any interesting $\LaTeX$ in a Markdown cell, you can double click the cell to check out the formatting. The same applies for cool Markdown syntax, like the table of contents above.

#### Basic equation formating $\LaTeX$

 Let's start with the equation for a line, `y = mx + b`. We can simply type `$y = mx + b$` to get:
 
 $y = mx + b$
 
 Equations often use superscripts and subscripts. You can add single-character super- and subscripts using `^` and `_`, respectively:
 
 $y_1 = m_1x_1^2 + b_1$
 
 If you want to put multiple characters in a super- or sub_script, just enclose them with curly brackets: `^{your ad here}`.

$This^{works^{pretty^{well}}}_{but_{is_{hard_{to_{read}}}}}$!

#### Greek letters and weird mathy symbols $\LaTeX$

This [cheat sheet](http://web.ift.uib.no/Teori/KURS/WRK/TeX/symALL.html) is a nice reference point for $\LaTeX$'s non-Latin characters, including Greek letters, like $\alpha$ and $\beta$, and fancy mathematical symbols like $\therefore$ (pronounced "therefore" or "_ergo_") and $\forall$ (pronounced "for all").

Greek letters are so common that they have a special system: to produce upper and lower case versions of a Greek letter, you simply type '$\Letter$' or '$\letter$', where `letter` is the letter's name, e.g. `\Gamma` or `\gamma` for $\Gamma$ or $\gamma$.

#### Fractions and derivatives 

To produce a fraction, type `$\frac{numerator}{denominator}$` to get $\frac{numerator}{denominator}$.

You can use the same command to produce (partial) derivative symbols, in combination with either a `d` or the `partial` command:

$\frac{d x}{d t}$

$\frac{\partial x}{\partial t}$

### Sums and Integrals

Both sums and integrals use a similar syntax: `$\sum_{start}^{finish}{stuff}$` and `$\int_{start}^{finish}{stuff}$` for 

$\sum_{start}^{finish}{stuff}$ and $\int_{start}^{finish}{stuff}$.

You may have noticed that the expressions are starting to get crowded. If you give $\LaTeX$ a bit more space, it can do a better job of displaying the math. You can do this by using double dollar signs (`$$`) instead of single dollar signs, which tells $\LaTeX$ to render the math on a new line: `$$\sum_{start}^{finish}{stuff}$$` gives you:

$$\sum_{start}^{finish}{stuff}$$

whether you put it on a new line or not! To make the raw text easier to read, we usually put the text on a different line anyway.

### Multiple Lines and Alignment

Double-dollars are also good for setting off a block of equations, like:

$$
    f'\ =\ f \\
    f \ =\ \mathrm{e}^x\ + C
$$

If want to make this block prettier, we have specify an `align` segment: we wrap the text in `\begin{align}` and `end{align}`. We also have to specify, using ampersands (`&`), which points we wish to line up, usually the equals signs:

$$
\begin{align}
    f'\ &=\ f \\
    f \ &=\ \mathrm{e}^x\ + C
\end{align}
$$

### Vectors and Matrices

Vectors are often indicated by the addition of an arrow over a lowercase letter : $\vec{x}$, or `$\vec{x}$`. At other times, they are indicated with **boldface**, as in $\mathbf{x}$, or `$\mathbf{x}$`. Some people are paranoid enough to wear both belts and suspenders. They frequently indicate their vectors as $\vec{\mathbf{x}}$ (`$\vec{\mathbf{x}}$`) or $\mathbf{\vec{x}}$ (`$\mathbf{\vec{x}}$`).

Matrices are usually represented with boldface capital letters: $\mathbf{M}$. When this is more of a burden than an aid, they can be represented with simply captial letters: $M$.

When we multiply matrices with matrices or vectors, we usually use the convention adopted for multiplication of scalar value: we simply write the symbols next to each other, as in $\mathbf{M}\vec{x}$ or $A\vec{b}$ (the fancy-pants name for this is juxtaposition). Note the alignment mismatch due to the `\vec` arrow -- this is a peculiarity of setting $\LaTeX$ in a Jupyter notebook, and so boldface is preffered for indicating vectors. 

If we wish to be explicit about vector multiplication, we use the symbol $\cdot$ (`$\cdot$`), as in $x\cdot x^T$. Transposes can be indicated with a superscripted $\intercal$ (`$intercal$`) or a superscripted T: $^T$

# Launch an .ipynb 

I launch Jupyter Notebook directly from my command line. It looks like this:

<img src="img/1-open-jupyternb-2.png" width="800px">

A browser should pop up. It looks like this:

<img src="img/jupyter-launched.png" width="800px">

There are a number of keyboard shortcuts that make editing a Jupyter notebook easier. 

|          *Action*         |   *Shortcut*  |
|:-------------------------:|:-------------:|
| Run Cell                  | `shift+enter` |
| Run Cell and Insert Below |  `alt+enter`  |
| Insert Cell Above         |    `esc, a`   |
| Insert Cell Below         |    `esc, b`   |
| Convert Cell to Markdown  |    `esc, m`   |
| Convert Cell to Code      |    `esc, y`   |

Note that for the last four commands, you need to press `enter` afterwards to resume editing.

## command mode (press Esc to enable) 

- ↩  : enter edit mode

- ⇧↩  : run cell, select below

- ⌃↩  : run cell

- ⌥↩  : run cell, insert below

- Y  : to code

- M  : to markdown

- R  : to raw

- 1  : to heading 1

- 2  : to heading 2

- 3  : to heading 3

- 4  : to heading 4

- 5  : to heading 5

- 6  : to heading 6

- ↑  : select cell above

- K  : select cell above

- ↓  : select cell below

- J  : select cell below

- A  : insert cell above

- B  : insert cell below

- X  : cut selected cell

- C  : copy selected cell

- ⇧V  : paste cell above

- V  : paste cell below

- Z  : undo last cell deletion

- D,D  : delete selected cell

- ⇧M  : merge cell below

- S  : Save and Checkpoint

- ⌘S  : Save and Checkpoint

- L  : toggle line numbers

- O  : toggle output

- ⇧O  : toggle output scrolling

- Esc  : close pager

- Q  : close pager

- H  : show keyboard shortcut help dialog

- I,I  : interrupt kernel

- 0,0  : restart kernel

- ␣  : scroll down

- ⇧␣  : scroll up

- ⇧  : ignore

### Edit Mode (press `Enter` to enable)

- ⇥  : code completion or indent

- ⇧⇥  : tooltip

- ⌘]  : indent

- ⌘[  : dedent

- ⌘A  : select all

- ⌘Z  : undo

- ⌘⇧Z  : redo

- ⌘Y  : redo

- ⌘↑  : go to cell start

- ⌘↓  : go to cell end

- ⌥←  : go one word left

- ⌥→  : go one word right

- ⌥⌫  : delete word before

- ⌥⌦  : delete word after

- Esc  : command mode

- ⌃M  : command mode

- ⇧↩  : run cell, select below

- ⌃↩  : run cell

- ⌥↩  : run cell, insert below

- ⌃⇧subtract  : split cell

- ⌃⇧  : split cell

- ⌘S  : Save and Checkpoint

- ↑  : move cursor up or previous cell

- ↓  : move cursor down or next cell

- ⇧  : ignore

### Proper formatting of cells

You should keep your cells as simple and as coherent as possible. You should define one function, or maybe a handful of related functions, in a single cell, and that's about it.

A quick note: One of the major differences between coding in Python and coding in MATLAB is the importance of user-defined functions. In Python, functions are like any other variable -- they can be passed to functions, renamed, and even returned by other functions. We can also define them inline very easily, as we do below. 

This leaves us free to define small functions without all the overhead of extra files that happens in languages like MATLAB. Defining small functions that each handle one small part of a task helps us think about our problem in manageable pieces, reuse code as much as possible, and keep the logic of our code as simple as possible.

If we wanted to write code to bake cookies, we might define the following functions: `readRecipe`, `measureIngredient`, `mixIngredients`, `preheatOven`, and `timedBake`. That way, if we later need to bake a cake or make a salad, we already have most of the things we need!

To make sure that somebody coming along later can use our function without having to read and understand the code, we begin with a "doc string", a brief explanation of what the function does. The formatting used below is standard.

You can view the doc string of any function, including the ones you write, by applying the function `help` to your function: `help(functionName)`. We'll do that below for our function.

In [1]:
def square(x):
    """
    Take in a numeric variable x and return its square, y
    
    Parameters
    ----------
    x        : numeric variable that supports multiplication, number to square
    
    Returns
    -------
    y        : same type as x, square of the input
    """
    
    y = x*x
    
    return y

help(square)

Help on function square in module __main__:

square(x)
    Take in a numeric variable x and return its square, y
    
    Parameters
    ----------
    x        : numeric variable that supports multiplication, number to square
    
    Returns
    -------
    y        : same type as x, square of the input



### Best practices for code cells

Here are some pieces of advice for when you're writing code cells. At the start of the course, you won't be writing your own cells, but this will change as the course goes on.

1. Keep your code cells short. If you find yourself having one massive code cell, break it up. If you can't break the cell up, try breaking your code into smaller pieces.
1. Always properly comment your code. Provide complete doc strings for any functions you define.
1. Do all of your imports in the first code cell at the top of the notebook. Import one module per line.

# MARKDOWN AND MARKDOWN CELLS 

Markdown cells contain text and equations.  The text is written in **markdown**, a very simple formatting language used for things like blogs. You can check out [this list of markdown basics](http://daringfireball.net/projects/markdown/syntax) to get most of what you need to know.  If you'd like to see the markdown used in this document, you can also "un-render" the text cells by double-clicking on them.

## MARKDOWN EMPHASIS 

Markdown treats asterisks (*) and underscores (_) as indicators of emphasis. Text wrapped with one * or _ will be wrapped with an HTML `<em>` tag; double *’s or _’s will be wrapped with an HTML `<strong>` tag. E.g., this input:

*single asterisks*
<em>single asterisks</em>

_single underscores_
<em>single underscores</em>

**double asterisks**
<strong>double asterisks</strong>

__double underscores__
<strong>double underscores</strong>

<div class="alert alert-success">
  <strong>Success!</strong> Indicates a successful or positive action.
</div>

<div class="alert alert-info">
  <strong>Info!</strong> Indicates a neutral informative change or action.
</div>

<div class="alert alert-warning">
  <strong>Warning!</strong> Indicates a warning that might need attention.
</div>

<div class="alert alert-danger">
  <strong>Danger!</strong> Indicates a dangerous or potentially negative action.
</div>

# MARKDOWN HEADERS 

Markdown supports two styples of headers, underlined or closed. 

Setext-style headers are “underlined” using equal signs (for first-level headers) and dashes (for second-level headers). For example:

This is an H1
=============

This is an H2
-------------

Any number of underlining =’s or -’s will work.

> This is a blockquote with two paragraphs. Lorem ipsum dolor sit amet,
> consectetuer adipiscing elit. Aliquam hendrerit mi posuere lectus.
> Vestibulum enim wisi, viverra nec, fringilla in, laoreet vitae, risus.
> 
> Donec sit amet nisl. Aliquam semper ipsum sit amet velit. Suspendisse
> id sem consectetuer libero luctus adipiscing.

> This is a blockquote with two paragraphs. Lorem ipsum dolor sit amet,
consectetuer adipiscing elit. Aliquam hendrerit mi posuere lectus.
Vestibulum enim wisi, viverra nec, fringilla in, laoreet vitae, risus.

> Donec sit amet nisl. Aliquam semper ipsum sit amet velit. Suspendisse
id sem consectetuer libero luctus adipiscing.

> This is the first level of quoting.
>
> > This is nested blockquote.
>
> Back to the first level.

> ## This is a header.
> 
> 1.   This is the first list item.
> 2.   This is the second list item.
> 
> Here's some example code:
> 
>     return shell_exec("echo $input | $markdown_script");

- [ ] step one
- [ ] step two
- [ ] step three
- [ ] step four 
- [x] none

This is [an example](http://example.com/ "Title") inline link.

[This link](http://example.net/) has no title attribute.

<p>This is <a href="http://example.com/" title="Title">
an example</a> inline link.</p>

<p><a href="http://example.net/">This link</a> has no
title attribute.</p>