## Contents

- [The very basics](#the-very-basics)
	- [Variables and data types](#variables-and-data-types)
	- [Arithmetic](#arithmetic)
- [Basics of doing things](#basics-of-doing-things)
	- [Libraries](#libraries)
	- [Functions](#functions)
		- [Defining functions](#defining-functions)
        - [Testing functions](#testing-functions)
- [String processing](#string-processing)
    - [Regex](#regex)
- [Functions by library](#functions-by-library)
    - [Names & objects by library](#names-&-objects-by-library)

----

## The very basics

- `help(function_name)` or `?function_name` or `?variable.method_name`: get help
- `module.<TAB>`, `variable.<tab>`, `dir(variable)`: get list of available functions, methods, etc.
- ``#``: Everything in a line after the octothorpe is a comment.

### Variables and [data types](https://docs.python.org/3/reference/datamodel.html)

- `m = 4`: set variable `m` to 4
- `%whos`/`%who`: display a table of all variables currently set (IPython only).
- `rm(m)`: delete variable `m` from memory
- `print(x)`: display the value of `x`.
- Basic data types: `int`, `float`, `str`, `bool`. Use, say, `int(x)` to convert between them.
	- Convert numbers into strings with text replacement: `y = "%s" % 5`
- `type(x)`: return the data type of `x`
- `(10, 20, 30)` is a tuple of 3. Tuples are immutable.
- Lists: `m = ["A", "B", "C", "D", "E"]` is a list with 5 elements. Lists are mutable.
	- `m[0]` is "A"
		- `m[1:3]` is ["B", "C"] -- elements 1 *to* 3, but not *through* 3
		- `m[3:]` is ["D", "E"] -- elements 3 through the end
		- `m[:2]` is ["A", "B", "C"] -- elements up to 2
		- `m[-1]` is "E" -- the last one
	- `n = [ ["A", "B", "C"], ["D", "E", "F"] ]` is a list of 2 elements. Both elements
	  are lists, each with 3 elements.
		- `n[0][1]` is "B"
	- Modifying lists in place:
		- `m.append("F")`: add the value "F" to the end
		- `del m[0]`: delete the first element
		- `m.reverse()`: reverse the order of elements
		- Be careful with this. Once `y = x`, if `x` is mutable, `y` and `x` are pointers to
          *the same data* -- modify `x` and you change `y`, and vice versa. To avoid 
		  that, use `y = list(x)`, or see `copy.deepcopy()`.)
	- `len(m)`: number of elements in `m`
- Arrays:
	- `data[0, 0]`: first (top-left) value. (The first value is the row (Y-coord), the second 
	  value is the column (X-coord). They work like matrices, not Cartesian coordinates.)
		- ``data[0:4, 1:10]``: rows 0-3 (the 1st-3rd) of columns 1-9 (the 2nd-10th)
		- ``data[:3, 10:]``: rows up to 3 (i.e. 0-2) of all columns after 10
		- ``data[0:3, :]``: rows up to 3 (i.e. 0-2) of all columns
	- Array attributes:
		- ``data.shape``: the dimensions of the array ``data`` (rows, cols)
		- ``data.dtype``: the data type of elements in ``data``
- Dictionaries:
    - `dict(["green", 1], ["blue", 4], ["red", 143]])`: generates a dictionary with three 
      key/value pairs.
    - Iterate over a loop `x` with `[x[i] for i in range(len(x)) if i>3]`
    - `dict( [[k, v*2] for k,v in h.items()] )`: Apply a function to a dictionary `h` ("hash")

-----

### Arithmetic

- Basic arithmetic operations are very simple: `` 3 + 5 * 4`` returns ``20``.
- Some array operations with ``numpy``:
	- ``numpy.mean(data, axis=0)``: returns a vector of the averages of each row in ``data``
	- ``numpy.mean(data, axis=1)``: returns a vector of the averages of each column in ``data``
    
    
----

## Basics of doing things

### Libraries

- ``import numpy``
	- ``import matplotlib.pyplot``: imports the ``pyplot`` module from the ``matplotlib``
	   library
	- `import numpy as np`: imports `numpy`, but lets you refer to it using just `np`
	  (e.g. `np.mean()`).
- ``numpy.mean()``: the function ``mean()`` drawn from library ``numpy``.
- ``numpy.<tab>``: In IPython and Jupyter Notebook, you can use tab completion to get a
   list of all functions in a library or attributes in an object.

### Functions

- Not all functions take arguments.

#### Defining functions

- `def`: keyword for function definition
- `assert`: [ ] ...if it's false, throw an error?

```python
def add1scalar(x):
     """adds 1 to scalar input"""
     x += 1
     print("after add1_scalar: ", x)
```

```python
def foo(arg1, arg2=0):
  """
  Return arg1 -1 + arg2.
  arg2 is optional, 0 by default.
  good practice: include examples.
  Examples:

  >>> foo(5)
  4
  >>> foo(5,8)
  12
  >>> foo(5, arg2=2)
  6
  """
  assert type(arg1)==int, "error message: here arg1 should be an integer"
  res = arg1 - 1 + arg2
  return res
```

#### Testing functions

----

### Loops

[for, if/elif/else, while, try/finally]

----

## String processing

- Given `b = "Ursus arctos horribilis"`:
	- `b.split(" ")` returns ["Ursus", "arctos", "horribilis"]
	- `b. [???]  ` returns... **what?**
	- `b.strip()` removes leading & trailing newlines, tabs, & spaces
- Insert into strings:
	- The simplest way: `print("Grizzly bear is ", b)` returns "Grizzly bear is Ursus arctos horribilis"
	- Better: `print("Grizzlies (%s) are large" % b)` returns "Grizzlies (Ursus arctos horribilis) are large"
	- `%s` inserts as strings, `%d` as integers

### Regex

- `re.search(r'pattern', item)`: Basic regex search, finding `pattern` in `item`.
- `re.findall()`:
- `re.sub()`: String substitution based on regex, e.g. `taxon="Homo sapiens" ; re.sub(r'^(\S).* ([^\s]+)', r'\1_\2', taxon)` returns "H_sapiens"
- `re.split()`: String slicing based on regex, e.g. [EXAMPLE]
- The results of a regex are stored in a "match object", e.g. `mo`:
    - `mo.group()`: what string matched the regex pattern
    - `mo.start()`: the position where the match starts (zero-indexed)
    - `mo.end()`: the position where the match ends (index *after* that number!)
    - `mo.group(1)`, `mo.group(2)`, etc.: groups captured with parentheses within the pattern
    - `mo.start(1)`: the position where the match for group 1 starts

----


## Running external programs

* Module `subprocess`
    * `subprocess.call()`:
    * `subprocess.check_output("program")`: Much like `subprocess.call()` but with an 
       error check included.
    * `subprocess.check_output("program & its arguments", shell=True)`: Run the included 
       arguments as if in the shell.
    * `subprocess.run()`:

* 

## Functions by library

- Built-in functions
	- `print(x)`: Show the value of `x`
	- `sorted(x)`: Sorts the elements of `x` in alphanumeric order.
- `copy`
	- `copy.copy(x)`: Shallow copy operation
	- `copy.deepcopy(x)`: Deep copy operation
- `doctest`
    - `doctest.testmod(x)`: 
- `glob`
	- `glob.glob('pattern')`: Returns all filenames in the current directory that
	  match `pattern`
	- 
- `matplotlib.pyplot`: Basic graphing library.
	- `%matplotlib inline`: Show graphs in the notebook, when using IPython or Jupyter
	- `matplotlib.pyplot.plot(list)`: Plots values in `list` as a line graph.
	- `matplotlib.pyplot.show()`: Shows whichever graph was just generated.
	- `matplotlib.pyplot.figure()`: Creates a space into which you can tile multiple plots.
      **See the Software Carpentry example, below.**
		- `graph1 = fig.add_subplot(x, y, z)`: Puts the plot `graph1` into position `z` 
          (counting L-R & top-bottom) of an `x`-by-`y` grid of subplots
		- `graph1.set_ylim(0,6)`: Sets y-limits for `graph1`
		- `graph1.set_xlabel("time")`: Labels the x-axis of `graph1` as "time"
		- `fig.tight_layout()`

In [None]:
import numpy
import matplotlib.pyplot

data = numpy.loadtxt(fname='data/inflammation-01.csv', delimiter=',')

fig = matplotlib.pyplot.figure(figsize=(10.0, 3.0))

axes1 = fig.add_subplot(1, 3, 1)
axes2 = fig.add_subplot(1, 3, 2)
axes3 = fig.add_subplot(1, 3, 3)

axes1.set_ylabel('average')
axes1.plot(numpy.mean(data, axis=0))

axes2.set_ylabel('max')
axes2.plot(numpy.max(data, axis=0))

axes3.set_ylabel('min')
axes3.plot(numpy.min(data, axis=0))

fig.tight_layout()

matplotlib.pyplot.show()

- `numpy`: various math and array functions.
	- `numpy.loadtxt(fname="filegoeshere.csv", delimiter=",")`: Loads the contents 
	  of `filenamegoeshere.csv` as an array, assuming comma-delimitation.
	- `numpy.absolute(x)`: abs(x)
	- 
	- `A = numpy.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])`: define a NumPy array
		- `numpy.hstack([A, A])`: sticks a copy of `A` on the horizontal end of `A`
		- `numpy.vstack([A, A])`: same but vertically
	- `numpy.max()`: returns the largest value in an array. `numpy.max(data[2, :]))`: returns
      the largest value for all columns in row 2 of array `data`
	- `numpy.diff(data, axis=1)`: returns an array where each value is the difference between
      2 values from `data`, going by rows (i.e. differences between days for a patient, not
      patients on one day)
	- `numpy.mean(data, axis=0)`: gets a list of averages for each row in `data` (`axis=1`
      gets column averages)
	- `numpy.std(data)`: gets the standard deviation for all data in the array
- `os`: [Miscellaneous operating system interfaces](https://docs.python.org/3/library/os.html).
	- `os.getcwd()`: Returns the working directory path (like `pwd` in bash)
	- `os.chdir(x)`: Changes the current working directory (like `cd` in bash?) to `x`.
	- `os.path.realpath(x)`: Gets the true path (no symbolic links) to the specified file-
	  name.
	- `os.path.relpath(path, start=os.curdir)`:
	- `os.path.dirname(os.path.realpath(__file__))`: Returns the path to the directory
	  containing this Python file. Only works if you haven't used `os.chdir()` yet.
	- `os.mkdir(x)`: Creates the folder `x` (see 
      <https://docs.python.org/3/library/os.html#os.mkdir>)
    - 
- `subprocess`
    - 
- `time`: [Time-related functions](https://docs.python.org/3/library/time.html).
	- `time.ctime()`: current date & time
    - 


### Names & objects by library

- `os`: [Miscellaneous operating system interfaces](https://docs.python.org/3/library/os.html).
	- `os.curdir`: The current working directory

---