# Welcome to QBB2020!

Please use this Jupyter Notebook to complete the Python preparatory work.  These exercises are based on the excellent [Programming with Python](https://swcarpentry.github.io/python-novice-inflammation/) materials created by Software Carpentry.

# Analyzing Patient Data Overview [[ref](https://swcarpentry.github.io/python-novice-inflammation/02-numpy/index.html)]

- Questions: 
    - How can I process tabular data files in Python?
- Objectives:
    - Explain what a library is and what libraries are used for.
    - Import a Python library and use the functions it contains.
    - Read tabular data from a file into a program.
    - Select individual values and subsections from data.
    - Perform operations on arrays of data.
- Estimated Time: 60 min

Words are useful, but what's more useful are the sentences and stories we build with them.
Similarly, while a lot of powerful, general tools are built into Python,
specialized tools built up from these basic units live in
[libraries](https://swcarpentry.github.io/python-novice-inflammation/reference/#library)
that can be called upon when needed.


# Loading data into Python

To begin processing inflammation data, we need to load it into Python.
We can do that using a library called
[NumPy](http://docs.scipy.org/doc/numpy/ "NumPy Documentation"), which stands for Numerical Python.
In general, you should use this library when you want to do fancy things with lots of numbers,
especially if you have matrices or arrays. To tell Python that we'd like to start using NumPy,
we need to [import](https://swcarpentry.github.io/python-novice-inflammation/reference/#import) it:

In [None]:
import numpy

Importing a library is like getting a piece of lab equipment out of a storage locker and setting it
up on the bench. Libraries provide additional functionality to the basic Python package, much like
a new piece of equipment adds functionality to a lab space. Just like in the lab, importing too
many libraries can sometimes complicate and slow down your programs - so we only import what we
need for each program.

Once we've imported the library, we can ask the library to read our data file for us:


In [None]:
numpy.loadtxt(fname='data/inflammation-01.csv', delimiter=',')

The expression `numpy.loadtxt(...)` is a [function call](https://swcarpentry.github.io/python-novice-inflammation/reference/#function-call)
that asks Python to run the [function](https://swcarpentry.github.io/python-novice-inflammation/reference/#function) `loadtxt` which
belongs to the `numpy` library. This [dotted notation](https://swcarpentry.github.io/python-novice-inflammation/reference/#dotted-notation)
is used everywhere in Python: the thing that appears before the dot contains the thing that
appears after.

As an example, John Smith is the John that belongs to the Smith family.
We could use the dot notation to write his name `smith.john`,
just as `loadtxt` is a function that belongs to the `numpy` library.

`numpy.loadtxt` has two [parameters](https://swcarpentry.github.io/python-novice-inflammation/reference/#parameter): the name of the file
we want to read and the [delimiter](https://swcarpentry.github.io/python-novice-inflammation/reference/#delimiter) that separates values on
a line. These both need to be character strings (or [strings](https://swcarpentry.github.io/python-novice-inflammation/reference/#string)
for short), so we put them in quotes.

Since we haven't told it to do anything else with the function's output,
the [notebook](https://swcarpentry.github.io/python-novice-inflammation/reference/#notebook) displays it.
In this case,
that output is the data we just loaded.
By default,
only a few rows and columns are shown
(with `...` to omit elements when displaying big arrays).
Note that, to save space when displaying NumPy arrays, Python does not show us trailing zeros, so `1.0` becomes `1.`.


## Importing libraries with shortcuts

## Data Type

## In the Corner

# Slicing data

# Analyzing data

# Exercise 2.1: Slicing Strings

In [None]:
element = 'oxygen'
print('first three characters:', element[0:3])
print('last three characters:', element[3:6])

What is the value of `element[:4]`? What about `element[4:]`? Or `element[:]`?

What is `element[-1]`? What is `element[-2]`?

Given those answers, explain what `element[1:-1]` does.

How can we rewrite the slice for getting the last three characters of `element`, so that it works even if we assign a different string to `element`? Test your solution with the following strings: `carpentry`, `clone`, `hi`.

# Exercise 2.2: Thin Slices

The expression `element[3:3]` produces an `empty string`, i.e., a string that contains no characters. If `data` holds our array of patient data, what does `data[3:3, 4:4]` produce? What about `data[3:3, :]`?

# Exercise 2.3: Stacking Arrays

In [None]:
import numpy

A = numpy.array([[1,2,3], [4,5,6], [7, 8, 9]])
print('A = ')
print(A)

B = numpy.hstack([A, A])
print('B = ')
print(B)

C = numpy.vstack([A, A])
print('C = ')
print(C)

Write some additional code that slices the first and last columns of `A`, and stacks them into a 3x2 array. Make sure to `print` the results to verify your solution.

# Exercise 2.4: Change in Inflammation

# Key Points

- Import a library into a program using `import libraryname`.
- Use the `numpy` library to work with arrays in Python.
- The expression `array.shape` gives the shape of an array.
- Use `array[x, y]` to select a single element from a 2D array.
- Array indices start at 0, not 1.
- Use `low:high` to specify a `slice` that includes the indices from `low` to `high-1`.
- Use `# some kind of explanation` to add comments to programs.
- Use `numpy.mean(array)`, `numpy.max(array)`, and `numpy.min(array)` to calculate simple statistics.
- Use `numpy.mean(array, axis=0)` or `numpy.mean(array, axis=1)` to calculate statistics across the specified axis.