Welcome to BME231! In this class we will use Python as a tool to solve problems in Biomedical Engineering.  In particular, we will explore applications that require handling and analysis of data sets.

This is a Jupyter notebook.  It is a great way for me to give you text descriptions and testable, executable code.

The gray boxes with In [number]: are Cells.  You can put code snippets in a cell and run it using the icons in the toolbar, the dropdown menu, or just hitting shift-return from inside the cell.  Output or product from the cell shows up below the cell.

# Intro to Python Language Syntax

## Objects

An *object* is an entity that contains data, along with associated metadata and/or functionality. 

* In Python *everything* (even a function) is an object, which means every entity has some metadata (called attributes) and associated functionality (called methods). 

* These attributes and methods are accessed via the dot syntax.

Syntax refers to the structure of the language (i.e., what constitutes a correctly-formed program).

You'll see some similarities in Python to other programming languages.  Practice will help you become fluent in Python syntax.


### Comments


Comments in Python are indicated by a pound sign (``#``), and anything on the line following the pound sign is ignored by the interpreter.
This means that you can have stand-alone comments as well as inline comments that follow a statement. For example:


In [None]:
# Starting with the # means the whole line is a comment
x = 3
x += 2  # shorthand for x = x + 2
print(x)

### Think of variables as POINTERS

(from Whirlwind Tour of Python)

Assigning variables in Python is as easy as putting a variable name to the left of the equals (``=``) sign:

```python
# assign 4 to the variable x
x = 4
```

This may seem straightforward, but if you have the wrong mental model of what this operation does, the way Python works may seem confusing.
We'll briefly dig into that here.

In many programming languages, variables are best thought of as containers or buckets into which you put data.
So in C, for example, when you write

```C
// C code
int x = 4;
```

you are essentially defining an integer "memory bucket" named ``x``, and putting the value ``4`` into it.
In Python, by contrast, variable names are best thought of not as containers but as pointers.
So in Python, when you write

```python
x = 4
```

you are essentially defining a *pointer* named ``x`` that points to some other bucket containing the value ``4``.
Note one consequence of this: because Python variables just point to various objects, there is no need to "declare" the variable, or even require the variable to always point to information of the same type!
This is the sense in which people say Python is *dynamically-typed*: variable names can point to objects of any type.
So in Python, you can do things like this:

In [None]:
x = 1         # x is an integer
print(x)
x = 'hello'   # now x is a string
print(x)
x = [1, 2, 3] # now x is a list
print(x)

### Using Objects with functions and methods

In [None]:
# Example of using objects, with the method 'append'

my_list = [10, 3.14159, 'tiger']  # the square brackets identify this group as a list
# notice that the list contains an integer, a floating point number ('float'), and a word ('string')
print(my_list)  # the print function returns what's inside the parentheses as an output

my_list.append(['phone',5])  # append is a method (function of list object)
print(my_list)

## Types of Objects: Scalar types

Remember that a scalar is a single element as opposed to a vector, matrix, list, etc.  In Python, scalars can be numeric or non-numeric.

**<center>Python Scalar Types</center>**

| Type        | Example        | Description                                                  |
|-------------|----------------|--------------------------------------------------------------|
| ``int``     | ``x = 1``      | integers (i.e., whole numbers)                               |
| ``float``   | ``x = 1.0``    | floating-point numbers (i.e., real numbers)                  |
| ``complex`` | ``x = 1 + 2j`` | Complex numbers (i.e., numbers with real and imaginary part) |
| ``bool``    | ``x = True``   | Boolean: True/False values                                   |
| ``str``     | ``x = 'abc'``  | String: characters or text                                   |
| ``NoneType``| ``x = None``   | Special object indicating nulls                              |

Let's talk about a few of these in more detail.

### Integers

In [None]:
# Python integers are "variable" precision, meaning that the number of bits is not fixed
# This means that very large numbers indeed can be calculated and returned without overflow.
# The ** operator means "to the power of"

2 ** 200

In [None]:
# dividing two integers with a single / gives you a float
#5 / 2
6 / 2

In [None]:
# if you want true integer division, use //
5 // 2

### Floats

A 'float' (short for 'floating point number') is a whole or fractional number that has a decimal point.  Essentially, it is any number that is not an integer.

In [None]:
# scientific notation
x = 0.000005
y = 5e-6 # 5 x 10 ** -6
print(x)
print(y)
print(x == y)  # single = means assignment; double == means "are these two things equal"

### Strings

You can think of a string as an object that is not meant to be treated as a number.  Most of the time strings are groupings of text, but not always.

Strings can be marked with single or double quotes.

In [None]:
question = 'what are we having for supper?'
answer = "macaroni and cheese."
print(question)
print(answer)

There are many built-in functions that can be performed on strings; there are also many built-in **methods** that can be utilized using the **dot syntax**. Notice that all functions and methods require ( ), even if there are no arguments.

In [None]:
# len tells you the length of string in number of characters and spaces
len(answer)

In [None]:
# Let's make our answer in all upper case using the dot syntax
print(answer.upper())


In [None]:
# Now let's capitalize the first letter of our question and also the answer, getting rid of all caps
# notice the () after the built-in method capitalize
print(question.capitalize())
print(answer.capitalize())

In [None]:
# We can concatenate strings, meaning attach one to the end of another, using +
question + answer

In [None]:
# can we multiply strings? yes, it's multiple concatenation
5 * answer

Did you notice that the upper and capitalize built-in methods did not change the original objects?

In [None]:
# check the contents of the variables question and answer
print(question)
#question
print(answer)

How can we save or capture changes made by built-in methods?

Save the result to a variable.

In [None]:
Q = question.capitalize()
A = answer.upper()
print(Q)
print(question)
print(A)
print(answer)

### Bools

The Boolean type is a simple type with two possible values: ``True`` and ``False``, and is returned by comparison operators.  Notice that you must use capital T and capital F for True and False in Booleans.

In [None]:
x = 4
y = 5
outcome = (x < y); # this structure creates the Boolean variable outcome, with answer True or False
print(outcome)
x<y

### NoneType

NoneType is the type for the None object, which is an object that indicates no value. None is the return value of functions that "don't return anything", like print. (from Stack Exchange)

In [1]:
# print does not explicitly return a value
# more precisely, it returns a value of type "None"
return_value = print('abc')
# the print(f"....") structure gives a formatted output
print(f"return value = {return_value}")

abc
return value = None


## Types of Objects: Sequence Types

### Lists

* *ordered*, *mutable*, heterogeneous (data types can be mixed)
* Defined with comma-separated values between square brackets

What does it mean to be mutable or immutable?

*Mutable* objects can be changed after they are created
* `list`, `dictionary`, `set`, `DataFrame` 

*Immutable* objects cannot be changed after they are created
* "Simple" (aka "scalar") types: `integer`, `float`, `complex`, `string`, `bool`, `NoneType`
* `tuple`

So in a list, the order matters, it can be changed after it's created, and it can be made up of different data types, like integers, strings, etc.  

You know it's a list if you see square brackets.  [   ]  

In [3]:
my_list = [3, 5.6, 'water', 4E3]
print(my_list)

[3, 5.6, 'water', 4000.0]


Since a list is ordered, you can refer to elements in the list by their index values.  Indexing in Python starts at zero, and calls to an index location does NOT include the last value.  This means that x[1:3] would give you the second and third item in list x.

In [None]:
print(my_list[2])
print(my_list[0:2])
print(my_list[:3])


In [4]:
# adding concatenates a list
new_list = my_list + [56, 8.7, 'bird']
print(new_list)

[3, 5.6, 'water', 4000.0, 56, 8.7, 'bird']


In [None]:
# If your list is numeric only, you can call the method sort to sort in place
listy = [4, 6, 8, 9.2, 9.1, 0]
print(listy)
listy.sort()
print(listy)
print(listy[2])


In [None]:
# You can change the value of items in a list
print(new_list)
new_list[2]= 'birdbath'
print(new_list)

### Tuples

Tuples are immutable, meaning that they can't be changed after creation.  Other than that, you can perform similar operations on tuples than lists (meaning all methods except those that change values).

Tuples are defined by commas, and usually enclosed in regular parentheses. (   )

Tuples are often used in Python programs; a particularly common case is in functions that have multiple return values.
For example, the ``as_integer_ratio()`` method of floating-point objects returns a numerator and a denominator; this dual return value comes in the form of a tuple.

In [None]:
# here's a tuple
t = (2, 3, 7)
print(t)

In [None]:
# can't append because that would change the tuple
t.append(4)

In [None]:
# here's an example of a method that returns an unchangeable tuple
x = 0.125
x.as_integer_ratio()

### Dictionaries

Dictionaries map keys to values. They can be created via a comma-separated list of ``key:value`` pairs within curly braces.  They have no order and so cannot be accessed by an index value; instead, they're accessed by the key value.
You can add new key:value pairs to a dictionary, and you can't duplicate keys, though you can duplicate values.

In [None]:
months = {'January':1, 'February':2, 'April':4}
# find the value associated with key 'April'
print(months['April'])
# add a new pair
months['June']=6
print(months)
months['Other June']=6
print(months)

### Sets

A set is unordered and unindexed.  You cannot have duplicates in a set.  Sets are defined by comma-separated elements inside curly brackets.  

You can do comparisons of sets, such as find their union, intersection, etc.

In [None]:
fibo = {1, 2, 3, 5, 8, 13, 21}
evens = {2, 4, 6, 8, 10, 12, 14}

# union: items appearing in either
print(fibo | evens)      # with an operator
print(fibo.union(evens)) # equivalently with a method

In [None]:
# intersection: items appearing in both
print(fibo & evens)             # with an operator
print(fibo.intersection(evens)) # equivalently with a method

In [None]:
# difference: items in fibo but not in evens
print(fibo - evens)           # with an operator
print(fibo.difference(evens)) # equivalently with a method

In [None]:
# symmetric difference: items appearing in only one set
# i.e., complement of the intersection
print(fibo ^ evens)                     # with an operator
print(fibo.symmetric_difference(evens)) # equivalently with a method

## Operations on variables

### Arithmetic Operations

Python implements seven basic binary arithmetic operators, two of which are unary operators (only having one operand).
They are summarized in the following table:

| Operator     | Name           | Description                                            |
|--------------|----------------|--------------------------------------------------------|
| ``a + b``    | Addition       | Sum of ``a`` and ``b``                                 |
| ``a - b``    | Subtraction    | Difference of ``a`` and ``b``                          |
| ``a * b``    | Multiplication | Product of ``a`` and ``b``                             |
| ``a / b``    | True division  | Quotient of ``a`` and ``b``                            |
| ``a // b``   | Floor division | Quotient of ``a`` and ``b``, removing fractional parts |
| ``a % b``    | Modulus        | Integer remainder after division of ``a`` by ``b``     |
| ``a ** b``   | Exponentiation | ``a`` raised to the power of ``b``                     |
| ``-a``       | Negation       | The negative of ``a``                                  |
| ``+a``       | Unary plus     | ``a`` unchanged (rarely used)                          |

These operators can be used and combined in intuitive ways, using standard parentheses to group operations.
(from Whirlwind Tour)

### Comparison Operations

Another type of operation which can be very useful is comparison of different values.
For this, Python implements standard comparison operators, which return Boolean values ``True`` and ``False``.
The comparison operations are listed in the following table:

| Operation     | Description                       || Operation     | Description                          |
|---------------|-----------------------------------||---------------|--------------------------------------|
| ``a == b``    | ``a`` equal to ``b``              || ``a != b``    | ``a`` not equal to ``b``             |
| ``a < b``     | ``a`` less than ``b``             || ``a > b``     | ``a`` greater than ``b``             |
| ``a <= b``    | ``a`` less than or equal to ``b`` || ``a >= b``    | ``a`` greater than or equal to ``b`` |

These comparison operators can be combined with the arithmetic operators to express a virtually limitless range of tests for the numbers.
For example, we can check if a number is odd by checking that the modulus with 2 returns 1:

In [None]:
# check for odd
84 % 2 == 1

In [None]:
# check for odd
99 % 2 == 1

In [None]:
# more complex comparison
a = 20
4 < a < 22


### Boolean operators: and, or, not

When working with Boolean values, Python provides operators to combine the values using the standard concepts of "and", "or", and "not".
Predictably, these operators are expressed using the words ``and``, ``or``, and ``not``:

In [None]:
x = 4
(x < 3) and (x > 2)

In [None]:
(x > 10) or (x % 2 == 0)

In [None]:
not (x < 6)

###  Identity and Membership Operators

Like and, or, and not, Python also contains prose-like operators to check for identity and membership. They are the following:

| Operator      | Description                                       |
|---------------|---------------------------------------------------|
| ``a is b``    | True if ``a`` and ``b`` are identical objects     |
| ``a is not b``| True if ``a`` and ``b`` are not identical objects |
| ``a in b``    | True if ``a`` is a member of ``b``                |
| ``a not in b``| True if ``a`` is not a member of ``b``            |

In [None]:
# define two variables a and b as 'the same' thing
a = [1, 2, 3]
b = [1, 2, 3]


In [None]:
# Let's do some boolean checking
a == b

In [None]:
a is b

In [None]:
a is not b

What is this?  We see that ``a == b`` is True, but ``a is b`` is False.  The key to this puzzle is remembering that variables are *pointers*.  Variables have to point to the *same bucket* for ``is`` to be True, not just have two buckets with the same contents.

In [None]:
# try it this way
a = [1,2,3]
b = a
a is b

The difference between the two cases here is that in the first, ``a`` and ``b`` point to *different objects*, while in the second they point to the *same object*.

With this in mind, in most cases that a beginner is tempted to use "``is``" what they really mean is ``==``.

## Randomness

We'll often need to generate random numbers in data science.  The *random* module has commands for this. Of course, in coding a number can only be pseudo-random.  In fact, we can use a *random.seed* if we want to get the same "random" number every time we run our script.

In [None]:
import random
random.seed(566)

# the random.random() method produces numbers uniformly between 0 and 1

five_randoms = [random.random() for _ in range(5)] # the ___ is shorthand for 'a variable we're not actually going to use
print(five_randoms)

In [None]:
# random.randrange(x,y) chooses an element between x and (y-1)
# if only one term is given, it's between 0 and one less than that term

print(random.randrange(10))
print(random.randrange(4,7))

In [None]:
# a few more useful methods with random
up_to_ten = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
random.shuffle(up_to_ten)
print(up_to_ten)

In [None]:
lucky_number = random.choice(up_to_ten)
print(lucky_number)

In [None]:
# suppose you want more than one number
# you make multiple calls to random.choice
four_lucky_numbers = [random.choice(up_to_ten) for i in range(4)]
print(four_lucky_numbers)

In [None]:
# suppose you want more than one number with no repeats
four_lucky_numbers = random.sample(up_to_ten,4)
print(four_lucky_numbers)

## Generating data of a particular size or content

### range

Sometimes you want to generate lists of a particular size or content.  

The *range* command can be useful for creating a sequence of integers.  Format:   range(startValue, endValue, stepSize)


In [None]:
testList = range(1,20,2)
print(testList)

Notice that when I tried to print testList, it just gave me the start, stop and step values.  To see the values, I have to *cast* it into a list or other sequence object.

In [None]:
testListForReal = list(testList)
print(testListForReal)
#
testListForReal = tuple(testList)
print(testListForReal)
#
testListForReal = set(testList)
print(testListForReal)

This only works for integers.  To generate lists of floats, we'll need the numpy library, which we will do in a few lectures.

## A few other syntax notes

In [None]:
# set the midpoint
midpoint = 5

# make two empty lists
lower = []; upper = []

# split the numbers into lower and upper
for i in range(10):
    if (i < midpoint):
        lower.append(i)
    else:
        upper.append(i)
        
print("lower:", lower)
print("upper:", upper)

This script isn't particularly useful, but it compactly illustrates several of the important aspects of Python syntax.
Let's look at it line by line and discuss some of the syntactical features of Python.

### End-of-Line Terminates a Statement
The first code line in the script is
``` python
midpoint = 5
```
This is an assignment operation, where we've created a variable named ``midpoint`` and assigned it the value ``5``.
Notice that the end of this statement is simply marked by the end of the line.
This is in contrast to languages like C and C++, where every statement must end with a semicolon (``;``).  I'm likely to end lines with a semicolon myself, out of habit.  I'm most familiar with MATLAB, where one ends lines with a semicolon to suppress the output of interim commands.

As an aside, in Python, if you'd like a statement to continue to the next line, it is possible to use the "``\``" marker to indicate this:

In [None]:
x = 1 + 2 + 3 + 4 +\
    5 + 6 + 7 + 8
x

It is also possible to continue expressions on the next line within parentheses, without using the "``\``" marker:

In [None]:
x = (1 + 2 + 3 + 4 +
     5 + 6 + 7 + 8)
x

Most Python style guides recommend the second version of line continuation (within parentheses) to the first (use of the "``\``" marker).

### Semicolon Can Optionally Terminate a Statement
Sometimes it can be useful to put multiple statements on a single line.
The next portion of the script is
``` python
lower = []; upper = []
```
This shows the example of how the semicolon (``;``) familiar in C can be used optionally in Python to put two statements on a single line.
Functionally, this is entirely equivalent to writing
``` python
lower = []
upper = []
```
Using a semicolon to put multiple statements on a single line is generally discouraged by most Python style guides, though occasionally it proves convenient.

### Indentation: Whitespace Matters!
Next, we get to the main block of code:
``` Python
for i in range(10):
    if i < midpoint:
        lower.append(i)
    else:
        upper.append(i)
```
This is a compound control-flow statement including a loop and a conditional – we'll look at these types of statements in a moment.
For now, consider that this demonstrates what is perhaps the most controversial feature of Python's syntax: whitespace is meaningful!  This was one of the first things I had to get used to with Python.

```
In Python, code blocks are denoted by *indentation*:

for i in range(100):
    # indentation indicates code block
    total += i
```
In Python, indented code blocks are always preceded by a colon (``:``) on the previous line.

The use of indentation helps to enforce the uniform, readable style that many find appealing in Python code.
But it might be confusing to the uninitiated; for example, the following two snippets will produce different results:
```python
>>> if x < 4:         >>> if x < 4:
...     y = x * 2     ...     y = x * 2
...     print(x)      ... print(x)
```
In the snippet on the left, ``print(x)`` is in the indented block, and will be executed only if ``x`` is less than ``4``.
In the snippet on the right ``print(x)`` is outside the block, and will be executed regardless of the value of ``x``!

Python's use of meaningful whitespace often is surprising to programmers who are accustomed to other languages, but in practice it can lead to much more consistent and readable code than languages that do not enforce indentation of code blocks.
If you find Python's use of whitespace disagreeable, I'd encourage you to give it a try: as I did, you may find that you come to appreciate it.

Finally, you should be aware that the *amount* of whitespace used for indenting code blocks is up to the user, as long as it is consistent throughout the script.
By convention, most style guides recommend to indent code blocks by four spaces, and that is the convention we will follow.
Note that many text editors contain Python modes that do four-space indentation automatically.

### Whitespace *Within* Lines Does Not Matter
While the mantra of *meaningful whitespace* holds true for whitespace *before* lines (which indicate a code block), white space *within* lines of Python code does not matter.
For example, all three of these expressions are equivalent:

In [None]:
x=1+2
print(x)
x = 1 + 2
print(x)
x             =        1    +                2
x

Abusing this flexibility can lead to issues with code readibility – in fact, abusing white space is often one of the primary means of intentionally obfuscating code (which some people do for sport).
Using whitespace effectively can lead to much more readable code, 
especially in cases where operators follow each other – compare the following two expressions for exponentiating by a negative number:
``` python
x=10**-2
```
to
``` python
x = 10 ** -2
```
I find the second version with spaces much more easily readable at a single glance.
Most Python style guides recommend using a single space around binary operators, and no space around unary operators.


### Parentheses Are for Grouping or Calling

In the previous code snippet, we see two uses of parentheses.
First, they can be used in the typical way to group statements or mathematical operations:

In [None]:
(2 * 3) + 4

They can also be used to indicate that a *function* is being called.
In the next snippet, the ``print()`` function is used to display the contents of a variable (see the sidebar).
The function call is indicated by a pair of opening and closing parentheses, with the *arguments* to the function contained within:

In [None]:
print('first value:', x)

In [None]:
print('second value:', 5)

Some functions can be called with no arguments at all, in which case the opening and closing parentheses still must be used to indicate a function evaluation.
An example of this is the ``sort`` method of lists:

In [None]:
L = [4,2,3,1]
print(L)
L.sort()
print(L)

The "``()``" after ``sort`` indicates that the function should be executed, and is required even if no arguments are necessary.