## Notebook 2.0: Python objects

This notebook will correspond with chapters 1 and 3 from the official Python tutorial: https://docs.python.org/3/tutorial/. You are welcome to read chapter 2 as well, but it is mostly about how to open, install, and run python. Since we have python running interactively in jupyter the details of starting a Python Interpreter from chapter 2 are not so important. The challenges in this notebook are meant to reinforce the material from the readings. Feel free to use this notebook as a scratch pad as well in which to write and test code from the readings. 

### Learning objectives: 

By the end of this exercise you should:

1. Understand the use of variables in Python to store values.
2. Comprehend the difference between returning and printing variables.
3. Become familiar with int, float, str, and list objects. 

### Python as a calculator

One of the simplest and most common uses of Python involves operations on numeric values to perform mathematical operations. This is also a good first approach for learning how Python stores variables, and how to reuse variables. 

In [1]:
# you can perform math operations in Python
(3 / 3) + (3 * 5) + (3 ** 2)

25.0

In [2]:
# create a new variable named x with the integer value 3
x = 3

In [3]:
# substitute named variables to represent a value or object
(x / 3) + (x * 5) + (x ** 2)

25.0

### Creating and accessing variables

In an interactive python session you can return a variable by executing the variable in a cell without storing it to a new variable. For example, in one of the cells above we stored the value 3 in `x` and nothing was returned as output. In contrast, the other operations did not store their result to a variable and thus the value was *returned*. 

In [4]:
# the value of x will be returned (shown)
x

3

Similarly, the code below returns the value 6 as a result of x + 3. If we want to store the result then we need to assign it to a new variable, or, if we wanted, we could overwrite the variable x by assigning it this result as a new value. 

In [5]:
# return x + 3
x + 3

6

### The print function
Notice that above when we *return* a value it is shown in the output cell next to the red signature <span style='color:red; font-family:monospace;'>Out[N]:</span>, this indicates that a value was returned. In contrast, when you call the `print()` function you are printing a value to stdout, but it is not a *returned value*. The distinction is a bit subtle. The print function is typically used to display a message to a user; to print a message about the progress of a script; or for debugging, to check the value of a variable at a particular point in the code. 

In [6]:
print(x)

3


<div class="alert alert-success">
    <b>Action:</b> In a code cell below write three lines of Python code. On line 1 create a new variable called 'y' with the value 30. On line 2 create another new variable 'z' with the value 5.5. On line 3 use the print function to print the value of y / z. (See Chapter 3 if you need help).
</div>

In [7]:
y = 30
z = 5.5
y / z

5.454545454545454

## Objects and Types
In the example above you have already used two different types of objects in Python, an Integer and a Float. We'll learn about object Types in more detail now.

### Integers and Floats
There is little practical difference between integers and floats (particularly in Python3 as opposed to Python2) except when you get down to the details of their memory use. Integers are whole numbers and Floats store floating point decimal values. The objects can be compared or combined in mathematical operations. 

In [8]:
# assigns an integer value
y1 = float(5.0)

# also assigns an integer variable
y2 = int(5)

# return whether the two variables are equal
y1 == y2

True

### Boolean type

A boolean type is a simple True or False statement. For example, you just saw above that the returned value of the comparison we performed was a value of `True`. That's a boolean. This type is used when comparing objects or values. Binary statements of this type are very common in programming so expect that you will see boolean types very often.

In [9]:
## True can be stored as True or as 1
x = True
y = 1
x == y

True

In [10]:
## False can be stored as False or as 0
x = False
y = 0
x == y

True

### Comparisons

As you can see above we used the = character to assign values to a variable and we used the == character to ask if two variables were equal. There are several other comparison expressions available in addition to ==.


In [11]:
x = 10
y = 3
z = "orange"

In [12]:
print(x > y)
print(x >= y)
print(y < x)
print(x == z)
print(z != y)

True
True
True
False
True


Not everything can be compared, though. For example, asking whether "orange" is greater than 3 does not make any sense. When you do this Python will raise an error. It is important to be aware of the Type of each of your variables. *We expect the code below will raise an error*, just go ahead with it. 

In [13]:
print(z > y)

TypeError: '>' not supported between instances of 'str' and 'int'

## Strings

A "string" is the name used in Python for words, sentences, or paragraphs of text that are joined together. It is one of the most basic data types and one that Python is very good at dealing with. In fact, the ease with which Python can be used to manipulate text is one of the primary reasons it bas become such a popular language for both scientific programming as well as web development.

### Strings as variables

Let's work with a string representation of a sequence of DNA. A string is created by wrapping any text in single or double quotes. 

In [14]:
dna = "ACGCAGACGATTTGATGATGAGCATCGACTAGCTACACAAAGACTCAGGGCATATA"

### Print versus return on strings
Another difference between using the `print()` function and the return value of a string is that when you use print special characters in the text will be rendered. This is particularly apparent for *newline* characters, which are used to represent line breaks, as well as many other types of characters like tabs. See the example below. 

In [15]:
# return the string 
mystring = "hello\tworld\nhello world"
mystring

'hello\tworld\nhello world'

In [16]:
# print the string
print(mystring)

hello	world
hello world



### Indexing and slicing

A string is an indexed datatype that is immutable. This means that we can select portions of the text using indexed numbering, but we cannot change/mutate individual elements of it.

In [17]:
# return an indexed portion of the dna string
dna[5:15]

'GACGATTTGA'

<div class="alert alert-success">
    <b>Action:</b> Use indexing to return only the first 10 characters of dna. See Chapter 3.1.2 if you need help. 
</div>

In [18]:
dna[:10]

'ACGCAGACGA'

<div class="alert alert-success">
    <b>Action:</b> Use indexing to return only the last 5 characters of dna. See Chapter 3.1.2 if you need help. 
</div>

In [19]:
dna[-5:]

'ATATA'

### Strings as Objects

Python is called an *object-oriented* programming language, and this is because *everything* in Python is an object. What does that mean? Well, it means that everything you interact with has a hidden structure within it that it uses to store its values, and that the object typically has built-in *functions* available that can be used to manipulate its data. We'll learn more about functions soon. 

This is one of the most exciting things about working with IPython interactively in jupyter, is that it is really easy to access and see all of the attributes and functions associated with an object. This can be done by typing a variable name followed by a dot, then while your cursor is still sitting after the dot, press the `<tab>` key on your keyboard. Try it on the `dna` variable below. You should see a popup that will display the names of many functions. Select the one called `lower` and type parentheses after it to execute as a function: `dna.lower()`. 

In [20]:
dna.lower()

'acgcagacgatttgatgatgagcatcgactagctacacaaagactcagggcatata'

### How do we learn how to use all of these functions?
Again, using the interactivity of Python is useful here. Functions always have a parentheses at the end, which is where you can enter *arguments* to  modify their function. In addition, you can find more information about a function by selecting your cursor inside the parentheses, holding the shift key down, and then pressing tab. Try it on the `.lower()` function above. A popup should come up explaining what the function does. 

In the example below, we'll use a function that takes an argument. When we provide a pattern to the `.split()` function it separates the string into separate objects everywhere that pattern is found. Try it below. 

In [21]:
dna.split("TTT")

['ACGCAGACGA', 'GATGATGAGCATCGACTAGCTACACAAAGACTCAGGGCATATA']

<div class="alert alert-success">
    <b>Action:</b> Use the split() function to split the dna variable on the characters "CG". Store the return values to a new variable called dnalist. Then use print on that variable to show its values. 
</div>

In [22]:
dnalist = dna.split("CG")
print(dnalist)

['A', 'CAGA', 'ATTTGATGATGAGCAT', 'ACTAGCTACACAAAGACTCAGGGCATATA']


## List objects

One of the most flexible and useful data objects in Python is the **list**. Lists are containers that can store any other type of data object, they can even store other lists. Lists are represented by values inside of square brackets. Seem familiar? That's right, we created a list above when we split the string into multiple objects. The returned value was a list containing multiple strings. 

In [23]:
# create a list 
letters1 = ['a', 'b', 'c', 'd', 'e', 'f', 'g']

# create a list faster
letters2 = list("abcdefg")

# show that the two are identical
letters2 == letters1

True

### Indexing a list
A list can be indexed just like a string, however, a big difference is that lists are *mutable*, meaning that we can replace individual elements of a list without having to create a new variable. This is shown below (we expect the error to be raised in the one example.) 

In [24]:
# index a string
dna[5:15]

'GACGATTTGA'

In [25]:
# *try* to mutate part of a string (this won't work)
dna[5] = "T"

TypeError: 'str' object does not support item assignment

In [26]:
# make a list of DNA
dnalist = list(dna)


In [27]:
# index the dna list
dnalist[5:15]

['G', 'A', 'C', 'G', 'A', 'T', 'T', 'T', 'G', 'A']

In [28]:
# mutate part of the list and then return to show it changed
dnalist[5] = "T"
dnalist[5:15]

['T', 'A', 'C', 'G', 'A', 'T', 'T', 'T', 'G', 'A']

### List functions

Again, just like strings lists are also objects in Python, and as such they have functions accessible that can be used to operate on lists. You can see all of these listed using tab-completion after a dot, like before. 

In [29]:
# count how many "A" are in the list
dnalist.count("A")

20

<div class="alert alert-success">
    <b>Action:</b> In the cell below create two new variables, one called fiveprime that contains the first ten 10 elements in dnalist, and another called threeprime that contains the last 10 elements in dnalist.
</div>

In [30]:
fiveprime = dnalist[:10]
threeprime = dnalist[-10:]


<div class="alert alert-success">
    <b>Action:</b> Save this notebook and download as HTML to upload to courseworks when all of your notebooks are finished.
</div>