<figure>
  <IMG SRC="https://www.colorado.edu/cs/profiles/express/themes/cuspirit/logo.png" WIDTH=50 ALIGN="right">
</figure>
# Beginning Python
*CSCI 3022 - Dirk Grunwald*

We need to know "just enough Python" to manipulate statistical data and do basic statistical computations.


We'll introduce additional concepts throughout the semester, but need some background that we will use through the course.

## Python Assignment , Variables and Comments

Variables are names for data of varying types. We assign values to those variables using an assignment statement with a new variable on the left-hand side of the assignment. Here are a couple of examples:

In [None]:
x = 10
y = 20.0

Once variables are defined, we can use them anywhere we would use a value

In [None]:
print(x+y)

There are more complex rules about variables, put we're leaving more complex programming and software development issues to other classes.

*Comments* start with a ```#``` and continue to the end of line

In [None]:
print(10.0)
# this is a comment that will be ignored

## Seeing output

Python notebooks capture and display the output of your python program statements. Some python statements produce no output (_e.g._ the assignment statements below).

In [None]:
z = 20

Statements containing just expressions or variable names produce output of _the last line_ of input. For example, the following input will evaluate two statements but produce a single line of output.

In [None]:
x
y+z

You can statements explicitly produce output using **print** statements. If you need to see the value of variables, you should explicitly print them rather than using the "last line of input" method.

In [None]:
print('x is', x)
print('y+z is', y+z)

## Order of Cell Evaluation in Notebooks

Python notebooks "read" from top to bottom like a narrative or web page. However, the code cells are _evaluated in the order they are executed_. For example, execute the the following cell and then go back to the ```print``` cell above and re-evaluate it.

In [None]:
z = 9999

This can be useful because it allows us to change values and "play with" a piece of code to see their effect, but it can also cause misunderstandings because the notebook didn't capture the order in which things were executed.

To make certain others will see the results you expect them to see, you should use the **Kernel -> Restart & Run All** menu item to execute all cells in the order they appear in the notebook.

## Python Datatypes

Computers distinguish different kinds of data, called *datatypes*.

The Python datatypes we'll use are:
* Floating point - numbers like 3.1415 or -0.5
* Integers - numbers like 2 or -10
* Booleans - True or False
* Strings - character strings "in double quotes" or 'single'

In python, a variable can hold any kind of data type. Operations on data types largely make sense based on your highschool algebra class:

In [None]:
print(11.0/2)

In Python3, division normally produces a floating point value, even if an integer is divided. Go and change the '11.0' to '10' in the previous cell. You can force integer division by using ```//``` for division, like this:

In [None]:
print(11//2)

The only odd notation is ```**``` for exponents. For example:

In [None]:
print('2 to the third power is', 2**3)

Comparison operations produce Booleans as a value. 

In [None]:
print(5 < 10)
print(10 < 5)

Addition can be used strings combine strings and multiplication duplicates strings. There are no string substraction or division operations.

In [None]:
print('dog' + 'cat')
print('dog' * 2)
#print('dog' - 'g')

You can convert one type of data to another using *conversion functions*.

Conversions are:
* float(x) - convert to a floating point number
* int(x) - convert to an integer
* str(x) - convert to a string
* type(x) - identify the type of a variable

In [None]:
s = "123"
print(s * 2)
print(int(s) * 2)
print('s is a', type(s))

## Collections

In statistics and data science, we usually manipulate collections of data. Python has three *collection types* built in:
* tuples - multiple values that you can't modify
* lists - multiple values that you *can* modify
* dictionaries - a method to access a *value* given a *key*

Later, we will examine *vectors* and *arrays* provided by the NumPy extension to Python and the Pandas *series* and *data frame*. We will use a combination of all of these collections in this course.

### Tuples

Collections are simply comma-separated values enclosed in parenthesis. You can access the elements or parts of a tuple using the *subscript* or *index* notation. Tuples can contain any values of any type.

In [None]:
t1 = ('dog', 47, 'cat', 9.2)
print('t1 is', t1, 'and has', len(t1), 'elements')
print('First element of t1 is', t1[0])
print('Fourth element of t1 is', t1[3])

You can't change the parts of a tuple.

In [None]:
#t1[0] = 'foo'

### Lists

Lists are basically tuples you can modify.

In [None]:
l1 = [ 'dog', 47, 'cat', 9.2]
print('l1 is', l1, 'and has', len(l1), 'elements')
print('First element of l1 is', l1[0])
l1[0] = 'dragon'
print('l1 is now', l1)

You can extract *slices* from lists. We'll use slices with NumPy arrays as well.

In [None]:
print('Middle part', l1[1:3])
print('Front part', l1[:2])
print('Back part', l1[2:])
print('Every other', l1[::2])

You can combine lists and append single items or lists to lists. The following illustrates that appending doesn't always do what you think (you may way want ```extend```) and also that assigning lists *does not* copy them -- both l1 and l3 refer to the same list after this code.

In [None]:
l2 = [ 'hamster', 19, 'shrew', 0.5]
print(l1 + l2)
l3 = l1
l3.append(l2)
print('l3 is now ', l3)
print('l1 is now ', l1, 'and has length', len(l1))

Some python functions operate on lists.

In [None]:
height = [ 10, 20, 15, 12, 7]
print('The mean height is', sum(height) / len(height))

In this course, we mainly deal with lists of numbers and rather than append or manipulate lists we use *list comprehensions* to operate on list elements. This is easiest to see in an example:

In [None]:
evens = [ 0, 2, 4, 6, 8, 10]
squares = [ x**2 for x in evens ]
print('squares is', squares)

### Dictionaries

We make little use of dictionaries in this class, but they may appear in some examples you encounter.

In [None]:
opposites = { 'cat' : 'dog', 'tall' : 'short'}
print('There are', len(opposites), 'entries in the dictionary')
print('The opposite of cat is', opposites['cat'])

In [None]:
print('The keys are', opposites.keys())
print('The values are', opposites.values())

In [None]:
print('Is cat in the dictionary?', 'cat' in opposites)