# <span style="color:red"> Introduction to Python, Part 2 </span>

### <span style="color:blue"> Call Expressions </span>

A call expression consists of a function name follow by 0 or more arguments separated by "," enclosed in parentheses.

When the Python interpreter sees a call expression, it evaluates the arguments and then applies the function.

In [None]:
# "abs" is a built in function that takes a single number as its argument and returns 
# its absolute value 

abs(-12)

In [None]:
# Here's how you get information about a function:

abs?

In [None]:
fifi= -12
abs(fifi)

In [None]:
round?

In [None]:
round(5 - 1.3)

In [None]:
round(100 / 3, 0)
round(100 / 3, 1)
round(-1/3)

In [None]:
max?

# If the documentation is cryptic, Google is your friend

In [None]:
max(2, 2 + 3, 4)

In [None]:
fifi = -20

In [None]:
gaga = -25
max(abs(gaga), 2 + 3, 4)

# Step by step, what's happening?

### <span style="color:blue"> Modules </span>

A lot of the functionality of Python is provided by "modules"
A module is essentially a library of functions.
If you want to use a function from the library you first have to "import" it.
Let's look at the module "math".

In [None]:
# Import the math module and get a list of all the functions in the module

import math
dir(math)

In [None]:
# To refer to a function in the math module you have to prefix the 
# function name with "math." as in

math.sin(1)

In [None]:
math.sqrt(9)

In [None]:
9 ** 0.5

In [None]:
(3 ** 2 + 4 ** 2) ** 0.5

In [None]:
math.sqrt(3 ** 2 + 4 ** 2)

In [None]:
# Modules can also define variables

math.pi

In [None]:
math.sin(math.pi / 2)

**Why module prefixes?**

There are 100s of Python modules written by different developers. Let's say Jim and Joe had written modules named "jim" and "joe", and they both (unaware of the other) defined a function named "foo". Suppose Ellen imported both modules. Without the prefixes there would be two functions named "foo" and the Python interpreter would get confused.  Prefixes disambiguate the situation - Ellen would refer to the functions as "jim.foo" and "joe.foo".

Later in this lecture we will need two modules, "numpy" and "datascience".

NumPy is the fundamental package for scientific computing with Python.

Datascience is a  module developed at UC Berkeley specifically for their version of "Introduction to Data Science", which is called "Data8". For future reference, there is a [cheat sheet](https://github.com/wstuetzle/STAT180/blob/master/Computing/data8_sp17_midterm_ref_sheet.pdf) and a [tutorial](http://data8.org/datascience/tutorial.html)

In [2]:
import numpy as np

# By importing numpy as np we can refer to a function foo in the module numpy
# as np.foo, rather than numpy.foo.

# The following line is necessary if the notebook is running
# on the Azure notebook server
# !pip install datascience

from datascience import *

# This way of importing a module lets us refer to the functions in the
# module without a prefix. That is in general NOT A GOOD IDEA because
# it can lead to name collisions.


###  <span style="color:blue"> Types </span>

Every object in Python has a type. The type determines what operations can be applied to the object.

In [None]:
type(3)

In [None]:
type(3.0)

In [None]:
type(abs)

# We may get to the meaning of "_or_method" later

In [None]:
# The type of an expression is the type of its value

type(5 + 6)

In [None]:
# The type of a symbol is the type of its value

x = 5 + 6.0
type(x)

In [None]:
# Another type: bool
# Possible values: True, False

3 > 2

In [None]:
3 < 2

In [None]:
type(3 < 2)

### <span style="color:blue"> Strings </span>

A string is a sequence of characters enclosed in single or double quotation marks (' or ")

In [None]:
# Here is an example for a string

'Introduction to Data Science'

In [None]:
# If a string contains single quotation marks it has to be enclosed in 
# double quotation marks, and vice versa

'It's a nice day'

In [None]:
"It's a nice day"

In [None]:
type("It's a nice day")

In [None]:
# We can glue strings together

foo = "Introduction to Data Science"
bar = " (IDS)"
foo + bar

In [None]:
# We can count the number of occurences of a character in a string

'data'.count('a')

In [None]:
# 'data'.count('a') is a shorthand for

str.count('data', 'a')

# Terminology: "count" is a "method" for type string

In [None]:
# What happens if we try try to apply the method "count" to
# other types?

foo = 5
foo.count('a')

# The interpreter complains because the method "count" is not defined
# for integers

In [None]:
type(str.count)

In [None]:
# Get documentation 

str.count?

**Are there other types with a "count" method?** <br>
Remember: google is your friend. Try googling "python count". <br>
Looks like there is a "count" method for lists. ("list" is a type we have not covered yet)

**There are many functions and methods for operating on strings**  <br>
To find out about them, google "python string operations" <br>
[This seems to be good source](https://www.digitalocean.com/community/tutorials/an-introduction-to-string-functions-in-python-3) <br>
Let's try some


In [None]:
foo = "balloon"
bar = "big "
bar.join(foo)

In [None]:
foo.replace("on", "")

In [None]:
# The new string is returned but foo is not changed

foo

### <span style="color:blue"> More on types and type conversion </span>

In [None]:
type("three")

In [None]:
type("3")

In [None]:
3 + "3"

# Addition of a str and an int is undefined

In [None]:
"3" + "3"

In [None]:
three = "3"
three + three

In [None]:
# The function int() converts a string of digits into an int

3 + int("577")

In [None]:
# Let's see if it also works for strings made up of digits and a decimal point

int("3.14159")

# Doesn't like that

In [None]:
# The string "3.14159" represents a float

float("3.14159")

In [None]:
# The function int() applied to an argument of type float converts
# the argument to an int by truncation

int(-3.1419)

### <span style="color:blue"> Arrays </span>

Arrays let us group objects of the same type into collections, which allows programmers to organize those values and refer to all of them with a single name. By grouping values together, we can write code that performs a computation on many pieces of data at once.

Below, we collect four different temperatures into an array called temps. These are the estimated average daily high temperatures over all land on Earth (in degrees Celsius) for the decades surrounding 1850, 1900, 1950, and 2000, respectively, expressed as deviations from the average absolute high temperature between 1951 and 1980, which was 14.48 degrees.

In [3]:
baseline_high = 14.48
highs = make_array(baseline_high - 0.880, 
                   baseline_high - 0.093,
                   baseline_high + 0.105, 
                   baseline_high + 0.684)
highs

array([ 13.6  ,  14.387,  14.585,  15.164])

In [4]:
# The function len() returns the length of an array

len(highs)

4

In [None]:
highs.size

# Note: size is not a method - it is an "instance variable"

In [5]:
# We can access array elements using their position (index)

highs[1]

# Valid indices are integers between 0 and len(highs) - 1

14.387

In [6]:
# When doing arithmetic, numbers are automatically expanded into arrays
# Covert temperatures to degrees Fahrenheit

(9/5) * highs + 32

array([ 56.48  ,  57.8966,  58.253 ,  59.2952])

In [7]:
highs.repeat(2)

array([ 13.6  ,  13.6  ,  14.387,  14.387,  14.585,  14.585,  15.164,
        15.164])

In [8]:
highs.sum()

57.736000000000004

In [9]:
highs.sum() / highs.size

14.434000000000001

In [10]:
highs.mean()

14.434000000000001

### <span style="color:blue"> Array Arithmetic </span>

In [11]:
baseline_low = 3.00
lows = make_array(baseline_low - 0.872, 
                  baseline_low - 0.629,
                  baseline_low - 0.126, 
                  baseline_low + 0.728)
lows

array([ 2.128,  2.371,  2.874,  3.728])

In [12]:
make_array(
    highs.item(0) - lows.item(0),
    highs.item(1) - lows.item(1),
    highs.item(2) - lows.item(2),
    highs.item(3) - lows.item(3)
)

array([ 11.472,  12.016,  11.711,  11.436])

In [13]:
highs - lows

array([ 11.472,  12.016,  11.711,  11.436])

In [14]:
(9/5*highs + 32 - (9/5*lows + 32))

array([ 20.6496,  21.6288,  21.0798,  20.5848])

In [None]:
9/5 * (highs - lows)

### <span style="color:blue"> Array functions </span>

The numpy module provides many functions and methods for working with arrays. There are books and online tutorials.

In [None]:
# Let's see what's there

dir(np)

# There's a lot!

In [16]:
# Just for illustration, pick a function and find out what it does

help(np.full)

Help on function full in module numpy.core.numeric:

full(shape, fill_value, dtype=None, order='C')
    Return a new array of given shape and type, filled with `fill_value`.
    
    Parameters
    ----------
    shape : int or sequence of ints
        Shape of the new array, e.g., ``(2, 3)`` or ``2``.
    fill_value : scalar
        Fill value.
    dtype : data-type, optional
        The desired data-type for the array  The default, `None`, means
         `np.array(fill_value).dtype`.
    order : {'C', 'F'}, optional
        Whether to store multidimensional data in C- or Fortran-contiguous
        (row- or column-wise) order in memory.
    
    Returns
    -------
    out : ndarray
        Array of `fill_value` with the given shape, dtype, and order.
    
    See Also
    --------
    zeros_like : Return an array of zeros with shape and type of input.
    ones_like : Return an array of ones with shape and type of input.
    empty_like : Return an empty array with shape and type of in

In [17]:
# Try it out

np.full(5, -1)

array([-1, -1, -1, -1, -1])

In [18]:
temps = make_array(
0.209,
0.253,
0.035,
0.226,
0.062,
0.045,
0.092,
0.231,
0.279,
0.158,
0.346,
0.332,
0.155,
0.184,
0.238,
0.385,
0.284,
0.427,
0.576,
0.344,
0.358,
0.500,
0.581,
0.568,
0.468,
0.652,
0.600,
0.604,
0.469,
0.587,
0.652,
0.535,
0.564,
0.587,
0.652,
0.791,
0.938
)

temps

In [19]:
temps

array([ 0.209,  0.253,  0.035,  0.226,  0.062,  0.045,  0.092,  0.231,
        0.279,  0.158,  0.346,  0.332,  0.155,  0.184,  0.238,  0.385,
        0.284,  0.427,  0.576,  0.344,  0.358,  0.5  ,  0.581,  0.568,
        0.468,  0.652,  0.6  ,  0.604,  0.469,  0.587,  0.652,  0.535,
        0.564,  0.587,  0.652,  0.791,  0.938])

In [20]:
max(temps)

0.93799999999999994

In [21]:
np.sqrt(temps)

array([ 0.45716518,  0.50299105,  0.18708287,  0.47539457,  0.24899799,
        0.21213203,  0.30331502,  0.48062459,  0.52820451,  0.39749214,
        0.58821765,  0.57619441,  0.39370039,  0.42895221,  0.48785244,
        0.62048368,  0.5329165 ,  0.65345237,  0.75894664,  0.58651513,
        0.59833101,  0.70710678,  0.76223356,  0.75365775,  0.68410526,
        0.80746517,  0.77459667,  0.77717437,  0.68483575,  0.76615925,
        0.80746517,  0.73143694,  0.75099933,  0.76615925,  0.80746517,
        0.88938181,  0.968504  ])

In [22]:
np.sort(temps)

array([ 0.035,  0.045,  0.062,  0.092,  0.155,  0.158,  0.184,  0.209,
        0.226,  0.231,  0.238,  0.253,  0.279,  0.284,  0.332,  0.344,
        0.346,  0.358,  0.385,  0.427,  0.468,  0.469,  0.5  ,  0.535,
        0.564,  0.568,  0.576,  0.581,  0.587,  0.587,  0.6  ,  0.604,
        0.652,  0.652,  0.652,  0.791,  0.938])

In [23]:
np.diff(temps)

array([ 0.044, -0.218,  0.191, -0.164, -0.017,  0.047,  0.139,  0.048,
       -0.121,  0.188, -0.014, -0.177,  0.029,  0.054,  0.147, -0.101,
        0.143,  0.149, -0.232,  0.014,  0.142,  0.081, -0.013, -0.1  ,
        0.184, -0.052,  0.004, -0.135,  0.118,  0.065, -0.117,  0.029,
        0.023,  0.065,  0.139,  0.147])

In [None]:
np.sort(np.diff(temps))

In [24]:
np.diff(temps) > 0

array([ True, False,  True, False, False,  True,  True,  True, False,
        True, False, False,  True,  True,  True, False,  True,  True,
       False,  True,  True,  True, False, False,  True, False,  True,
       False,  True,  True, False,  True,  True,  True,  True,  True], dtype=bool)

### <span style="color:blue"> Ranges </span>

A range is an evenly spaced sequence of numbers

In [None]:
np.arange?

In [25]:
np.arange(5)

# arange(foo) produces array of length foo containing integers 0...(foo - 1)

array([0, 1, 2, 3, 4])

In [26]:
np.arange(3, 9)

array([3, 4, 5, 6, 7, 8])

In [27]:
np.arange(3, 30, 5)

array([ 3,  8, 13, 18, 23, 28])

In [28]:
np.arange(1.5, -2, -0.5)

array([ 1.5,  1. ,  0.5,  0. , -0.5, -1. , -1.5])

**Ilustration: computing $\pi$ using Leibniz' formula**
$$\pi = 4 \cdot \left(1 - \frac{1}{3} + \frac{1}{5} - \frac{1}{7} + \frac{1}{9} - \frac{1}{11} + \dots\right)$$

In [29]:
a = np.arange(1, 100000, 4)

In [30]:
a[np.arange(10)]

array([ 1,  5,  9, 13, 17, 21, 25, 29, 33, 37])

In [31]:
(a + 2)[np.arange(10)]

array([ 3,  7, 11, 15, 19, 23, 27, 31, 35, 39])

In [33]:
pi_approx = 4 * sum(1/a - 1/(a+2))

In [35]:
import math

In [36]:
abs(math.pi - pi_approx)

1.9999999985031991e-05