# Python Review

Short review about some core concepts in Python exemplified by objects in the Numpy library.

Goals:
- recall basic Python vocabulary
- practice markdown syntax


## Libraries and packages
**library** is a collection of code that we can use to perform specific tasks in our programs. It can be a single file or multiple files.

**NumPy:**
- core library for numerical computing in Python
- many of libraries use NumPy arrays as their building blocks
- computations on NumPy objects are optimized for speed and memory usage

Let's import NumPy with its **standard abbreviation** np:

In [1]:
import numpy as np

## Variables

**variable:** a name that we assign to a particular object in Python

Example:

In [2]:
# Assign a small array to variable a
a = np.array([ [1,1,2], [3,5,8] ])

To view a variable's value from our Jupyter notebook:

In [3]:
# Run cell with variable name to show value
a

array([[1, 1, 2],
       [3, 5, 8]])

In [4]:
# Use 'print' function to print the value
print(a)

[[1 1 2]
 [3 5 8]]


## Convention: Use 'snake_case' for naming variables

This is the convention we will use in the course. Why?
'my-variable' or 'MyVariable' or 'myVariable'

PEP 8 - Style Guide for Python Code recommends snake_case.

** Remember that variables name should be both descriptive and concise **


## Objects
**object:** (informally speaking) is a bundle of *properties* and *actions* about something specific.

Example:

Object: data frame
Properties: number of rows, names of columns, and date created
Actions: selecting a specifc row or adding a new column

A variable is the name we give a specific object, and the same object can be reference by different variables.

In practice, we can often use the word variable and object interchangeably.

## Types
Every object in Python has a **type**, the type tells us what kind of object we have. We can also call the type of an object, the **class** of an object. So the class and object both mean what kind of object we have.

In [5]:
# see the type/class of a variable/object by using the 'type' function
type(a)

numpy.ndarray

The numpy.ndarray is the core object/data type of the NumPy package

In [7]:
print(a[0,0])

type(a[0,0])

1


numpy.int64

`numpy.int64` is not the standard Python integer type 'int'
`numpy.int64` is a special data type in NumPy telling us that 1 is an integer stored in a 64-bit number

Check-In: access the value 5 in array 'a'

In [10]:
a[1,1]

5

## Functions

`print` was our first example of a Python **function**
Functions take in a set of **arguments**, separated by commas, and use those arguments to create an **output**

In the course, we'll be using argument and parameter interchangeably. But they do have slightly different meanings.

We can ask for info about what a function does by executing `?` followed by the function name.

In [11]:
?print

[0;31mDocstring:[0m
print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

Prints the values to a stream, or to sys.stdout by default.
Optional keyword arguments:
file:  a file-like object (stream); defaults to the current sys.stdout.
sep:   string inserted between values, default a space.
end:   string appended after the last value, default a newline.
flush: whether to forcibly flush the stream.
[0;31mType:[0m      builtin_function_or_method


What we obtain is a **docstring**, a special type of comment that is used to document how a function (or a class or a module) works.

Notice is that there a different types of argumnets inside the function's parenthesis.

Roughly speaking, a function has type two types of arguments:

- **non-option arguments**: arguments *you* have to specify for the function to work
- **optional argumnets**: arguments that are pre-filled with a default value by the function, but you can override them. Optional argumnets appear inside the parenthesis () in the form `optional_argument = default_value`.

Example:
`end` is a parameter in `print` with default vlaue in a new line
We can pass the value `:-)` to this parameter so that it finishes the line with `:-)`

In [13]:
print('Change the end parameter', end=' :-)')

Change the end parameter :-)

## Attributes and methods
An object in Python has attributes and methods.

- **attributes**: a property of the object, some piece of information about it
- **method**: a procedure associated with an object, so it is an action where the main ingredient is the object itself.

## Check-in
Make a diagram like the cat one, for a class `fish`

attributes: scale, fresh-water/saltwater, color, weight, length
method: swim(), eat(), breathe()


Example
NumPy arrays have many methods and attributes. For example:


In [20]:
a

array([[1, 1, 2],
       [3, 5, 8]])

In [15]:
# T is an example of an attribute, it returns the transpose an array
print(a.T)

[[1 3]
 [1 5]
 [2 8]]


In [16]:
type(a.T)

numpy.ndarray

In [19]:
# shape: another attribute tells us the shape of the array
print(a.shape)
type(a.shape)

(2, 3)


tuple

In [24]:
# ndim is an attribute holding the number of array dimensions
print('dim:', a.ndim, '| type:', type(a.ndim))

dim: 2 | type: <class 'int'>


Attributes can have many different data types.

Some examples of methods:


In [26]:
# The min method returns the minimum value in the array along a specified axis
print(a)
a.min(axis=0)

[[1 1 2]
 [3 5 8]]


array([1, 1, 2])

In [27]:
# Run min method with axis
a.min()

1

Remember, methods are **functions** associated with an object. We can confirm this!

In [29]:
# method tolist() transform array into a list
a.tolist()

builtin_function_or_method

In [30]:
type(a.tolist)

builtin_function_or_method

## Exercise
1. Read the print function help. What is the type of the argumnet `sep`? Is this a default or non-default argument? Why?
2. Create two new variables, one with the integer value 77 and another one with the string 99.
3. Use your variable to print 77%99%77 by changing the value of one of the default programs

In [31]:
?print

[0;31mDocstring:[0m
print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

Prints the values to a stream, or to sys.stdout by default.
Optional keyword arguments:
file:  a file-like object (stream); defaults to the current sys.stdout.
sep:   string inserted between values, default a space.
end:   string appended after the last value, default a newline.
flush: whether to forcibly flush the stream.
[0;31mType:[0m      builtin_function_or_method


In [39]:
x = 77
y = 99

print(x,y,x, sep = "%")

77%99%77


HW Check-In:
The integer number -999 is often used to represent missing values. Create a pandas.Series named s with four integer values, two of which are -999. The index of this series should be the the letters A through D.

In the pandas.Series documentation, look for the method mask(). Use this method to update the series s so that the -999 values are replaced by NA values. HINT: check the first example in the method’s documentation.


Check-In 2
We can access the data frame's column names via the columns attribute. Update the column names to C1 and C2 by updating this attribute.

In [44]:
import pandas as pd
s = pd.Series([1,5,-999, -999], ['A', 'B', 'C', 'D'])
s = s.mask(s == -999)
s

A    1.0
B    5.0
C    NaN
D    NaN
dtype: float64

In [50]:
import numpy as np
df = {'col_name_1' : pd.Series(np.arange(3)),
    'col_name_2' : pd.Series([3.1, 3.2, 3.3])}

df = pd.DataFrame(df)
df

df = df.rename(columns={'col_name_1' : 'C1',
                       'col_name_2' : 'C2'})


In [52]:
# another one where you don't have to re-assign the variable
df.columns = ['C1', 'C2']
df

Unnamed: 0,C1,C2
0,0,3.1
1,1,3.2
2,2,3.3
