In [None]:
!python --version

Python 3.9.1


<img src="UP Data Science Society Logo 2.png" width=700>

# [0.2] Python Fundamentals

**Prepared by:**

- Joshua Castillo
- Lanz Lagman

**Topics to cover:** 

- Functions and packages 
- Numpy

**References:**
- [(Ivezic, Connolly, Vanderplas, Gray) Statistics, Data Mining, and Machine Learning in Astronomy](https://press.princeton.edu/books/hardcover/9780691198309/statistics-data-mining-and-machine-learning-in-astronomy)
- [(Vanderplas) Python Data Science Handbook](https://jakevdp.github.io/PythonDataScienceHandbook/)

# I. Functions and Packages

## A. Basics of Building a Function

- A function is a set of instructions that you ask your computer carry out. These are essential as it will allows us to do many things to our data in order to extract meaningful insights.
- It is highly suggested that when creating functions, a docstring should be incorporated with it. 
  - A docstring is documentation of what a function does
  - See the codes below for examples on the functions `do_nothing`, `sqr`, and `introduce_yourself`. Make sure to run the functions so that they can be used for reference later.

In [None]:
def do_nothing(x):
    """
    This is a sample docstring. This function does not do anything at all, lol.
    """
    return x

In [None]:
def sqr(x):
    """
    Squares any input x of type float/int.
    """
    return x**2

In [None]:
def introduce_yourself(x):
    """
    Politely introduces an input name.
    """
    return f"Hi!, my name is {x}."

- Docstrings of certain functions could be accessed using the `help()` function or by typing `?` after the function. Let's apply this to our three functions earlier. Try running the codes below

In [None]:
help(do_nothing)

Help on function do_nothing in module __main__:

do_nothing(x)
    This is a sample docstring. This function does not do anything at all, lol.



In [None]:
help(sqr)

Help on function sqr in module __main__:

sqr(x)
    Squares any input x of type float/int.



In [None]:
help(introduce_yourself)

Help on function introduce_yourself in module __main__:

introduce_yourself(x)
    Politely introduces an input name.



In [None]:
do_nothing?

In [None]:
sqr?

In [None]:
introduce_yourself?

- The source code of any function could be accessed by typing `??` at the end of the function.

In [None]:
do_nothing??

In [None]:
sqr??

In [None]:
introduce_yourself??

- Try to run the functions with their respective values below.
  - Run the `do_nothing` function with the input `'inflation'`. It should return `'inflation'`.
  - Run the `sqr` function with the input `100`. `%%time` is an example of *magic command* which prints the wall time it takes to run the entire cell.
  - Run the introduce_yourself function with the input 'Jhepoy Dizon'. It should return 'Hi!, my name is Jhepoy Dizon.'

In [None]:
do_nothing('inflation')

'inflation'

In [None]:
%%time
sqr(100)

CPU times: total: 0 ns
Wall time: 0 ns


10000

In [None]:
introduce_yourself('Jhepoy Dizon')

'Hi!, my name is Jhepoy Dizon.'

# II. `NumPy`

## A. Basics

- `NumPy` is a python package that will help us work with arrays a lot better.
- First, let's import the numpy package as `np`
- Also, take a look at the description of the `np.ones` command in the numpy package. 

In [None]:
import numpy as np

In [None]:
help(np.ones)

Help on function ones in module numpy:

ones(shape, dtype=None, order='C', *, like=None)
    Return a new array of given shape and type, filled with ones.
    
    Parameters
    ----------
    shape : int or sequence of ints
        Shape of the new array, e.g., ``(2, 3)`` or ``2``.
    dtype : data-type, optional
        The desired data-type for the array, e.g., `numpy.int8`.  Default is
        `numpy.float64`.
    order : {'C', 'F'}, optional, default: C
        Whether to store multi-dimensional data in row-major
        (C-style) or column-major (Fortran-style) order in
        memory.
    like : array_like
        Reference object to allow the creation of arrays which are not
        NumPy arrays. If an array-like passed in as ``like`` supports
        the ``__array_function__`` protocol, the result will be defined
        by it. In this case, it ensures the creation of an array object
        compatible with that passed in via this argument.
    
        .. versionadded:: 1.20

### Constructing Arrays

- Constructing arrays allow us to have a more convenient way to store and view our data for any problem.
- Let's construct a numpy 1D array with 2 elements using the command below. 

In [None]:
np.ones(shape=(2))

array([1., 1.])

- Now, construct a 2D array with 2 rows and 3 columns. Notice how, compared to the 1D array, we input numbers in the form $(r,c)$. Here, $r$ indicates the number of rows and $c$ indicates the number of columns. This gives us a 2D $r\times c = 2\times 3$ array.

In [None]:
np.ones(shape=(2,3))

array([[1., 1., 1.],
       [1., 1., 1.]])

- Let's apply this to a 3D array. Here, we construct a 3D array with four 2D arrays stacked on each other, wherein each 2D array has 2 rows and 3 columns.
- Notice how for this, we give the inputs $(n,r,c)$ where n is the number of 2D $r\times c$ arrays we want to output.

In [None]:
np.ones(shape=(4,2,3))

array([[[1., 1., 1.],
        [1., 1., 1.]],

       [[1., 1., 1.],
        [1., 1., 1.]],

       [[1., 1., 1.],
        [1., 1., 1.]],

       [[1., 1., 1.],
        [1., 1., 1.]]])

- Let's construct `x1`, a custom 1D array with that contains the counting numbers from 0 to 6.

In [None]:
x1 = np.array([0,1,2,3,4,5,6])
x1

array([0, 1, 2, 3, 4, 5, 6])

- Now, let's construct a custom 2D array `x2` that contains counting numbers from 0 to 15, where each row only has 4 numbers before continuing to the next row.

In [None]:
x2 = np.array([[0,1,2,3], 
               [4,5,6,7],
               [8,9,10,11],
               [12,13,14,15]])
x2

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

### Constructing Arrays with `arange` and `linspace`

- The next two commands we'll study allows us to generate an array without manually inputting the values.
- Let's study the use of the arange command in numpy. Here we construct an 1D array of numbers starting from 0 to 1, with an interval of 0.1. Notice how the input follows the format $(a, b, x)$, where $a$ indicates the starting point of the series, $b$ indicates the end of the series plus $x$, and $x$ indicates the value to be added to every term of the series while it is not yet equal to b.

In [None]:
np.arange(0, 1.1, 0.1)

array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])

- For linspace, let's construct an 1D array of numbers starting from 0 to 1, with equal intervals evenly spaced into 101 elements.
- Notice how the inputs of linspace follow the format $(a, c, y)$, where $a$ is still the starting point of the series, $c$ is the final element of the series, and $y$ is the total number of equally-spaced generated elements.

In [None]:
np.linspace(0, 1, 101)

array([0.  , 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1 ,
       0.11, 0.12, 0.13, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, 0.2 , 0.21,
       0.22, 0.23, 0.24, 0.25, 0.26, 0.27, 0.28, 0.29, 0.3 , 0.31, 0.32,
       0.33, 0.34, 0.35, 0.36, 0.37, 0.38, 0.39, 0.4 , 0.41, 0.42, 0.43,
       0.44, 0.45, 0.46, 0.47, 0.48, 0.49, 0.5 , 0.51, 0.52, 0.53, 0.54,
       0.55, 0.56, 0.57, 0.58, 0.59, 0.6 , 0.61, 0.62, 0.63, 0.64, 0.65,
       0.66, 0.67, 0.68, 0.69, 0.7 , 0.71, 0.72, 0.73, 0.74, 0.75, 0.76,
       0.77, 0.78, 0.79, 0.8 , 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87,
       0.88, 0.89, 0.9 , 0.91, 0.92, 0.93, 0.94, 0.95, 0.96, 0.97, 0.98,
       0.99, 1.  ])

#### Access array properties

- Accessing array properties is important to give us an idea of what type of data we're dealing with, as well as its general size.
- We can get the data type of the elements of an array by adding `.dtype` after the name of the array. For this example, let us refer back to array `x2` which contains the counting numbers from 0 to 15.
- For the next functions `dtype`, `size`, and `shape`, verify the results with array `x2`constructed earlier.

In [None]:
x2.dtype

dtype('int32')

- We can get the total number of elements in an array by adding .size after its name.

In [None]:
x2.size

16

- To get the dimensions/shape of our array, add .shape after the name of the array.

In [None]:
x2.shape

(4, 4)

### 1. Array Indexing

- Array indexing is important for us to look into specific elements in our data. This will be important for future functions that we will use to examine our data.
- First, let us access the first element of our 1D array `x1`. It is simply the 0th element.
- For indexing, we use the square brackets `[]` next to the array we want to index and input the number of the ith element we want to select.

In [None]:
x1

array([0, 1, 2, 3, 4, 5, 6])

In [None]:
x1[0]

0

- Next, let's access the last element of 1D array. Notice how, even if you do not know the size of your array, this is still possible. Simply using the input of -1 gives you the last element.

In [None]:
x1[-1]

6

- Moreover, we access the element of index 3 of a 1D array, as you can do with any element in between the first and last.

In [None]:
x1[3]

3

- Let's apply what we did earlier to 2D arrays. Notice how the input follows the same format as $(r,c)$, except now it's refering to the element in the $r$th row and $c$th column.
- Here, we access the first element of a 2D array.

In [None]:
x2

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [None]:
x2[0,0]

0

- Now, we access the last element of a 2D array.

In [None]:
x2[-1,-1]

15

- Finally, we access the element in the rth row and cth column, where $r=2$ and $c=3$.

In [None]:
x2[2,3]

11

### 2. Array Slicing: Accessing Subarrays

#### 1D Arrays

- Array slicing is essentially array indexing except for sets (otherwise called as subarrays) of elements. It serves the same importance as indexing except for groups of data. This can also be used in sampling.
- Let's access the first three elements of our 1D array `x1`. Notice how the input as change now, from just a singular input `x` to a range `n:m`, which accesses the elements from element `n` to `m`.
- If we are accessing from the first element to any given element `m`, it is sufficient to simply input `:m` to obtain the result subarray.
- You can compare the results of `:m` and `n:m` where `n=0` below.

In [None]:
x1[0:3]

array([0, 1, 2])

In [None]:
x1[:3]

array([0, 1, 2])

- Applying the same concept as `:m`, we can get the elements after the nth element all the way to the last element by using the input `n:`. Here, we get the elements after index 3 of our array.

In [None]:
x1[3:]

array([3, 4, 5, 6])

- Let's say instead of choosing elements to display, we want to display a subarray given that certain elements are removed. This can be done by adding a negative before the numbers in the range `n:m` or `:m`.
- This is `x1` with its last three elements excluded .

In [None]:
x1[:-3]

array([0, 1, 2, 3])

- To get the last three elements of an array, we use the same concept as -1 for array indexing but with `-w:` where `w` is the number of elements included in the subarray starting from the last element. 

In [None]:
x1[-3:]

array([4, 5, 6])

- Repeating our process for array slicing using `x1[0:3]` wherein we got the 1st to third element, we can get the sub-array of the 5th element to the 7th element by slicing from index 4 up to index 7. 

In [None]:
x1[4:7]

array([4, 5, 6])

- To get your array in reversed order, simply input `::-1` in your square brackets.

In [None]:
x1[::-1]

array([6, 5, 4, 3, 2, 1, 0])

- To access elements at equal intervals i, you input `::i` inside your square brackets.
- Let's access every other element in our array `x1` (i.e. `i=2`).

In [None]:
x1[::2]

array([0, 2, 4, 6])

- We can access elements at equal intervals in reverse as well.
- For every other element, reversed, we use the code below which combines our knowledge on reversed arrays and accessing every other element.

In [None]:
x1[::-2]

array([6, 4, 2, 0])

- You can also give a starting point to your access of elements at equal intervals. This can be done by using the input `j::i`, where `j` is the positon of the `j`th element in the array and `i` is the selection interval.
- Starting with index 1, let's access every other element afterwards.

In [None]:
x1[1::2]

array([1, 3, 5])

- Let's do the same thing but starting with index 5, getting every other element, reversed.

In [None]:
x1[5::-2]

array([5, 3, 1])

#### Multidimensional Arrays

In [None]:
x2

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

- To get subarrays in 2D arrays, we still follow the format of $(r,c)$. However, now we indicate ranges for our rows and columns that we want to select.
- We can get elements starting from first $r=2$ rows and first $c=3$ columns of our array `x2` with the following code.

In [None]:
x2[:2,:3]

array([[0, 1, 2],
       [4, 5, 6]])

- Now applying our new knowledge on 2D array slicing, let's combine it with our knowledge of accessing at equal intervals $i$, but not for rows and columns.
- To get the first two rows of our array, getting only every other column, we use the code below.

In [None]:
x2[:3,::2]

array([[ 0,  2],
       [ 4,  6],
       [ 8, 10]])

- To get every row from 0 to the last row at an interval of 3, only for the first two columns of the array, we use the following code.

In [None]:
x2[::3,:2]

array([[ 0,  1],
       [12, 13]])

### End of tutorial.

---

# Coding Space

- Try to implement your own codes based on this week's lesson on the cells below.
- Feel free to add more cells and markdowns as you see fit.