<div class='heading'>
    <div style='float:left;'><h1>CPSC 4300/6300: Applied Data Science</h1></div>
     <img style="float: right; padding-right: 10px" width="100" src="https://raw.githubusercontent.com/bsethwalker/clemson-cs4300/main/images/clemson_paw.png"> </div>
     </div>

**Clemson University**<br>
**Instructor(s):** Aaron Masino <br>

## Lab 2: Introduction to Python Fundamentals & Python Data Science Libraries

## Programming Expectations
All assignments for this class will use Python and the browser-based iPython notebook format you are currently viewing.  Programming at the level of CPSC 2120 is a prerequisite for this course.   If you have concerns about this, come speak with any of the instructors.

We will refer to the Python 3 [documentation](https://docs.python.org/3/) in this lab and throughout the course.  

## Learning Goals
This introductory lab is a condensed introduction to Python numerical programming.  By the end of this lab, you should:

- Be able to write short Python scripts using variables, operators, containers, and functions
- Understand [numpy](https://numpy.org/) arrays and broadcase operations.
- Be aware of and able to reference some of the common Python numerical libraries including: [numpy](https://numpy.org/), [scipy](https://scipy.org/), [statsmodels](https://www.statsmodels.org/stable/index.html)

# Part 1: Python Programming Fundamentals

Let's begin by reviewing some fundamental programming concepts in Python:
1. Variable assignment
2. Operators: numerical, boolean
3. Containers: tuples, lists, sets, dictionaries
4. Control flow
5. Functions
6. Classes

For more information and as a handy reference, checkout this [Python Control Flow Cheatsheet](https://www.pythoncheatsheet.org/cheatsheet/control-flow)

## Python Variable Assignment
* Variable names are labels for a datum held in computer memory
* Assignment of a variable name to the variable value (datum) is done with the `=` as in `name = value`
* (__Good Coding Practice__) Variable names should:
    - Be descriptive enough to help you or other readers of your code understand the meaning and use of the variable
    - Be concise so that the variable name does not clutter the code
    - Follow a chosen [variable nameing convention](https://peps.python.org/pep-0008/#descriptive-naming-styles). There are many options, most importantly be consistent. Most of the material in this course will tend to follow the following convention:
        * UPPERCASE NAMES - refer to "constants" - python doesn't inforce immutability so constants can be changed, by convention programmes avoid doing so.
        * CamelCaseNames - refer to variables whose values are expected to change in the code, for single word variable names, all lower case
        * functions_use_underscores - function / method names will be all lowercase with underscores separating words
        * prefixed variables
            * f_FileName - file names will usually start with f_
            * dir_DirName - directory names will usually start with dir_


### Examples of variable assignment

In [None]:
# Some examples
# Numerical variables
j = 1 # integer variable
x = 3.14 # float variable

# String variables
a = 'a' # single character
s = 'Hello, World!' # string
myName = 'Aaron' # string
helloMyName=f'Hello, {myName}' # f-string # for more on f-strings, see https://realpython.com/python-f-strings/

# Boolean variables
t = True
f = False

# The None object - a special object that represents the absence of a value
n = None

# print some variables
print(j)
print(x)
print(n)
print(helloMyName)
s # notice, if the last line of a cell is a variable, it will be printed

### Reasignment
Variables names can be reassigned to new variables

In [None]:
x = 3.14
print(x)
x = 2
print(x)

# no restriction on the type when reassigning a variable
x = 'Hello, World!'
print(x)

### Assigning variable names to other variable names
Variable names can be assigned (or reassigned) by setting them equal to other variable names. **Warning**: Remember the variable name points to the datum. If the datum changes, it will change the value for all variables to which it is assigned. This doesn't affect atomic types as the underlying datum cannot be changed (more on this below).

In [None]:
x = 3.14
y = x
print('x=',x)
print('y=',y)

# Because x is assigned to an atomic value, y is a copy of x
# Changing x will not affect y
x = 2
print('x=',x)
print('y=',y)

### Variable type
All Python variables have a type. It is often useful to check the type with the `type` function

In [None]:
x = 3.14
print('x is of type: ', type(x))

s = 'Hello, World!'
print('s is of type: ', type(s))

## Operators
Now that we have variables with assigned values, what can we do with them? Python defines several operators. Some (like division) can only be applied to numerical values. Others can be applied to any atomic (and many composite) data types.

<img style="width: 600px" src="https://raw.githubusercontent.com/bsethwalker/clemson-cs4300/main/images/ops1_v2.png">

<img style="width: 650px" src="https://raw.githubusercontent.com/bsethwalker/clemson-cs4300/main/images/ops2_v2.png">

In [None]:
# Some numerical operations
print('Some numerical operations')
a = 7
b = 3
print('a+b=',a+b)
print('a-b=',a-b)
print('a*b=',a*b)
print('a/b=',a/b)
print('a//b=',a//b) # integer division
print('a%b=',a%b) # remainder

# Some string operations
print('Some string operations')
s1 = 'Hello, '
s2 = 'World!'
print('s1+s2=',s1+s2)
print('s1*3=',s1*3)

# the +=, -=, *=, /=, //=, %= operators these operators are used to update the value of a variable
print('The +=, -=, *=, /=, //=, %= operators')
a = 7
b = 3
a += b
print('a+=b:',a)
print('a=',a)
a-=b
print('a-=b:',a)


In [None]:
# Some boolean operations
print('Some boolean operations')
a = True
b = False
print('a and b =',a and b)
print('a or b =',a or b)
print('not a =',not a)
print('a and not b =',a and not b)

# Some comparison operations
print('\nSome comparison operations')
a = 7
b = 3
print('a == b:',a == b)
print('a != b:',a != b)
print('a > b:',a > b)
print('a < b:',a < b)
print('a >= b:',a >= b)
print('a <= b:',a <= b)




## Containers
Python containers can be used to assign multiple datum to single variable. There are four built-in container types in Python:
* tuple
* list
* set
* dictionary

__Complete documentation for built-in containers:__ [Python Data Structures](https://docs.python.org/3/tutorial/datastructures.html#)

Each container type has slightly different characteristics that are useful for different tasks. However, there are some common features such as:
* they can hold any number of items
* they can hold any type of item - including other containers
* the items in a container can be of different types
* Most, but not all, are __mutable__ : if a variable is put in container, changing an attribute using the variable will change the value in the container (more on this below)
* they are iterable

For more containers used in special circumstances, see the [collections library](https://docs.python.org/3/library/collections.html)

### Tuples
This is the most basic container type. It is used by default in many situations that may not be obvious. Let's look at an example of:
* creating a tuple variable
* accessing tuple elements
* tuples are immutable - sort of
* supports slicing
* some none obvious code that creates tuples

In [None]:
# explicilty creating a tuple with ()
t = ('a', 1, False)
print(t)

# accessing elements of a tuple
print(t[0])
print(t[1])

# tuples are immutable, you cannot change the elements of a tuple
# t[1] = 2 # this will raise an error
# however, if the element is mutable, you can change the element. We'll see this later

# tuples can be empty
empty = ()
print('empty: ', empty)


In [None]:
# assigning comma separated values to a variable creates a tuple. We'll see this again in function return values
x = 'a', 1, False
print(x)
print(type(x))

In [None]:
# explicilty creating a tuple with ()
t = ('a', 1, False)

# check if an element is in a tuple
print('a' in t)
print('x' in t)

In [None]:
# create tuple
t = (0, 1, 2, 3, 'a', 'b', 'c', 'd')
print('t: ', t)

# slicing a tuple
# t[i:j] returns a tuple with elements i to j-1
print('t[0:2]: ', t[0:2])
print('t[2:4]: ', t[2:4])

# t[i:] returns a tuple with elements i to the end
print('t[2:]: ', t[2:])

# the length of t can be obtained using len
print('len(t): ', len(t))

# t[-i:] returns a tuple with the last i elements
print('t[-2:]: ', t[-2:])
# t[:-i] returns a tuple with all elements except the last i elements
print('t[:-2]: ', t[:-2])
# t[-i:-i+j] returns a tuple with elements -i to -i+j-1 # this is not often used as it is equivalent to t[len(t)-i:len(t)-i+j]
print('for i=-6, j=4')
print('t[-6:-2]: ', t[-6:-2])


### Lists
Lists are very similar to tuples. The main difference is that they are mutable. Let's see some examples

In [None]:
# create a list
l = ['a', 1, False]
print(l)

# accessing elements of a list
print(l[0])
print(l[1])

# lists can be empty
empty = []
print('empty: ', empty)

In [None]:
# create a list
l = ['a', 1, False]
print(l)

# lists are mutable, you can change the elements of a list
l[1] = 2
print(l)

In [None]:
# create a list
l = ['a', 1, False]

# you can add elements to the end of a list using append
l.append('b')
print(l)

# you can add elements to the front of a list using insert
l.insert(0, 'z')
print(l)

# you can remove elements from a list using remove
l.remove('a') # removes the first occurence of 'a' only
print(l)

# you can remove elements from a list using pop and assign it to a variable
x = l.pop() # removes the last element
print('x=', x)
print('l=',l)

x= l.pop(1) # removes the element at index 1
print('x=', x)
print('l=',l)


In [None]:
# create a list
l = ['a', 1, False]
print(l)

# if you assign a list to another variable, the new variable will point to the same list
l = ['a', 1, False]
m = l
print('l=',l)
print('m=',m)

# if you change an element of the list using one variable, the change will be reflected in the other variable
print("\nchanging an element of the list using one variable")
l[1] = 2
print('l=',l)
print('m=',m)

In [None]:
# create a list
l = ['a', 1, False]
print(l)

# check if an element is in a list
print('a' in l)
print('x' in l)

In [None]:
# list slicing is similar to tuple slicing
l = [0, 1, 2, 3, 'a', 'b', 'c', 'd']
print('l: ', l)

# slicing a list
# l[i:j] returns a list with elements i to j-1
print('l[0:2]: ', l[0:2])
print('l[2:4]: ', l[2:4])

# l[i:] returns a list with elements i to the end
print('l[2:]: ', l[2:])

# the length of l can be obtained using len
print('len(l): ', len(l))

# l[-i:] returns a lsit with the last i elements
print('l[-2:]: ', l[-2:])
# l[:-i] returns a list with all elements except the last i elements
print('l[:-2]: ', l[:-2])
# l[-i:-i+j] returns a list with elements -i to -i+j-1 # this is not often used as it is equivalent to l[len(l)-i:len(l)-i+j]
print('for i=-6, j=4')
print('l[-6:-2]: ', l[-6:-2])

### Sets
A set is a container that contains distinct items, i.e., values are not repeated. Some special considerations for sets:
* sets are not subscriptable (i.e., cannot access set items by index)
* sets are mutable - items can be added and removed
* sets are often transformed to lists and vice versa.
Let's see some examples

In [None]:
# explicit creation of a set
s = set([1,2,4,4,4,4,4])
print(s) # notice that the set has only unique elements

# accessing elements of a set
# s[0] # this will raise an error
# sets are unordered, so you cannot access elements by index. You must iterate over the set to access elements
for _ in s:
    print(_)

In [None]:
# explicit creation of a set
s = set([1,2,4,4,4,4,4])
print(s) # notice that the set has only unique elements

# sets are mutable
s.add(5)
s.remove(2)
print(s)

In [None]:
# a list can be converted to a set - often a convenient way to remove duplicate items from a list
l = [1,2,4,4,4,4,4]
s = set(l)
print(s)

# a set can be converted to a list
l = list(s)
print(l)

In [None]:
# explicit creation of a set
s = set([1,2,4,4,4,4,4])
print(s) # notice that the set has only unique elements

# check if an element is in a set
print(1 in s)
print(3 in s)

### Dictionaries
A dictionary contains `key:value` pairs. Some important properities of dictionaries include:
* `keys` are any hashable type, though we typically use integers or strings
* `keys` may be of different types though this is usually not recommended
* dictionary items are accessed with the `key`
* dictionaries are mutable - items can be changed, added, removed
* no slicing in dictionaries

In [None]:
# create a dictionary
d = {'a': 1, 'b': 2, 'c': 3}

# accessing elements of a dictionary
print(d['a'])
print(d['b'])
# print(d['x']) # this will raise an error because 'x' is not a key in the dictionary

# using get to access elements of a dictionary
print(d.get('a'))
print(d.get('b'))
print(d.get('x')) # this will not raise an error, it will return None


In [None]:
# create a dictionary
d = {'a': 1, 'b': 2, 'c': 3}

# dictionaries are mutable, you can change the elements of a dictionary
d['b'] = 4
print(d)

# add a new element to a dictionary
d['d'] = 5
print(d)

# remove an element from a dictionary
del d['b']
print(d)

In [None]:
# create a dictionary
d = {'a': 1, 'b': 2, 'c': 3}

# check if a key is in a dictionary - notice that this checks for the key, not the value
print('a' in d)
print('x' in d)

# using get to access elements of a dictionary with a default value
print(d.get('a', -1))
print(d.get('x', -1))


## Control Flow
Our analysis scripts will involve many steps that require control over how they should proceed. Python has several _control flow_ options including
1. for loops
2. if statements
3. while statements

For more information on additional control flow options, see [Python Control Flow] (https://www.pythoncheatsheet.org/cheatsheet/control-flow).


In [None]:
# Let's see some examples of control structures
# if-else
x = 3
print('x=',x)
if x > 0:
    print('x is positive')
else:
    print('x is non-positive')

# if-elif-else
x = 0
print('x=',x)
if x > 0:
    print('x is positive')
elif x < 0:
    print('x is negative')
else:
    print('x is zero')

In [None]:
# Now let's see some examples of loops
# for loop
k = 5
print(f'for loop with range{k}')
for i in range(k): # the range(k) function returns a sequence of numbers from 0 to k-1
    print(i)

# for loop with a list
l = ['a', 1, False]
print(f'\nfor loop with list, {l}')
for i in l:
    print(i)

# can we combine range and len to iterate over the indices of a list?
l = ['a', 1, False]
print(f'\nfor loop with range and list, {l}')
for i in range(len(l)):
    print(i, l[i])

# while loop
x = 3
print(f'\nwhile loop with x={x}')
while x > 0:
    print(x)
    x -= 1


In [None]:
# for loops on dictionaries often iterate over the keys, the values, or the key-value pairs (items)
d = {'a': 1, 'b': 2, 'c': 3}
print(f'\nfor loop with dictionary using keys, {d}')
for k in d:
    print(k, d[k])

print(f'\nfor loop with dictionary using values, {d}')
for v in d.values():
    print(v)

print(f'\nfor loop with dictionary items, {d}')
for k,v in d.items():
    print(k, v)

In [None]:
# control structures can be nested
d = {'a': 1, 'b': 2, 'c': 3}
print(f'\nNested control structures with dictionary items, {d}')
for k,v in d.items():
    if v % 2 == 0:
        print(f'{k} has an even value')
    else:
        print(f'{k} has an odd value')

### Control Flow with Container Comprehension
Python includes the notion of container comprehension that allows for more concise code for situations where you want to iterate over the items in a container. The most common use is on lists, though comprehension can be used on all built-in container types.

The list comprehension syntax
`[expression for item in list]`
is equivalent to
```
for item in list:
    expression
```
where expression is a valid operation item (i.e., application of an operator or function)

List comprehension can also be combined with conditions
`[expression for item in list if condition]`
is equivalent to
```
for item in list:
    if condition:
        expression
```

More on list comprehensions in the [Python list comprehension documentation](https://docs.python.org/2/tutorial/datastructures.html#list-comprehensions).


In [None]:
# Some examples of comprehensions
# list comprehension
l = [1, 2, 3, 4, 5]
squared = [x**2 for x in l]
print(squared)

# This is equivalent to the following for loop
squared = []
for x in l:
    squared.append(x**2)
print(squared)

# list comprehension with condition
l = [1, 2, 3, 4, 5]
# only square the even numbers
squared = [x**2 for x in l if x % 2 == 0]
print(squared)

# dictionary comprehension
l = ['a', 'b', 'c', 'd']
d = {x: i for i, x in enumerate(l)}
print(d)
vSquared = [x**2 for x in d.values()]
print(vSquared)


## Functions
A *function* object is a reusable block of code that does a specific task.  Functions are commonplace in Python, either on their own or as they belong to other objects. To invoke a function `func`, you call it as `func(arguments)`.

We've seen __built-in__ Python functions and methods. For example, `len()` and `print()` are built-in Python functions.

As we will see, python libraries, also called modules, will have many functions. To access these functions, we will need to _import_ the modules (or specific parts of modules) but more on that later. For now, let's look at functions we can create with built-in Python capabalities.

### User-defined functions

We'll now learn to write our own user-defined functions.  Below is the syntax for defining a basic function with one input argument and one output. You can also define functions with no input or output arguments, or multiple input or output arguments.

```
def name_of_function(arg):
    ...
    return(output)
```

We can write functions with one input and one output argument.  Here are two such functions.

In [None]:
def square(x):
    x_sqr = x*x
    return(x_sqr)

def cube(x):
    x_cub = x*x*x
    return(x_cub)

square(5),cube(5)

A function can return one, multiple or NO items. To return multiple item you can use a container. If no `return` statement is included, `None` object is returned. The `None` object can also be explicitly returned if needed.

In [None]:
# custom divide function that returns the quotient and remainder OR None if the divisor is 0
def divide(a, b):
    if b == 0:
        return None # you could also raise an exception here
    else:
        quotient = a//b
        remainder = a%b
        return quotient, remainder

print(divide(7,3))
print(divide(7,0))

### Lambda functions
Often we want to define an _anonymous_ function for concise code, usually with just one line. *lambda* functions make this possible.  Lambda functions are great because they enable us to write functions without having to name them, ie, they're *anonymous*.  
No return statement is needed.

In [None]:
# The apply function takes a function and a variable and applies the function to the variable
def apply(f, x):
    return f(x) # f is a function

# we can use apply to square a number by passing a lambda function that squares the input
print(apply(lambda x: x*x, 5))

## Classes
Recall the four atomic data types in Python are: integers, floats, characters, and booleans. The containers we've seen are composite data types that hold one or more items that can be atomic or composite. Additionally, we can define Python _Classes_ which combine data types (atomic and composite) with methods (same as functions). Classess are useful when you need to create multiple _instances_ of a variables that share the same data structure and have the same methods.  For more information see [Python Class Definitions](https://docs.python.org/3/tutorial/classes.html)

In [None]:
# Let's look at a simple example
class Complex():
    def __init__(self, real, imag):
        self.real = real
        self.imag = imag
    def __str__(self):
        return str(self.real) + '+' + str(self.imag) + 'i'
    def magnitude(self):
        return (self.real**2 + self.imag**2)**0.5
    def __add__(self, other):
        return Complex(self.real + other.real, self.imag + other.imag)
    def __sub__(self, other):
        return Complex(self.real - other.real, self.imag - other.imag)

c1 = Complex(1, 2)
c2 = Complex(3, 4)
print(c1)
print(c2)
print(c1.magnitude())
print(c2.magnitude())
print(c1 + c2)

### Is it a method or a function?
Strictly speaking, a _function_ is code defined outside of a Class and can be accessed directly using the function name.

A function that belongs to an object is called a *method*. By "object," we mean an "instance" of a class (e.g., list, integer, or floating point variable).

For example, when we invoke `magnitude` on an existing Complex, `magnitude` is a method.

In other words, a *method* is a function on a specific *instance* of a class (i.e., *object*). In the example above, our class is Complex. `c1` is an instance of `Complex` (thus, an object), and the `magnitude` function is technically a *method* since it pertains to the specific instance `c1`. Indeed, there is no way to invoke the `magnitude` method independently of an instance of the `Complex` class.

We will see examples of libraries that provide both functions and Classes with methods.

# Part 2: Python Numerical Libraries

We will now briefly introduce three libraries commonly used for numerical analysis in Python.
1. [numpy](https://numpy.org/)
2. [scipy](https://scipy.org/)
3. [statsmodels](https://www.statsmodels

You are _not_ expected to know everything possible about these libraries after this introduction. In fact, these libraries are quite extensive and most practioners of data science find it necessary to refer to the documention regularly. We will introduce more concepts from these libraries throughout the course.

## Numpy Introduction
NumPy is a Python library supports efficient mathematical operations over large, multi-dimensional arrays and matrices. It also includes numerous built in mathematical and statisical functions to operate on the arrays. Many other Python libraries (e.g., Pandas) extensively use numpy.

Let's start by importing the libray:

__Importing modules__
Before we can use Numpy we will need to _import_ into our workspace using the `import` statement which has the following forms:<br/><br/>
`import MODULE_NAME as ALIAS` - this imports the module and assigns it to the variable `ALIAS`. <br/>
`import MODULE_NAME` - this imports the module name and assigns it to the variable `MODULE NAME`. This is equivalent to `import MODULE_NAME as MODULE_NAME`<br/>
`from MODULE_NAME import X as ALIAS` - this imports only the `X` from `MODULE_NAME`<br/><br/>

_(Good Coding Practice)_ It is best to include _ALL_ import statements at the top of your Jupyter notebooks and other Python (.py) files. In this notebook, they are placed later only for convenience of introducing the import concept.

In [None]:
import numpy as np

Now let's create an array. The `array` class is type of container in the numpy library. It is very similar to a _list_ but has additional functions and its implementation is much more efficient for numerical operations.

In [None]:
my_array = np.array([1, 2, 3, 4])
my_array

In [None]:
# works as it would with a standard list
len(my_array)

# elements can be accessed by index
print(my_array[0])

# elements can be accessed by slicing
print(my_array[1:3])

# elements can be modified
my_array[0] = 5
print(my_array)

The shape array of an array is very useful (we'll see more of it later when we talk about 2D arrays -- matrices -- and higher-dimensional arrays).

In [None]:
my_array = np.array([1, 2, 3, 4])
my_array.shape

Numpy arrays are **typed**. This means that by default, all the elements will be assumed to be of the same type (e.g., integer, float, String). This is important because if all elements have the same type, it is possible to make numerical operations much faster. Note, however, that numpy arrays can have mixed types (see below) in which case the **type** is anonymous and the speed efficiency is usually lost.

In [None]:
my_array = np.array([1, 2, 3, 4])
print(my_array.dtype)

myOtherArray = np.array(['a', 1, 'b'])
print(myOtherArray.dtype) # numpy will automatically convert all elements to the

There are two ways to manipulate numpy arrays a) by using the numpy module's methods (e.g., `np.mean()`) or b) by applying the function np.mean() with the numpy array as an argument.

In [None]:
my_array = np.array([1, 2, 3, 4])
print(my_array.mean())
print(np.mean(my_array))

A ``constructor`` is a general programming term that refers to the mechanism for creating a new object (e.g., list, array, String).

There are many other efficient ways to construct numpy arrays. Here are some commonly used numpy array constructors. Read more details in the numpy documentation.

In [None]:
np.ones(10) # generates 10 floating point ones

Numpy gains a lot of its efficiency from being typed. That is, all elements in the array have the same type, such as integer or floating point. The default type, as can be seen above, is a float. (Each float uses either 32 or 64 bits of memory, depending on if the code is running a 32-bit or 64-bit machine, respectively).

In [None]:
np.dtype(float).itemsize # in bytes (remember, 1 byte = 8 bits)

We can tell numpy what type we want the elements to be:

In [None]:
np.ones(10, dtype='int') # generates 10 integer ones

An array of zeros is often useful:

In [None]:
np.zeros(10)

Often, you will want random numbers. Use the `random` constructor!

In [None]:
np.random.random(10) # uniform from [0,1]

You can generate random numbers from a normal distribution with mean 0 and variance 1:

In [None]:
number_of_samples = 1000
normal_array = np.random.randn(number_of_samples) # standard normal
print(f"The sample mean and standard devation are {np.mean(normal_array)} and {np.std(normal_array)}, respectively.")
print(len(normal_array))

You can sample with and without replacement from an array. Let's first construct a list with evenly-spaced values and then sample from it.

In [None]:
grid = np.arange(0., 1.01, 0.1)
print (grid)

number_of_samples = 5
sample = np.random.choice(grid, number_of_samples, replace=False)
print(sample)

number_of_samples = 20
try:
    np.random.choice(grid, number_of_samples, replace=False)
except ValueError as e:
    print("What happened?")
# The following will raise an error. Why?
# np.random.choice(grid, number_of_samples, replace=False)

Let's try it with replacement

In [None]:
np.random.choice(grid, 20, replace=True)

## Tensors

We can think of tensors as a name to include multidimensional arrays of numerical values. While tensors first emerged in Mathematics and Physics the 20th century, they have since been applied to numerous other disciplines, including machine learning. In this class you will only be using **scalars**, **vectors**, and **2D arrays**, so you do not need to worry about the name 'tensor'.

We will use the following naming conventions:

- scalar = just a number = rank 0 tensor  ($a$ ∈ $F$,)
<BR><BR>
- vector = 1D array = rank 1 tensor ( $x = (\;x_1,...,x_i\;)⊤$ ∈ $F^n$ )
<BR><BR>
- matrix = 2D array = rank 2 tensor ( $\textbf{X} = [a_{ij}] ∈ F^{m×n}$ )
<BR><BR>
- 3D array = rank 3 tensor ( $\mathscr{X} =[t_{i,j,k}]∈F^{m×n×l}$ )
<BR><BR>


### Slicing a 2D array

<img src="https://raw.githubusercontent.com/bsethwalker/clemson-cs4300/main/images/slicing_2D_oreilly.png" alt="Drawing" style="width: 400px;"/>

[source:oreilly](https://www.oreilly.com/library/view/python-for-data/9781449323592/ch04.html)

#### Numpy supports vector operations

What does this mean? It means that instead of adding two arrays, element by element, you can just say: add the two arrays.

In [None]:
first = np.ones(5)
second = np.ones(5)
first + second # adds in-place

Note that this behavior is very different from python lists where concatenation happens.

In [None]:
first_list = [1., 1., 1., 1., 1.]
second_list = [1., 1., 1., 1., 1.]
first_list + second_list # concatenation

On some computer chips, this numpy addition actually happens in parallel and can yield significant increases in speed. But even on regular chips, the advantage of greater readability is important.

#### Broadcasting

Numpy supports a concept known as *broadcasting*, which dictates how arrays of different sizes are combined together. There are too many rules to list here, but importantly, multiplying an array by a number multiplies each element by the number. Adding a number adds the number to each element.

In [None]:
first_list = np.array([1., 1., 1., 1., 1.])
first_list + 1

In [None]:
first*5

This means that if you wanted the distribution $N(5, 7)$ you could do:

In [None]:
normal_5_7 = 5 + 7*normal_array
np.mean(normal_5_7), np.std(normal_5_7)

Multiplying two arrays multiplies them element-by-element

In [None]:
(first +1) * (first*5)

You might have wanted to compute the dot product instead:

In [None]:
np.dot((first +1) , (first*5))

## A quick look at `scipy.stats` and `statsmodels`

Two useful statistics libraries in python are `scipy` and `statsmodels`. Here, some examples are introduced. We will see much more of these libraries during the course.

Let's first import the [proportions_ztest](https://www.statsmodels.org/stable/generated/statsmodels.stats.proportion.proportions_ztest.html) from teh statsmodel.stats module:

In [None]:
from statsmodels.stats.proportion import proportions_ztest as pztest

In [None]:
x = np.array([74,100])
n = np.array([152,266])

zstat, pvalue = pztest(x, n)
print("Two-sided z-test for proportions: \n","z =",zstat,", pvalue =",pvalue)

Now let's look at a the [normal distribution](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html) from `scipy.stats`.

In [None]:
from scipy.stats import norm
from matplotlib import pyplot as plt

Let's create 1,000 points between -10 and 10

In [None]:
x = np.linspace(-10, 10, 1000) # linspace() returns evenly-spaced numbers over a specified interval
x[0:10], x[-10:]

Let's get the pdf of a normal distribution with a mean of 1 and standard deviation 3, and plot it using the grid points computed before:

In [None]:
pdf_x = norm.pdf(x, 1, 3)
plt.plot(x, pdf_x)