# Python Basic

### MSRI-UP 2023

Authors:  Ilani Sai Axelrod-Freed, Ariel Cintron-Arias and  Jose Perea 

Date: 06/12/2023

---

**To run a cell**, press Control+Enter (Command+Enter in mac) or click Run at the top of the page. To run a cell and adavance to the next cell, press Shift+Enter.

**To add a new cell**, click Insert at the top of the page and then add a cell above or below your current one

---

### Markdown

Markdown is used to display  text that will not run as code. It can display headers, plain text and simple LaTeX:  $$ \tan(\theta) = \frac{\sin(\theta)}{\cos(\theta)} $$ 

To put a cell in markdown mode, click on the box at the top that says "Code" inside (circled in red in the picture below). This will give you a drop down menu for cell type from which you should select "markdown".

![1E8C9181-FD75-41B3-99E0-FC3BBF9925B5.jpeg](attachment:1E8C9181-FD75-41B3-99E0-FC3BBF9925B5.jpeg)

To make headers, put # signs at the beginning of the line in markdown mode. The more #'s you use, the smaller the header will be. Make sure to put a space between the last # and your first word.


### Print

The `print` command takes a string as input, specified by characters 
enclosed between single or double quotation marks.

In [None]:
string = 'Hello World!'
print(string)

The `upper()` method converts all lowercase characters in a string into uppercase characters and returns it.

In [None]:
print(string.upper())

The `lower()` method converts all uppercase characters in a string into lowercase characters and returns it.

In [None]:
print(string.lower())

### Comments

The hashtag symbol, `#`, is reserved in Python to comment or ignore lines of code. 

In [None]:
# The following code prints out the phrase 'hello world!'
print('hello world!')
print('# is not a comment in this case, since it is part of a string (specified by quotations).')

### Numbers and Operations

Here are some of the types of numbers handled in Python.
* Integers indicated by int
* Floating point numbers indicated by float
* Complex numbers `a + 1j*b`, for floats `a` and `b`


In [2]:
# Examples of number types
a = 5 + 8
print("Sum of int numbers is", a, 'and its type is', type(a))

b = 5 + 2.3
print("Sum of int and {} and number format is {}".format(b, type(b)))

Sum of int numbers is 13 and its type is <class 'int'>
Sum of int and 7.3 and number format is <class 'float'>


In [None]:
#Some useful operations:
a = 3*2     # multiplication
b = 6/4     # division
c = 4**3    # exponent, 4 cubed
d = 17%4    # mod (i.e this would give 17 mod 4 = 1)

In [None]:
print(b)   # Prints the value we assigned to the variable b above

https://www.tutorialspoint.com/python/python_basic_operators.htm is a good source for looking up a lot of basic Python operations

**Important**: The above cell uses information from an earlier cell. Make sure the cell that defines b has run before running the print(b) command.
**If you have code in a cell that calls on information from a previous cell, make sure the previous cell has run first, or it will not work.**

### Lists, Tuples and Dictionaries

The following data types are supported in Python: lists, tuples, dictionaries, and arrays.

#### Lists

A list is created by enclosing the elements within square brackets, separated by commas. A list can have any number of elements, which may have different types (integer, float, string, etc.).

In [None]:
my_list = []
first_list = [3,5,7,10]
second_list = [1, 'python', 3.5]

In [None]:
# Nest multiple lists
nested_list = [first_list,second_list, my_list]
nested_list

In [None]:
# Combine multiple lists
combined_list = first_list + second_list # + operation on lists is concatenation
combined_list

In [None]:
# The elements of a list are indexed starting at 0
combined_list[3]  # Gives the element with index 3 (the 4th element in the list)

In [None]:
# You can slice a list
combined_list[1:4]   # A new list with elements 1 through 4 of the original list

In [None]:
combined_list[:6]    # List with elements 0 to 6 of original list

In [None]:
combined_list[2:]    #list with elements 2 to end of original list

In [None]:
# Append a new entry to the list
combined_list.append(600)
combined_list

In [None]:
# Remove the last entry from the list
combined_list.pop()

In [None]:
combined_list

In [None]:
# Length of list
len(combined_list)  #gives number of elements in the list

In [None]:
# Iterate the list by deploying a for-loop
for item in combined_list:
    print(item)

#### Tuples

A tuple is similar to a list, but they  use  parentheses ( ) instead of square brackets. The main difference is that a tuple is immutable, while a list is mutable (google what this means).

In [None]:
this_tuple = ('v','w','x','y','z') # a 5-tuple of strings

print(this_tuple)
print(this_tuple[1:4])  

#### Dictionaries

A dictionary consists of a collection of key-value pairs. Each key-value pair maps the key to its associated value.

In [None]:
desk_location = {'jack': 123, 'joe': 234, 'hary':543}
desk_location['jack']

### Functions

A function is a block of code that runs when it is called. You can pass data, or parameters, into the function. A function can return data as a result. 

In Python, a function is defined by the command `def`, followed by the function name (one word without blank spaces) and then the colon operator `:`. Indentation is required below the `def` command.

In [None]:
# Defining a function
def new_funct():
    print("A simple function")

# Calling the function
new_funct()

In [None]:
# Sample fuction with parameters

def param_funct(first_name):
    print("Employee name is {}.".format(first_name))
    
param_funct("Harry")
param_funct("Meghan")
param_funct("Luz")

### Python Modules

A module is a Python object with arbitrarily named attributes. In other words, it is a file consisting of Python code (similar to a library in R). Modules are commonly referred to as packages or libraries.

Some Python modules need to first be installed.

After installation, modules have to be activated with the command `import` and an alias (nickname) can then be defined.

### Math Module

The math library module comes with many useful mathematical constants and functions.

In [None]:
import math  #Math comes installed already, so we just need to import it to use it

In [None]:
e = math.e           # the number e
pi= math.pi          # the number pi
a = math.sin(pi/3)   # the function sin (in radians)
b = math.sqrt(5)     # square root

a

More functions and constants offered by the math module can be found at https://docs.python.org/3/library/math.html

### NumPy

NumPy stands for "Numerical Python". This modules contains several tools for scientific computing, including:
* N-dimensional array objects
* Sophisticated (broadcasting) functions
* Tools for integrating C/C++ and Fortran code
* Useful linear algebra, Fourier transform, and random number capabilities

Documentation
* https://numpy.org/doc/stable/reference/index.html
* https://www.w3schools.com/python/numpy/default.asp

In [None]:
# Import NumPy module. 
# Comes installed with Anaconda distribution, otherwise run !pip install numpy 

import numpy as np # here np is an alias or nickname

**Array creation** (Useful for data plots later)

In [None]:
#np.arange
a= 5
b = 15
step = 2
x  = np.arange(a,b,step)
#np.arange gives an array of increasing value with lowest value a (inclussive), highest value b (exlusive)
#separated by the step value

x

In [None]:
#np.linspace
a= 5
b = 15
s = 8
x = np.linspace(a,b,s)
#np.linspace gives an array of evenly spaced numbers of increasing value 
#with lowest value a (inclusive), highest value b (also inclusive)
#with s giving the number of terms in the array

x

In [None]:
x = np.random.rand(2,4)
## gives an array of the specified dimensions filled with random numbers from 0-1

x

In [None]:
# Create arrays
a = np.array([[1,2,4],[7,2,5]]) # Create a 2-by-3 array (matrix)
b = np.arange(15).reshape(3, 5) # Create array with range 0-14 in 3 by 5 dimension
c = np.zeros((3,5)) # Create array with zeroes
d = np.ones( (2,3,4), dtype=np.int16 ) # Create array with ones and defining data types

In [None]:
a

In [None]:
b.shape # Array dimension

In [None]:
len(c)# Length of array


In [None]:
d.ndim # Number of array dimensions


In [None]:
b.size # Number of array elements


In [None]:
c.dtype # Data type of array elements


In [None]:
d.dtype.name # Name of data type


In [None]:
d.astype(float) # Convert an array type to a different type

#### Basic Operations with Arrays

In [None]:
# Create array
a = np.array([[1,2,4],[7,2,5]])
b = np.arange(15).reshape(3, 5) # Create array with range 0-14 in 3 by 5 dimension
c = np.zeros((3,5)) # Create array with zeroes
d = np.ones( (2,3,4), dtype=np.int16 ) # Create array with ones and defining data types
e = np.ones((3,5))

In [None]:
np.add(b,c) # Addition, the same as doing b + c, try it!

In [None]:
np.subtract(b,c) # Substraction, the same as doing b - c, try it!

In [None]:
np.divide(b,e) # Division - elementwise, same as doing b/e, try it!

In [None]:
np.multiply(b,e) # Multiplication - elementwise, same as b*e, try it!

In [None]:
np.matmul(a,b) #matrix multiplication, same as a@b, try it!

In [None]:
np.array_equal(b,c) # Comparison - arraywise (returns true if they are the same, false if not)

#### Aggregate Functions

In [None]:
# Create array
a = np.array([[1,2,4],[7,2,5]])
a

In [None]:
a.sum() # Array-wise sum

In [None]:
a.min() # Array-wise min value

In [None]:
a.mean() # Array-wise mean

In [None]:
a.max(axis=0) # Max value of array row

In [None]:
np.std(a) # Standard deviation

#### Subsetting, Slicing, and Indexing

In [None]:
# Create array
b = np.arange(15).reshape(3, 5) # Create array with range 0-14 in 3 by 5 dimension
b

In [None]:
b[1,2] # Select element of row 1 and column 2

In [None]:
b[0:2] # Select items on index 0 and 1

In [None]:
b[:1] # Select all items at row 0

In [None]:
b[-1:] # Select all items from last row

In [None]:
b[b<2] # Select elements from 'a' that are less than 2

#### Array manipulation

In [None]:
# Create array
a = np.array([[1,2,4],[7,2,5]])
b = np.arange(15).reshape(3, 5) # Create array with range 0-14 in 3 by 5 dimension
c = np.zeros((3,5)) # Create array with zeroes
d = np.ones( (2,3,4), dtype=np.int16 ) # Create array with ones and defining data types
e = np.ones((3,5))


In [None]:
np.transpose(a) # Transpose array 'a'

In [None]:
b.ravel() # Flatten the array

In [None]:
b.reshape(5,-2) # Reshape but don't change the data

In [None]:
np.append(b,c) # Append items to the array

In [None]:
np.concatenate((b,e), axis=0) # Concatenate arrays

In [None]:
np.vsplit(b,3) # Split array vertically at 3rd index

In [None]:
np.hsplit(b,5) # Split array horizontally at 5th index

### Basic Plots

In [None]:
# Import the pyplot module from the matplotlib plotting library
# matplotlib comes included with Anaconda, if not, run !pip install matplotlib

from matplotlib import pyplot as plt

In [None]:
%matplotlib inline
# Used to make the plots appear inside the notebook cell instead of below it

In [None]:
# Creating synthetic data
sample_data = np.random.normal(0, 0.1, 1000) # np is the alias for numpy
plt.hist(sample_data);   # remove the ; and see what happens

In [None]:
x=[1,2,4,5,6]
y=[1,4,16,25,36]
plt.plot(x, y);
# first array gives x coordinates and second array gives y coordinates of each point

In [None]:
n_data = 200  # variable
x = 2*np.pi*np.random.rand(n_data)  # #Makes an random array with n_data number of elements, scaled by 2pi
y = np.sin(x)                       #a new array which takes sin of each element of the array x 
plt.scatter(x, y);     


### Pandas (Optional)

Pandas is a Python module for data manipulation and analysis. 

Documentation:
* https://pandas.pydata.org/docs/
* https://www.w3schools.com/python/pandas/default.asp


In [None]:
## Install pandas
## You only need to run pip install once. Uncomment, run and then comment.
#!pip install pandas


In [None]:
# Import NumPy and Pandas modules
import numpy as np
import pandas as pd

In [None]:
# Sample dataframe df
df = pd.DataFrame({'num_legs': [2, 4, np.nan, 0],
                   'num_wings': [2, 0, 0, 0],
                   'num_specimen_seen': [10, np.nan, 1, 8]},
                   index=['falcon', 'dog', 'spider', 'fish'])


In [None]:
df # Display dataframe df

In [None]:
df.head(2) # View top data

In [None]:
df.tail(2) # View bottom data

In [None]:
df.index # Display index column

In [None]:
df.dtypes # Inspect datatypes

In [None]:
df.describe() # Display quick statistics summary of data

In [None]:
df.info()

In [None]:
pd.isna(df)  # To get boolean mask where data is missing

In [None]:
df

Pandas primarily uses the value `np.nan` to represent missing data. It is not included in computations by default.

In [None]:
df.dropna(how='any') # Drop any rows that have missing data

In [None]:
df.dropna(how='any', axis=1) # Drop any columns that have missing data

In [None]:
df5 = df.fillna(value=5) # Fill missing data with value 5
df5

In [None]:
pd.isna(df5) # To get boolean mask where data is missing