# Python Review
Welcome! First of all; this document is meant to serve as a complement to your previous knowledge obtained from the khan academy videos not a supplement. Though the notebook will be written to be as clear as possible; some previous programming experience is recommended in order to understand this notebook with greater ease. Before starting with the introduction to Python; we'll go over some of the basic functionalities of Jupyter which will help us during our programming journey this week.

## What is Python?
Python is an interpreted, high-level general-purpose programming language.

Interpreted: This means the code done on Python can be run by the interpreter directly rather than being compiled into machine language code and then executed.

High-level: Usually represents how abstracted the syntax is from computer language or how similar it is to natural language (our language). High level means its syntax is very user-friendly (even among high level languages Python is known to have a very friendly syntax).

General-purpose: It can be used on many applications; from website development to mobile applications (Just because something is general purpose doesn't mean it SHOULD be used on everything).

In short; Python is a programming language that is very easy to learn (due to friendly syntax), simple to run (we have an interpreter) that can allows us to work in various fields and applications due to its versatility (general purpose). Currently; this language is among the top 3 most popular in the world and it owes this position to its high-level syntax and its open source status; this meaning that everyone can contribute to Python by making new packages to tackle different applications. These may range from solving differential equations and plotting; all the way to a package designed specifically for stock analysis or simulation of chaotic systems.

## What is Jupyter?

So; usually when writing code we use an IDE (Interactive Development Environment). It can range from dedicated Python IDE's like Spyder and Pycharm; to generalist software like Visual Studio Code, Atom and Notepad. These are generally good options to create projects and to code in general; however, this doesn't allow for visualization and long explanations of the code which is where Jupyter comes in.

Jupyter Notebook is an open-source project (the plan is for it to remain free) that is based on visualization, interaction and distribution of code in a simple manner. The name Jupyter notebook comes from both the first 3 kernels to be supported (JUlia, PYThon, R) and the notebooks written by Galileo on Jupiter and it gives programmers a way to run code sequentially in cells and add text in between to explain said code. In short; it's an environment that runs Python (or any Kernel of our choosing) continuously.

The way Jupyter works is via the creation and modification of cells; the cells' functionalities depend on the type of cell we create which can be either Markdown, Raw Text, Code and Heading (Headings have since been discontinued and are now included in Markdown). If a cell is in the Code format; then any content inside that cell will be treated just like code in an IDE (the language is determined by the Kernel that's currently running). Cells can be inserted or removed by using the buttons in the upper part of the notebook and to run the selected cell we just need to press Ctrl+Enter. 

If we run a cell in Markdown then we're free to write text as we please; however, the Markdown format allows for basic text formatting along with an assortment of other neat and useful features (headings, bullet lists, number lists, hyperlink inclusion, LaTeX support and image embedding). This allows the user to not only write explanations to certain code snippets; but also to give it nice formatting and making it even more interactive for the user.

To save files we first select a name in the upper left corner of the notebook and just press Ctrl+S.

## Now we start the review. Variables and basic arithmetic.
A variable is a way that data can be stored and referenced in a program; this data can vary greately from variable to variable and it's usually essential for the functioning of the program. For the user of the program that's pretty much it but for the programmer it also provides labels to the data in order to make the code more readable (this is something crucial not just in Python but in any language). 

In [1]:
# Anything written beside the numeral sign is a comment.
'''
This is a multi-line
comment
'''

# Define and set a variable in one line: name, equals sign, then value.
number = 5

# The print command displays the value of its argument after the cell runs.
print(number)

# If we reassign the value of the variable, it "forgets" it's previous value
number = 4
print(number)

# Python variables can have multiple types. Confusingly we can set number to a value that is not a number.
number = 'not a number!'
print(number)

# Generally it's good to give variables clearer names.
# Strings can be defined with single quotes (as above) or double quotes (like below).
text = "Hello!"
print(text)

5
4
not a number!
Hello!


### Whitespacing
Python along with most programming languages has an official [style  guide](https://www.python.org/dev/peps/pep-0008/) on how code should be structured, variables named and whitespace written. This style guide sets the standard on how code should be written and allows for consistency within the Python community; whitespace is defined as the space left after pressing the space button. There are some cases in which the use of whitespace is very helpful in matters regarding the readibility of your code; however, sometimes it can work against us since too much whitespace can often lead to confusion whilst reading the code. Therefore; we should focus on being consistent and following the style guide.

In [2]:
### Basic Arithmetic operations and printing along the way. 

# Save some variables to do basic math with. 
a = 21.4
b = 10.1
print('a =', a)
print('b =', b)

# Empty print command prints a new line
print() 

# Addition
c = a + b
print('a + b =', c)

# Subtraction
c = a - b
print('a - b =', c)

# Subtraction + rounding to 3 decimal places
# Notice that you can combine an arithmetic operation (subtraction) with a function call (rounding)
# in one line of code. 
c = round(a - b, 3)
print('a - b =', c)

# Multiplication
c = a*b
print('a * b =', c)

# Division
c = a/b
print('a / b =', c)

# Exponents
print('a^2 =', a**2)

# Modulus
print('a % 6 =', a % 6)

a = 21.4
b = 10.1

a + b = 31.5
a - b = 11.299999999999999
a - b = 11.3
a * b = 216.14
a / b = 2.118811881188119
a^2 = 457.9599999999999
a % 6 = 3.3999999999999986


### Exercises

Calculate the solution to these problems using Python and verify the answers (round the answers to the most significant digit).
\begin{align}
(5*(10+2)\div 30)^2=4 \\ (5-25)^2+3-2*9=385 \\ (42.3*17.2)^{1.5}-66.5 \approx 19558.2 
\end{align}

In [45]:
print((5 * (10 + 2) / 30)**2)
print((5 - 25)**2 + 3 - 2*9)
print((42.3 * 17.2)**1.5 - 66.5)

4.0
385
19558.20880948851


## A more complete perspective on variables.

We have defined previously that a variable is something used to store and reference information and while that is true; it's often useful to abstract the concept of a variable into something more familiar like an object (If anyone here has seen object oriented programming, this is equivalent to the pre-pre-introduction and you can skip this section). 

An object in the simplest sense is something that is labeled, has certain properties and can be manipulated in certain ways. Of course this definition is lacking for many objects but for the "objects" we see in programming this suffices; the variable has a lable that depends on the data type it contains within, in order to see this label we use the type function in Python with 3 different variable types to see what it is.

In [38]:
# Let's start by defining 3 different variables, a value is assigned to a variable using the equal sign.
x = 3
y = 4.3
z = "Hello"

# The "type" command will tell you what the type of a variable is. Usually you will know already because we set it. 
print('type(x) =', type(x))
print('type(y) =', type(y))
print('type(z) =', type(z))

type(x) = <class 'int'>
type(y) = <class 'float'>
type(z) = <class 'str'>


In [66]:
# Notice that the types of objects affect what we can do with them
# Can add x + y even though they have different types - both are numbers. That makes sense.
my_val = x + y
print('x + y =', my_val)

#What happens if we add two strings together?
print("Add" + "me")

#What if we multiply a string and an integer?
print("Add"*2)

# Can't add x + z because it is not clear what it means to add a string to an integer. 
my_val = x + z

# You will often find that operations have different effects on different data types! 

x + y = 7.3
Addme
AddAdd


TypeError: unsupported operand type(s) for +: 'int' and 'str'

In [71]:
#Strings have another interesting property which allows us to access certain sections of the string via indexing.
S = "My string."

#Indexing is done using square brackets beside the string of interest.
print(S[0])

#We can also choose a range of indexes 
print(S[0:7])

#Choose a range with a steps of certain size
print(S[0:7:2])

#And even use negative indexes
print(S[-3:])

M
My stri
M ti
ng.


Variables often have commands ("functions" or "methods") associated with them; these functions or methods change depending on the class a variable belongs to and a clear example of this can be seen in the list and string classes. A list of their methods can be found via the following function.

In [40]:
# You can call dir() on your object to list all the available methods
print(dir(z))
print()

# z is a string and has a function "endswith" - lets try it.
print('z =', z)
print('z.endswith("bye") =', z.endswith("bye"))
print('z.endswith("llo") =', z.endswith("llo"))

# Side note: you can see above that it is possible to embed double quotes in single quotes, 
# e.g. 'my nickname is "mike"'. You can't put single quotes inside single quotes because it is confusing to Python.

['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isascii', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']

z = Hello
z.endswith("bye") = False
z.endswith("llo") = True


In [41]:
# You can use "tab-completion" on your object to find out what it can do.
# Try typing z.s and then hitting the "tab" key. You will find a "startswith" method.
print('z =', z)
print('z.startswith("Goo") =', z.startswith("Goo"))
print('z.startswith("He") =', z.startswith("He"))

z = Hello
z.startswith("Goo") = False
z.startswith("He") = True


In [67]:
# If you want to know more about what a particular function does, you can use the 
# question mark to ask for documentation. 
names = "alejandro, avi, ada, adrian"

# This line will bring up the documentation for the "split" method of a string.
# It will show up in a little window in the bottom and show you how to call the function
names.split?

# Turns out split() splits up a string by breaking it at "sep". This gives us a list
splitted = names.split(', ')
print("names.split(', ') =", splitted)

#However; the result is a new data structure that seems to containe the resulting strings separatedly.

names.split(', ') = ['alejandro', 'avi', 'ada', 'adrian']


### Lists and For loops.
One of the most useful data types in basic Python are lists. Lists are a data structure which can contain many elements of different data types which can range from integers, floats, strings and even other lists. This allows us an easy and reliable way to store and access multiple data types using only one variable (thus; avoiding getting lost among many variable instances).

In [36]:
# l is a list
l = [3, 4.0, "Marco"]
print("l =", l)

print('type(l) =', type(l))

#We can access the elements of a list using square brackets.
print("The first element of this list is {}".format(l[0]))
print("This elemet is {}".format(type(l[0])))

#Although the list is globally a list; the elements still preserve their type.
#What if we try to add two lists of numbers though?
n_list1=[1,2,3]
n_list2=[4,5,6]
l2 = n_list1 + n_list2
print(l2)

l = [3, 4.0, 'Marco']
type(l) = <class 'list'>
The first element of this list is 3
This elemet is <class 'int'>
[1, 2, 3, 4, 5, 6]


## Logic and control-flow
The real value of variables is that they can take different values each time you run your program. For example, you might read a variable value from an excel file. It might even be a random number that we generate. We often want to do different things depending on the value of a variable. 

In [None]:
# Python has a "module" called random that generates random numbers for us. 
import random 

# x is a random integer between 0 and 99, inclusive.
# Notice that there are 100 such numbers since we are including 0.
x = random.randint(0, 99)

# One in 10 times, x should be greater than or equal to 90 
if x >= 90: 
    print('x > 90, you win!')
# Simimarly, x should be less than 20 one in 5 times. 
elif x < 20: 
    print('x <= 20, you lose!')
else: 
    print('Who cares?')

In [None]:
# We can do a similar logical computation, but take input from the user.
x = input("Give me a positive integer.")

# The input() function always returns a string, so we must convert to an integer.
x = int(x)

if x > 0:
    print("Great! A positive integer.")
else:
    print(x, "is not a positive integer")

In [None]:
# Here's a simple example with control flow AND looping.
# We have a string that is a list of names of people in our group separated by commas.
# We want to print them all on separate lines after capitalizing them.

group_names_string = "alejandro, avi, ada, adrian"

# Won't work if we forget commas!
# group_names_string = "alejandro avi ada adrian"

# Same problem if there is only one name
# group_names_string = "alejandro"

# Use "split()" to convert the names to a list.
group_names_list = group_names_string.split(', ')
if len(group_names_list) <= 1:
    # One person is not a group  
    print('We need 2 or more names to make a group!')
else: 
    # We have enough names, so let's proceed. 
    print('Our group members are:')
    # This is a simple loop. 
    # In each round, the variable name will be set to the next name in group_names_list
    for name in group_names_list:
        capitalized_name = name.title()
        print(capitalized_name)

In [None]:
### Practice Exercise solutions
x=input("Write a word!")
if len(x) >= 10:
    print("{} is a long word!".format(x))
else:
    print("{} is a regular word.".format(x))
x1=input("Write the first integer.")
x2=input("Write the second integer.")
x3=input("Write the third integer.")
if (x1 > x2):
    print("{} is greater than {}".format(x1, x2))
elif (x1 < x2):
    print("{} is greater than {}".format(x2, x1))
else:
    print("{} and {} are equal".format(x2, x1))
### Continue the solution once Avi checks this.

## Functions

Although for many cases we can write code as a script and decide beforehand the input parameters; this usually ends up working against us when we want to share the code with other people and using said code in other projects. In that case it's very useful to define a function that can be used at any point of the document. By definition: A function is an object made of several operations that take inputs and return an output whose number doesn't necessarily correspond with the number of inputs. In Python; this corresponds to a reusable piece of code that has to perform one specific action. 

Why do functions matter? They matter because they help make the easier to read and to share.

In [7]:
### Let's start off by making a very simple function.
def printnum(num1):
    """
    This function prints an number
    """
    print(num1)

In [8]:
printnum(3)

3


So what can we say about this function? First of all; we notice that def is a keyword used to define functions, the name of the function is in blue and the arguments are inside a parenthesis. Similar to how we use : in for loops and conditional statements; we also use it at the end of the line we declare the function to start the body of the function. The multi-line comment does serve a purpose in this case since now we can call the ? command to see that comment and obtain a detailed explanation of the function.

In [11]:
import math
printnum?

In [17]:
'''
The next step in making functions is using the return statement; 
this allows the function to yield or return a value that can be assigned to a variable and thus; stored locally. 
'''

def squaring(num):
    square = num**2
    return square

x = squaring(2)

print(x)

# We can also take multiple parameters as inputs and return multiple values.

def right_triangle(a,o):
    '''
    The function is given the lengths of the adjacent and opposite sides; returning the hypothenouse and the
    angles of the triangle.
    '''
    hypothenouse=(a**2+o**2)**(0.5)
    angles=[math.atan(o/a)*180/3.14,90-math.atan(o/a)*180/3.14,90]
    return hypothenouse, angles
right_triangle(3,4)

# And we can even have preset values and optional parameters!
def velocity(dist,time=1):
    v=dist/time
    return v
velocity(dist=3)

4


3.0

## Exercises!
1. Define a function that takes a list of integers and a number as inputs and returns every element of the list multiplied by that number.
2. Define a function that takes a list of strings as an input and returns a list with all of the words in lowercase.
6. Define a function that can calculate the standard deviation of a list of numbers.
3. Define a function that takes as an input a list of numbers and returns a list stating whether they are odd or even (a list of strings).
4. Define a function that can calculate both the arithmetic average and weighted average for a given set of numbers or a given set of numbers and weights.
5. Define a function that takes as an input a list of strings and a number; then have it return whether the lenght of the string is longer, shorter or equal to that number.


## Extra topic: Lambda functions

### Packages and Modules
The basic implementation of Python is very powerful by itself and allows us to tackle many types of applications; however; if the task is too complex then solving it might involve the creation of new modules and objects that would take quite the amount of time. Fortunately; given that Python isn't a brand new language; people have tackled said issues before and they've made their solutions available to the public via packages. A package is a collection of modules (they can be built into vanilla Python or developed by others) which together aim to solve a problem or series of problems in a given field. In this club we'll actually be using many packages which will help us analyze images; however, not all of them are dedicated purely for imaging, some have a scope outside of images and can be used in a wide array of applications.

Sci-Py Ecosystem: An ecosystem of open-source Python packages to be used in mathematics, engineering and sciences; in short, if you want to work with Python in a scientific or engineering field you'll most likely use these packages. We're going to be using 4 of them (although we only need to import 3 of them).
1. Numpy: Numerical Python; allows for the manipulation of arrays and n-dimensional objects.
2. Pandas: Allows for the creation of structures for data analysis.
3. Matplotlib: Plotting library based on Matlab's plotting capabilities.
4. IPython: Interactive Python; if you're using a Jupyter Notebook, you're using IPython.

Besides these packages; we'll be using two others that are very important 
Sklearn: Sci-kit Learn is the most commonly used package in machine learning applications. Data spliting, model creation, model evaluation, etc; it's all here.
5. Seaborn

### Numpy
Numpy (Numerical Python) is a package that makes numerical manipulation in Python way easier; in pure Python we would have to loop over a list and multiply the elements one by one. By defining new objects called arrays (which work similar to lists) we can do the addition and multiplication of those objects using the operators we use to do normal arithmetic expressions; not only that, but numpy functions and constants are included to simplify the mathematical evaluation of functions.

In [21]:
#The import command brings the package to our workspace, the "as" keyword allows us to rename it.
import numpy as np
#We try to multiply a list
my_list = [0,1,2,3,4,5,6]
print(my_list*2)
my_array = np.array(my_list)
print(my_array*2)

[0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6]
[ 0  2  4  6  8 10 12]


In [None]:
#We can create numpy arrays 

In [29]:
#We can also create an array using either arange or linspace
x1=np.arange(0,10,0.1)
x2=np.linspace(0,10,101)
print(x1)
print(x2)

[0.  0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.  1.1 1.2 1.3 1.4 1.5 1.6 1.7
 1.8 1.9 2.  2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.  3.1 3.2 3.3 3.4 3.5
 3.6 3.7 3.8 3.9 4.  4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 5.  5.1 5.2 5.3
 5.4 5.5 5.6 5.7 5.8 5.9 6.  6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 7.  7.1
 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 8.  8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9
 9.  9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9]
[ 0.   0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1.   1.1  1.2  1.3
  1.4  1.5  1.6  1.7  1.8  1.9  2.   2.1  2.2  2.3  2.4  2.5  2.6  2.7
  2.8  2.9  3.   3.1  3.2  3.3  3.4  3.5  3.6  3.7  3.8  3.9  4.   4.1
  4.2  4.3  4.4  4.5  4.6  4.7  4.8  4.9  5.   5.1  5.2  5.3  5.4  5.5
  5.6  5.7  5.8  5.9  6.   6.1  6.2  6.3  6.4  6.5  6.6  6.7  6.8  6.9
  7.   7.1  7.2  7.3  7.4  7.5  7.6  7.7  7.8  7.9  8.   8.1  8.2  8.3
  8.4  8.5  8.6  8.7  8.8  8.9  9.   9.1  9.2  9.3  9.4  9.5  9.6  9.7
  9.8  9.9 10. ]


In [31]:
#Why would we want that? Because most of the functions we use on a day to day basis use that space.

y1=x1**2
y2=np.exp(-x2)



In [37]:
# Exercise: let's try to verify that the random number generator is really working. 
# We want to count how many times we see each value. 

# To keep track of how many times we see each value, we will introduce the idea of a numpy array
# we import numpy as "np" since it is shorter, which is nice
import numpy as np

# To see how good the random number generator is, we will make a very simple plot called a "histogram"
from matplotlib import pyplot as plt

# Each position in the array contains the count for that number.
# We want to keep of 100 positions. When we are done with this computation, the value at position 10 
# will be the number of times the random number generator chose the number 10. etc. 
counts = np.zeros(100)  # All the values start as 0.

# We will pick a random number 10000 times!
total_random_numbers = 10000

# This is a simple way of looping in python: the range() command will go through all the numbers 
# from 0 up to total_random_numbers - 1. This is 10000 total numbers!
# In each run of the loop, the variable i will be set to the next number until we get to the last one.
for i in range(total_random_numbers):
    # Get the random number for this round. 
    x = random.randint(0, 99)
    
    # Add one to the count for that number.
    counts[x] = counts[x] + 1
    
# Now we are done with the loop!

# Let's look at how many times we saw each value. 
# The easiest way to do this it to make a plot. We want to plot the number
# against the number of times it was chosen. Remember that there are 100 total
# numbers and we ran the loop 10000 times, so we should see each number 100 times.
plt.scatter(np.arange(100), counts)

# We can make a horizontal line at y=100 to show where we expect the average to be.
plt.axhline(100)

# We can change the Y axis to show a wider range
plt.ylim(0, 200)

# We can label our axes for clarity and even set the fontsize
plt.ylabel('Number of Times Chosen', fontsize=16)
plt.xlabel('Numeric Value', fontsize=16)

# Show the pretty picture! Notice that the random number generator seems to be working!
plt.show()

NameError: name 'random' is not defined