# Python Tutorial

## Standard syntax
- Python is dynamically typed -- therefore you do not need to explicitly state a variable's data type.
- Scope is based on white space, so no use of brackets for methods, loops, etc. (brackets are used for the dictionary data type)
- Do not need ```;``` to end lines

In [1]:
x = 5
print(x)

my_string = "Hello, world"
print(my_string)

# You can get confused to you set a variable's value to something it was not named for, for example:
my_string = 3908274213408
print(f"Type of my string: {type(my_string)}")

5
Hello, world
Type of my string: <class 'int'>


## Data Types

### Numbers

- There are three types of "number" datatypes
    - Integers (no distinctions to long, etc.)
    - Floats (no distinction to doubles, etc.)
    - Complex variables

In [1]:
integer_var = 3
float_var = 3.1
complex_var = 3+2j

print(f"Type of integer_var: {type(integer_var)}")
print(f"Type of float_var: {type(float_var)}")
print(f"Type of complex_var: {type(complex_var)}")

Type of integer_var: <class 'int'>
Type of float_var: <class 'float'>
Type of complex_var: <class 'complex'>


### Strings

- Python does not have the ```char``` datatype, and there is no real distinction between char and strings. Thus, you can ```''``` or ```""``` to make strings

In [3]:
print("You can make a string using double quotes")
print('You can also make a string using single quotes')

You can make a string using double quotes
You can also make a string using single quotes


- There are many additional string methods available

In [4]:
statement = "My whole name is Bob Builder"
print(statement.lower())
print(statement.upper())
print(statement.replace('Bob', 'Bran, The'))

my whole name is bob builder
MY WHOLE NAME IS BOB BUILDER
My whole name is Bran, The Builder


- Additionally, as you have seen before, *formatted strings* (denoted by the use of ```f""``` or ```f''``` allow the easy use of variables within strings
    - You would put your variable/object/one-lined statements that you would like to convert into a string within ```{}```.

In [5]:
for number in range(6):
    print(f"The number is: {number} and it is {'even' if number % 2 == 0 else 'false'}")

The number is: 0 and it is even
The number is: 1 and it is false
The number is: 2 and it is even
The number is: 3 and it is false
The number is: 4 and it is even
The number is: 5 and it is false


### Lists

- Arrays don't exist in Python (unless you use libraries such as numpy or libraries that rely on numpy/wrapper code). Instead, Python only has lists.
- Due to dynamic typing, lists also don't have a fixed type (they are like object arrays). You can shove a whole bunch of different objects into lists. 
- List indices start at 0.

In [6]:
my_list = ["hello", 0, {"name": "Dax"}]

print(my_list) # Prints fine, no exception
print([type(item) for item in my_list])

['hello', 0, {'name': 'Dax'}]
[<class 'str'>, <class 'int'>, <class 'dict'>]


- Adding, removing, and popping elements are pretty easy

In [7]:
number_list = [0, 15, 32, 99, 852, 16]

number_list.append(82) # Adds 82 to the end of the list
print(number_list)

number_list.remove(15) # Removes the value 15
print(number_list)

del number_list[3] # Removes object at index 3
print(number_list)

number = number_list.pop(2) # Removes the value at index 2 and returns it
print(number, number_list)

number_list.reverse() # Reverses the list in place
print(number_list)

number_list.sort() # Sorts the list
print(number_list)

minimum = min(number_list)
maximum = max(number_list)
print(f"Minimum number: {minimum}, maximum number: {maximum}")

[0, 15, 32, 99, 852, 16, 82]
[0, 32, 99, 852, 16, 82]
[0, 32, 99, 16, 82]
99 [0, 32, 16, 82]
[82, 16, 32, 0]
[0, 16, 32, 82]
Minimum number: 0, maximum number: 82


- Due to the fact that lists do not have fixed datatypes, making multi-dimensional arrays are pretty simple. 
    - This is not recommended, however, if you have high-dimensionality arrays as libraries made for ndarrays (such as numpy) are wrappers for C code that are *much* more efficient.

In [8]:
my_2d_array = [[1, 2], [2, 5], ["cat", "dog"]]
print(my_2d_array)

[[1, 2], [2, 5], ['cat', 'dog']]


### Dictionaries

- Initialized by ```var = {}``` or ```var = {'hello': 0, 'one': 1}```
- Like lists, dictionaries don't have a set type. You can add different types of objects within the same dictionary

In [9]:
my_age_dictionary = {'Dax': 22, 'Liang': 'professor age', 'Tucker': ['zoomer', 'age']}
print(f"Dax's age is: {my_age_dictionary['Dax']}")
print(f"Dr. Liang's age is: {my_age_dictionary['Liang']}")
print(f"Tucker's age is: {my_age_dictionary['Tucker']}")

Dax's age is: 22
Dr. Liang's age is: professor age
Tucker's age is: ['zoomer', 'age']


- You can set and variables by simply setting the dictionary at a certain key

In [10]:
my_dictionary = {}
my_dictionary['something'] = 'exists'
my_dictionary[3+2j] = 'its complicated'

print(my_dictionary)
print(my_dictionary[3+2j])

{'something': 'exists', (3+2j): 'its complicated'}
its complicated


### Loops

- Loops, in a sense, are always the "shorthand" version of loops found in other programming languages. Therefore, things like this can be done:

In [11]:
my_2d_array = [[1, 2], [2, 5], ["cat", "dog"]]

for row in my_2d_array:
    for value in row:
        print(value)

1
2
2
5
cat
dog


- More standard looking for loops can be made using ```range()```

In [12]:
# A more standard for loop analogous to i = 0; i < len(list); i++
for i in range(0, len(my_2d_array), 1):
    print(my_2d_array[i])

# Shorthand exists for range as well:
for i in range(len(my_2d_array)): # Handing one parameter automatically stops at the explicit parameter, starting at 0, and iteraring 1
    print(my_2d_array[i])


[1, 2]
[2, 5]
['cat', 'dog']
[1, 2]
[2, 5]
['cat', 'dog']


- For loops with else statements (for-else loops) also exist and can be quite helpful, say for simple search algorithms
    - For-else loops work on the notion that you will break the for-loop yourself. If the for loop reaches its terminating state/the end of all objects, it executes the else statement.

In [13]:
names = ['Tucker', 'Dax', 'Dr. Liang']

for name in names:
    if name == "Dax":
        print("Found Dax!")
        
        # This should break the for loop and not execute the else statement
        break
else:
    print("Did not find Dax!")

for name in names:
    if name == "Evan":
        print("Found Evan!")
        
        # This loop should not break and it will execute the else statement
        break
else:
    print("Did not find Evan!")



Found Dax!
Did not find Evan!


- While loops are as expected

In [14]:
some_int = 0
while(True):
    print(some_int)
    
    some_int += 1 # some_int++ does not exist in Python
    if some_int > 5:
        break

0
1
2
3
4
5


## Functions

- Defining functions are pretty simple as you do not need to state what datatype they return

In [1]:
some_object = 0
def my_function(an_object):
    return an_object

print(my_function(some_object))

0


- Functions can have default values for their explicit parameters
- You can set specific explicit parameters when calling a function

In [5]:
def another_function(number=0, name="No name"):
    return f"{number}, {name}"

# Relying on default value for name parameter
print(another_function(12))

# Explicitly setting parameters, as seen here these do not follow the order that the parameters are in the function definition
print(another_function(name="Dax", number=49))

12, No name
49, Dax


## Importing Libraries
How to import libraries: You can import libraries using "import", such as ```import numpy```
There are variations on how to do this:
1. To import a library as a different variable name, use the "as" keyword
2. To import only part of a library, use the "from" keyword, such as ```from numpy import array```. You can also use ".", such as ```import numpy.array```.
    - Note that ```from numpy import *``` and ```import numpy.*``` is the same as ```import numpy```

In [2]:
import numpy as np

a = np.zeros((500, 500))
print(f"{a}\n{a.shape}")

[[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]]
(500, 500)


In [21]:
import pandas as pd
import numpy as np
list = [1,2,3]
print(type(list))
a = np.array(list)   # Create a rank 1 array, this is a ROW VECTOR of the underlying list object in python
print(type(a))            # Prints "<class 'numpy.ndarray'>"
print(a.shape)            # Prints "(3,)"
print(a[0], a[1], a[2])   # Prints "1 2 3"
a[0] = 5                  # Change an element of the array
print(a)                  # Prints "[5, 2, 3]"

                  # Prints "(2, 3)"


<class 'list'>
<class 'numpy.ndarray'>
(3,)
1 2 3
[5 2 3]


In [17]:
#stepping up to a 2 dimensional array

b = np.array([[1,2,3],[4,5,6]])    # Create a rank 2 array
print(b.shape)
print(b[0, 0], b[0, 1], b[1, 0])   # Prints "1 2 4"  

(2, 3)
1 2 4


In [20]:
#3 dimensional array now
'''     Page :        0                  1 
        Row  :    0       1         0         1
     Column  :  0 1 2   0 1 2     0 1 2    0  1  2                  '''
c = np.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]])
print(c.shape)
print(c[0,0,0])
# print(c[0,0,1])
# print(c[0,0,2])
# print(c[0,1,0])
# print(c[0,1,1])
# print(c[0,1,2])
# print(c[1,0,0])
# print(c[1,0,1])
# print(c[1,0,2])
# print(c[1,1,0])
# print(c[1,1,1])
# print(c[1,1,2])

(2, 2, 3)
1
2
3
4
5
6
7
8
9
10
11
12


In [22]:
#Adding and Subtracting
x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)
# Elementwise sum; both produce the array
# [[ 6.0  8.0]
#  [10.0 12.0]]
print(x + y)
print(np.add(x, y))

# Elementwise difference; both produce the array
# [[-4.0 -4.0]
#  [-4.0 -4.0]]
print(x - y)
print(np.subtract(x, y))


[[ 6.  8.]
 [10. 12.]]
[[ 6.  8.]
 [10. 12.]]
[[-4. -4.]
 [-4. -4.]]
[[-4. -4.]
 [-4. -4.]]


In [23]:
#Element wise Multiplication and Division
# Elementwise product; both produce the array
# [[ 5.0 12.0]
#  [21.0 32.0]]
print(x * y)
print(np.multiply(x, y))

# Elementwise division; both produce the array
# [[ 0.2         0.33333333]
#  [ 0.42857143  0.5       ]]
print(x / y)
print(np.divide(x, y))

# Elementwise square root; produces the array
# [[ 1.          1.41421356]
#  [ 1.73205081  2.        ]]
print(np.sqrt(x))

[[ 5. 12.]
 [21. 32.]]
[[ 5. 12.]
 [21. 32.]]
[[0.2        0.33333333]
 [0.42857143 0.5       ]]
[[0.2        0.33333333]
 [0.42857143 0.5       ]]
[[1.         1.41421356]
 [1.73205081 2.        ]]


In [24]:
import numpy as np

x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])

v = np.array([9,10])
w = np.array([11, 12])

# Inner product of vectors; both produce 219
print(v.dot(w))
print(np.dot(v, w))

# Matrix / vector product; both produce the rank 1 array [29 67]
print(x.dot(v))
print(np.dot(x, v))

# Matrix / matrix product; both produce the rank 2 array
# [[19 22]
#  [43 50]]
print(x.dot(y))
print(np.dot(x, y))

219
219
[29 67]
[29 67]
[[19 22]
 [43 50]]
[[19 22]
 [43 50]]


In [30]:
print("Original array : \n", x)
print("Transposed : \n",x.T)

Original array : 
 [[1 2]
 [3 4]]
Transposed : 
 [[1 3]
 [2 4]]


In [34]:
#Pandas
#Biggest difference: numpy arrays only contain one data type
#Pandas arrays can contain multiple different 
'''NCAA basketball games from 1985 to 2016. This dataset is in a CSV file, and the function we're going to use to read in the file is called pd.read_csv(). This function returns a dataframe variable.'''
import pandas as pd

df = pd.read_csv("RegularSeasonCompactResults.csv")
print(df.head(10))

Season  Daynum  Wteam  Wscore  Lteam  Lscore Wloc  Numot
0    1985      20   1228      81   1328      64    N      0
1    1985      25   1106      77   1354      70    H      0
2    1985      25   1112      63   1223      56    H      0
3    1985      25   1165      70   1432      54    H      0
4    1985      25   1192      86   1447      74    H      0
5    1985      25   1218      79   1337      78    H      0
6    1985      25   1228      64   1226      44    N      0
7    1985      25   1242      58   1268      56    N      0
8    1985      25   1260      98   1133      80    H      0
9    1985      25   1305      97   1424      89    H      0


In [35]:
print(df.tail())

Season  Daynum  Wteam  Wscore  Lteam  Lscore Wloc  Numot
145284    2016     132   1114      70   1419      50    N      0
145285    2016     132   1163      72   1272      58    N      0
145286    2016     132   1246      82   1401      77    N      1
145287    2016     132   1277      66   1345      62    N      0
145288    2016     132   1386      87   1433      74    N      0


In [36]:
#Basic descriptive statistics

print(df.describe())

Season         Daynum          Wteam         Wscore  \
count  145289.000000  145289.000000  145289.000000  145289.000000   
mean     2001.574834      75.223816    1286.720646      76.600321   
std         9.233342      33.287418     104.570275      12.173033   
min      1985.000000       0.000000    1101.000000      34.000000   
25%      1994.000000      47.000000    1198.000000      68.000000   
50%      2002.000000      78.000000    1284.000000      76.000000   
75%      2010.000000     103.000000    1379.000000      84.000000   
max      2016.000000     132.000000    1464.000000     186.000000   

               Lteam         Lscore          Numot  
count  145289.000000  145289.000000  145289.000000  
mean     1282.864064      64.497009       0.044387  
std       104.829234      11.380625       0.247819  
min      1101.000000      20.000000       0.000000  
25%      1191.000000      57.000000       0.000000  
50%      1280.000000      64.000000       0.000000  
75%      1375.000000 