<a href="https://colab.research.google.com/github/Jaeger47/A.I-Seminar/blob/main/Crash_Course_in_Python_and_SciPy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Crash Course in Python and SciPy
**Python** is a general purpose interpreted programming language. **SciPy** is an ecosystem of Python libraries for mathematics, science, and engineering. The SciPy ecosystem is comprised of the following core modules relevant to machine learning.


*   **Numpy**: A foundation for SciPy that allows you to efficiently work with data in arrays.
*   **Matplotlib**: Allows you to create 2D charts and plots from data.
*   **Pandas**: Tools and data structures to organize and analyze your data.

<br><br>
After completing this crash course, you will learn:


1.   How to navigate Python language syntax.
2.   Enough NumPy, Matplotlib and Pandas to read and write machine learning scripts.
3.   A foundation from which to build a deeper understanding of machine learning tasks in Python.




# 1. Python



When getting started in Python, you need to know a few key details about the language syntax such as:

## 1.1. Assignment
Assignment means assigning values to variables.

### 1.1.1. Strings

In [None]:
data = 'hello world'
print(data[0])     # ?
print(len(data))   # ?
print(data)        # ?

### 1.1.2. Numbers

In [None]:
value = 123.1
print(value)      # ?
value = 10
print(value)      # ?

### 1.1.3. Boolean

In [None]:
a = True
b = False
print(a, b)     # ?

### 1.1.4. Multiple assignment

In [None]:
a, b, c = 1, 2, 3
print(a, b, c)       # ?

### 1.1.5. No value

In [None]:
a = None
print(a)   # ?

## 1.2. Flow Control
There are three main types of flow control that  you need to learn:

### 1.2.1. If-Then-Else Conditional

In [None]:
value = 99
if value == 99:
  print ('That is fast')
elif value > 200:
  print ('That is too fast')
else:
  print ('That is safe')

### 1.2.2. For-Loops

In [None]:
for i in range(10):
  print (i)

### 1.2.3. While-Loops

In [None]:
i = 0
while i < 10:
  print (i)
  i += 1

## 1.3. Data Structures
There are three data structures in Python that you will find the most used and useful. They are:

### 1.3.1. Tuple
- is a read-only collection of items.

In [None]:
a = (1, 2, 3)
print (a)

### 1.3.2. List
- uses the square bracket notation and can be index using array notation.

In [None]:
mylist = [1, 2, 3]

print(("Zeroth Value: %d") % mylist[0])
print(("Last Value: %d") % mylist[-1])
print(("List Length: %d") % len(mylist))

for value in mylist:
  print (value)

In [None]:
mylist.append(4)

print(("Zeroth Value: %d") % mylist[0])
print(("Last Value: %d") % mylist[-1])
print(("List Length: %d") % len(mylist))

for value in mylist:
  print (value)

### 1.3.3. Dictionary
- is a mapping of names to values, like key-value pairs. Note the use of the curly bracket and colon notations when defining the dictionary.

In [None]:
mydict = {'a': 1, 'b': 2, 'c': 3}        # Assigning a value to a dictionary
print(("A value: %d") % mydict['a'])

mydict['a'] = 11                         # Updating value of a key
print(("A value: %d") % mydict['a'])

print(("Keys: %s") % mydict.keys())       
print(("Values: %s") % mydict.values())

for key in mydict.keys():
  print (mydict[key])

## 1.4. Functions
The biggest gotcha with Python is the **whitespace**. Ensure that you have an empty new line after indented code. The example below defines a new function to calculate the sum of two values and calls the function with two arguments.

In [1]:
# Sum function
def mysum(x, y):
  return (x + y)

# Test sum function
result = mysum(1, 3)
print(result)

4


# 2. NumPy

NumPy provides the foundation data structures and operations for SciPy. These are arrays (*ndarrays*) that are efficient to define and manipulate.

In [3]:
import numpy as np     # import NumPy package once

## 2.1. Create Array

In [4]:
# define an array
mylist = [1, 2, 3]
myarray = np.array(mylist)
print(myarray)
print(myarray.shape)  

# Does List data structure have a 'shape' method?

[1 2 3]
(3,)


## 2.2. Access Data
Array notations and ranges can be used to efficiently access data in a NumPy array.

In [None]:
# access values
mylist = [[1, 2, 3], [3, 4, 5]]      # 2D list -> matrix
myarray = np.array(mylist)

print(myarray)
# print(myarray.shape)       # Note, uncomment line to print 

# print(("First row: %s") % myarray[0])
# print(("Last row: %s") % myarray[-1])
# print(("Specific row and col: %s") % myarray[0, 2])
# print(("Whole col: %s") % myarray[:, 2])

## 2.3. Arithmetic
NumPy arrays can be used directly in arithmetic.

In [5]:
myarray1 = np.array([2, 2, 2])
myarray2 = np.array([3, 3, 3])

print(("Addition: %s") % (myarray1 + myarray2))
print(("Multiplication: %s") % (myarray1 * myarray2))

Addition: [5 5 5]
Multiplication: [6 6 6]


# 3. Matplotlib



Matplotlib can be use for creating plots and charts. The library is generally used as follows:


*   Call a plotting function with some data (e.g. **.plot( )** ).
*   Call many functions to setup the properties of the plot (e.g. labels and colors).
*   Make the plot visible (e.g. **.show( )** ).

In [6]:
import matplotlib.pyplot as plt          # import Matplotlib package once

## 3.1. Line Plot
The example below creates a simple line plot from one dimensional data.


In [None]:
myarray = np.array([1, 2, 3])     # note, we use NumPy to create an array
plt.plot(myarray)                 # this is how you call basic line plot
plt.xlabel('some x axis')
plt.ylabel('some y axis')
plt.show()

## 3.2. Scatter Plot
Below is a simple example of creating a scatter plot from two dimensional data.

In [None]:
x = np.array([1, 2, 3])
y = np.array([2, 4, 6])

plt.scatter(x,y)              # this is how you call a basic scatter plot
plt.xlabel('some x axis')
plt.ylabel('some y axis')
plt.show()

# 4. Pandas

Pandas provides data structures and functionality to quickly manipulate and analyze data. The key to understanding Pandas for machine learning is understanding the Series and DataFrame data structures.

In [7]:
import pandas as pd        # import Pandas package once

## 4.1. Series
A series is a one dimensional array where the rows and columns can be labeled.

In [8]:
myarray = np.array([1, 2, 3])
rownames = ['a', 'b', 'c']
myseries = pd.Series(myarray, index=rownames)
print(myseries)

a    1
b    2
c    3
dtype: int64


You can access the data in a series like a NumPy array and like a dictionary, for example:

In [9]:
print(myseries[0])        # use of INDEX: accessing Pandas Series like a NumPy array
print(myseries['a'])      # use of KEY: accessing Pandas Series like a dictionary

1
1


## 4.2. DataFrame
A data frame is a multi-dimensional array where the rows and the columns can be labeled.

In [10]:
myarray = np.array([[1, 2, 3], [4, 5, 6]])
rownames = ['a', 'b']
colnames = ['one', 'two', 'three']
mydataframe = pd.DataFrame(myarray, index=rownames, columns=colnames)
print(mydataframe)

   one  two  three
a    1    2      3
b    4    5      6


Data can be index using column names. 

In [11]:
print("method 1:")
print(("one column: \n%s") % mydataframe['one'])

# print("\nmethod 2:")                              # uncomment lines to print 
# print(("one column: \n%s") % mydataframe.one)

method 1:
one column: 
a    1
b    4
Name: one, dtype: int64
