# Python - General Stuff

## Pass statement
The pass statement is used as a placeholder for future code.
When the pass statement is executed, nothing happens, but you avoid getting an error when empty code is not allowed.
Empty code is not allowed in loops, function definitions, class definitions, or in if statements.

In [None]:
def myfunction():
  pass

class Person:
  pass

a = 33
b = 200

if b > a:
  pass

## map() function
The map() function executes a specified function for each item in an iterable. The item is sent to the function as a parameter.


### Syntax
map(function, iterables)

- function | Required. The function to execute for each item
- iterable | Required. A sequence, collection or an iterator object. You can send as many iterables as you like, just make sure the function has one parameter for each iterable.

**Returns a map object (which is an iterator) of the results after applying the given function to each item of a given iterable (list, tuple etc.) 

In [None]:
def myfunc(n):
  return len(n)

x = map(myfunc, ('apple', 'banana', 'cherry'))
print (list(x)) # Converts the map object into a list for readability

## **PENDING** filter() function


## **PENDING** Packing & Unpacking

In [None]:
some_list = [3, 4, 5]
j, k, l  = some_list

print(j, k)

## Arguments |  Parameters
The terms parameter and argument can be used for the same thing: information that are passed into a function

### Default arguments
Default arguments are values that are provided while defining functions.
- The assignment operator = is used to assign a default value to the argument.
- Default arguments become optional during the function calls.
- If we provide a value to the default arguments during function calls, it overrides the default value.
- The function can have any number of default arguments
- Default arguments should follow non-default arguments.

In [None]:
def add(a,b=5,c=10):
    return (a+b+c)

# Only giving the mandatory argument
print(add(3))

# Giving only one of the optional arguments
print(add(3,4))

# Giving all the arguments
print(add(2,3,4))


### Keyword Arguments
Functions can also be called using keyword arguments of the form kwarg=value.

During a function call, values passed through arguments need not be in the order of parameters in the function definition. This can be achieved by keyword arguments. But all the keyword arguments should match the parameters in the function definition.

In [None]:
def add(a,b=5,c=10):
    return (a+b+c)

# All parameters are given as keyword arguments, so no need to maintain the same order.

print (add(b=10,c=15,a=20))
#Output:45

### Positional Arguments
During a function call, values passed through arguments should be in the order of parameters in the function definition. This is called positional arguments.

In [None]:
def add(a,b,c):
    return (a+b+c)

The above function can be called in two ways:
1. During the function call, all arguments are given as positional arguments. Values passed through arguments are passed to parameters by their position. 10 is assigned to a, 20 is assigned to b and 30 is assigned to c.



In [None]:
print (add(10,20,30))
#Output:60

2. Giving a mix of positional and keyword arguments, keyword arguments should always follow positional arguments

In [None]:
print (add(10,c=30,b=20))
#Output:60

### Important things to remember





In [None]:
# 1. default arguments should follow non-default arguments (non-default, default arguments)
 
# def add(a=5,b,c):
#     return (a+b+c)

#Output:SyntaxError: non-default argument follows default argument

In [None]:
# 2. keyword arguments should follow positional arguments

# def add(a,b,c):
#     return (a+b+c)

# print (add(a=10,3,4))

#Output:SyntaxError: positional argument follows keyword argument

In [None]:
# 3. All the keyword arguments passed must match one of the arguments accepted by the function and their order is not important.

# def add(a,b,c):
#     return (a+b+c)

# print (add(a=10,x=5,c=12))

#Output:TypeError: add() got an unexpected keyword argument 'x'

In [None]:
# 4. No argument should receive a value more than once

# def add(a,b,c):
#     return (a+b+c)

# print (add(a=10,b=5,b=10,c=12))

#Output:SyntaxError: keyword argument repeated

## Variable-length arguments

Variable-length arguments are also known as arbitrary arguments. If we don’t know the number of arguments needed for the function in advance, we can use arbitrary arguments

### Arbitrary positional arguments (*args)

For arbitrary positional argument, an asterisk (*) is placed before a parameter in function definition which can hold non-keyword variable-length arguments. These arguments will be wrapped up in a tuple.



In [None]:
def add(*b):
    result=0
    for i in b:
         result=result+i
    return result

print (add(1,2,3,4,5))   # Output:15
print (add(10,20))       # Output:30

### Arbitrary keyword arguments (**kwargs)
For arbitrary positional argument, a double asterisk is placed before a parameter in a function which can hold keyword variable-length arguments

In [None]:
def fn(**a):
    for i in a.items():
        print (i)
fn(numbers=5,colors="blue",fruits="apple")


## Modules
Modules refer to a file containg Python statements and definitions.

A file containg Python code, for example>
'example.py', is called a module, and its module name would be 'example'.

We use modules to break down large programs into small manageable and organized files. Furthermore, modules provide reusability of code.

We can define our most used functions in a module and import it, instead of copying ther definition into different programs. 

FMI: https://www.programiz.com/python-programming/modules

## Namespace and Scope
### Name
**Name**, also called *identifier*, is simply a name given to objects. Everything in Python is an object. Name is a way to acces the underlying object.

For example, when we do the assigmen a =2, 2 is an object store in memory and 'a' is the name we associate it with. We can get the adress in RAM of some object through the buil-in function id(). 

In [None]:
a = 2
print('id(a) =', id(a))

a = a+1
print('id(a) =', id(a))

print('id(3) =', id(3))

b = 2
print('id(b) =', id(b))
print('id(2) =', id(2))

![image.png](attachment:a47cd593-69e9-492e-8fdd-974f2f01a497.png)

Initially, an object 2 is created and the name a is associated with it, when we do a = a + 1, a new object 3 is created and now 'a' is associated with this object.

Note tha id(a) and id(3) have the same values.

Furthermore, when b=2 is executed, the new name b gets associated with the previous object 2. 

This is efficient as Python does not have to create a new duplicate object. This dynamic nature of name binfing makes Python powerful; a name could refer to any type of object. 

In [None]:
a = 5
a = 'Hello World!'
a = [1,2,3]

All these are valid and 'a' will refer to three different types of objects in different instances. Functions are object too, so a name can refer to them as well. 

In [None]:
def printHello():
    print("Hello")

a = printHello

a()

### Namespaces
Now that we understand what names are, we can move on to the concept of namespaces. 

To simply put it, a namespace is a collection of names.

In Python, you can imagine a namespace as a mapping of every name you have defined to corresponding objects.

Diferrent namespaces can co-exist at a given time, but are completely isolated.

A namespace containg all the built-in names is created when we start the Python interpreter and exists as long as the interpreter runs.

This is the reason that built-in functions like id(), print(), etc. are always available to us from any part of the program. Each module creates its own global namespace.

These different namespaces are isolated. Hence, the same name that may exist in different modules does not collide. 

Modules can have various functions and classes. A local namespace is created when a function is called, which has all the names defined in it. Similar is the case with class. The following diagram may help to clarify this concept



### Variable Scope



### Scope of Namespaces



In [None]:
# Write a function to compute 3x +1 

def f(x):
    return 3 * x + 1 

print( f(2) ) 


lambda x: 3*x + 1

g = lambda x: 3*x + 1

print( g(2) )


## Decorators

Python's functions are objects.

To understand decorators, you must first understand that functions are objects in Python. This has important consequences. Let's see why with a simple example: 

In [None]:
def shout(word="yes"):
    return word.capitalize() + "!" 

print(shout())
# outputs : 'Yes!' 

# As an object, you can assign the function to a variable like any other object.

scream = shout 

# Notice we don't use parentheses: we are not calling the function, we are
# putting the function "shout" into the variable "scream"
# This means that you can call "shout" from "scream":

print(scream())
# outputs : 'Yes!' 

# More than that, it means you can remove the old name 'shout'. 
# and the function will still be accsessible from 'scream'

del shout
try:
    print(shout())
except NameError as e: 
    print(e)
    # outputs : "name 'shout' is not defined"

    
print(scream())
# outputs: 'Yes!'

Another interesting property of Python functions is that they can be defined inside another function! 

In [None]:
def talk():
    # You can define a function on the fly in "talk" ... 
    def whisper(word="yes"):
        return word.lower() + "..."
    
    # ... And use it right away 
    
    print(whisper()) 
    
# You call "talk", that defines "whisper" EVERY TIME you call it, then
# "whisper" is called in "talk" 
talk()
# outputs: "yes..." 


# But "whisper" DOES NOT EXISTS outside of "talk":

try: 
    print(whisper())
except NameError as e: 
    print(e)
    # outputs : "name 'whisper' is not defined"

### Function References
Now the fun part... 

You've seen that functions are objects. Therefore, functions: 
- Can be assigned to a variable
- Can be defined in another function

That means that **a function can return another function.**

In [None]:
def getTalk(kind="shout"):
    # We define functions on the fly
    def shout(word="yes"): 
        return word.capitalize() + "!"
    
    def whisper(word="yes"):
        return word.lower() + "..." 
    
    # then we return one og them 
    
    if kind == "shout": 
        # We don't use "()", we are not calling the function,
        # we are returning the function object
        return shout
    else: 
        return whisper
    
# how do you use this strange beast?
# Get the function and assign it to a variable

talk = getTalk()

# You can see that "talk" is here a function object. 
print(talk)
# outputs : <function getTalk.<locals>.shout at 0x00000251B45D8940>


# THe object is the one returned by the function: 
print(talk())
# ouputs : 'Yes!'


# And you can even use it directly if you feel wild: 
print(getTalk("whisper")())
        

There's more! 

If you can return a function, you can pass one as a parameter: 




In [None]:
def doSomethingBefore(func):
    print("I do something before then i call the function you gave me!")
    return(func())

doSomethingBefore(scream)

Well, you just have everything needed to understand decorators. You see, decorators are "wrappers", which means that **they let you execute code before and after the function they decorate** without modifying the function itself. 



### Handcrafted decorators
How you would do it:

In [None]:
# A decorators is a function that expects another function as parameter.

def my_shiny_new_decorator(a_function_to_decorate): 
    
    # Inside, the decorator defines a function on the fly: the wrapper. 
    # This function is going to be wrapped around the original function
    # so it can execute code before and after it. 
    
    def the_wrapper_arround_the_original_function():
        
        # Put here the code you want to be executed BEFORE the original funcion is called
        print("Before the function runs")
        
        # Call the function here (using parentheses)
        a_function_to_decorate()
        
        # Put here the code you want to be executed AFTER the original function is called.
        print("After the function runs")
        
    # At this point, "a_function_to_decorate" HAS NEVER BEEN EXECUTED.
    # We return the wrapper function we have just created. 
    # The wrapper contains the function and the code to execute before and after.
    
    return the_wrapper_arround_the_original_function

# Now imagine you create a function you don't want to ever touch again
def a_stand_alone_function():
    print("Don't you dare modify me")
    
# a_stand_alone_function()
# outputs: Don't you dare modify me. 


print("_______________________")


# Well, you can decorate it to extend its behavior. 
# Just pass it to the decorator, it will wrap it dynamically in
# any code you want and return you a new function ready to be used

# a_stand_alone_function_decorated = my_shiny_new_decorator(a_stand_alone_function)
# a_stand_alone_function_decorated()
    

Now, you probably want that every time you call a_stand_alone_function, a_stand_alone_function_decorated is called instead. That's easy, just overwrite a_stand_alone_function with the function returned by my_shiny_new_decorator: 

In [None]:
a_stand_alone_function = my_shiny_new_decorator(a_stand_alone_function)
a_stand_alone_function()

# This is EXACTLY what decorator do !

### Decorator demystified
The previous example, using the decorator syntax:

In [None]:
@my_shiny_new_decorator
def another_stand_alone_function():
    print("Leave me alone")
    
another_stand_alone_function()

Yes, that's all, it's that simple @decorator is just a shorcut to: 

another_stand_alone_function = my_shiny_new_decorator(another_stand_alone_function)

Decorators are just a pythonic variant of the decorator design pattern. There are several classic design patterns embedded in Python to ease development (like iterators).

Of course, you can accumulate decorators:

In [None]:
def bread(func):
    def wrapper():
        print("</''''''\>")
        func()
        print("</''''''\>")
    return wrapper 
              
def ingredients(func):
    def wrapper():
        print("tomatoes")
        func()
        print("~salad~")
    return wrapper
              
def sandwich(food = "--ham--"):
    print(food)
    
    
# sandwich()
# outputs: --ham-- 

sandwich = bread(ingredients(sandwich))
sandwich()


              
              

Using the Python decorator syntax

In [None]:
@bread
@ingredients
def sandwich(food ="--ham--"):
    print(food)
    
sandwich()

As a conclusion, you can see the next example

In [None]:
# The decorator to make it bold
def makebold(fn):
    def wrapper():
        return "<b>" + fn() + "</b>"
    return wrapper

# The decorator to make it italic
def makeitalic(fn):
    def wrapper():
        return "<i>" + fn() + "</i>"
    return wrapper

@makebold
@makeitalic
def say():
    return "hello"

print(say())
#outputs: <b><i>hello</i></b>

# This is the exact equivalent to 
def say():
    return "hello"

say = makebold(makeitalic(say))

print(say())
#outputs: <b><i>hello</i></b>

## **PENDING** Lambdas

## **PENDING** Setters & Getters

## **PENDING** Recurssion

## **PENDING** Regular Expressions

## **PENDING** Higher Order Functions

##Generators (yield)

Python generator functions allow you to declare a function that behaves likes an iterator.

The main difference between a regular function and a generator function is that the state of the generator functions are maintained through the use of the keyword yield and works much like using return, but it has some important differences. The difference is that yield saves the state of the function. The next time the function is called, execution continues from where it left off. with the same variable values it had before yielding, whereas the return statement terminates the function completely.


Generators are iterators, a kind of iterable you can only iterate over once. Generators do not store all values in memory, the generate the values on the fly

In [None]:
mygenerator = (x*x for x in range(3))

for i in mygenerator:
    print(i)


### Yield
Yield is a keyword that is used like return, except the function will return a generator.

In [None]:
def create_generator():
    my_list = range(3)
    for i in my_list:
        yield i*i
            
mygenerator = create_generator()
print(mygenerator)

for i in mygenerator:
    print(i)


To master yield, you must understand that **when you call the function, the code you have written in the function body does not run.** The function only returns the generator object.

The first time the for calls the generator object created from your function, it will run the code in your function from the beginning until it hits yield, then it'll return the first value of the loop. Then, each subsequent call will run another iteration of the loop you have written in the function and return the next value. This will continue until the generator is considered empty, which happens when the function runs withouth hitting yield.

In [None]:
import memory_profiler as mem_profile
import random
import time

names = ["John", 'Corey', 'Adam', 'Steve', 'Rick', 'Thomas']
majors = ['Math', 'Engineering', 'CompSci', 'Arts', 'Business']
print('Memory (before): {}Mb'.format(mem_profile.memory_usage()))

def people_list(num_people):
    result = []
    for i in range(num_people):
        person = {
            'id': i,
            'name': random.choice(names),
            'major': random.choice(majors)
        }
        result.append(person)
    return result

def people_generator(num_people):
    for i in range(num_people):
        person = {
            'id': i,
            'name': random.choice(names),
            'major': random.choice(majors)
        }
        yield person
        
t1 = time.perf_counter()
people = people_list(1000000)
t2 = time.perf_counter()

# t1 = time.perf_counter()
# people = people_generator(1000000)
# t2 = time.perf_counter()

print('Memory (after): {}Mb'.format(mem_profile.memory_usage()))
print('Took {} seconds'.format(t2-t1))



## **PENDING** Docstring
https://www.programiz.com/python-programming/docstrings, https://peps.python.org/pep-0257/

# Python - OOP

## Introduction to OOP in Python
Python is a multi-paradigm programming language. It supports different programming approaches.
One of the popular approaches to solve a programming problem is by creating objects. This is known as Object-Oriented Programming (OOP).
An object has two characteristics:
- Attributes
- Behavior

Let's take an example:
A parrot is an object, as it has the following properties:
- Attributes: name, age, color 
- Behavior: singing, dancing 

### Example 1 - Attributes
Note: The __init__() function is called automatically every time the class is being used to create a new object.

In [None]:
# First we create a class named Parrot

class Parrot:

    # class attribute
    species = "bird"

    # instance attribute
    def __init__(self, name, age):   
        self.name = name
        self.age = age
        
# instantiate the Parrot class
blu = Parrot("Blu", 10)
woo = Parrot("Woo", 15)

# access the class attributes
print("Blu is a", blu.species)
print("Woo is also a", woo.species)

# access the instance attributes
print(f"{blu.name} is {blu.age} years old")

print(f"{woo.name} is {woo.age} years old")

### FMI on attributes see: Class attributes vs Instance attributes. ###




### Example 2 - Methods (Behavior)
Methods are functions defined inside the body of a class. They are used to define the behaviors of an object.

In [None]:
class Parrot: 
    # Instance attributes
    def __init__(self, name, age):
        self.name = name
        self.age = age
    
    # Instance methods
    def sing(self, song):
        print (f"{self.name} sings {song}")
    
    def dance(self):
        print (f"{self.name} is dancing")
    
pato = Parrot("pato", 3)
pato.sing("macarena")
pato.dance()
    
    

### Example 3 - Inheritance
Inheritance is a way of creating a new class for using details of an existing class without modifying it. The newly formed class is a derived class (or child class). Similarly, the existing class is a base class (or parent class).

A polygon is a closed figure with 3 or more sides. Say, we have a class called Polygon defined as follows.

In [None]:
# class Polygon:
#     # Creates a Polygon with the specified amount of sides
#     # Each side it's given the default value 0 
#     # So it looks like this 
#     # shape = Polygon(4) 
#     # >>> shape.sides = [0,0,0,0]  
#     def __init__(self, no_of_sides):
#         self.n = no_of_sides
#         self.sides = [0 for i in range(no_of_sides)]

        
#     # Defines each side's longitude 
#     def input_sides(self):
#         self.sides = [float(input("Enter side " + str(i + 1) + ":")) for i in range(self.n)]

#     # Displays the longitude of each side
#     def disp_sides(self):
#         for i in range(self.n):
#             print("Side", i + 1, "is", self.sides[i])
            
# t = Polygon(3)
# print(t.sides)
# t.input_sides()
# t.disp_sides()

A triangle is a polygon with 3 sides. So, we can create a class called Triangle which inherits from Polygon. This makes all the attributes of Polygon class available to the Triangle class.
We don't need to define them again (code reusability). Triangle can be defined as follows.

In [None]:
# class Polygon:
#     # Creates a Polygon with the specified amount of sides
#     # Each side it's given the default value 0 
#     # So it looks like this 
#     # shape = Polygon(4) 
#     # >>> shape.sides = [0,0,0,0]  
#     def __init__(self, no_of_sides):
#         self.n = no_of_sides
#         self.sides = [0 for i in range(no_of_sides)]
        
#     # Defines each side's longitude 
#     # Rewrites the 
#     def input_sides(self):
#         self.sides = [float(input("Enter side " + str(i + 1) + ":")) for i in range(self.n)]

#     # Displays the longitude of each side
#     def disp_sides(self):
#         for i in range(self.n):
#             print("Side", i + 1, "is", self.sides[i])


# class Triangle(Polygon):
#     def __init__(self):
#         Polygon.__init__(self, 3)

#     def find_area(self):
#         a, b, c = self.sides  ### FMI: See Packing & Unpacking
        
#         # FMI on Area of a Triangle using it's perimter, see: https://www.cuemath.com/measurement/area-of-triangle-with-3-sides/
        
#         # Formula: Area = sqrt( s(s-a)(s-b)(s-c) ) 
#         # Where: Semi Perimeter, s = (a + b + c) / 2 
#         # and: a, b, c are the sides of the triangle.
        
#         # calculate the semi-perimeter
#         s = (a + b + c) / 2
#         area = ( s*(s-a)*(s-b)*(s-c) ) ** 0.5 # ** it's the Exponentiation operator 
#         ### FMI on operators, see: https://www.w3schools.com/python/python_operators.asp ###
        
#         print('The area of the triangle is %0.2f' %area)
        
# some_triangle = Triangle() 
# some_triangle.input_sides()
# some_triangle.find_area()

        
        

### Example 4 - Encapsulation
Using OOP in Python, we can restrict access to methods and variables. This prevents data from direct modification which is called encapsulation. In Python, we denote private attributes using underscore as the prefix i.e single _ or double __

In [None]:
class Computer:

    def __init__(self):
        self.__price = 900

    def show_selling_price(self):
        print(f"Selling Price: {self.__price}")

    def set_price(self, selling_price):
        self.__price = selling_price

        
# We instanciate a Computer object.        
c = Computer()
c.show_selling_price()

# Here, we have tried to modify the value of __maxprice outside of the class. 
# However, since __max_price is a private variable, this modification is not seen on the output.
c.__max_price = 1500
c.show_selling_price()

# So the only way to chance the __max_price variable it's to use a setter function
c.set_price(1000)
c.show_selling_price()

### ***RABBIT HOLE*** Example 5 - Polymorphism

## Python Class
Python is an object-oriented programming language. Unlike procedure-oriented programming, where the main emphasis is on functions, object-oriented programming stresses on objects.

An object is simply a collection of data (variables) and methods (functions) that act on those data. Similarly, a class is a blueprint for that object.

We can think of a class as a sketch (prototype) of a house. It contains all the details about the floors, doors, windows, etc. Based on these descriptions we build the house. House is the object.

As many houses can be made from a house's blueprint, we can create many objects from a class. An object is also called an instance of a class and the process of creating this object is called instantiation.

### Defining a Class in Python
Like function definitions begin with the def keyword in Python, class definitions begin with a class keyword.

The first string inside the class is called docstring and has a brief description of the class. Although not mandatory, this is highly recommended. FMI: https://www.programiz.com/python-programming/docstrings, https://peps.python.org/pep-0257/

Here is a simple class definition.

In [None]:
class MyNewClass:
    '''This is a docstring. I have created a new class'''
    pass

## Class Inheritance
Inheritance is a powerful feature in object oriented programming.

It refers to defining a new class with little or no modification to an existing class. The new class is called derived (or child) class, and the one from which it inherits is called the base (or parent) class. 

**Python Inheritance Syntax**

In [None]:
class BaseClass:
    # Body of base class
    pass

class DerivedClass(BaseClass):
    # Body of derived class
    pass

### Method Overriding
To demonstrate the use of inheritance, let us re-use the previous inheritance example.


In [None]:
# class Polygon:
#     # Creates a Polygon with the specified amount of sides
#     # Each side it's given the default value 0 
#     # So it looks like this 
#     # shape = Polygon(4) 
#     # >>> shape.sides = [0,0,0,0]  
#     def __init__(self, no_of_sides):
#         self.n = no_of_sides
#         self.sides = [0 for i in range(no_of_sides)]
        
#     # Defines each side's longitude 
#     # Rewrites the 
#     def input_sides(self):
#         self.sides = [float(input("Enter side " + str(i + 1) + ":")) for i in range(self.n)]

#     # Displays the longitude of each side
#     def disp_sides(self):
#         for i in range(self.n):
#             print("Side", i + 1, "is", self.sides[i])


# class Triangle(Polygon):
#     def __init__(self):
#         Polygon.__init__(self, 3)

#     def find_area(self):
#         a, b, c = self.sides  ### FMI: See Packing & Unpacking
        
#         # FMI on Area of a Triangle using it's perimter, see: https://www.cuemath.com/measurement/area-of-triangle-with-3-sides/
        
#         # Formula: Area = sqrt( s(s-a)(s-b)(s-c) ) 
#         # Where: Semi Perimeter, s = (a + b + c) / 2 
#         # and: a, b, c are the sides of the triangle.
        
#         # calculate the semi-perimeter
#         s = (a + b + c) / 2
#         area = ( s*(s-a)*(s-b)*(s-c) ) ** 0.5 # ** it's the Exponentiation operator 
#         ### FMI on operators, see: https://www.w3schools.com/python/python_operators.asp ###
        
#         print('The area of the triangle is %0.2f' %area)
        
# some_triangle = Triangle() 
# some_triangle.input_sides()
# some_triangle.find_area()

In [None]:
# isinstance(some_triangle, Triangle)

# isinstance(some_triangle, Polygon)

# isinstance(some_triangle, object)

# # Similarly, issubclass() is used to check for class inheritance. 

# issubclass(Polygon,Triangle)

# issubclass(Triangle, Polygon)

## Multiple Inheritance

A clas can be derived from more than one base class in Python, similar to C++. This is called multiple inheritnace.

In multiple inheritance, the features of all the base classes are inherited into the derived class. The syntax for multiple inheritance is similar to single inheritance.

Example:

In [None]:
class Base1:
    pass

class Base2:
    pass

class MultiDerived(Base1, Base2):
    pass

![image.png](attachment:1947fef7-f6f7-4720-926a-d477eba3a52b.png)

### Multilevel Inheritance
We can also inherit from a derived class. This is calles multilevel inheritance. It can be og any depth in Python.

In multilevel inheritance, features of the base class and the derived class are inherited into the new derived class.

An example with corresponding visualization is given below. 

In [None]:
class Base:
    pass

class Derived1(Base):
    pass

class Derived2(Derived1):
    pass

Here, the Derived1  class is derived from the Base class, and the Derived2 class is derived from the Derived1 class.
![image.png](attachment:82709f76-5efd-402b-953c-8c722910d630.png)


### Method Resolution Order in Python

Every class in Python is derived from the *object* class. It is the most base type in Python.

So technically, all other classes, either built-in or user-defined, are derived classes and all objects are instances of the *object* class

In [None]:
print(issubclass(list,object))

print(isinstance(5.5,object))

print(isinstance("Hello",object))

### ***RABBIT HOLE*** - Monotonicity.

## Operator Overloading

You can change the meaning of an operator in Python depending upon the operands used. 

Python operators work for built-in classes. But the same operator behaves differently with different types. For example, the + operator will perform arithmetic addition on two number, merge two lists, or concatenate two strings.

This feature in Python that allows the same operator to have different meaning according to the context is called operator overloading.

So what happens when we use them with objects of a user-defined class? Let us consider the following class, which tries to simulate a point in 2-D coordinate system.



In [None]:
class Point:
    def __init__(self, x=0, y=0):
        self.x = x
        self.y = y
        
p1 = Point(1, 2)
p2 = Point(3, 4) 

# print(p1 + p2)
# will print: 
# TypeError: unsupported operand type(s) for +: 'Point' and 'Point'

Here, we can see that a TypeError was raised, since Python didn't know how to add two Point objects together.

However, we can achieve this task in Python through operator overloading. But firs, let's get a notion about special functions. 

### Python Special Functions
Class functions that begin with double underscore are called special functions in Python.

These functions are not the typical functions that we define for a class. the __ init __ () function we defined above is one of them. It gets called every time we create a ne object of that class. 

FMI on Python Special Functions: 
https://docs.python.org/3/reference/datamodel.html#special-method-names

Using special functions, we can make our class compatible with built-in functions

In [None]:
p1 = Point(2,3)
print(p1)

Suppose we want the print() function to print the coordinates of the Point object instead of what we got. We can define a __ str __ () method in our class that controls how the object gets printed. Let's look at how we can achieve this.

In [None]:
class Point: 
    def __init__(self, x = 0, y = 0):
        self.x = x
        self.y = y
    def __str__(self):
        return f"({self.x}, {self.y})"
    
p1 = Point(2, 3)
print(p1)

Now back to operator Overloading.
### Overloading the + Operator

To overload the + operator, we will need to implement __ add __ () function. 

**With great power comes great responsibility. We can do whatever we like inside this function.**

But, it makes more sense to return a Point object of the coordinate sum.

In [None]:
class Point:
    def __init__(self, x=0, y=0):
        self.x = x
        self.y = y
    
    def __str__(self):
        return f"({self.x}, {self.y})"
    
    def __add__(self, other):
        x = self.x + other.x
        y = self.y + other.y
        return Point(x, y)
    
p1 = Point(1, 2)
p2 = Point(2, 3)

print(p1+p2)

<div class="table-responsive">
	<table border="0"><tbody><tr><th>Operator</th>
				<th>Expression</th>
				<th>Internally</th>
			</tr><tr><td>Addition</td>
				<td><code>p1 + p2</code></td>
				<td><code>p1.__add__(p2)</code></td>
			</tr><tr><td>Subtraction</td>
				<td><code>p1 - p2</code></td>
				<td><code>p1.__sub__(p2)</code></td>
			</tr><tr><td>Multiplication</td>
				<td><code>p1 * p2</code></td>
				<td><code>p1.__mul__(p2)</code></td>
			</tr><tr><td>Power</td>
				<td><code>p1 ** p2</code></td>
				<td><code>p1.__pow__(p2)</code></td>
			</tr><tr><td>Division</td>
				<td><code>p1 / p2</code></td>
				<td><code>p1.__truediv__(p2)</code></td>
			</tr><tr><td>Floor Division</td>
				<td><code>p1 // p2</code></td>
				<td><code>p1.__floordiv__(p2)</code></td>
			</tr><tr><td>Remainder (modulo)</td>
				<td><code>p1 % p2</code></td>
				<td><code>p1.__mod__(p2)</code></td>
			</tr><tr><td>Bitwise Left Shift</td>
				<td><code>p1 &lt;&lt; p2</code></td>
				<td><code>p1.__lshift__(p2)</code></td>
			</tr><tr><td>Bitwise Right Shift</td>
				<td><code>p1 &gt;&gt; p2</code></td>
				<td><code>p1.__rshift__(p2)</code></td>
			</tr><tr><td>Bitwise AND</td>
				<td><code>p1 &amp; p2</code></td>
				<td><code>p1.__and__(p2)</code></td>
			</tr><tr><td>Bitwise OR</td>
				<td><code>p1 | p2</code></td>
				<td><code>p1.__or__(p2)</code></td>
			</tr><tr><td>Bitwise XOR</td>
				<td><code>p1 ^ p2</code></td>
				<td><code>p1.__xor__(p2)</code></td>
			</tr><tr><td>Bitwise NOT</td>
				<td><code>~p1</code></td>
				<td><code>p1.__invert__()</code></td>
			</tr></tbody></table></div>

## Concepts & Details

### Class attributes vs Instance attributes

- Class attributes are the variables defined directly in the class that are shared by all objects of the class.

- Instance attributes are attributes or properties attached to an instance of a class. Instance attributes are defined in the constructor.

<table >
                <thead>
                    <tr>
                        <th class="w-50">
                            Class Attribute
                        </th>
                        <th class="w-50">
                            Instance Attribute
                        </th>
                    </tr>
                </thead>
                <tbody>
                    <tr>
                        <td>
                            Defined directly inside a class.
                        </td>
                        <td>
                            Defined inside a constructor using the <code>self</code> parameter.
                        </td>
                    </tr>
                    <tr>
                        <td>
                            Shared across all objects.
                        </td>
                        <td>
                            Specific to object.
                        </td>
                    </tr>
                    <tr>
                        <td>
                            Accessed using class name as well as using object with dot notation, e.g. <code>classname.class_attribute</code> or <code>object.class_attribute</code>
                        </td>
                        <td>
                            Accessed using object dot notation e.g. <code>object.instance_attribute</code>
                        </td>
                    </tr>
                    <tr>
                        <td>Changing value by using <code>classname.class_attribute = value</code> will be reflected to all the objects.</td>
                        <td>Changing value of instance attribute will not be reflected to other objects.</td>
                    </tr>
                    </tbody>
                </table>

In [None]:
# Example: 
# first: We define a Student class with a class attribute called 'count' which will keep the ammount of students.  
# then: Everytime we create a new Student object the count will be increased by one.
class Student:
    count = 0
    def __init__(self):
        Student.count += 1 

std1=Student()
print(Student.count)

std2 = Student()
print(Student.count)

### Instance Methods

### Static Methods


### Class methods


### Abstract Class


### First Class Objects


## OOP Example 1


# Python - Pandas

## Introduction
Pandas is a Python library used for working with data sets.
It has functions for analyzing, cleaning, exploring, and manipulating data.

The name "Pandas" has a reference to both "Panel Data" and "Python Data Analysis".

Pandas allows to analyze big data and make conclusions based on statistical theories.
Pandas can clean messy data sets, and make them readable and relevant.

**Relevant data is very important in data science**

Install pandas:
C:\Users\Your Name>pip install pandas

## Pandas Series
A Pandas Series is like a column in a table.

It is a one-dimensional array holding data of any type. 
Example:

In [None]:
import pandas as pd
a = [1, 7, 2]
myvar = pd.Series(a)
print(myvar)

### Lables
If nothing else is specified, the values are labeled with their index number. First value has index 0, second value has index 1, etc.

This label can be used to access a specified value.

With the _index_ argument, you can name your own labels.

Example:


In [None]:
import pandas as pd 

a = [1, 7, 2]

myvar = pd.Series(a, index = ["x", "y", "z"])

print(myvar)

# When you have created labels, you can access an item by referring to the label. 

print(myvar["y"])



**Key/Value Objects as Series**

You can also use a key/value object, like a dictionary, when creating a Series.


In [None]:
import pandas as pd 

calories = {"day1": 420, "day2": 380, "day3": 390}

myvar = pd.Series(calories)

print(myvar)

# THe kets of the dictionary become the labels. 

To select only some of the items in the dictionary, use the index argument and specify only the items you want to include in the Series.

In [None]:
import pandas as pd 

calories = {"day1": 420, "day2": 380, "day3": 390}

myvar = pd.Series(calories, index = ["day1","day3"])

print(myvar)

## DataFrames
Data sets in Pandas are usually multi-dimensional tables, called DataFrames. 

Series is like a column, a DataFrame is the whole table. 

Example: 

In [None]:
import pandas as pd

data = {
  "calories": [420, 380, 390],
  "duration": [50, 40, 45]
}

df = pd.DataFrame(data)

print(df)


### Locate Row
As you can see from the result above, the DataFrame is like a table with rows and columns.

Pandas use the __loc__ attribute to return one or more specified row(s)

Example: 


In [None]:
# refer to the row index:
print(df.loc[0])

print("---------------------")

# Return row 0 and 1:
print(df.loc[[0, 1]])

# When using [], the result is a Pandas DataFrame


### Named Indexes
With the index argument, you can name your own indexes. 

In [None]:
import pandas as pd

data = {
  "calories": [420, 380, 390],
  "duration": [50, 40, 45]
}

df = pd.DataFrame(data, index = ["day1", "day2", "day3"])

print(df) 

### Locate Named INdexes
Use the named index in the loc attribute to return the specified row(s) 

In [None]:
# Return day2 
print(df.loc["day2"])

### Load Files into a Data Frame
If your data sets are stored in a file, Pandas can load them into a DataFrame. 

In [None]:
import pandas as pd 

df = pd.read_csv('Materials/calories_data.csv')

print(df)

## Pandas Read CSV
A simple way to store big data sets is to use CSV files. 

CSV files contains plain text and is a wel know format that can be read by everyone including Pandas.

In the next examples we will be using a CSV file called 'calories_data.csv'

Example

In [None]:
import pandas as pd 

df = pd.read_csv('Materials/calories_data.csv')

# print(df.to_string())   # This prints the entire DataFrame
print(df.head())          # This prints only the first five rows


## Read JSON
Big data sts are often store, or extracted as JSON.

JSON is plain text, but has the format of an object, and is well known in the world of programming, including Pandas.

In the next examples we'll be using a JSON file called 'calories_data.json'


In [None]:
import pandas as pd 

df = pd.read_json('Materials/calories_data.json')

print(df.head())

## Analyzing DataFrames
One of the most useds methods for getting a quick overview of the DataFrame is the head() method.

The head() method return the headers and a specified number of rows, starting from the top. 

There's also a tail() method for viewing the last rows of the DataFrame. 

The tail() method returns the headers and a specified number of rows, starting from the bottom. 

examples: 

In [None]:
import pandas as pd 
df = pd.read_csv('Materials/calories_data.csv') 

print(df.head())
print(df.tail())

***Info About the Data*** 
The DataFrames object has a method called info(), that gives you more information about the data set. 
Example: 

In [None]:
print(df.info())

***Null Values***
The info() method also tells us how many Non-Null values therer are present in each column, and in our data set it seems like there are 164 of 169 Non-Null values in the "Calories" column.

Whitch means that there are 5 rows with no value at all, in the "Calories" column, for whatever reason.

Empty values, or Null values can be bad when analyzing data, and you should consider removing rows with empty values. This is a step towards what is called _cleaning data_.

## Data Cleaning

Data cleaning means fixing bad data in your data set.

Bad data could be: 
- Empty cells 
- Data in wrong format
- Wrong data
- Duplicates
- NaN values 

- The data set contains some empty cells ("Date" in row 22, and "Calories" in row 18 and 28).

- The data set contains wrong format ("Date" in row 26).

- The data set contains wrong data ("Duration" in row 7).

- The data set contains duplicates (row 11 and 12).

### Cleaning Empty Cells
#### Remove Rows with dropna()
One wat to deal with empty cells is to remove rows that contain empty cells. 

This is usually ok, since data sets can be very big, and removing a few rows will not have big impact on the result.

Example - Return a new DataFrame wirh no empty cells: 

In [None]:
import pandas as pd 

df = pd.read_csv('Materials/unclean_data_example.csv')

new_df = df.dropna()

print(new_df.info())

# By default, the dropna() method returns a new 
# DataFrame, and will not change the original.

# If you wanted to change the original DataFrame, you could
# use the inplace = True argument
#
# df.dropna(inplace = True)
# 
# will NOT return a new DataFrame, but remove rows from the 
# original.

#### Replace Empty Values with fillna()

Another way of dealing with empty cells is to insert a new value instead.

This way you do not have to delete entire rows jus because some empty cells. 

The fillna() method allows us to replace emty cells with a value: 

Example:

In [None]:
import pandas as pd 

df = pd.read_csv('Materials/unclean_data_example.csv')

df.fillna(130, inplace = True) 
df.info()

#### Replace only for specified columns
To only replace empry values for one column, specify the column name for the DataFrame: 


In [None]:
# Replace NULL values in the "Calories" column with the number 130: 
import pandas as pd 

df = pd.read_csv('Materials/unclean_data_example.csv')

df["Calories"].fillna(130, inplace = True)

df["Calories"]

#### Replace Using Mean, Median or Mode

A common way way to replace emptycells is to calculate the mean, median or mode value of the column.

Pandas uses the mean(), median(), and mode() methods to calculate the respective values for a specified column: 

Example:

Calculate the MEAN, and replace any emty values with it: 

In [None]:
df = pd.read_csv('Materials/unclean_data_example.csv')

calories_mean = df["Calories"].mean() 

df["Calories"].fillna(calories_mean, inplace = True) 

print("Calories mean:", calories_mean)

print(df["Calories"])

### Cleaning Data of Wrong Format

Cells with data of wrong format can make it difficult, or even impossible to analyze data.

To fix it, you have two options: Remove the rows, or convert all cellss in the columns to the same format.

### Convert into a correct format

In our data Frame, we have two cells with the wrong format. Row 22 and row 26

Pandas has a to_datetime() method for this. 

Example:



In [None]:
import pandas as pd

df = pd.read_csv('Materials/unclean_data_example.csv')

df['Date'] = pd.to_datetime(df['Date'])

print(df.to_string())

This will fix the row 26, but the 22 has a NaT value (Not a Time), which can be handled as a NULL value, and we can remove the row by using the dropna() method

In [None]:
df.dropna(subset=['Date'], inplace = True)

### Fixing wrong data

"Wrong data" does not have to be "empty cells" or "wrong format", it can just be wrong, like if someone registered "199" instead of "1.99".

Sometimes you can spot wrong data by looking at the data set, because you have an expectation of what it should be.

If you take a look at our data set, you can see that in row 7, the duration is 450, but for all the other rows the duration is between 30 and 60.

It doesn't have to be wrong, but taking in consideration that this is the data set of someone's workout sessions, we conclude with the fact that this person did not work out in 450 minutes.

#### Replacing values

One way to fix wrong values is to replace them with something else. 

In our example, it is mot likely a typo, and the value should be "45" instead of "450", and we could just insert 45 in row 7

In [None]:
df.loc[7, 'Duration'] = 45

To replace wrong data for larger data sets you can create som rules, and replace any values that are outside of the boundaries

Example: 

Loop through all values in the "Duration" column. 

If the value is higher than 120, set it 120:

In [None]:
for row in df.index: 
    if df.loc[row, "Duration"] > 120: 
        df.loc[row, "Duration"] = 120

#### Removing Rows
Another way of handling wrong data is to remove the rows that contains wrong data

In [None]:
for x in df.index:
  if df.loc[x, "Duration"] > 120:
    df.drop(x, inplace = True)

### Finding and Removing Duplicates

To discover duplicates, we can use the duplicated() method.
This method returns a boolean value for each row. 


In [None]:
print(df.duplicated())

To remove duplicates, use the drop_duplicates() method

In [None]:
df.drop_duplicates(inplace = True)
print(df.duplicated())

## Data Correlations


**Finding Relationships**
A great aspect of the Pandas module is the corr() method. 

The corr)= method calculates the relationship between each column in your data set.


In [None]:
import pandas as pd
df = pd.read_csv("Materials/correlation_example_data.csv")
df.corr()

**Result Explained**
The Result of the corr() method is a table with a lot of numbers that represents how well relationship is between two columns. 

The number varies from -1 to 1. 

1 means that there is a 1 to 1 relationship, and for this data set, each time a value went up in the first column, the other one went up as well.

0.9 is also a good relationship, and if you increase one value, the other will probably increase as well.

-0.9 would be just as good relationship as 0.9, but if you increase one value, the other will probably go down. 

0.2 means NOT a good relationship, meaning that if one value goes up does not mean that the other will. 


## Pandas Plotting
Pandas uses the plot() method to create diagrams. 

We can use Pyplot, a submodule of the Matplotlib library to visualize the diagram on the screen.

Example: 

Import pyplot from Matplotlib and visualize our DataFrame:

In [None]:
import pandas as pd 
import matplotlib.pyplot as plt

df = pd.read_csv('Materials/calories_data.csv') 

df.plot()
plt.show()



### Scatter Plot

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv('Materials/calories_data.csv')

df.plot(kind = 'scatter', x = 'Duration', y = 'Calories')

plt.show()

### Histogram

In [None]:
df["Duration"].plot(kind = 'hist')

# Python - NumPy

NumPy is a Python library used for working with arrays.

It also has functions for working in domain of linear algebra, fourier transform, and matrices.

NumPy stands for Numerical Python.

**Why use NumPy?**

In Python we have lists thtat serve the purpose of arrays, but they are slow to process.

NumPy aims to provide an array object that is up to 50x faster than traditional Python lists.

The array object in NumPy is called ___ndarray___, it provides a ot of supporting functions that make working with ___ndarray___ very easy. 

Arrays are very frequently used in data science, where speed and resources are very important.

**Why is NumPy faster thab lists?**
NumPy arrays are stored at one continuos place in memory unlike lists, so processes can access and manipulate them very efficiently. 

This behavior is called locality of reference in computer science.

This is the main reason why NumPy is faster than lists. Also it is optimized to work with latests CPU architectures.

## Creating Arrays
We can create a NumPy ___ndarray___ object using the array() function. 

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(arr)
print(type(arr))

To create an ___ndarray___, we can pass a list, tuple or any array-like object into the array() method, and it will be converted into an ndarray

### Dimensions in Arrays
A dimension in arrays is one level of array depth (nested arrays). 

#### 0-D Arrays

0-D arrays, or Scalars, are the elements in an array. Each value in an array is a 0-D array.

Example of 0-D array:

In [None]:
import numpy as np

arr = np.array(42)

print(arr)


#### 1-D Arrays
An array that has 0-D arrays as its elements is called uni-dimensional or 1-D array.

These are the most common and basic arrays. 

Example of a 1-D array:

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(arr)

#### 2-D Arrays
An array that has 1-D arrays as its elements is called a 2-D array.

These are often used to represent matrix or 2nd order tensor

- **Note:** NumPy has a whole sub module dedicated towards matrix operations called numpy.mat

Example of a 2-D array: 

In [None]:
import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr)

#### 3-D Arrays
An array that has 2-D arrays (matrices) as its elements is called 3-D array.

These are often used to represent a 3rd order tensor

Example: 

In [None]:
import numpy as np

arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(arr)

### Check number of dimensions?
NumPy Arrays provides the ndim attribute that returns an integer that tells us how many dimensions the array have. 

Example: 

In [None]:
import numpy as np

a = np.array(42)
b = np.array([1, 2, 3, 4, 5])
c = np.array([[1, 2, 3], [4, 5, 6]])
d = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(a.ndim)
print(b.ndim)
print(c.ndim)
print(d.ndim)

### Higher Dimensional Arrays
An array can have any number of dimensions. 

When the array is created, you can define the number of dimensions by using the ___ndmin___ argument. 

Example: Create an arrat with 5 dimensions and verify that it has 5 dimensions. 

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4], ndmin=5)

print(arr)
print('number of dimensions :', arr.ndim)

## Array Indexing
Array indexing is the same as accesing an array element. 

You can access an array element by referring to its index number.

The indexes in NumPy arrays start with 0, meaning that the first element has a index 0, and the second has index 1, etc. 

### Access 1-D Arrays

Example:

In [None]:
import numpy as np 

arr = np.array([1, 2, 3, 4])

print(arr[0])
print(arr[2])
print(arr[1] + arr[3]) 


### Access 2-D Arrays.
To access elements from 2-D arrays we can use comma separated integers representing the dimension and the index of the element. 

Think of 2-D arrays like a table with rows and columns, where the row represents the dimension and the index represents the column. 

Example:

In [None]:
import numpy as np

arr = np.array([
    [1,2,3,4,5],
    [6,7,8,9,10]
    ])



print('2nd element on 1st row:', arr[0, 1])
print('5th element on 2nd row:', arr[1, 4])

### Access 3-D Arrays.
To access elements from 3-D arrays we can use comma separated integers representing the dimensions and the index of the element. 

In [None]:
import numpy as np 

arr = np.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]])

print(arr[0,1,2])


### Negative Indexing
Use negative indexing to acces an array from the end. 

Example: 

In [None]:
import numpy as np 

arr = np.array([ [1,2,3,4,5], [6,7,8,9,10] ])

print('Last element from 2nd dim: ', arr[1, -1])

## Array Slicing
Slicing in python means taking elements from one given index to another given index.

We pass slice instead of index like this ___[start:end]___

We can also define the step, like thid: ___[start:end:step]___

If we don't pass start its considered 0. 

If we don't pass end its considered the lenght of array in that dimension.

If we don't pass step its considered 1. 

### Slicing Arrays
Example:

In [None]:
import numpy as np 

arr = np.array([1, 2, 3, 4, 5, 6, 7])

# the result includes the start index, but excludes the end index. #

# Slice elements from index 1 to index 5
print(arr[1:5])

# Slice elements from index 4 to the end of the array
print(arr[4:])

# Slice elements from the beginning to index 4 

print(arr[:4])


### Negative Slicing
Using the minus operator to refer to an index from the end: 

Example: 

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

# Slice from the index 3 from the end to index 1 from the end:
print(arr[-3:-1])


### STEP slicing
Use the step value to determine the step of the slicing: 

Example: 

In [None]:
import numpy as np 

arr = np.array([1, 2, 3, 4, 5, 6, 7])


print(arr[1:5:2])

print(arr[::2])

### Slicing 2-D Arrays
Example:


In [None]:
import numpy as np

arr = np.array([ [1, 2, 3, 4, 5], [6, 7, 8, 9, 10] ])

# From the second element, slice elements from index 1 to index 4 (not included)
print(arr[1, 1:4])

# From both elements, return index 2: 
print(arr[0:2, 2])

# From both elements, slice index 1 to index 4 (not included), this will return 
# a 2-D array

print(arr[0:2, 1:4])


## NumPy Data Types
By default Python have these data typesÑ
- ___strings___: Used to represent text data, the text is given under quote marks. e.g "abCD"
- ___integer___: Used to represent integer numbers. e.g. -1, 3, 5
- ___float___: Used to represent real number. e.g 1.2, -3.4, 42.42. 
- ___boolean___: used to represent True or False.
- ___comples___: used to represent complex number. e.g. 1.0 + 2.0j 

Numpy has some extra data types, and refer to data types with one character, like ___i___ for integers, ___u___ for unsigned integers etc. 

Below is a list of all data types in NumPy and the characters used to represent them. 

- ___i___ - integer
- ___b___ - boolean
- ___u___ - unsigned integer
- ___f___ - float
- ___c___ - complex float
- ___m___ - timedelta
- ___M___ - datetime
- ___O___ - object
- ___S___ - string
- ___U___ - unicode string
- ___V___ - fixed chunk of memory for other type ( void )

### Checking the Data Type of an Array
The NumPy array object has a propery called ___dtype___ that returns the data type of the array:

In [None]:
import numpy as np 

arr = np.array([1, 2, 3, 4])
arr2 = np.array(['apple', 'banana', 'cherry'])

# Get the data type of an array object
print(arr.dtype)
print(arr2.dtype)

### Creating Arrays with a Defined Data Type
We use the array() function to create arrays, this function can take an optional argument dtype that allows us to define the expected data type of the array elements: 


In [None]:
import numpy as np 

# Create an array with data type string:
arr = np.array([1, 2, 3, 4], dtype='S')

print(arr)
print(arr.dtype)

# For i, u, f, S and U we can define size as well. 

# Create an array with data type 4 bytes integer:
arr = np.array([1, 2, 3, 4], dtype='i4')

print(arr)
print(arr.dtype)

## Array Copy vs View
The main difference between a copy and a view of an array is that the copy is a new array, and the view is just a view of the original array.

The copy _owns_ the data and any changes made to the copy will not affect the original array, and any changes made to the original array will not affect the copy. 

The view _dows not own_ the data and any changes made to the view will affect the original array, and any changes made to the original array will afect the view.

### Copy Example
Make a copy, change the original array, and display both arrays:

In [None]:
import numpy as np

arr = np.array( [1, 2, 3, 4, 5] )
x = arr.copy()
arr[0] = 42

print(arr)
print(x)

### View Example
Make a view, change the original array, and display both arrays:

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])
x = arr.view()
arr[0] = 42

print(arr)
print(x)

### Check if Array owns its Data
As mentioned above, copies owns the data, and vies does not own the data, but how can we check this? 

Every Numpy array has the attribute ___base___ that returns ___None___ if the array owns the data.

Otherwise, the base attribute refers to the original object.

In [None]:
import numpy as np 

arr = np.array([1, 2, 3, 4, 5])

x = arr.copy()
y = arr.view()

print(x.base)
print(y.base)

## NumPy Array Shape
The shape of an array is the number of elements in each dimension.

Numpy arrays have an attribute called ___shape___ that return a tuple with each index having the number of corresponding elements. 

In [None]:
import numpy as np 

arr = np.array([ [1, 2, 3, 4], [5, 6, 7, 8] ])

print(arr.shape)

The example above returns ___(2,4)___, which means that the array has 2 dimensions, where the first dimensions has 2 elements, and the second has 4. 

Example: 

create an array with 5 dimensions using ndmin using a vector with values 1,2,3,4 and verify that last dimensions has value 4: 

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4], ndmin=5)

print(arr)
print('Shape:', arr.shape)

Integers at every index tells about the number of elements the corresponding dimension has. 

In the example above at index 4 we hav value 4, so we can say that 5th (4 + 1th) dimension has 4 elements. 

## Array Reshaping
Reshaping means changing the shape of an array.

The shape of an array is the number of elements in each dimension.

By reshaping we can add or remove dimensions or change number of elements in each dimension. 


### Reshape from 1-D to 2-D

Example:

Convert the following 1-D array with 12 elements into a 2-D array.

The outermost dimension will have 4 arrays, each with 3 elements. 

In [None]:
import numpy as np 

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])

new_arr = arr.reshape(4, 3) 

print(new_arr)

### Reshape from 1-d to 3-D
Convert the following 1-D array with 12 elements into a 3-D array.

The outermost dimension will have 2 arrays that contains 3 arrays, each with 2 elements:

In [None]:
import numpy as np 

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])

new_arr = arr.reshape(2, 3, 2)

print(new_arr)

### Can we reshape into any shape?
As long as the elements required for reshaing are equal in both shapes.

We can reshape 8 elements 1D array into 4 elements in 2 rows 2D array, but we cannot reshape it into a 3 elements 3 rows 2D array as that would require 3x3 = 9 elements.

### Unknown Dimension
You are allowed to have one "unkown" dimension.

Meaning that you do not have to specify an exact number for one of the dimensions in the reshape method.

Pass ___-1___ as the value, and NumPy will calculate this number for you. 

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

new_arr = arr.reshape(2, 2, -1)

print(new_arr)

### Flattening the Arrays
Flattening an array means converting a multidimensional array into a 1D array.

We can use ___reshape(-1)___ to do this.

Example: 

In [None]:
import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

newarr = arr.reshape(-1)

print(newarr)

There are a lot of functions for changing the shapes of arrays in numpy ___flatten___, ___ravel___ and also for rearranging the elements ___rot90___, ___flip___, ___fliplr___, ___flipud___, etc. 

## Array Iterating
Iterating means going through elements one by one.

As we deal with multi-dimensional arrays in numpy, we can do this using basic ___for___ loop of python. 



### Iterating 1D arrays
If we iterate on a 1D array it will go through each element one by one. 
Example: 

In [None]:
import numpy as np

arr = np.array([1,2,3])

for x in arr:
    print(x)

### Iterating 2D arrays.
In a 2D array it will go through all the rows.

In [None]:
import numpy as np 

arr = np.array([[1, 2, 3], [4, 5, 6]])

for x in arr:
  print(x)

# To return the actual values, the scalars, we have to 
# iterate the arrays in each dimension

for x in arr:
    for y in x: 
        print(y)

### Iterating 3D arrays
In a 3D array it will go through all the 2D arrays 

In [None]:
import numpy as np

arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])

for x in arr:
  print(x)

# To return the actual values, we have to iterate the arrays in each dimension. 
for x in arr:
  for y in x:
    for z in y:
      print(z)


### Iterating Arrays Using nditer()
The function ___nditer()___ is a helping functino that can be used from very basic to very advanced iterations. It solves some basic issues which we face in iteration. 

#### Iterating on Each Scalar element.
In basic ___for___ loops, iterating through each scalar of an array we need to use n ___for___ loops which can be difficult to write for arrays with vey high dimensionality. 

In [None]:
import numpy as np

arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])

for x in np.nditer(arr):
    print(x)

#### Iterating Array with different Data Types
We can use ___op_dtuypes___ argument and pass it the expected datatype to change the datatype of elements while iterating.

NumPy does not change the data type of the element in-place (where the element is in array) so it needs some other space to perform this acction, that extra space is called buffer, and in order to enable it in nditer() we pass ___flags=['buffered']___

In [None]:
import numpy as np 

arr = np.array([1,2,3])

for x in np.nditer(arr, flags=['buffered'], op_dtypes=['S']):
    print(x)

#### Iterating with different Step Size
We can use filtering and followed by iteration.

In [None]:
import numpy as np

arr = np.array([ [1, 2, 3, 4], [5, 6, 7, 8] ])

for x in np.nditer(arr[:, ::2]):
    print(x)

#### Enumrated Iteration using ndenumerate()
Enumeration means mentioning sequence number of somethings one by one. 

Sometimes we require corresponding index of the element while iteratin, the ndenumerate() method can be used for those usecases.

In [None]:
import numpy as np 

# For 1D arrays
arr = np.array([1, 2, 3])
for idx, x in np.ndenumerate(arr):
    print(idx, x)
    
# For 2D arrays
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
for idx, x in np.ndenumerate(arr):
    print(idx, x)

## Joining Arrays
Joining means putting contents of two or more arrays ina single array.

In SQL we join tables based on a key, whereas in NumPy we join arrays by axis.

Wepass a sequence of arrays that we want to join to the ___concatenate()___ function, along with the axis. If axis is not explicitly passed, it is taken as 0.

### using concatenate()

In [None]:
import numpy as np 

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

arr = np.concatenate((arr1, arr2))

print(arr)

In [None]:
import numpy as np

arr1 = np.array([[1, 2], [3, 4]])

arr2 = np.array([[5, 6], [7, 8]])

arr = np.concatenate((arr1, arr2), axis=1)

print(arr)

### Using stack()
Stacking is same as concatenation, the only difference is that stacking is done along a new axis.

We can concatenate two 1D arrays along the second axis which would result in putting them one over the ther. ie. stacking.

We pass a sequence of arrays that we want to join to the stack() method along with the axis. If axis is not explicitly passed it is taken as 0. 

In [None]:
import numpy as np

arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.stack((arr1, arr2), axis=1)

print(arr)

#### Stacking Along Rows
NumPy provides a helper function: ___hstack()___ to stack along rows.

In [None]:
import numpy as np

arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.hstack((arr1, arr2))

print(arr)

#### Stacking along Columns

In [None]:
import numpy as np

arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.vstack((arr1, arr2))

print(arr)

#### Stacking along Height (depth)
dstack() to stack along height, which is the same as depth.

In [None]:
import numpy as np

arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.dstack((arr1, arr2))

print(arr)

## Splitting Arrays
Splitting is reverse operation of Joining.

We use ___array_split()___ for splitting arrays, we pass it the array we cant to split and the number of splits. 

In [None]:
import numpy as np 

arr = np.array([1, 2, 3, 4, 5, 6])

new_arr = np.array_split(arr, 3)

# The return value is an array containing three arrays.
print(new_arr)

# If the array has less elements than required, it will adjust form the end accordingly

new_arr = np.array_split(arr, 4)

print(new_arr)


### Split Into Arrays
The return value of the ___array_split()___ method is an array containing each of the split as an array.

If you split an array into 3 arrays, you can access them from the result just like any array element:

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])

new_arr = np.array_split(arr, 3)

print(new_arr[0])
print(new_arr[1])
print(new_arr[2])


### Splitting 2-D arrays
Use the same syntax when splitting 2D arrays.

Use the array_split() method, pass in the array you want to split and the number of splits you want to do. 

In [None]:
import numpy as np

arr = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12]])

newarr = np.array_split(arr, 3)

# Returns three 2D arrays
print(newarr)




# In addition, you can specify which axix you want to do the split around.

arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])


newarr = np.array_split(arr, 3, axis=1)

print(newarr)


## Searching Arrays
You can search an array for a certain value, and return the indexes that get a match

To search an array, use the ___where()___ method.

Example:

In [None]:
import numpy as np 

arr = np.array([1, 2, 3, 4, 5, 4, 4])


# Find the indexes where the value is 4
x = np.where(arr == 4)

# Find the indexes where the value is even 
y = np.where(arr%2 == 0)

# Find the indexes where the values are odd
z = np.where(arr%2 == 1)


print(x)
print(y)
print(z)

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])


x = np.where(arr%2 == 0)

print(x)

### Search Sorted

There is a method called ___searchsorted()___ which performs a binary search in the array, and returns the index where the specified value would be inserted to maintain the search order. 

In [None]:
import numpy as np 

arr = np.array([6, 7, 8, 9])

x = np.searchsorted(arr, 7)

print(x)

Example explained: 

The number 7 should be inserted on index 1 to remain the sort order. 

The method starts the search from the left and returns the first index where the number 7 is no longer larger than the next value.

**Search From the Right Side**

By default the left most index is returned, but we can give ___side='right'___ to return the right most index instead

In [None]:
import numpy as np

arr = np.array([6, 7, 8, 9])

x = np.searchsorted(arr, 7, side='right')

print(x)

Example explained:

The number 7 should be inserted on index 2 to remain the sort order.

The method starts the search from the right and returns the first index where the number 7 is no longer less than the next value.

**Multiple Values**

To search for more than one value, we can use an array with the specified values.

Example - Find the indexes where the values 2, 4, and 6 should be inserted:

In [None]:
import numpy as np

arr = np.array([1, 3, 5, 7])

x = np.searchsorted(arr, [2, 4, 6])

# The return value is an array, containing the three indexes 
# where 2, 4, 6 would be inserted in the original array
print(x)

## Sorting Arrays
Sorting means putting elements in an _ordered sequence_

_Ordered Sequence_ is any sequence that has an order corresponding to elements, like numeric or alphabetical, ascending or descending. 

The NumPy ndarray object has a function called ___sort()___, that will sort a specified. 

In [None]:
import numpy as np

arr = np.array([3, 2, 0, 1])

print(np.sort(arr))

You can also sort arrays of strings, or any other data type: 

In [None]:
import numpy as np

arr = np.array(['banana','cherry','apple'])

print(np.sort(arr))

Sort a boolean array

In [None]:
import numpy as np 

arr = np.array([True, False, True])

print(np.sort(arr))

### Sorting a 2D array

If you use the sort() method on a 2d array, both arrays will be sorted:

In [None]:
import numpy as np 

arr = np.array([ [3, 2, 4], [5, 0, 1] ])

print(np.sort(arr))

## Filtering Arrays
Getting some elements out of an existing array and creating  a new array out of them is called filtering.

In NumPy, you filter an array using a __boolean index list__

___
A boolean index list is a list of booleans correspondig to indexes in the array
___

If the value ar an index is ___True___ that element is contained in the filtered array, if the value at that index is ___False___ that element is excluded from the filtered array.


In [None]:
import numpy as np

arr = np.array([41, 42, 43, 44])

x = [True, False, True, False]

new_arr = arr[x]

print(new_arr)


### Creating the Filter Array
In the example above we hard-coded the True and False values, but the common use is to create a filter array based on conditions.

Example - Create a filter array that will return only values higher than 42:

In [None]:
import numpy as np

arr = np.array([41, 42, 43, 44])

# Create an empty list
filter_arr = []

# Go through each element in arr
for element in arr: 
    if element > 42: 
        filter_arr.append(True)
    else: 
        filter_arr.append(False)

new_arr = arr[filter_arr]

print(arr)
print(filter_arr)
print(new_arr)

Example - Create a filter array that will return only even elements from the original array


In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

filter_arr = []

for element in arr:
    if element % 2 == 0:
        filter_arr.append(True)
    else:
        filter_arr.append(False)

new_arr = arr[filter_arr]

print(arr)
print(filter_arr)
print(new_arr)


### Creating Filter Directly From Array
The above example is quite a common task in NumPy and NumPy provides a nice way to tackle it. 

We can directly substitue the array instead of the iterable variable in our condition and it will work just as we expect it to

In [None]:
import numpy as np 

arr = np.array([41, 42, 43, 44])

filter_arr = arr > 42

new_arr = arr[filter_arr]

print(filter_arr)
print(new_arr)

## NumPy - ufunc
ufuncs stands for "Universal Functions" and the are NumPy functions that operate on the ___ndarray___ objects. 

ufuncs are used to implement _vectorazation_ in NumPy which is way faster than iterating over elements. 

They also provide broadcasting and additional methods like reduce, accumulate, etc. that are very helpful for computation. 

ufuncs also take additional arguments, like: 

___where___ boolean array or condition defining where the operations should take place.

___dtype___ defining the return type of elements.

___out___ output array where the return value should be copied. 

**What is Vectorization?**

Converting iterative statements into a vector based operation is called vectorization.

It is faster as modern CPUs are optimized for such operations.

In [None]:
# Example without ufunc, we can use Python's built-in zip() method. 

x = [1, 2, 3, 4]
y = [4, 5, 6, 7]
z = []

for i, j in zip(x, y):
    z.append(i + j)
print(z)


NumPy has a ufunc for this, called ___add(x, y)___ that will produce the same result

In [None]:
import numpy as np

x = [1, 2, 3, 4]
y = [5, 6, 7, 8]
z = np.add(x, y)

print(z)

### Simple Arithmetic
You could use arithmetic operators directly between NumPy arrays, but this section discusses an extension of the same where we have function that can take any array-like objects e.g lists, tuples, etc. and perform ***arithmetic conditionally***.

Arithmetic Conditionally: Means that we can define conditions where the arithmetic operation should happen. 

All of the discussed arithmetic functions take a ___where___ parameter in which we can specify that condition

In [None]:
import numpy as np

arr1 = np.array([10, 11, 12, 13, 14, 15])
arr2 = np.array([20, 21, 22, 23, 24, 25])

# Addition
new_arr = np.add(arr1, arr2)
print("Addition: ",new_arr)

# Subtraction 
new_arr = np.subtract(arr1, arr2)
print("Subtraction:", new_arr)

# Multiplication
new_arr = np.multiply(arr1, arr2)
print("Multiplication:", new_arr)

# There's also: Division, Power, Remainder, Quotient and Mod and Absolute Values

### Rounding Decimals
There are primarly five ways of round off decimals in NumPy:
- Truncation
- Fix
- Rounding
- Floor
- Ceil

In [None]:
import numpy as np

arr = np.array([-3.1666, 3.6667])
print(arr)

# trunc() | fix() - Remove the decimals, and return the float number closest to zero.
print(np.trunc(arr))
print(np.fix(arr))

# around() increments preceding digit or decimal by 1 if >= 5 else do nothing
print(np.around(arr))

# floor() rounds off decimal to nearest lower integer. 
print(np.floor(arr))

# ceil() rounds off decimal to nearest upper inteer.
print(np.ceil(arr))

### Logs
NumPy procides functions to perform log at the base 2, e and 10.

All of the los gunctions will place -inf or inf in the elements if the log can not be computed.

In [None]:
import numpy as np

arr = np.arange(1, 10) # Returns an array with integers starting from 1 to 10(not included)

# Log base 2
print(np.log2(arr))

# Log base 10
print(np.log10(arr))

# Natural Log
print(np.log(arr))


# Python - Scipy

Scypy is  a scientific computation library that uses NumPy underneath. 

SciPy stands for Scientific Python.

It provides more utility functions for optimization, stats and signal processing.

## Constants in SciPy
As Scipy is more focused on scientific implementations, it provides many built-in scientific constans. 

These constants can be helpful when you are working with Data Science. 

Example:

In [None]:
from scipy import constants

print(constants.pi)

A list of all units under the constants module can be seen using the ___dir()___ function.

### Constant Units

In [None]:
from scipy import constants

print(dir(constants))

### Unit Categories
The units are placed under these categories:
- Metric
- Binary
- Mass
- Angle
- Time
- Lenght
- Pressure
- Volume
- Speed
- Temperature
- Energy
- Force

### Metric Prefixes
Return the specified unit in meter

In [None]:
print(constants.yotta)    #1e+24
print(constants.zetta)    #1e+21
print(constants.exa)      #1e+18
print(constants.peta)     #1000000000000000.0
print(constants.tera)     #1000000000000.0
print(constants.giga)     #1000000000.0
print(constants.mega)     #1000000.0
print(constants.kilo)     #1000.0
print(constants.hecto)    #100.0
print(constants.deka)     #10.0
print(constants.deci)     #0.1
print(constants.centi)    #0.01
print(constants.milli)    #0.001
print(constants.micro)    #1e-06
print(constants.nano)     #1e-09
print(constants.pico)     #1e-12
print(constants.femto)    #1e-15
print(constants.atto)     #1e-18
print(constants.zepto)    #1e-21

### Binary Prefixes
Return the specified unit in bytes

In [None]:
from scipy import constants

print(constants.kibi)    #1024
print(constants.mebi)    #1048576
print(constants.gibi)    #1073741824
print(constants.tebi)    #1099511627776
print(constants.pebi)    #1125899906842624
print(constants.exbi)    #1152921504606846976
print(constants.zebi)    #1180591620717411303424
print(constants.yobi)    #1208925819614629174706176

### Mass
Returns the specified unit in kg

In [None]:
from scipy import constants

print(constants.gram)        #0.001
print(constants.metric_ton)  #1000.0
print(constants.grain)       #6.479891e-05
print(constants.lb)          #0.45359236999999997
print(constants.pound)       #0.45359236999999997
print(constants.oz)          #0.028349523124999998
print(constants.ounce)       #0.028349523124999998
print(constants.stone)       #6.3502931799999995
print(constants.long_ton)    #1016.0469088
print(constants.short_ton)   #907.1847399999999
print(constants.troy_ounce)  #0.031103476799999998
print(constants.troy_pound)  #0.37324172159999996
print(constants.carat)       #0.0002
print(constants.atomic_mass) #1.66053904e-27
print(constants.m_u)         #1.66053904e-27
print(constants.u)           #1.66053904e-27

### Angle
Returns the specified unit in radians

In [None]:
from scipy import constants

print(constants.degree)     #0.017453292519943295
print(constants.arcmin)     #0.0002908882086657216
print(constants.arcminute)  #0.0002908882086657216
print(constants.arcsec)     #4.84813681109536e-06
print(constants.arcsecond)  #4.84813681109536e-06

### Time
Returns the specified unit in seconds


In [None]:
from scipy import constants

print(constants.minute)      #60.0
print(constants.hour)        #3600.0
print(constants.day)         #86400.0
print(constants.week)        #604800.0
print(constants.year)        #31536000.0
print(constants.Julian_year) #31557600.0

### Lenght
Returns the specified unit in meters

In [None]:
from scipy import constants

print(constants.inch)              #0.0254
print(constants.foot)              #0.30479999999999996
print(constants.yard)              #0.9143999999999999
print(constants.mile)              #1609.3439999999998
print(constants.mil)               #2.5399999999999997e-05
print(constants.pt)                #0.00035277777777777776
print(constants.point)             #0.00035277777777777776
print(constants.survey_foot)       #0.3048006096012192
print(constants.survey_mile)       #1609.3472186944373
print(constants.nautical_mile)     #1852.0
print(constants.fermi)             #1e-15
print(constants.angstrom)          #1e-10
print(constants.micron)            #1e-06
print(constants.au)                #149597870691.0
print(constants.astronomical_unit) #149597870691.0
print(constants.light_year)        #9460730472580800.0
print(constants.parsec)            #3.0856775813057292e+16

### Pressure
Returns the specified unit in pascals


In [None]:
from scipy import constants

print(constants.atm)         #101325.0
print(constants.atmosphere)  #101325.0
print(constants.bar)         #100000.0
print(constants.torr)        #133.32236842105263
print(constants.mmHg)        #133.32236842105263
print(constants.psi)         #6894.757293168361

### Area
Returns the specified unit in square meters


In [None]:
from scipy import constants

print(constants.hectare) #10000.0
print(constants.acre)    #4046.8564223999992

### Volume
Return the specified unit in cubic meters


In [None]:
from scipy import constants

print(constants.liter)            #0.001
print(constants.litre)            #0.001
print(constants.gallon)           #0.0037854117839999997
print(constants.gallon_US)        #0.0037854117839999997
print(constants.gallon_imp)       #0.00454609
print(constants.fluid_ounce)      #2.9573529562499998e-05
print(constants.fluid_ounce_US)   #2.9573529562499998e-05
print(constants.fluid_ounce_imp)  #2.84130625e-05
print(constants.barrel)           #0.15898729492799998
print(constants.bbl)              #0.15898729492799998

### Speed
Return the specified unit in meters per seconds 


In [None]:
from scipy import constants

print(constants.kmh)            #0.2777777777777778
print(constants.mph)            #0.44703999999999994
print(constants.mach)           #340.5
print(constants.speed_of_sound) #340.5
print(constants.knot)           #0.5144444444444445

### Temperature
Return the specified unit in Kelvin


In [None]:
from scipy import constants

print(constants.zero_Celsius)      #273.15
print(constants.degree_Fahrenheit) #0.5555555555555556

### Energy
Return the specified unit in joules

In [None]:
from scipy import constants

print(constants.eV)            #1.6021766208e-19
print(constants.electron_volt) #1.6021766208e-19
print(constants.calorie)       #4.184
print(constants.calorie_th)    #4.184
print(constants.calorie_IT)    #4.1868
print(constants.erg)           #1e-07
print(constants.Btu)           #1055.05585262
print(constants.Btu_IT)        #1055.05585262
print(constants.Btu_th)        #1054.3502644888888
print(constants.ton_TNT)       #4184000000.0

### Power
Returns the specified unit in watts


In [None]:
from scipy import constants

print(constants.hp)         #745.6998715822701
print(constants.horsepower) #745.6998715822701

### Force
Return specified unit in newton


In [None]:
from scipy import constants

print(constants.dyn)             #1e-05
print(constants.dyne)            #1e-05
print(constants.lbf)             #4.4482216152605
print(constants.pound_force)     #4.4482216152605
print(constants.kgf)             #9.80665
print(constants.kilogram_force)  #9.80665

## Scipy Optimizers
Optimizers are a set of procedures defined in SciPy that either find the minimun value of a function, or the root of an ecuation. 

Essentially, all of the algorithms in Machine Learning are nothing more than a complex equation that needs to be minimized with the help of given data. 

### Roots of an Equation
NumPy is capable of finding roots for polynomials and linear equations, but it can not find roots for non linear equations, like this one:
__x + cos(x)__ 

For that you can use SciPy's ___optimize.root___ function. 

This function takes two required arguments: 

- __fun__ - a function representing an equation. 

- __x0__ - an initial guess for the root

The function returns an object with information regarding the solution.

The actual solution is given under attribute ___x___ of the returned object.

Example - Find root of the ecuation ___x + cos(x)___: 

In [None]:
from scipy.optimize import root
from math import cos

def eqn(x):
    return x + cos(x) 

myroot = root(eqn, 0) 

print("Root: ", myroot.x)
print("------------------------")
print(myroot)

### Minimizing a Function
A function, in this context, represents a curve, curves have high points and low points.

High points are called __maxima__

Low points are called __minima__ 

The highest point in the whole curve is called __global maxima__, whereas the rest of them are called __local maxima__. 

The lowest point in the whole curve is called __global minima__, whereas the rest of them are called __local minima__ 

### Finding Minima
We can use ___scipy.optimize.minimize()___ function to minimize the function

The ___minimize___ function takes the following arguments: 

- ___fun___ - a function representing an equation. 
- ___x0___ - an initial guess for the root. 
- ___method___ - name of the method to use. Legal values: 
    'CG',
    'BFGS',
    'Newton-CG',
    'L-BFGS-B',
    'TNC',
    'COBYLA',
    'SLSQP'
- ___callback___ - function called after each iteration of optimization.
- ___options___ - a dictionary defining extra params: 

Example - Minimize the function x^2 + x + 2 with BFGS: 


In [None]:
from scipy.optimize import minimize 

def eqn(x):
    return x**2 + x + 2 

mymin = minimize(eqn, 0, method='BFGS')

print(mymin)

# Python - Data Visualization

## Matplotlib

Matplotlib is a low level graph plotting library in python that serves as a visualization utility. 

### Pyplot
Most of the Matplotlib utilities lies under the ___pyplot___ submodule, and are usually imported under the ___plt___ alias.

Example - Draw a line in a diagram from position (0,0) to position (6,250)

In [None]:
import matplotlib.pyplot as plt
import numpy as np 

x_points = np.array([0, 6])
y_points = np.array([0, 250])

plt.plot(x_points, y_points)
plt.show()

### Plotting
The ___plot()___ function is used to draw points in a diagram.

By default, the ___plot()___ function draws a line from point to point. 

The function takes parameters for specifying points in the diagram. 

Parameter 1 is an array containing the points on the x-axis.

Parameter 2 is an array containing the points on the y-axis.

If we need to plot a line from (1,3) to (8,10), we have to pass two arrays [1,8] and [3,10] to the plot function.

Example - Draw a line in a diagram from position (1,3) to position (8,10). 

In [None]:
import matplotlib.pyplot as plt
import numpy as np

xpoints = np.array([1,8])
ypoints = np.array([3,10])

plt.plot(xpoints, ypoints)
plt.show()

#### Plotting points
To plot only the markers, you can use shortcut string notation paramter 'o', which means 'rings'

Example - Draw two points in the diagram, one at position (1,3) and one in position (8,10)

In [None]:
import matplotlib.pyplot as plt
import numpy as np

xpoints = np.array([1,8])
ypoints = np.array([3,10])

plt.plot(xpoints, ypoints, 'o')
plt.show()

#### Multiple Points
You can plot as many points as you like, just make sure youhave the same number of points on both axis.

Example - Draw a line in a diagram from position (1,3) to (2,8) then to (6,1) and finally to position (8,10):

In [None]:
import matplotlib.pyplot as plt
import numpy as np

xpoints = np.array([1, 2, 6, 8])
ypoints = np.array([3, 8, 1, 10])

plt.plot(xpoints, ypoints)
plt.show()

#### Default X-Points
If we do not specify the points in the x-axis, the will get the default values 0,1,2,3 and so on. 

So if we take the same example as above, and leave out the x-points, the diagram will look like this.

In [None]:
import matplotlib.pyplot as plt
import numpy as np

ypoints = np.array([3, 8,1,10,5,7])

plt.plot(ypoints)
plt.show()

### Markers
You can use the keyword argument ___marker___ to emphasize each point with a specified marker. 

In [None]:
import matplotlib.pyplot as plt
import numpy as np

ypoints = np.array([3, 8, 1, 10])

plt.plot(ypoints, marker = 'o')
plt.show()

plt.plot(ypoints, marker = '*')
plt.show()

#### Format Strings
You can use also use the _shortcut string notation_ parameter to specify the marker. 

This parameter is also called ___fmt___, and is written with this syntax:

___marker|line|color___

Example: 

In [None]:
import matplotlib.pyplot as plt 
import numpy as np

ypoints = np.array([3, 8, 1, 10])

plt.plot(ypoints, 'o:r')
plt.show()

#### Marker Size
You can use the keyword argument ___markersize___ or the shorter version, ___ms___ to set the size of the markers:

In [None]:
import matplotlib.pyplot as plt
import numpy as np

ypoints = np.array([3, 8, 1, 10])

plt.plot(ypoints, marker = 'o', ms = 20)
plt.show()

# You can use markeredgecolor ot mec to set the color of the edge

plt.plot(ypoints, marker = 'o', ms = 20, mec = 'r')
plt.show()

# You can also use the markerfacecolor or mfc to set the color inside the
# edge of the marker.

plt.plot(ypoints, marker = 'o', ms = 20, mfc = 'r')
plt.show()

# You can also use Hexadecimal color values
plt.plot(ypoints, marker = 'o', ms = 20, mec = '#4CAF50', mfc = '#4CAF50')
plt.show()

### Linestyle
You can use the keyword argument ___linestyle__, or shorter ___ls___, to change the style of the plotted line:

Example - Use a dotted line:

In [None]:
import matplotlib.pyplot as plt
import numpy as np

ypoints = np.array([3, 8, 1, 10])

plt.plot(ypoints, linestyle = 'dotted')
plt.show()

# Use a dashed line

plt.plot(ypoints, linestyle = 'dashed')
plt.show()

#### Line Color
You can use the keyword argument ___color___ or the shorter ___c___ to set the color of the line: 

In [None]:
import matplotlib.pyplot as plt 
import numpy as np

ypoints = np.array([3, 8, 1, 10])

plt.plot(ypoints, color = 'r')
plt.show()


#### Line Width
You can use the keyword argument ___linewidth___ or ___lw___ to change the width of the line. 

The value is a floating number, in points. 

Example - Plot with a 20.5pt wide line

In [None]:
import matplotlib.pyplot as plt
import numpy as np

ypoints = np.array([3, 8, 1, 10])

plt.plot(ypoints, linewidth='20.5')
plt.show()

### Multilpe Lines
You can plot as many lines as you like simply adding more ___plt.plot()___ functions: 

In [None]:
import matplotlib.pyplot as plt
import numpy as np

y1 = np.array([3, 8, 1, 10])
y2 = np.array([6, 2, 7, 11])

plt.plot(y1)
plt.plot(y2)

plt.show()

### Create Labels for Plot
With Pyplot, you can use the ___xlabel()___ and ___ylabel()___ functions to set a label for x- and y-axis 

In [None]:
import numpy as np
import matplotlib.pyplot as plt

x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

plt.plot(x, y)

plt.xlabel("Average Pulse")
plt.ylabel("Calorie Burnage")

plt.show()

#### Create a Title for a plot
You can use the ___title()___ function to set a title for the plot. 

In [None]:
import numpy as np
import matplotlib.pyplot as plt

x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

plt.plot(x, y)

plt.title("Sports Watch Data")
plt.xlabel("Average Pulse")
plt.ylabel("Calorie Burnage")

plt.show()

#### Set Font properties for title and lables
You can use the ___fontdict___ parameter in ___xlabel(), ylabel()___ and ___title()___ to set font properties for the title and lables. 

Example: 

In [None]:
import numpy as np
import matplotlib.pyplot as plt

x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

font1 = {'family':'serif','color':'blue','size':20}
font2 = {'family':'serif','color':'darkred','size':15}

plt.title("Sports Watch Data", fontdict = font1)
plt.xlabel("Average Pulse", fontdict = font2)
plt.ylabel("Calorie Burnage", fontdict = font2)

plt.plot(x, y)
plt.show()

#### Position the Title
You can use the ___loc___ parameter in ___title()___ to position the title. 

Legal values are 'left', 'right', and 'center'. Default value is 'center'

In [None]:
import numpy as np
import matplotlib.pyplot as plt

x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

plt.title("Sports Watch Data", loc = 'left')
plt.xlabel("Average Pulse")
plt.ylabel("Calorie Burnage")

plt.plot(x, y)
plt.show()

### Matplotlib Adding Grid Lines
With Pyplot, you can use the ___grid()___ function to add grid lines to the plot.



In [None]:
import numpy as np
import matplotlib.pyplot as plt

x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

plt.title("Sports Watch Data")
plt.xlabel("Average Pulse")
plt.ylabel("Calorie Burnage")

plt.plot(x, y)
plt.grid()

plt.show()

# You can also specify which grid lines to display
plt.plot(x, y)
plt.grid(axis = 'x')
plt.show()


#### Set Grid line properties

In [None]:
import numpy as np
import matplotlib.pyplot as plt

x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

plt.title("Sports Watch Data")
plt.xlabel("Average Pulse")
plt.ylabel("Calorie Burnage")

plt.plot(x, y)

plt.grid(color = 'green', linestyle = '--', linewidth = 0.5)

plt.show()

### Matplotlib Subplot
With the ___subplot()___ function you can draw multiple plots in one figure

The ___subplot()___ function takes three argumet that describes the layout of the figure.

The layout is organized in rows and columns, which are represented by the first and second argument. 

The third argument represents the index of the current plot.

plt.subplot(1, 2, 1)
1 row, 2 columns, an this plot is the first plot.

plt.subplot(1, 2, 2)
1 row, 2 columns, and this plot is the second plot

Example - Draw 2 plots on top of each other:

In [None]:
import matplotlib.pyplot as plt
import numpy as pn

#plot 1:
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])

plt.subplot(2, 1, 1)
plt.plot(x,y)

#plot 2:
x = np.array([0, 1, 2, 3])
y = np.array([10, 20, 30, 40])

plt.subplot(2, 1, 2)
plt.plot(x,y)

plt.show()


# Working with Large Datasets


## Introduction to map
Map is a function for transforming sequences of data. 
A simple application of _map_ is to take a sequence of numbers and transform each number into a larger number. 

-1, 0, 1, 2 >>>> map >>>> add_seven(n) >>>> 6, 7, 8, 9  

_map_ depends on the function provided to it. In this case, it will apply add_seven to each input. 
The output of the _map_ function is another series of equal size.

The essence of map is that you have an input of some length, in this case four, and an ouput of that same length. And each input gets transformed by the same function as all the other inputs. These transformed inputs are then returned as our output.


### First Example

**Sceneario:** "You want to generate a call list for your sales team, but the original developers for your customer sign-up form forgot to build data validation checks into the form. As a result, all the phone numbers are formatted differently. for example, some will be formated nicely (123) 456-7890; some are just numbers 1234567890; some uses dots as separators 123.456.7890; and others trying to be helpful, includ a country code +1 123 456-7890".

First, let's tackle this problem in a way that you're probably familiar: _for_ looping. 

In [None]:
import re

phone_numbers = [
    "(123) 456-7890",
    "1234567890",
    "123.456.7890",
    "+1 123 456-7890"
]

new_numbers = [] 

R = re.compile(r"\d")

# Loops through all the phone numbers
for number in phone_numbers:
    digits = R.findall(number) 
    
    # Gathers the numbers into variables
    area_code = "".join(digits[-10:-7])
    first_3 = "".join(digits[-7: -4])
    last_4 = "".join(digits[-4: len(digits)])
    
    pretty_format = "({}) {}-{}".format(area_code, first_3, last_4)
    
    # Appends the numbers in the right format.
    new_numbers.append(pretty_format)


print(new_numbers)


How do we tackle this wiht _map_? Similarly, but with map, we have to separate this problem into two parts. Let's separate it like this:
- Resolving the formatting of a phone number
- Applying that solution to all the phone numbers we have

In [None]:
import re 

# Creates a class to hold our compiled regular expression
class PhoneFormatter:
    # Creates an initialization method to compile the regular expression.
    def __init__(self):
        self.r = re.compile(r"\d")
        
        
    # Creates a format method to do the formatting.
    def pretty_format(self, phone_number):
        phone_numbers = self.r.findall(phone_number)
        
        # Gathers the numers from the phone number string
        area_code = "".join(phone_numbers[-10:-7])
        first_3 = "".join(phone_numbers[-7:-4])
        last_4 = "".join(phone_numbers[-4:len(phone_numbers)])
        
        # Returns the numbers in the desired "pretty" format.
        return "({}) {}-{}".format(area_code, first_3, last_4)
        



Now that we're able to turn phone numbers of any format into phone numbers in a pretty format, we can combine our class iwth map to apply it to a list of phone numbers of any length. To combine the two, we'll instantiate our class and pass the method as the function that map will apply toa ll the elements of a sequency. We can do that as shown in the following listing. 

In [None]:
phone_numbers = [
    "(123) 456-7890",
    "1234567890",
    "123.456.7890",
    "+1 123 456-7890"
]
P = PhoneFormatter()
print(list(map(P.pretty_format, phone_numbers)))

You'll notice that in this example we were set up perfectly to take advantage of map beacuse we were doing a 1-to-1 transformation. That is, we were transforming each element of a sequence. IN essence we've turned this problem into our middle-school algebra example: applying n+7 to a list of numbers.

We can see the similarities between the two problems. For each problem, we're doing three things: taking a sequence of data, transforming it with some function, and getting the outputs.

The key with map is recognizing situations where we can apply this three-step pattern. Once we start looking for it, we'll start to see it everywhere. 

## Concurrency vs Parallelism

Concurrency involves multiple hobs to take turns accessing the same shared resources. Parallelism is about running several task side by side, like in multiple CPU cores.

Python has different mechanism for implementing concurrency, like Threading, coroutines or async. 

For parallelism exist multiprocessing, which launches multiple instances of the Python interpreter. 

Glosary:
- A program is an executable file which consists of a set of instructions to perform some task and is usually stored on the disk of your computer.
- A process is what we call a program that has been loaded into memory along with all the resources it needs to operate. It has its own memory space.
- A thread is the unit of execution within a process. A process can have multiple threads running as a part of it, where each thread uses the process’s memory space and shares it with other threads.
- Multithreading is a technique where multiple threads are spawned by a process to do different tasks, at about the same time, just one after the other. This gives you the illusion that the threads are running in parallel, but they are actually run in a concurrent manner. In Python, the Global Interpreter Lock (GIL) prevents the threads from running simultaneously.
- Multiprocessing is a technique where parallelism in its truest form is achieved. Multiple processes are run across multiple CPU cores, which do not share the resources among them. Each process can have many threads running in its own memory space. In Python, each process has its own instance of Python interpreter doing the job of executing the instructions.

Further learning: https://www.youtube.com/watch?v=AZnGRKFUU0c

### Thread vs Process
A thread and a process are separate entities in a computer's operating system, but both are used to execute a program.

A process is an instance of a program running on a computer, which has its own memory space, system resources (such as CPU time), and environment variables. Each process runs independently and isolated from other processes, and is treated as a separate program by the operating system.

A thread, on the other hand, is a lightweight and independent unit of execution within a process. Multiple threads can run concurrently within a single process and share the same memory space and system resources

Multithreading vs Multiprocessing
<table><tbody><tr><td><strong>Multithreading in Python</strong></td><td><strong>Multiprocessing in Python</strong></td></tr><tr><td>It Implements the Concurrency.</td><td>It Implements the Parallelism.</td></tr><tr><td>Python does not support multithreading in the case of parallel computing.</td><td>Python supports multiprocessing in the case of parallel computing.</td></tr><tr><td>In multithreading, multiple threads at the same time are generated by a single process.</td><td>In multiprocessing, multiple threads at the same time run across multiple cores.</td></tr><tr><td>Multithreading can not be classified.</td><td>Multiprocessing can be classified such as symmetric or asymmetric.</td></tr></tbody></table>

### Example of Threading vs Multiprocessing.

In [None]:

import time, os
from threading import Thread, current_thread
from multiprocessing import Process, current_process
 
 
COUNT = 200000000
SLEEP = 10
 
def io_bound(sec):
 
    pid = os.getpid()
    threadName = current_thread().name
    processName = current_process().name
 
    print(f"{pid} * {processName} * {threadName} \
        ---> Start sleeping...")
    time.sleep(sec)
    print(f"{pid} * {processName} * {threadName} \
        ---> Finished sleeping...")
 
def cpu_bound(n):
 
    pid = os.getpid()
    threadName = current_thread().name
    processName = current_process().name
 
    print(f"{pid} * {processName} * {threadName} \
        ---> Start counting...")
 
    while n>0:
        n -= 1
 
    print(f"{pid} * {processName} * {threadName} \
        ---> Finished counting...")
 

start = time.time()

# Code snippet for Part 1
# io_bound(SLEEP)
# io_bound(SLEEP)

# Code snippet for Part 2
# t1 = Thread(target = io_bound, args =(SLEEP, ))
# t2 = Thread(target = io_bound, args =(SLEEP, ))
# t1.start()
# t2.start()
# t1.join()
# t2.join()

# Code snippet for Part 3
# cpu_bound(COUNT)
# cpu_bound(COUNT)

# Code snippet for Part 4
# t1 = Thread(target = cpu_bound, args =(COUNT, ))
# t2 = Thread(target = cpu_bound, args =(COUNT, ))
# t1.start()
# t2.start()
# t1.join()
# t2.join()

# Code snippet for Part 5
p1 = Process(target = cpu_bound, args =(COUNT, ))
p2 = Process(target = cpu_bound, args =(COUNT, ))
p1.start()
p2.start()
p1.join()
p2.join()


end = time.time()
print('Time taken in seconds -', end - start)

### Threading Examples

In [None]:
import threading 

start = time.perf_counter()

# A function to do something. 
def do_something():
    print('Sleeping 1 second...')
    time.sleep(1)
    print('Done sleeping...')

t1 = threading.Thread(target=do_something)
t2 = threading.Thread(target=do_something)

t1.start()
t2.start()

t1.join()
t2.join()




finish = time.perf_counter()
print(f'Finished in {round(finish-start, 2)} second(s)')

In [None]:
import concurrent.futures
import time
import threading

start = time.perf_counter()


# A function to do something. 
def do_something(seconds):
    print(f'Sleeping {seconds} second(s)...')
    time.sleep(seconds)
    return f'Done Sleeping...{seconds}'



threads = []
for _ in range(10):
    t = threading.Thread(target=do_something, args=[1.5])
    t.start()
    threads.append(t)

for thread in threads:
    thread.join()

finish = time.perf_counter()

print(f'Finished in {round(finish-start, 2)} second(s)')

In [None]:
import concurrent.futures
import time

start = time.perf_counter()


def do_something(seconds):
    print(f'Sleeping {seconds} second(s)...')
    time.sleep(seconds)
    return f'Done Sleeping...{seconds}'


with concurrent.futures.ThreadPoolExecutor() as executor:
    secs = [5, 4, 3, 2, 1]
    results = [executor.submit(do_something, sec) for sec in secs]

    for f in concurrent.futures.as_completed(results):
        print(f.result())




finish = time.perf_counter()
print(f'Finished in {round(finish-start, 2)} second(s)')


This snippet uses threading to read data from multiple URLs at once, using multiple executed instances of the get_from() functions. The results are stored in a list. 

Note that the underscore '_' it's used for throw-away variables. 

In [None]:
import concurrent.futures
import urllib.request

URLS = ['http://www.foxnews.com/',
        'http://www.cnn.com/',
        'http://europe.wsj.com/',
        'http://www.bbc.co.uk/',
        'http://some-made-up-domain.com/']

# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
    with urllib.request.urlopen(url, timeout=timeout) as conn:
        return conn.read()

# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    # Start the load operations and mark each future with its URL
    future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
    for future in concurrent.futures.as_completed(future_to_url):
        url = future_to_url[future]
        
        
        try:
            data = future.result()
        except Exception as exc:
            print('%r generated an exception: %s' % (url, exc))
        else:
            print('%r page is %d bytes' % (url, len(data)))

### Multiprocessing Examples

In [None]:
import time 
import multiprocessing

start = time.perf_counter()

def do_something():
    print('Sleeping 1 second...')
    time.sleep(1)
    print('Done Sleeping')
    
    
p1 = multiprocessing.Process(target=do_something)
p2 = multiprocessing.Process(target=do_something)

if __name__ == "__main__":

    p1.start()
    p2.start()

    p1.join()
    p2.join()

    print("Finished in ",time.perf_counter() - start)

In [None]:
import time 
import multiprocessing

start = time.perf_counter()

def do_something():
    print('Sleeping 1 second...')
    time.sleep(1)
    print('Done Sleeping')
    
    
if __name__ == "__main__":

    processes = [] 
    for _ in range(10):
        p = multiprocessing.Process(target=do_something)
        p.start()
        processes.append(p)
        
    for process in processes:
        process.join()
        
    print("Finished in ",time.perf_counter() - start)

In [None]:
# Passsing arguments... 

import time 
import multiprocessing

start = time.perf_counter()

def do_something(seconds):
    print(f'Sleeping {seconds} second(s)...')
    time.sleep(seconds)
    print('Done Sleeping')
    
    
if __name__ == "__main__":

    processes = [] 
    for _ in range(10):
        p = multiprocessing.Process(target=do_something, args=[1.5])
        p.start()
        processes.append(p)
        
    for process in processes:
        process.join()
        
    print("Finished in ",time.perf_counter() - start)

In [None]:
# With Process Pool
import time 
import multiprocessing
import concurrent.futures


start = time.perf_counter()

def do_something(seconds):
    print(f'Sleeping {seconds} second(s)...')
    time.sleep(seconds)
    return('Done Sleeping')
    
    
if __name__ == "__main__":

    with concurrent.futures.ProcessPoolExecutor() as executor: 
        results = [executor.submit(do_something, i) for i in range(5, 0, -1)]
        
        for f in concurrent.futures.as_completed(results):
            print(f.result())
        
    print("Finished in ",time.perf_counter() - start)

In [None]:
# With process pool and map
import time 
import multiprocessing
import concurrent.futures


start = time.perf_counter()

def do_something(seconds):
    print(f'Sleeping {seconds} second(s)...')
    time.sleep(seconds)
    return(f'Done Sleeping at {seconds}')
    
    
if __name__ == "__main__":

    with concurrent.futures.ProcessPoolExecutor() as executor: 
        secs = range(5, 0, -1)
        results = executor.map(do_something, secs)
        
        for result in results:
            print(result)
        
    print("Finished in ",time.perf_counter() - start)