### Why Python for Data Science?

OOP allows us to have additional control over data types and how data should flow. Can manage the properties within the object itself.

Benefit is building class objects specifically to control how ML pipelines work, whereas R requires more checks for lowest-common demoninator errors.



In [1]:
x = "a b c d e f g".split()
print(x)

y = "abcdef*g".split("*")
print(y)

for i in x:
    print(i)
    

['a', 'b', 'c', 'd', 'e', 'f', 'g']
['abcdef', 'g']
a
b
c
d
e
f
g


We're accessing a very specific call to the iterator, x.__init()__
the for loop turns on this iterator and keeps calling it until it hits the condition to end the loop.

Example: Panda iterates across the column space "for no other reason than that's how the instructions for iteration are defined in the rules for their iterator."

Boolean Values & Evaluations
 1 == 2 #False
 1 != 2 #True
Evaluates parenthetical statements together, then evaluates other statements with the final result. 
Example: 

In [2]:
print((1 <= 2 or 3 > 4) and 1 > 2)

u = x

# This asks do they have the same ID, versus are they the same object
print(u is x)
print(u is not x)
print('z' in x) # False
print('z' not in u)


False
True
False
False
True


In [3]:
for i in [1,2,3,4,5,6,7,8,9,10]:
    if i < 5:
        print("Hello")
    elif i >=5 and i<=7:
        print("wait")
    else:
        print("bye")

Hello
Hello
Hello
Hello
wait
wait
wait
bye
bye
bye


In [4]:
import this


The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


In [5]:
#While loop example

i = 0
while i < 5:
    print(i)
    i += 1 

0
1
2
3
4


In [6]:
def func():
    pass

### Defining Functions

def name(arguments):
    
    code
    
    return result
    
    
The above chunk of code example makes a specific definition of an object, we then make a call to the function that gives us a specific instance within the namespace.

Serves as a way to pack up and make code more modular for complex projects.

In [7]:
def func():
    print("Hello")

func()

def func1(my_list):
    for i in my_list:
        
        if i < 5:
            print("a")
        elif i == 5:
            print("b")
        else:
            print("c")
            
func1(my_list=[1,5,7])
my_new_list = [4,5,10]
func1(my_new_list)

Hello
a
b
c
a
b
c


Whenever we create a default input for a function, we don't want to leave it as a mutable object or it can create unexpected issues.

### Scope

Built-In > Modules > Global > Class > Function

for instance, if we have a variable b defined before a function is called as well as within a subsequent function, the function will check the more local definition of b before looking for the global definition of b.
However another function that doesn't have b passed to it would look for the global function beforehand.

Want to make sure that we never make calls globally for things that should be defined locally.
Going outside the scope of a given call is the source of a lot of hard to track bugs.

Instead of referencing a global variable, pass it in as an argument to the function.

### Docstring

Note:
""" 
As much text as we want rendered as a string
"""

Being "kind to future you" includes writing information as to what created functions do

functions are objects and have their own method calls. One of the built-in functions we get for free is __doc__ which includes the info we include. The global built-in help() will provide documentation.

In [8]:
text = "An initial attempt to rescue the group, stranded in mountain refuge for two nights, was abandoned on Monday night because of smoke from the Creek Fire. But helicopters were able to land early on Tuesday and are have begun taking the hikers to safety. Fires in California have burned through a record 2m acres in recent weeks. In total, these blazes span an area larger than the US state of Delaware. California is currently experiencing an unprecedented heatwave."


def word_count(text=""):

    """
    What is the function doing?

    Argument:
    ------
    text: str
    A text string
    """
    
    
    tmp = text.split(" ")
    count = len(tmp)
    return count

word_count(text)

78