# [Introduction to Python](https://www.datacamp.com/completed/statement-of-accomplishment/course/27e7c4ed40996d56d7d48a23cc55337bbc89c0c3)

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adamelliotfields/datacamp/blob/main/notebooks/courses/introduction_to_python/notebook.ipynb)

In [1]:
import this


The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


### Comments and Docstrings

In [2]:
# this is a comment
"""This is a docstring"""


'This is a docstring'

### Math

In [3]:
two = 1 + 1
one = 2 - 1
six = 2 * 3
three = 6 / 2  # (float division)
three_int = 6 // 2  # (integer division)
four = 6 % 2
eight = 2**3


### Variables

In [4]:
s = "hello world"
s_type = type(s)  # <class 'str'>


### Types

In [5]:
t = isinstance(42, int)
t = isinstance(3.14, float)
t = isinstance("hi", str)
t = isinstance(True, bool)
t = isinstance([1, 2], list)
t = isinstance({"a": 1}, dict)
t = isinstance((1, 2), tuple)


#### Casting

In [6]:
three = int(3.14)  # 3
pi = float("3.14")  # 3.14
false = bool(0)  # False
true = bool(1)  # True
# int("foo")  # raises ValueError


### Lists

In [7]:
# lists can be heterogeneous
l = [False, 1, 2, 3.14, "hello"]

# lists can be multidimensional
l2d = [[1, 2, 3], [4, 5, 6]]

# lists can be subset by index
false = l[0]  # False
hello = l[-1]  # "hello"

# you can retrieve the index of an element
ind = l.index("hello")  # 4

# lists can be sliced
# note that the end index is exclusive
ints = l[1:3]  # [1, 2]
first = l[:3]  # [False, 1, 2]
last = l[3:]  # [3.14, "hello"]

# you can delete elements from a list
del l2d[0][0]  # [[2, 3], [4, 5, 6]]

# you can delete a slice
del l[1:-1]  # everything between `False` and `"hello"`


### Dictionaries

In [8]:
# dictionaries are key-value pairs
d = {"a": 1, "b": 2, "c": 3}

# dictionaries can be subset by key
one = d["a"]  # 1

# you can check for membership
true = "a" in d  # True

# you can get an iterable of keys (not a traditional sequence)
keys = d.keys()  # dict_keys(["a", "b", "c"])

# you can delete a key-value pair
del d["a"]  # {"b": 2, "c": 3}


### Sets

In [9]:
# sets are unordered collections of unique elements
s1 = {1, 2, 3}

# they can also be created from lists
s2 = set([3, 4, 5])

# they provide fast membership checking
true = 1 in s1

# you can use set methods
union = s1.union(s2)  # {1, 2, 3, 4, 5}
isect = s1.intersection(s2)  # {3}
diff = s1.difference(s2)  # {1, 2}
sym_diff = s1.symmetric_difference(s2)  # {1, 2, 4, 5} # elements in either set but not both

# you can also use operators
union = s1 | s2  # {1, 2, 3, 4, 5}
isect = s1 & s2  # {3}
diff = s1 - s2  # {1, 2}
sym_diff = s1 ^ s2  # {1, 2, 4, 5}

# you can add, remove, and discard elements
s1.add(4)  # {1, 2, 3, 4}
s1.remove(4)  # {1, 2, 3}
s1.discard(4)  # like remove but doesn't raise KeyError if element is not present

# a frozenset is immutable and hashable (can be used as a key in a dict)
fs1 = frozenset({1, 2, 3})
fs2 = frozenset({3, 4, 5})
true = 1 in fs1
fs = fs1 | fs2  # frozenset({1, 2, 3, 4, 5})


### Collections

In [10]:
from collections import Counter, defaultdict, namedtuple

# Counter is a dictionary subclass for counting hashable objects
c = Counter([1, 2, 3])  # Counter({1: 1, 2: 1, 3: 1})
c.update([3])  # Counter({3: 2, 1: 1, 2: 1})
most_common = c.most_common(1)  # [(3, 2)]
c.subtract([3])  # Counter({3: 1, 1: 1, 2: 1})

# defaultdict allows you to specify a default type for missing keys
# useful if you do not know the shape ahead of time
dd = defaultdict(list)  # defaultdict(<class 'list'>, {})
dd["foo"].append("bar")  # we didn't have to `dd['foo'] = []` first

# namedtuples are useful for representing records when you don't need Pandas
# they are immutable
NT = namedtuple("NT", ["foo", "bar"])  # foo and bar fields are required
# nt = NT(1)  # raises TypeError (missing 'bar')
nt = NT(1, 2)  # NT(foo=1, bar=2)
nt = NT(foo=1, bar=2)  # NT(foo=1, bar=2) # fields can be named or positional
# foo = nt['foo']  # raises TypeError (namedtuples are not subscriptable) (i.e., use dot notation)
foo = nt.foo  # 1
bar = nt.bar  # 2


### Classes

In [None]:
# classes are defined with the `class` keyword and are PascalCase
class MyClass:
    pass


### Dataclasses

In [11]:
# dataclasses are like more powerful namedtuples
# they are mutable
from dataclasses import dataclass


@dataclass
class DC:
    foo: int
    bar: int = 2


dc = DC(1)  # DC(foo=1, bar=2)


### Functions

In [12]:
# functions
def add(x: int, y: int) -> int:
    """
    Add two numbers

    Args:
        x (int): first number
        y (int): second number

    Returns:
        int: sum of x and y
    """
    return x + y


# you can access the docstring
# can also use `inspect.getdoc(add)`
docstring = add.__doc__


# you can return a tuple to return multiple values (like Go)
def add_and_subtract(x, y):
    return x + y, x - y  # parens are optional


ten, _ = add_and_subtract(6, 4)  # 10


In [13]:
# function arguments can have default values
def greet(name="world"):
    return f"hello {name}"


print(greet())  # "hello world"
print(greet("🌎"))  # "hello 🌎"


hello world
hello 🌎


In [14]:
# you can have arbitrary arguments (`args` is conventional)
# the `*` unpacks a list into "positional arguments"
def greet_all(*args):
    for arg in args:
        print(f"hello {arg}")


greet_all("🌍", "🌏")


# you can have arbitrary keyword arguments (`kwargs` is conventional)
# the `**` unpacks a dictionary into "keyword arguments"
def greet_all_custom(**kwargs):
    for k, v in kwargs.items():
        print(f"{k} {v}")


greet_all_custom(hello="world")
greet_all_custom(**{"hola": "mundo"})  # unpacking a dict is the same as above


hello 🌍
hello 🌏
hello world
hola mundo


#### Pass by Assignment

Variables in Python are passed to functions _by assignment_. The behavior depends on the type of the object being passed. If an immutable type like an `int` is passed, the function receives a copy of the object. If a mutable type like a `list` is passed, the function receives a reference to the object.

If you mutate the argument passed to a function and it is a mutable type, then it will be altered outside of the function, which can lead to unexpected behavior.

A type is mutable if it has a method that can change its contents. For example, `list` has an `append` method that can add items to the list; `int` has no such method.

In [15]:
def list_add(l: list) -> list:
    l.append(1)
    return l


def int_add(i: int) -> int:
    i += 1
    return i


lst = []
num = 0

list_add(lst)
int_add(num)

print(lst)  # [1]
print(num)  # 0


[1]
0


### Profiling

In [16]:
# use getsizeof to get the size of an object in bytes
from sys import getsizeof

bytes = getsizeof([i for i in range(1000)])  # 8856


In [17]:
# you can profile in ipython with the %timeit magic
# (can also be used as a cell (%%) magic)
%timeit list()
%timeit []  # literal faster than factory


55.1 ns ± 4.47 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
26.9 ns ± 0.541 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


In [18]:
# you can also use line_profiler
%load_ext line_profiler


In [19]:
from time import sleep


def fn(n) -> int:
    """A function to profile"""
    x = 0
    for i in range(n):
        x += i
    sleep(1)
    y = 0
    for i in range(n):
        y += i * i  # takes a little longer
    return x + y


In [20]:
%lprun -f fn fn(1000000)


Timer unit: 1e-09 s

Total time: 2.40152 s
File: /var/folders/_0/vzr7w6510yg9930l0ydhc1mm0000gn/T/ipykernel_9860/1427703122.py
Function: fn at line 4

Line #      Hits         Time  Per Hit   % Time  Line Contents
     4                                           def fn(n) -> int:
     5                                               """A function to profile"""
     6         1       1000.0   1000.0      0.0      x = 0
     7   1000001  335777000.0    335.8     14.0      for i in range(n):
     8   1000000  346604000.0    346.6     14.4          x += i
     9         1 1003019000.0    1e+09     41.8      sleep(1)
    10         1       1000.0   1000.0      0.0      y = 0
    11   1000001  332160000.0    332.2     13.8      for i in range(n):
    12   1000000  383955000.0    384.0     16.0          y += i * i  # takes a little longer
    13         1          0.0      0.0      0.0      return x + y

### Help

In [21]:
# get help on a function
help(len)


Help on built-in function len in module builtins:

len(obj, /)
    Return the number of items in a container.



In [22]:
# get help on an instance method
help("".join)


Help on built-in function join:

join(iterable, /) method of builtins.str instance
    Concatenate any number of strings.
    
    The string whose method is called is inserted in between each given string.
    The result is returned as a new string.
    
    Example: '.'.join(['ab', 'pq', 'rs']) -> 'ab.pq.rs'



In [23]:
%%bash
# get help on a module type method from anywhere
poetry run python -c 'import numpy; help(numpy.ndarray.reshape)'


Help on method_descriptor:

reshape(...)
    a.reshape(shape, order='C')
    
    Returns an array containing the same data with a new shape.
    
    Refer to `numpy.reshape` for full documentation.
    
    See Also
    --------
    numpy.reshape : equivalent function
    
    Notes
    -----
    Unlike the free function `numpy.reshape`, this method on `ndarray` allows
    the elements of the shape parameter to be passed in as separate arguments.
    For example, ``a.reshape(10, 11)`` is equivalent to
    ``a.reshape((10, 11))``.

