# A brief introduction to Python
[Presented at [Neurohackademy 2020](https://neurohackademy.org/) by [Tal Yarkoni](https://talyarkoni.org)]

This notebook presents a very brief overview of the Python programming language, with a particular slant on tools and applications relevant for data science. It's assumed that the reader has at least a little bit of prior programming experience; the emphasis is primarily on (a) demonstrating how basic things are done in Python, and (b) reviewing the many strengths of Python (and okay, also a few weaknesses).

[**Cautionary note**: while this tutorial is introductory, that doesn't mean it's *easy*. Novice or less experienced programmers may find some of the concepts covered here—particularly the later sections (e.g., object-oriented programming)—difficult. If this is your experience, don't be alarmed! Programming computers is a fundamentally hard thing to do. The fact that this notebook is titled "a brief introduction" shouldn't fool you into thinking that one learns to become a proficient programmer in just a few hours of reading and experimentation. Readers will undoubtedly get more or less out of different parts of this tutorial depending on their prior experience; the hope is just that at least *some* part of the tutorial will be helpful to almost anyone looking to learn Python.]

## What is Python?

Python is a programming language. Specifically, it's an easy-to-learn, high-level, general-purpose, dynamic programming language. That's quite a mouthful, so let's unpack that...

### Easy to learn
Programming is hard, so, in an absolute sense, no programming language is easy to learn unless you already have prior programming experience. But, comparatively speaking, Python's high-level nature (see next section), readable syntax, and use of semantic whitespace make the language easier to pick up than many others. For example, below is a (deliberately uncommented) definition of a simple Python function that converts a string of English words to (crummy) Pig Latin:

In [None]:
def pig_latin(text):
    ''' Takes in a sequence of words and converts it to (imperfect) pig latin. '''
    
    word_list = text.split(' ')
    output_list = []

    for word in word_list:

        word = word.lower()

        if word.isalpha():
            first_char = word[0]
        
            if first_char in 'aeiou':
                word = word + 'ay'
            else:
                word = word[1:] + first_char + 'yay'

            output_list.append(word)
    
    pygged = ' '.join(output_list)
    return pygged

The above function won't actually produce completely valid pig latin (assuming that there's such a thing as "valid pig latin"), but that's okay. It does something passable:

In [None]:
test1 = pig_latin("let us see if this works")

print(test1)

Pig Latin aside, the code is fairly easy to read ("easy" is relative, of course; I'm not suggesting that a novice programmer with no Python experience should be able to scan the code and immediately understand what's going on at every step!). There are several reasons for this. First, the code is written at a high level of abstraction (more on this below), so that each line of code maps onto a fairly intuitive operation like "take the first character of this word", and not onto a less intuitive lower-level operation like "reserve 1 byte of memory for a character I'm going to hand you in a moment". Second, the control structures (i.e., for-loops, if-then conditionals, etc.) use words like `in`, `and`, and `not`, rather than mysterious-looking operators. Third, Python's strict control of indentation (more on this later) imposes a level of discipline that keeps code readable while also preventing certain very common kinds of errors. And fourth, the Python community's strong emphasis on adhering to style conventions and writing "Pythonic" code means that Python programmers, more so than those working in many other languages, tend to use consistent naming conventions, line lengths, programming idioms, and many other similar features that collectively make it easier to read someone else's code (though admittedly this is more a feature of the community rather than the language itself).

### High-level
Python features a high level of abstraction. Many operations that must be invoked explicitly in lower-level languages (e.g., C or C++) are performed implicitly in Python. For example, you almost never have to explicitly allocate memory or collect garbage in Python—it's all done for you. Put simply, Python lets you write code faster than in many other languages.

### Dynamic
Python code is interpreted at run-time: there's no compilation process (well, this isn't entirely true, but close enough), and code is read line-by-line when executed. The upside of this is it eliminates a common choke-point in development (i.e., waiting for code to compile), and facilitates very fast iteration. It also means variables can be dynamically typed (more on that below). The downside is that, as with other dynamic languages, Python is often considerably slower than compiled languages—at least when performing operations that can't be easily optimized and/or bound to pre-existing code written in a compiled language. (You wouldn't, for example, want to write a 3d game engine in Python.)

### General-purpose
In contrast to many other dynamic programming languages designed to fill specific niches, Python is well suited for a very wide range of applications. It features a comprehensive standard library (i.e., the functionality available out-of-the-box when you install Python) and an enormous ecosystem of third-party packages. It also supports multiple programming paradigms to varying extents (object-oriented, functional, etc.). Consequently, Python is used in many areas of software development (data science, back-end web development, DevOps, scripting engines, etc.).

## Variables and basic types
Now that we've done a bit of evangelizing for Python (we'll do some more at the end!), let's look at the actual mechanics of the language. (If you have a fair bit of experience in other programming languages, you'll probably find the next few sections very basic, and might want to skip ahead.)

### Declaring variables
In Python, we declare a variable by assigning it a value with the `=` sign:

In [None]:
my_favorite_number = 3

Notice that when we initialized the above variable and assigned it a value (`3`), we didn't have to declare its *type* anywhere. In a statically typed language like C++, we'd have to explicitly indicate that the variable holds an integer (e.g., `int my_favorite_number = 3`). In Python, we just assign the value to the variable.

This is known as *[duck typing](https://en.wikipedia.org/wiki/Duck_typing)*, in reference to the idea that in languages like Python, you don't need to know ahead of time whether something is or isn't a duck: when you see an object that looks like a duck and behaves like a duck, you just assume it's a duck when you interact with. If something goes wrong, and your interaction fails, then you know the object isn't a duck.

### Printing variables
We can examine the contents of a variable at any time using the built-in `print()` function:

In [None]:
print(my_favorite_number)

If we're working in an interactive python shell (or an environment wrapped around one, like a Jupyter notebook), we may not even need to call `print()`, as we'll automatically get the output of the last line evaluated by the Python interpreter:

In [None]:
# this line won't be printed, because it isn't the last line in the notebook cell to be evaluated
"this line won't be printed"

# but this one will
my_favorite_number

### Built-in types

If you're coming to Python from another language, you're probably used to working with different types of variables—things like strings, booleans, integers, and so on. Python is no different, and provides us with a large number of [built-in types](https://docs.python.org/3/library/stdtypes.html). Let's take a quick look at some of these. We're assuming a little bit of prior programming experience here, so I won't bother to explain what a string or an integer is; the main thing is to just learn to recognize what different types look like in Python, and how they can be used.

#### Integers

In [None]:
# assign an integer to a variable
age_in_years = 30

In [None]:
# arithmetic works as you would expect
age_in_years / 2

#### Floats

In [None]:
# A float
almost_pi = 3.14

In [None]:
# arithmetic on floats also works as you'd expect
almost_pi + 10

In [None]:
# round() is a built-in function that rounds numbers.
# notice that it returns an integer and not a float,
# even if the input was a float.
# how can you tell this at a glance?
round(almost_pi)

#### Strings
Strings come with a lot of useful built-in methods in Python ([see for yourself](https://docs.python.org/3/library/string.html)!). Let's explore just a few...

In [None]:
# A string
country = "Madagascar"

In [None]:
# How long is the string?
len(country)

In [None]:
# Convert to lowercase
# you can also try lower() or capitalize()
country.upper()

In [None]:
# Count the number of occurrences of the passed substring
country.count('a')

In [None]:
# Replace matching substrings with another value
country.replace('car', 'truck')

#### Booleans
Booleans operate pretty much the same in Python as in other languages; the main thing to recognize is that they can only take on the values `True` or `False`. Not `true` or `false`, not `'true'` or `'false'`; not `1` or `0`.

In [None]:
enjoying_tutorial = True

As you probably know, we can perform logical operations that will evaluate to a boolean:

In [None]:
# Is the length of the string 'apple' greater than 2?
len('apple') > 2

In [None]:
# Is the product of the first two numbers equal to the third?
719 * 1.0002 == 2000

#### None
In addition to the usual suspects, Python also has a type called `None`. `None` is special, and indicates that no value has been assigned to a variable or returned by a function. It's roughly equivalent to many other languages' `null` value.

In [None]:
name = None

Note: `None` is NOT the same thing as `False`!

In [None]:
None == False

## Collections
Most code we're going to want to write in Python will require more than just strings and integers. We're going to need more complex data structures, or *collections*, that can hold other objects (like strings, integers, etc.) and enable us to easily manipulate them in various ways. Python provides built-in support for many common collections, and others can be found in various modules in the standard library (e.g., [collections](https://docs.python.org/3/library/collections.html)).

### Lists
Lists are the most common collection we'll work with in Python. A list is an ordered, heterogeneous collection of objects.

By *ordered* we mean that a list retains a memory of the position each of its elements was inserted in. The order of elements won't change unless we explicitly change it. This allows us to access individual elements in the liset directly, by specifying their *index*.

By *heterogeneous*, we mean that a list can contain elements of different types. A list doesn't have to contain all strings or all integers; it can contain a mix of them, as well as all kinds of other types.

#### List initialization
To create a list, we enclose one or more values between square brackets (`[` and `]`). Elements are separated by commas.

In [None]:
# Notice the different types--lists are heterogeneous!
random_stuff = [11, "apple", 7.14, "banana"]

#### List indexing
To access the $i^{th}$ element in a list, we enclose the index $i$ in square brackets. Note that Python uses 0-based indexing (i.e., the first element in the sequence has index 0), and not 1 as in some other data-centric languages (MATLAB, R, etc.).

In [None]:
# Returns the second element in the list
random_stuff[1]

#### List slicing
We can access sub-lists containing multiple contiguous elements using the colon (`:`) operator.

In [None]:
# First number indicates the start position;
# second indicates the end position. Note that
# the start is inclusive and the end is exclusive.
# In this example, we get back the 2nd and 3rd
# elements, but not the 4th.
random_stuff[1:3]

#### Assigning values to list elements
To overwrite an element at a given index, we just assign a value to it:

In [None]:
print("First element before re-assignment:", random_stuff[0])

random_stuff[0] = 14

print("First element after re-assignment:", random_stuff[0])

#### Appending to a list
We can add a single element to a list via the `.append()` method.

In [None]:
# Append an element
random_stuff.append(88)

# Now our list has changed
random_stuff

### Dictionaries (dict)
Dictionaries are another extremely commonly used data structure in Python. A dictionary (or dict) is a mapping from keys to values; we can think of it as a set of key:value pairs, where the keys have to be unique. Many other languages have structures analogous to Python's dictionaries, though they're usually called something like *associative arrays* or *hashtables*.

#### Dictionary initialization
Dictionary initialization looks like this:

In [None]:
fruit_prices = {
    'apple': 0.65,
    'mango': 1.5,
    'strawberry': '$3/lb',
    'durian': 'unavailable',
    5: 'just to make a point'
}

Note that both the keys and values are heterogeneously typed (observe the last pair, where the key is an integer).

#### Dictionary indexing
Dictionaries are indexed by key. The syntax is identical to that used for list indexing:

In [None]:
# Returns the stored value associated with the key 'mango'
fruit_prices['mango']

However, dictionaries *cannot* be indexed by position, because unlike lists, they're unordered. When you create a dictionary, or add new key:value pairs to an existing dictionary, the order of insertion isn't explicitly tracked. This means you can't ask for, e.g., "the 4th key:value pair in the dictionary". The following example fails, with a `KeyError` telling us there is no such key in the dictionary:

In [None]:
fruit_prices[0]

#### Updating a dictionary
Updating a dictionary uses the same indexing syntax, except we now make an explicit assignment:

In [None]:
# Add a new entry for orange
fruit_prices['orange'] = 0.5

# Overwrite the existing value for mango
fruit_prices['mango'] = 2.25

In [None]:
# Let's look at the dict again...
fruit_prices

### Tuples
Tuples are very similar to lists in Python. The two are easy to confuse, and in practice, you can use a list in most places where you can use a tuple (though there are some important exceptions we won't cover here). The main difference between lists and tuples is that lists are *mutable*, meaning, they can change after initialization. Tuples are *immutable*; once a tuple has been created, it can no longer be modified.

We initialize a tuple in much the same way as a list, except we use parentheses instead of square brackets:

In [None]:
# Tuples are initialized with parentheses, not brackets
my_tuple = ('a', 12, 4.4)

## Everything in Python is an object
The discussion so far might give you the impression that some data types in Python are basic or special in some way. It's natural to think, for example, that strings, integers, and booleans are "primitive" data types—i.e., that they're built into the core of the language, behave in special ways, and can't be duplicated, or modified. And this is true in many other programming languages. For example, in Java, there are exactly 8 primitive data types. If you get bored of them, you're out of luck. You can't just create new ones—say, a new type of string that behaves just like the primitive strings, but adds some additional stuff you think would be kind of cool to have.

Python is different: it doesn't *really* have any primitive data types. Python is a deeply object-oriented programming language, and in Python, *everything is an object*. Strings are objects, integers are objects, booleans are objects. So are collections. So are dictionaries. Everything is an object. We'll explore some of the deeper implications of this later. For now, let's focus on what it means for the way we write Python code. 

### The dot notation
Let's start with the dot (`.`) notation we use to indicate that we're accessing data or functionality inside a method. You've probably already noticed that there are two kinds of constructions we've been using in our code to do things with variables. There's the functional syntax, where we pass an object as an argument to a function:

In [None]:
len([2, 4, 1, 9])

And then there's the object-oriented syntax that uses the dot notation, which we saw when looking at some of the functionality implemented in strings:

In [None]:
phrase = "aPpLeS ArE delICIous"

phrase.lower()

If you have some experience in another object-oriented programming language, the dot syntax will be old hat to you. But if you've mostly worked in data-centric languages (e.g., R or Matlab), you might find it puzzling.

What's happening in the above example is that we're calling the method `lower()` *on* the `phrase` object itself. You can think of the `.` as expressing a relationship of belonging, or roughly translating as "look inside of". So, when we write `phrase.lower()`, we're essentially saying, "try to call the `lower()` method that's contained inside of `phrase`. (I'm being a bit sloppy here for the sake of simplicity, but that's the gist of it.)

Note that `lower()` works on strings, but it isn't a built-in function in Python. We can't just call `lower()` on the air around us:

In [None]:
lower()

And neither is `lower()` a method that's available on *all* objects. For example, this won't work:

In [None]:
num = 6

num.lower()

Integers, as it happens, don't contain a method called `lower()`. And neither do most other types. Strings in Python *do* contain a method called `lower()`, and what that method does is return a lower-cased version of the string on which we called the method. But that functionality is a feature of the string type itself, and *not* of the Python language in general.

Later, we'll see how we go about defining new types (or classes), and specifying what methods they have. For the moment, the main point to take away is that almost all functionality in Python is going to be accessed via objects. The dot notation is ubiquitous in Python, so you'll need to get used to it quickly if you're used to a purely functional syntax.

#### Inspecting objects
One implication of everything being an object in Python is that we can always find out exactly what data an object contains, and what methods it implements, by inspecting it in various ways.

We won't look very far under the hood of objects in this tutorial, but it's worth knowing about a couple of ways of interrogating objects that can make your life easier.

First, you can always see the type of an object with the built-in `type()` function:

In [None]:
msg = 'Hello World!'

type(msg)

Second, the built-in `dir()` function will show you all of the attributes and methods implemented on an object. Be warned that this will often be a long list, and that some of the attribute names you see (mainly those that start and end with two underscores) will look a little wonky. We'll talk about those briefly later.

In [None]:
dir(msg)

That's a pretty long list! Any name in that list is available to you as an attribute in the object (e.g., `my_var.values()`, `my_var.__class__`, etc.). Notice that the list contains all of the string methods we experimented with earlier (including `lower`), as well as many others.

## Control flow
Like nearly every other programming language, Python has a number of core language constructs that allow us to control the flow of our code—i.e., the order in which functions get called and expressions are evaluated. The two most common ones are conditionals (if-then statements) and for-loops (for others, see the [official docs](https://docs.python.org/3/reference/compound_stmts.html)).

### Conditionals
Conditional (or if-then) statements allow our code to branch—meaning, we can execute different chunks of code depending on which of two or more conditions is met. For example:

In [None]:
mango = 0.2 

if mango < 0.5:
    print("Mangoes are super cheap; get a bunch of them!")
elif mango < 1.0:
    print("Get one mango from the store.")
else:
    print("Meh. I don't really even like mangoes.")

The printed statement will vary depending on the value assigned to the `mango` variable. Try changing it and see what happens when you re-run the cell.

Notice that there are actually three statements in the above code: `if`, `elif` (which in Python stands for "else if"), and `else` (the whole if-then-else construction is referred to as a *compound statement*). Only the first of these (i.e., `if`) is strictly necessary. And there can be arbitrarily many `elif` statements. Try adding another one to the code above.

### Loops
For-loops allow us to iterate (or loop) over the elements of a sequence (e.g., a list) and perform the same operation(s) on each one:

In [None]:
# Loop over the random_stuff list we created earlier and print each element
for elem in random_stuff:
    print(elem)

### Nested control flow
We can nest conditionals and for-loops inside one another (as well as inside other compound statements). For example, we can loop over the elements of `random_stuff`, as above, but keeping only the elements that meet some condition—e.g., only those elements that are strings:

In [None]:
# create an empty list to hold the filtered values
strings_only = []

# loop over the random_stuff list
for elem in random_stuff:
    # if the current element is a string...
    if isinstance(elem, str):
        # ...then append the value to strings_only
        strings_only.append(elem)

print("Only the string values:", strings_only)

### List comprehensions
In Python, for-loops can also be written in a more compact way known as a list comprehension. List comprehensions are just a bit of *syntactic sugar*—meaning, they don't have any difference in meaning from for-loops, and are purely a different way of writing the same thing. Here's the list comprehension version of the for-loop we wrote above:

In [None]:
# this is identical to the first loop above
[print(elem) for elem in random_stuff];

We can also embed conditional statements inside list comprehensions. Here's a much more compact way of writing the string-filtering snippet we wrote above:

In [None]:
strings_only = [elem for elem in random_stuff if isinstance(elem, str)]

print("Only the string values:", strings_only)

List comprehensions can save you quite a bit of typing once you get used to reading them, and you may eventually even find them clearer to read. It's also possible to nest list comprehensions (equivalent to for-loops within for-loops), though that power should be used sparingly, as nested list comprehensions can be difficult to understand.

### Syntactically significant whitespace
One thing you might have noticed when reading the conditional statements and for-loops above is that we always seem to indent our code inside these statements. This isn't a matter of choice; Python is a bit of an odd duck among programming languages, in that it imposes strong rules about how whitespace can be used (i.e., whitespace is *syntactically significant*). This can take a bit of getting used to, but once you do, it has important benefits: there's much less variability in coding style across different Python developers, and reading other people's code is often much easier than it is in languages without semantic whitespace.

The main rule you need to be aware of is that whenever you enter a compound statement (which includes for-loops and conditionals, but also function and class definitions), you have to increase the indentation of your code. When you exit the compound statement, you then decrease the indentation by the same amount.

The exact amount you indent each time is technically up to you. But it's strongly recommended that you use the same convention everyone else does (described [here](https://www.python.org/dev/peps/pep-0008/#indentation)), which is to always indent or dedent by 4 spaces. Here's what this looks like in a block with multiple nested conditionals:

In [None]:
num = 800

if num > 500:
    if num < 900:
        if num > 700:
            print("Great number.")
        else:
            print("Terrible number.")

Now try modify the above snippet so that you (a) consistently use a different amount of indentation, and (b) break Python by using invalid indentation.

## Namespaces and imports
Python is a high-level, dynamic programming language, which people often associated with flexibility and lack of precision (e.g., you don't have to type your variables when you declare them in Python). But in some ways, Python is actually much more of a stickler than most other dynamic languages about the way Python developers write their code. We just saw that Python is very serious about how you indent your code. Another thing that's characteristic of Python is that it takes *namespacing* very seriously.

If you're used to languages like, say, R or MATLAB, you might expect to have hundreds of different functions available to call as soon as you fire up an interactive prompt. In Python, the *built-in namespace*—i.e., the set of functions you can invoke when you start running Python—is [very small](https://docs.python.org/3/library/functions.html). This is by design: Python expects you to carefully manage the code you use, and it's particularly serious about making sure you maintain orderly namespaces.

In practice, this means that any time you want to use some code that's not available to you in your current [scope](https://en.wikipedia.org/wiki/Scope_(computer_science)), you need to explicitly *import* it from whatever module it's currently in, via an `import` statement. Python's import system often annoys beginners, because it forces them to write additional lines of code that other languages dson't. But once you get used to it, you'll find that it substantially increases code clarity and almost completely eliminates naming conflicts and confusion.

### Importing a module
Conventionally, all import statements in a Python file are consolidated at the very top (though there are some niche situations where this isn't possible). Here's what the most basic usage of `import` looks like:

In [None]:
import json

# Create a dict
dummy = {'a': 1, 'b': 5}

# Dump the contents of the dict to a JSON string
json.dumps(dummy)

Here we import the `json` module from the standard library. Once we do that, we can now call functions located within that library, using the dot syntax you see above. In this case, we're calling the `dumps()` function that's contained in the `json` module we imported. But we could also call `json.load()` or `json.dump()`, as those are other functions available in the `json` module's namespace.

Note that if we hadn't explicitly imported `json`, the `json.dumps()` call would have failed, because `json` would be undefined in our namespace. You can try it and see what happens.

### Importing from a module
Importing a module by name gives us access to all of its internal attributes. But sometimes we only need to call a single function inside a module, and we might not want to have to type the full module name every time we use that function. In that case, we can import *from* the module:

In [None]:
# defaultdict is a dictionary that has default values
# for new keys.
from collections import defaultdict

test_dict = defaultdict(int)

# this would fail with a normal dict, but with a defaultdict,
# a new key with the default value is created upon first access.
test_dict['made_up_key']

Here we import `defaultdict()` directly *from* the `collections` module *into* our current namespace. This makes `defaultdict` available for our use. Note that `collections` itself is *not* available to us unless we explicitly import it (i.e., if we `import collections`):

In [None]:
test_dict = collections.defaultdict(int)

### Renaming variables at import time
Sometimes the module or function we want to import has an unwieldy name. Python's import statements allow us to rename the variable we're importing on-the-fly using the `as` keyword:

In [None]:
from collections import defaultdict as dd

b = dd(float)

b['apple']

For many commonly used packages, that are strong conventions about naming abbreviations, and you should make sure to respect these in your code. For example, it's standard to see `import numpy as np` and `import pandas as pd`. Your code will still work fine if you use other variable names, but other programmers will have a slightly more difficult time understanding what you're doing. So be kind to others and respect the conventions.

## Functions
Python would be of limited use to use if we could only run our code linearly from top to bottom. Fortunately, as in almost every other modern programming language, Python has *functions*: blocks of code that only run when explicitly called. Some of these are built into the language itself (or contained in the standard library's many modules we can import from, as we saw above):

In [None]:
animal = 'elephant'

# len() is a built-in function that counts the number of
# elements in an iterable object like a list or string.
len(animal)

But we can also define our own functions, which we can then call just like the built-in ones:

In [None]:
def useless_message():
    print("This function just prints a fairly useless message.")

Notice that nothing happens when we run the above block of code, because all we've done is define the function; we haven't yet *called* or *invoked* it. let's do that:

In [None]:
useless_message()

### Function arguments and return values
Functions can accept *arguments* (or parameters) that alter their behavior. When we called `len()` above, we passed the variable `elephant` as an argument. This argument is mandatory in the case of `len()`; if you try calling `len()` with no argument (feel free to attempt that), it will generate an error.

Functions can also explicitly *return* values to the user. (If a function doesn't explicitly end with a `return` statement, it will return the special value `None` we encountered earlier.)

Let's illustrate the use of arguments by writing a small function that takes a single float as input, adds gaussian noise from a specified distribution to it, and returns the result.

In [None]:
# We'll need the random module for this
import random

def add_noise(x, mu, sd):
    ''' Adds gaussian noise to the input.
    
    Parameters:
        x (number): The number to add noise to
        mu (float): The mean of the gaussian noise distribution
        sd (float): The standard deviation of the noise distribution
    
    Returns: A float.
    '''
    noise = random.normalvariate(mu, sd)
    return (x + noise)

In [None]:
# Let's try calling it...
add_noise(7, 0, 10)

The `add_noise()` function has three mandatory parameters (i.e., if you omit any of them, an error will be generated). The first (`x`) is the number we want to add noise to; the second is the mean of the gaussian distribution to sample from; and the third is the distribution's standard deviation. Notice that we've documented the function's behavior inside the function using what's called a *[docstring](https://www.python.org/dev/peps/pep-0257/)*. This is a good habit to get into, as good documentation is essential if you expect other people (including future versions of you) to be able to use the code you write.

### Function arguments
Python functions can have two kinds of arguments: *positional* arguments, and *keyword* (or sometimes, *named*) arguments.

#### Positional arguments
Positional arguments, as their name suggests, are defined by position, and they *must* be passed when the function is called. The values passed inside the parentheses are mapped one-to-one onto the arguments, as we saw above for `add_noise()`. That is, inside the `add_noise()` function, the first value is referenced by `x`, the second by `mu`, and so on.

If the caller fails to pass the right number of arguments (either too few or too many), an error will be generated:

In [None]:
# Fails because the function has 3 positional arguments, and we only pass one
add_noise(7)

#### Keyword arguments
Keyword arguments are arguments that are assigned a default value in the function *signature* (i.e., the top line of the function definition, that looks like `def my_function(...)`). Unlike positional arguments, keyword arguments are optional: if the caller doesn't pass a value for the keyword argument, the corresponding variable will still be available inside the function, but it will have whatever value is defined as the default in the signature.

To see how this works, let's rewrite our `add_noise()` function so that the parameters of the gaussian distribution are now optional:

In [None]:
def add_noise_with_defaults(x, mu=0, sd=1):
    ''' Adds gaussian noise to the input.
    
    Parameters:
        x (number): The number to add noise to
        mu (float, optional): The mean of the gaussian noise distribution
        sd (float, optional): The standard deviation of the noise distribution
    
    Returns: A float.
    '''
    noise = random.normalvariate(mu, sd)
    return x + noise

This looks very similar, but we can now call the function without filling in `mu` or `sd`. If we don't pass in those values explicitly, the function will internally use the defaults (i.e., `0` in the case of `mu`, and `1` in the case of `sd`).

In [None]:
# Let's call it again...
add_noise_with_defaults(10)

Note that keyword arguments don't need to be filled in order, as long as we explicitly name them. For example, we can specify a value for `sd` but not for `mu`:

In [None]:
# we specify x and sd, but not mu
add_noise_with_defaults(5, sd=100)

Note that if we didn't specify the name of the argument (i.e., if we called `add_noise_with_defaults(5, 100)`, the function would still work, but the second value we pass would be interpreted as `mu` rather than `sd`.

It's also worth noting that we can always explicitly name *any* of our arguments, including positional ones. This is extremely handy in cases where we're calling functions whose argument names we remember, but where we don't necessarily remember the exact order of the arguments. For example, suppose we remember that `add_noise()` takes the three arguments `x`, `mu`, and `sd`, but we don't remember if `x` comes before or after the distribution parameters. We can guarantee that we get the result we expect by explicitly specifying all the names:

In [None]:
add_noise(mu=1, sd=2, x=100)

### Argument unpacking with \*args and \**kwargs
It sometimes happens that a function needs to be able to accept an unknown number of arguments. A very common scenario like this is where we've written a "wrapper" function that takes some input, does some operation that relies on only some of the arguments the user passed, and then hands off the rest of the arguments to a different function. 

For example, suppose we want to write a `arg_printer()` function that we can use to produce a standardized display of the positional and keyword arguments used when calling some other (arbitrary function). Python handles this scenario elegantly via special `*args` and `**kwargs` syntax in function signatures, also known as *argument unpacking*.

The best way to understand what `*args` and `**kwargs` do is to see them in action (and note that if you're new to programming, some of these ideas in this section and the next couple may take a *while* to make sense). Here's an example:

In [None]:
def arg_printer(func, *args, **kwargs):
    """
    A wrapper that takes any other function plus arguments to pass
    to that function. The arguments are printed out before the
    function is called and the result returned.
    
    Args:
        func (callable): The function to call with the passed args.
        args (list): Optional list of arguments to pass onto func.
        kwargs (dict): Optional dict of keyword arguments to pass
            onto func.

    Returns:
        The result of func() when called with the passed arguments.
    """
    print("Calling function '{}'.".format(func.__name__))
    print("Positional arguments:", args)
    print("Keyword arguments:", kwargs)
    return func(*args, **kwargs)

This may seem a bit mysterious, and there are parts we won't explain right now (e.g., `func.__name__`). But try experimenting a bit with calling this function, and things may start to click. Here's an example to get you rolling:

In [None]:
# feel free to experiment with this. e.g.,
# try replacing add_noise with built-in functions
# like min, len, or list. Hint: you'll probably
# nee to change the rest of the arguments too!
arg_printer(add_noise, 17, mu=0, sd=5)

What's happening here is that the first argument to `arg_printer()` is the `add_noise()` function we defined earlier. Remember: everything in Python is secretly an object! You can pass functions as arguments to other functions too; it's no different than passing a string or a list. A key point to note, however, is that what we're passing in is, in a sense, the *definition* of the function. Notice how we didn't add parentheses to `add_noise` when we passed it to `arg_printer()`? That's because we don't want to actually call `add_noise` yet; we're leaving that to `arg_printer()` to do that internally.

All the other arguments to `arg_printer()` after the first one are arguments that we actually want to pass onto the `add_noise()` function when it's called internally by `arg_printer()`. The first thing `arg_printer()` does is print out the name of the function we just gave it, as well as all of the positional and keyword arguments we passed in. Once it's done that, it calls the function we passed in (`add_noise()`) and passes along all the arguments.

If the above doesn't make sense, don't worry! We're moving quickly, and the concepts from here on out start to get quite a bit denser. The point of this tutorial is just to give you an overview of how Python works, and people come to Python with varying degrees of experience in other programming languages. If argument unpacking doesn't make much sense right now, try reading other, longer, tutorials (for example, [this one](https://realpython.com/python-kwargs-and-args/)). And remember, there's no substitute for writing your own code and experimenting with different ideas until things start to make sense.

## Classes
At various points in this tutorial, I've pointed out that Python is a deeply object-oriented programming language, and that, in a sense, everything in Python is an object. We're now in a position to unpack that statement. What does it mean to say that something is an object? How do objects get defined? And how do we specify what an object should do when a certain operation is applied to it? To answer these question, we need to introduce the notion of *classes*—and, more generally, the *object-oriented programming* (OOP) paradigm.

### What is a class?
A class is, in a sense, a kind of template for an object. You can think of it as a specification, or a set of instructions that determine what an object of a given class can do. In a sense, it's very close in meaning to what we've already been referring to as the *type* of an object. There *is* technically a difference between types and classes in Python, but it's quite subtle (especially in Python 3), and in day-to-day usage, you can use the terms interchangeably and nobody is going to yell at you.

### Defining classes
So a class is a kind of template; okay, what does it look like? Well, minimally, it looks like very little. Here's a fully functional class definition:

In [None]:
class Circle:
    pass

### Creating instances of a class
That's it! We've defined a new Python class. We can now *instantiate* this class if we like—which is to say, we can create entirely new *instances* of objects, whose behaviors are defined by the class. The syntax for creating an instance in Python is:

In [None]:
circle = Circle()

That's it again! We now have a new variable `circle` on our hands, which is an instance of class `Circle`.

If you don't believe me, I can prove it:

In [None]:
type(circle)

#### A note on nomenclature
You may have already noticed the naming convention we've used throughout this tutorial: our variables are always composed of lower-case characters, with words separated by underscores. This is called *snake_case*, and you should get in the habit of following it. You'll also note that class names are capitalized (technically, they're in *CamelCase*). This too is conventional, and you should follow that convention as well.

If you follow both of these conventions, then you won't be surprised when you see `circle` and `Circle` both show up in your code. You'll know that `Circle` is a class, whereas `circle` is a particular *instance*.

### Making it do things
The `Circle` class definition we wrote above was perfectly valid, but not terribly useful. It didn't define any new behavior, so any instances of the class we created wouldn't do anything more than base objects in Python can do (which isn't very much).

Let's fix that by filling in the class a bit.

In [None]:
# We need pi!
import math

class Circle:
    
    def __init__(self, radius):
        self.r = radius

    def area(self):
        return math.pi * self.r**2

There's not much code to see, but conceptually, a lot is going on. Let's walk through this piece by piece.

First, observe that we've defined what look like two new functions inside the class. Technically, these are *methods* and not functions, because they're *bound* to a particular object. But the principle is the same: they're chunks of code that take arguments, do some stuff, and then (possibly) return something to the caller.

You'll also note that both methods have `self` as their first argument. This is a requirement: all instance methods have to take a reference to the current instance (conventionally named `self`) as their first argument (there are also *class* methods and *static* methods, which behave differently, but we won't cover those in this tutorial). This reference is what will allow us to easily maintain state (i.e., to store information that can vary over time) inside the instance.

Now let's walk through each of the defined methods. First, there's `__init__()`. The leading and trailing double underscores indicate that this is a special kind of method called a *magic* method; we'll talk about those a bit more later. For the moment, we just have to know that `__init__()` is a special method that gets called whenever we initialize a new instance of the class. So, when we write a line like `circle = Circle()`, what happens is that the `__init__()` method of `Circle` gets called and executed.

Observe that, in this case, `__init__()` takes a single argument (other than `self`, that is): a `radius` argument. And further, the only thing that happens inside `__init__()` is that we store the value of `radius` in an *instance attribute* called `r`. We do this by assigning to `self`. Remember, `self` is a reference to the current instance! So the newly-created instance that's returned by `Circle()` will have that `radius` value set in the `.r` attribute.

This code should make this a bit clearer:

In [None]:
# Initialize a new circle, passing a radius of 4
circle = Circle(4)

# We can now access the stored radius inside our instance
circle.r

Next, let's look at the `area()` method. This one takes no arguments (again, `self` is passed automatically; we don't need to, and shouldn't, pass it ourselves). That means we can just call it and see what happens. Let's do that:

In [None]:
circle.area()

When we call `area()`, what we get back is the area of our circle—based on the radius stored in the instance at that moment. Note that this area is only computed when we actually call `area()`, and isn't computed in advance. This means that if the circle's radius changes, so too will the result of `area()`:

In [None]:
# We're changing the radius stored in the instance
circle.r = 9

# And that should also change the area...
circle.area()

### Magic methods
There's a *lot* more we could say about how classes work in Python, and about object-oriented programming in general, but this is just a brief introduction, so we have to be picky. Let's introduce just one other big concept: *magic methods*. Magic methods of objects, as we've seen a couple of times now, start and end with a double underscore: `__init__`, `__getattr__`, `__new__`, and so on. As their names suggest, these methods are magic—at least in the sense that they appear to add some magic to a class's behavior. We've already talked about `__init__`, which is a magic method that gets called any time we create a new instance of a class. But there are many others.

The key to understanding how magic methods work is to recognize that they're usually called implicitly when a certain operation is applied to an object—even if it doesn't look like the magic method and the operation being applied have anything to do with each other (that's what makes them magic!). Remember how we said earlier that *everything in Python is an object*? Well now we're going to explore one of the deeper implications of that observation, which is that *all operators in Python are actually just cleverly-disguised method calls*. That means that when we write even an expression as seemingly basic as `4 * 3` in Python, it's actually implicitly converted to a call to a magic method on the first operand (4), with the second operand (3) being passed in as an argument.

This is a bit hard to explain abstractly, so let's dive into an example. Start with this naive arithmetic operation:

In [None]:
4 * 3

No surprises there. But here's an equivalent way to write that, which makes clearer what's actually happening under the hood when we multiple one number by another:

In [None]:
# 4 is a number, so we have to wrap it in parentheses to prevent a syntax error.
# but we wouldn't have to do this for variables.
(4).__mul__(3)

Remember the dot notation? Here, `__mul__` is actually a (magic) method implemented in the integer class. When Python evaluates the expression `4 * 3`, it actually calls `__mul__` on the first integer, and hands it the second one as an argument. See, we weren't messing around when we said *everything is an object in Python*. Even something as seemingly basic as the multiplication operator is actually just an alias to a method called on an integer object! Isn't that neat?

### The semantics of *
Once we recognize that Python's `*` operator is just an alias to the `__mul__` magic method, we might start to wonder if this is *always* true. Does every valid occurrence of `*` in Python code imply that the object just before the `*` must be an instance of a class that implements the `__mul__` method? The answer is yes! The result of an expression that includes the `*` operator (and for that matter, every other operator in Python, including things like `==` and `&`) is entirely dependent on the receiver object's implementation of `__mul__`.

Just to make it clear how far-reaching the implications of this principle are, let's look at how a couple of other built-in Python types deal with the `*` operator. Let's start with strings. What do you think will happen when we multiply a string object by 2?

In [None]:
# What about a string?
"apple" * 2

There's a good chance this was *not* the behavior you expected. Many people intuitively expect an expression like `"apple" * 2` to produce an error, because we don't normally think of strings as a kind of thing that can be multiplied. But remember: in Python, the multiplication operator is just an alias for a `__mul__` call. And there's no particular reason a string class *shouldn't* implement the `__mul__` method; why not define *some* behavior for it, even if it's counterintuitive? That way users have a super easy way to repeat strings if that's what they want to do.

What about a list?

In [None]:
# A list?
random_stuff * 3

List multiplication behaves a lot like string multiplication: the result is that the list is repeated $n$ times.

What about dictionary multiplication?

In [None]:
fruit_prices * 2

Finally, we encounter an outright failure! It appears Python dictionaries can't be multiplied. Which presumably means that the `dict` class doesn't implement `__mul__` (you can verify this for yourself by inspecting a dictionary using `dir()`).

### Other magic methods
Most of the magic methods in Python do something very much like what we saw for the multiplication operator. Consider the following operators: `+`, `&`, `/`, `%`, and `<`. These map, respectively, onto the magic methods `__add__`, `__and__`, `__truediv__`, `__mod__`, and `__lt__`. There are many others that follow the same pattern.

There are also a number of magic methods tied to built-in functions rather than operators (e.g., when you call `len(obj)`, that's equivalent to calling `obj.__len__`), or that are triggered by certain events (e.g., `__getattr__` is called when a requested attribute isn't found in an object).

For a full descriptions of all the magic methods, you can peruse the [official docs](https://docs.python.org/3/reference/datamodel.html). In practice, though, you won't really need to know much about magic methods unless you start writing a lot of classes of your own. We spent a lot of time talking about them mainly because they're a good way to convey some deep insights about the data model at the core of the Python language.

### Hungry Circles
Let's come full circle now (no pun intended) and revisit the `Circle` class we defined earlier. The last thing we'll do in this tutorial is add a magic method to our `Circle` class. This will nicely tie together a lot of different threads we've covered.

What we're going to do, via a clever use of the `__lshift__` magic method (which maps to the `<<` operator), is give instances of class `Circle` the ability to "eat" other circles. When given Python code like this...

In [None]:
c1 = Circle(4)
c2 = Circle(2)
c1 << c2

...we want the first circle to "grow" its radius by exactly the amount required for its new area to equal the sum of the two circles' previous areas.

If you run the snippet above it'll fail, because our current `Circle` implementation has no idea what to do with the `<<` operator. So let's fix that. Here's our updated implementation:

In [None]:
import math

class Circle:
    
    def __init__(self, radius):
        self.r = radius

    def __lshift__(self, prey):
        new_area = self.area() + prey.area()
        self.r = math.sqrt(new_area / math.pi)

    def area(self):
        return math.pi * self.r**2

The only change here is the addition of the `__lshift__` method.

Let's see if the above did what we wanted:

In [None]:
import weakref

c1 = Circle(4)
print("Radius of c1:", c1.r)
print("Area of c1:", c1.area())

c2 = Circle(2)
print("Radius of c2:", c2.r)
print("Area of c2:", c2.area())

# Now the important part: c1 eats c2!
c1 << c2

Well, we didn't get an error, so that's a good sign. Let's inspect `c1` and see if it's been updated as we expect. Remember: we expect `c1` to have "eaten" `c2`, which means its radius should grow, and its area should be the sum of both previous areas.

In [None]:
print("Radius of c1 after gorging on c2:", c1.r)
print("Area of c1 after gorging on c2:", c1.area())

It worked!

The only slightly dissatisfying feature of our implementation is that, after `c1` eats `c2`, `c2` is somehow still around to tell the tale. This probably violates some conservation law, but we'll overlook that here. For reasons we won't get into, it's not trivial to delete `c2` from inside `c1`. (There are good reasons for this, and the fact that we can't easily make some of our circles wink out of existence from inside the belly of other circles might lead us to suspect we've architected our code suboptimally. But that's a problem for a different tutorial.)

# Resources/further reading
This tutorial provided a high-level look at some of the main features of the Python language—some basic, some more advanced. To really develop a working familiarity with the language, you will, of course need to roll up your sleeves and start writing some code. One of the best ways to learn is to pick a small problem that actually interests or matters to you in some way (e.g., parsing some text data you have lying around), and google for help every time you run into problems (there's no shame in consulting the internet! All programmers do it!).

If you prefer to have more structure, there are hundreds of excellent, and mostly free, resources online to help you on your way. A few good ones:

* CodeAcademy offers interactive programming courses for many languages and tools, including [Python](https://www.codecademy.com/learn/learn-python). (The Python 3 course costs money, but the Python 2 course is free, and the changes to the language aren't huge.)
* [A Whirlwind Tour of Python](http://www.oreilly.com/programming/free/files/a-whirlwind-tour-of-python.pdf) is an excellent intro to Python by [Jake VanderPlas](https://staff.washington.edu/jakevdp/); Jupyter notebooks are available [here](https://github.com/jakevdp/WhirlwindTourOfPython)
* Another excellent and free online book is Allen Downey's ["Think Python"](http://greenteapress.com/wp/think-python-2e/)