# [Hello, Python](https://www.kaggle.com/colinmorris/hello-python)

[**Python Home Page**](https://www.kaggle.com/learn/python)

---


## Intro
This course covers the key Python skills you’ll need so you can start using Python for data science. The course is ideal for someone with some previous coding experience who wants to add Python to their repertoire or level up their basic Python skills. (If you're a first-time coder, you may want to check out [these "Python for Non-Programmers" learning resources](https://wiki.python.org/moin/BeginnersGuide/NonProgrammers).)

We'll start with a brief overview of Python syntax, variable assignment, and arithmetic operators. If you have previous Python experience, you can [skip straight to the hands-on exercise](https://www.kaggle.com/kernels/fork/1275163).

## Hello, Python!
Python was named for the British comedy troupe [Monty Python](https://en.wikipedia.org/wiki/Monty_Python), so we'll make our first Python program an homage to their skit about Spam?

Just for fun, try reading over the code below and predicting what it's going to do when run. (If you have no idea, that's fine!)

Then click the "output" button to see the results of our program.

In [5]:
spam_amount = 0
print(spam_amount)

# Ordering Spam, egg, Spam, Spam, bacon and Spam (4 more servings of Spam)
spam_amount = spam_amount + 4

if spam_amount > 0:
    print("But I don't want ANY spam!")

viking_song = "Spam " * spam_amount
print(viking_song)

0
But I don't want ANY spam!
Spam Spam Spam Spam 


There's a lot to unpack here! This silly program demonstrates many important aspects of what Python code looks like and how it works. Let's review the code from top to bottom.

In [6]:
spam_amount = 0

**Variable assignment**: Here we create a variable called `spam_amount` and assign it the value of 0 using `=`, which is called the assignment operator.

> **Aside**: If you've programmed in certain other languages (like Java or C++), you might be noticing some things Python doesn't require us to do here:
> 
> * we don't need to "declare" `spam_amount` before assigning to it
> * we don't need to tell Python what type of value `spam_amount` is going to refer to. In fact, we can even go on to reassign `spam_amount` to refer to a different sort of thing like a string or a boolean.

In [7]:
print(spam_amount)

0


**Function calls**: `print` is a Python function that displays the value passed to it on the screen. We call functions by putting parentheses after their name, and putting the inputs (or *arguments*) to the function in those parentheses.

In [8]:
# Ordering Spam, egg, Spam, Spam, bacon and Spam (4 more servings of Spam)
spam_amount = spam_amount + 4

The first line above is a **comment**. In Python, comments begin with the `#` symbol.

Next we see an example of reassignment. Reassigning the value of an existing variable looks just the same as creating a variable - it still uses the `=` assignment operator.

In this case, the value we're assigning to `spam_amount` involves some simple arithmetic on its previous value. When it encounters this line, Python evaluates the expression on the right-hand-side of the `=` (0 + 4 = 4), and then assigns that value to the variable on the left-hand-side.

In [9]:
if spam_amount > 0:
    print("But I don't want ANY spam!")

viking_song = "Spam Spam Spam"
print(viking_song)

But I don't want ANY spam!
Spam Spam Spam


We won't talk much about "conditionals" until later, but, even if you've never coded before, you can probably guess what this does. Python is prized for its readability and the simplicity.

Note how we indicated which code belongs to the `if`. `"But I don't want ANY spam!"` is only supposed to be printed if `spam_amount` is positive. But the later code (like `print(viking_song)`) should be executed no matter what. How do we (and Python) know that?

The colon (`:`) at the end of the `if` line indicates that a new "code block" is starting. Subsequent lines which are **indented** are part of that code block. Some other languages use `{`curly braces`}` to mark the beginning and end of code blocks. Python's use of meaningful whitespace can be surprising to programmers who are accustomed to other languages, but in practice it can lead to more consistent and readable code than languages that do not enforce indentation of code blocks.

The later lines dealing with `viking_song` are not indented with an extra 4 spaces, so they're not a part of the `if`'s code block. We'll see more examples of indented code blocks later when we define functions and using loops.

This code snippet is also our first sighting of a **string** in Python:

In [10]:
"But I don't want ANY spam!"

"But I don't want ANY spam!"

Strings can be marked either by double or single quotation marks. (But because this particular string *contains* a single-quote character, we might confuse Python by trying to surround it with single-quotes, unless we're careful.)

In [11]:
viking_song = "Spam " * spam_amount
print(viking_song)

Spam Spam Spam Spam 


The `*` operator can be used to multiply two numbers (`3 * 3` evaluates to 9), but amusingly enough, we can also multiply a string by a number, to get a version that's been repeated that many times. Python offers a number of cheeky little time-saving tricks like this where operators like `*` and `+` have a different meaning depending on what kind of thing they're applied to. (The technical term for this is [operator overloading](https://en.wikipedia.org/wiki/Operator_overloading))

## Numbers and arithmetic in Python
We've already seen an example of a variable containing a number above:

In [12]:
spam_amount = 0

"Number" is a fine informal name for the kind of thing, but if we wanted to be more technical, we could ask Python how it would describe the type of thing that `spam_amount` is:

In [13]:
type(spam_amount)

int

It's an `int` - short for integer. There's another sort of number we commonly encounter in Python:

In [14]:
type(19.95)

float

A `float` is a number with a decimal place - very useful for representing things like weights or proportions.

`type()` is the second built-in function we've seen (after `print()`), and it's another good one to remember. It's very useful to be able to ask Python "what kind of thing is this?".

A natural thing to want to do with numbers is perform arithmetic. We've seen the `+` operator for addition, and the `*` operator for multiplication (of a sort). Python also has us covered for the rest of the basic buttons on your calculator:

```
Operator	Name	Description
a + b	Addition	Sum of a and b
a - b	Subtraction	Difference of a and b
a * b	Multiplication	Product of a and b
a / b	True division	Quotient of a and b
a // b	Floor division	Quotient of a and b, removing fractional parts
a % b	Modulus	Integer remainder after division of a by b
a ** b	Exponentiation	a raised to the power of b
-a	Negation	The negative of a
```

One interesting observation here is that, whereas your calculator probably just has one button for division, Python can do two kinds. "True division" is basically what your calculator does:

In [15]:
print(5 / 2)
print(6 / 2)

2.5
3.0


It always gives us a `float`.

The `//` operator gives us a result that's rounded down to the next integer.

In [16]:
print(5 // 2)
print(6 // 2)

2
3


Can you think of where this would be useful? You'll see an example soon in the coding challenges.

### Order of operations
The arithmetic we learned in primary school has conventions about the order in which operations are evaluated. Some remember these by a mnemonic such as **PEMDAS** - **P**arentheses, **E**xponents, **M**ultiplication/**D**ivision, **A**ddition/**S**ubtraction.

Python follows similar rules about which calculations to perform first. They're mostly pretty intuitive.

In [17]:
8 - 3 + 2

7

In [18]:
-3 + 4 * 2

5

Sometimes the default order of operations isn't what we want:

In [19]:
hat_height_cm = 25
my_height_cm = 190
# How tall am I, in meters, when wearing my hat?
total_height_meters = hat_height_cm + my_height_cm / 100
print("Height in meters =", total_height_meters, "?")

Height in meters = 26.9 ?


Parentheses are your useful here. You can add them to force Python to evaluate sub-expressions in whatever order you want.

In [20]:
total_height_meters = (hat_height_cm + my_height_cm) / 100
print("Height in meters =", total_height_meters)

Height in meters = 2.15


### Builtin functions for working with numbers
`min` and `max` return the minimum and maximum of their arguments, respectively...

In [21]:
print(min(1, 2, 3))
print(max(1, 2, 3))

1
3


`abs` returns the absolute value of it argument:

In [22]:
print(abs(32))
print(abs(-32))

32
32


In addition to being the names of Python's two main numerical types, `int` and `float` can also be called as functions which convert their arguments to the corresponding type:

In [23]:
print(float(10))
print(int(3.33))
# They can even be called on strings!
print(int('807') + 1)

10.0
3
808


## Your Turn
Now is your chance. Try your [**first Python programming exercise**](https://www.kaggle.com/kernels/fork/1275163)

---
[**Python Home Page**](https://www.kaggle.com/learn/python)


# [Functions and Getting Help](https://www.kaggle.com/colinmorris/functions-and-getting-help)


[**Python Home Page**](https://www.kaggle.com/learn/python)

---

## Intro
You've already seen and used functions such as `print` and `abs`. But Python has many more functions, and defining your own functions is a big part of python programming.

In this lesson you will learn more about using and defining functions.

## Getting Help
You saw the abs function in the previous tutorial, but what if you've forgotten what it does?

The `help()` function is possibly the most important Python function you can learn. If you can remember how to use `help()`, you hold the key to understanding most other function.

Here is an example:

In [24]:
help(round)

Help on built-in function round in module builtins:

round(...)
    round(number[, ndigits]) -> number
    
    Round a number to a given precision in decimal digits (default 0 digits).
    This returns an int when called with one argument, otherwise the
    same type as the number. ndigits may be negative.



`help()` displays two things:

1. the header of that function `round(number[, ndigits])`. In this case, this tells us that `round()` takes an argument we can describe as `number`. Additionally, we can optionally give a separate argument which could be described as `ndigits`.
2. A brief English description of what the function does.

**Common pitfall**: when you're looking up a function, remember to pass in the name of the function itself, and not the result of calling that function.

What happens if we invoke help on a *call* to the function `abs()`? Unhide the output of the cell below to see.

In [25]:
help(round(-2.01))

Help on int object:

class int(object)
 |  int(x=0) -> integer
 |  int(x, base=10) -> integer
 |  
 |  Convert a number or string to an integer, or return 0 if no arguments
 |  are given.  If x is a number, return x.__int__().  For floating point
 |  numbers, this truncates towards zero.
 |  
 |  If x is not a number or if base is given, then x must be a string,
 |  bytes, or bytearray instance representing an integer literal in the
 |  given base.  The literal can be preceded by '+' or '-' and be surrounded
 |  by whitespace.  The base defaults to 10.  Valid bases are 0 and 2-36.
 |  Base 0 means to interpret the base from the string as an integer literal.
 |  >>> int('0b100', base=0)
 |  4
 |  
 |  Methods defined here:
 |  
 |  __abs__(self, /)
 |      abs(self)
 |  
 |  __add__(self, value, /)
 |      Return self+value.
 |  
 |  __and__(self, value, /)
 |      Return self&value.
 |  
 |  __bool__(self, /)
 |      self != 0
 |  
 |  __ceil__(...)
 |      Ceiling of an Integral retur

Python evaluates an expression like this from the inside out. First it calculates the value of `round(-2.01)`, then it provides help on the output of that expression.

(And it turns out to have a lot to say about integers! After we talk later about objects, methods, and attributes in Python, the voluminous help output above will make more sense.)

round is a very simple function with a short docstring. `help` shines even more when dealing with more complex, configurable functions like `print`. Don't worry if the following output looks inscrutable... for now, just see if you can pick anything new out from this help.

In [26]:
help(print)

Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
    
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.



If you were looking for it, you might learn that print can take an argument called `sep`, and that this describes what we put between all the other arguments when we print them.

## Defining functions
Builtin functions are great, but we can only get so far with them before we need to start defining our own functions. Below is a simple example.

In [27]:
def least_difference(a, b, c):
    diff1 = abs(a - b)
    diff2 = abs(b - c)
    diff3 = abs(a - c)
    return min(diff1, diff2, diff3)

This creates a function called `least_difference`, which takes three arguments, `a`, `b`, and `c`.

Functions start with a header introduced by the `def` keyword. The indented block of code following the `:` is run when the function is called.

`return` is another keyword uniquely associated with functions. When Python encounters a `return` statement, it exits the function immediately, and passes the value on the right hand side to the calling context.

Is it clear what `least_difference()` does from the source code? If we're not sure, we can always try it out on a few examples:

In [28]:
print(
    least_difference(1, 10, 100),
    least_difference(1, 10, 10),
    least_difference(5, 6, 7), # Python allows trailing commas in argument lists. How nice is that?
)

9 0 1


Or maybe the `help()` function can tell us something about it.

In [29]:
help(least_difference)

Help on function least_difference in module __main__:

least_difference(a, b, c)



Python isn't smart enough to read my code and turn it into a nice English description. However, when I write a function, I can provide a description in what's called the **docstring**.

### Docstrings

In [30]:
def least_difference(a, b, c):
    """Return the smallest difference between any two numbers
    among a, b and c.
    
    >>> least_difference(1, 5, -5)
    4
    """
    diff1 = abs(a - b)
    diff2 = abs(b - c)
    diff3 = abs(a - c)
    return min(diff1, diff2, diff3)

The docstring is a triple-quoted string (which may span multiple lines) that comes immediately after the header of a function. When we call `help()` on a function, it shows the docstring.

In [31]:
help(least_difference)

Help on function least_difference in module __main__:

least_difference(a, b, c)
    Return the smallest difference between any two numbers
    among a, b and c.
    
    >>> least_difference(1, 5, -5)
    4



> **Aside**: example calls The last two lines of the docstring are an example function call and result. (The `>>>` is a reference to the command prompt used in Python interactive shells.) Python doesn't run the example call - it's just there for the benefit of the reader. The convention of including 1 or more example calls in a function's docstring is far from universally observed, but it can be very effective at helping someone understand your function. For a real-world example of, see [this docstring for the numpy function](https://github.com/numpy/numpy/blob/v1.14.2/numpy/lib/twodim_base.py#L140-L194) `np.eye`.

Good programmers use docstrings unless they expect to throw away the code soon after it's used (which is rare). So, you should start writing docstrings too.

## Functions that don't return
What would happen if we didn't include the `return` keyword in our function?

In [32]:
def least_difference(a, b, c):
    """Return the smallest difference between any two numbers
    among a, b and c.
    """
    diff1 = abs(a - b)
    diff2 = abs(b - c)
    diff3 = abs(a - c)
    min(diff1, diff2, diff3)
    
print(
    least_difference(1, 10, 100),
    least_difference(1, 10, 10),
    least_difference(5, 6, 7),
)

None None None


Python allows us to define such functions. The result of calling them is the special value `None`. (This is similar to the concept of "null" in other languages.)

Without a `return` statement, `least_difference` is completely pointless, but a function with side effects may do something useful without returning anything. We've already seen two examples of this: `print()` and `help()` don't return anything. We only call them for their side effects (putting some text on the screen). Other examples of useful side effects include writing to a file, or modifying an input.

In [33]:
mystery = print()
print(mystery)


None


## Default arguments
When we called `help(print)`, we saw that the `print` function has several optional arguments. For example, we can specify a value for `sep` to put some special string in between our printed arguments:

In [34]:
print(1, 2, 3, sep=' < ')

1 < 2 < 3


But if we don't specify a value, `sep` is treated as having a default value of `' '` (a single space).

In [35]:
print(1, 2, 3)

1 2 3


Adding optional arguments with default values to the functions we define turns out to be pretty easy:

In [36]:
def greet(who="Colin"):
    print("Hello,", who)
    
greet()
greet(who="Kaggle")
# (In this case, we don't need to specify the name of the argument, because it's unambiguous.)
greet("world")

Hello, Colin
Hello, Kaggle
Hello, world


## Functions Applied to Functions
Here's something that's powerful, though it can feel very abstract at first. You can supply functions as arguments to other functions. Some example may make this clearer:

In [37]:
def mult_by_five(x):
    return 5 * x

def call(fn, arg):
    """Call fn on arg"""
    return fn(arg)

def squared_call(fn, arg):
    """Call fn on the result of calling fn on arg"""
    return fn(fn(arg))

print(
    call(mult_by_five, 1),
    squared_call(mult_by_five, 1), 
    sep='\n', # '\n' is the newline character - it starts a new line
)

5
25


Functions that operate on other functions are called "Higher order functions." You probably won't write your own for a little while. But there are higher order functions built into Python that you might find useful to call.

Here's an interesting example using the `max` function.

By default, `max` returns the largest of its arguments. But if we pass in a function using the optional `key` argument, it returns the argument `x` that maximizes `key(x)` (aka the 'argmax').

In [38]:
def mod_5(x):
    """Return the remainder of x after dividing by 5"""
    return x % 5

print(
    'Which number is biggest?',
    max(100, 51, 14),
    'Which number is the biggest modulo 5?',
    max(100, 51, 14, key=mod_5),
    sep='\n',
)

Which number is biggest?
100
Which number is the biggest modulo 5?
14


## Your Turn
Functions open up a whole new world in Python programming. [**Try using them yourself**](https://www.kaggle.com/kernels/fork/1275158)

[**Python Home Page**](https://www.kaggle.com/learn/python)

# [Booleans and Conditionals](https://www.kaggle.com/colinmorris/booleans-and-conditionals)

## Booleans
Python has a type `bool` which can take on one of two values: `True` and `False`.

In [39]:
x = True
print(x)
print(type(x))

True
<class 'bool'>


Rather than putting `True` or `False` directly in our code, we usually get boolean values from **boolean operators**. These are operators that answer yes/no questions. We'll go through some of these operators below.

## Comparison Operations

```
Operation	Description		
a == b	a equal to b		
a < b	a less than b		
a <= b	a less than or equal to b		
a != b	a not equal to b
a > b	a greater than b
a >= b	a greater than or equal to b
```

In [40]:
def can_run_for_president(age):
    """Can someone of the given age run for president in the US?"""
    # The US Constitution says you must "have attained to the Age of thirty-five Years"
    return age >= 35

print("Can a 19-year-old run for president?", can_run_for_president(19))
print("Can a 45-year-old run for president?", can_run_for_president(45))


Can a 19-year-old run for president? False
Can a 45-year-old run for president? True


Comparisons are a little bit clever...

In [41]:
3.0 == 3

True

But not too clever...

In [42]:
'3' == 3

False

Comparison operators can be combined with the arithmetic operators we've already seen to express a virtually limitless range of mathematical tests. For example, we can check if a number is odd by checking that the modulus with 2 returns 1:

In [43]:
def is_odd(n):
    return (n % 2) == 1

print("Is 100 odd?", is_odd(100))
print("Is -1 odd?", is_odd(-1))

Is 100 odd? False
Is -1 odd? True


Remember to use `==` instead of `=` when making comparisons. If you write `n == 2` you are asking about the value of n. When you write `n = 2` you are changing the value of n.

## Combining Boolean Values
Python provides operators to combine boolean values using the standard concepts of "and", "or", and "not". And in fact, the corresponding Python operators use just those words: `and`, `or`, and `not`.

With these, we can make our `can_run_for_president` function more accurate.

In [44]:
def can_run_for_president(age, is_natural_born_citizen):
    """Can someone of the given age and citizenship status run for president in the US?"""
    # The US Constitution says you must be a natural born citizen *and* at least 35 years old
    return is_natural_born_citizen and (age >= 35)

print(can_run_for_president(19, True))
print(can_run_for_president(55, False))
print(can_run_for_president(55, True))

False
False
True


Quick, can you guess the value of this expression?

In [45]:
True or True and False

True

Python has precedence rules that determine the order in which operations get evaluated in expressions like above. For example, `and` has a higher precedence than `or`, which is why the first expression above is `True`. If we had evaluated it from left to right, we would have calculated `True or True` first (which is `True`), and then taken the `and` of that result with `False`, giving a final value of `False`.

You could try to [memorize the order of precedence](https://docs.python.org/3/reference/expressions.html#operator-precedence), but a safer bet is to just use liberal parentheses. Not only does this help prevent bugs, it makes your intentions clearer to anyone who reads your code.

For example, consider the following expression:

In [46]:
prepared_for_weather = have_umbrella or rain_level < 5 and have_hood or not rain_level > 0 and is_workday

NameError: name 'have_umbrella' is not defined

I'm trying to say that I'm safe from today's weather....

* if I have an umbrella...
* or if the rain isn't too heavy and I have a hood...
* otherwise, I'm still fine unless it's raining and it's a workday

But not only is my Python code hard to read, it has a bug. We can address both problems by adding some parentheses:

In [None]:
prepared_for_weather = have_umbrella or (rain_level < 5 and have_hood) or not (rain_level > 0 and is_workday)

You can add even more parentheses if you think it helps readability:

In [None]:
prepared_for_weather = have_umbrella or ((rain_level < 5) and have_hood) or (not (rain_level > 0 and is_workday))

We can also split it over multiple lines to emphasize the 3-part structure described above:

In [None]:
prepared_for_weather = (
    have_umbrella 
    or ((rain_level < 5) and have_hood) 
    or (not (rain_level > 0 and is_workday))
)

## Conditionals
While useful enough in their own right, booleans really start to shine when combined with conditional statements, using the keywords `if`, `elif`, and `else`.

Conditional statements, often referred to as if-then statements, allow the programmer to execute certain pieces of code depending on some Boolean condition. A basic example of a Python conditional statement is this:

In [None]:
def inspect(x):
    if x == 0:
        print(x, "is zero")
    elif x > 0:
        print(x, "is positive")
    elif x < 0:
        print(x, "is negative")
    else:
        print(x, "is unlike anything I've ever seen...")

inspect(0)
inspect(-15)

Python adopts the `if` and `else` often used in other languages; its more unique keyword is `elif`, a contraction of "else if". In these conditional clauses, `elif` and else blocks are optional; additionally, you can include as many `elif` statements as you would like.

Note especially the use of colons (`:`) and whitespace to denote separate blocks of code. This is similar to what happens when we define a function - the function header ends with `:`, and the following line is indented with 4 spaces. All subsequent indented lines belong to the body of the function, until we encounter an unindented line, ending the function definition.

In [None]:
def f(x):
    if x > 0:
        print("Only printed when x is positive; x =", x)
        print("Also only printed when x is positive; x =", x)
    print("Always printed, regardless of x's value; x =", x)

f(1)
f(0)

## Boolean conversion
We've seen `int()`, which turns things into ints, and `float()`, which turns things into floats, so you might not be surprised to hear that Python has a `bool()` function which turns things into bools.

In [None]:
print(bool(1)) # all numbers are treated as true, except 0
print(bool(0))
print(bool("asf")) # all strings are treated as true, except the empty string ""
print(bool(""))
# Generally empty sequences (strings, lists, and other types we've yet to see like lists and tuples)
# are "falsey" and the rest are "truthy"

We can use non-boolean objects in `if` conditions and other places where a boolean would be expected. Python will implicitly treat them as their corresponding boolean value:

In [None]:
if 0:
    print(0)
elif "spam":
    print("spam")

## Conditional expressions (aka 'ternary')
Setting a variable to either of two values depending on some condition is a pretty common pattern.

In [None]:
def quiz_message(grade):
    if grade < 50:
        outcome = 'failed'
    else:
        outcome = 'passed'
    print('You', outcome, 'the quiz with a grade of', grade)
    
quiz_message(80)

Python has a handy single-line 'conditional expression' syntax to simplify these cases:

In [None]:
def quiz_message(grade):
    outcome = 'failed' if grade < 50 else 'passed'
    print('You', outcome, 'the quiz with a grade of', grade)
    
quiz_message(45)

You may recognize this as being similar to the ternary operator that exists in many other languages. For example, in javascript, we would write the assignment above as `var outcome = grade < 50 ? 'failed' : 'passed'`. (When it comes to readability, I think Python is the winner here.)

## Your turn!
Head over to [the Exercises notebook](https://www.kaggle.com/kernels/fork/1275165) to get some hands-on practice working with booleans and conditionals.

# [Lists](https://www.kaggle.com/colinmorris/lists)

[Python Micro-Course Home Page](http://)

---

## Lists
Lists in Python represent ordered sequences of values. Here is an example of how to create them:

primes = [2, 3, 5, 7]

We can put other types of things in lists:

In [None]:
planets = ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn', 'Uranus', 'Neptune']

We can even make a list of lists:

In [None]:
hands = [
    ['J', 'Q', 'K'],
    ['2', '2', '2'],
    ['6', 'A', 'K'], # (Comma after the last element is optional)
]
# (I could also have written this on one line, but it can get hard to read)
hands = [['J', 'Q', 'K'], ['2', '2', '2'], ['6', 'A', 'K']]

A list can contain a mix of different types of variables:

In [None]:
my_favourite_things = [32, 'raindrops on roses', help]
# (Yes, Python's help function is *definitely* one of my favourite things)

## Indexing
You can access individual list elements with square brackets.

Which planet is closest to the sun? Python uses zero-based indexing, so the first element has index 0.

In [None]:
planets[0]

What's the next closest planet?

In [None]:
planets[1]

Which planet is furthest from the sun?

Elements at the end of the list can be accessed with negative numbers, starting from -1:

In [None]:
planets[-1]

In [None]:
planets[-2]

## Slicing
What are the first three planets? We can answer this question using *slicing*:

In [None]:
planets[0:3]

`planets[0:3]` is our way of asking for the elements of `planets` starting from index 0 and continuing up to *but not including* index 3.

The starting and ending indices are both optional. If I leave out the start index, it's assumed to be 0. So I could rewrite the expression above as:

In [None]:
planets[:3]

If I leave out the end index, it's assumed to be the length of the list.

In [None]:
planets[3:]

i.e. the expression above means "give me all the planets from index 3 onward".

We can also use negative indices when slicing:

In [None]:
# All the planets except the first and last
planets[1:-1]

In [None]:
# The last 3 planets
planets[-3:]

## Changing lists
Lists are "mutable", meaning they can be modified "in place".

One way to modify a list is to assign to an index or slice expression.

For example, let's say we want to rename Mars:

In [None]:
planets[3] = 'Malacandra'
planets

Hm, that's quite a mouthful. Let's compensate by shortening the names of the first 3 planets.

In [None]:
planets[:3] = ['Mur', 'Vee', 'Ur']
print(planets)
# That was silly. Let's give them back their old names
planets[:4] = ['Mercury', 'Venus', 'Earth', 'Mars',]

## List functions
Python has several useful functions for working with lists.

`len` gives the length of a list:

In [None]:
# How many planets are there?
len(planets)

`sorted` returns a sorted version of a list:

In [None]:
# The planets sorted in alphabetical order
sorted(planets)

`sum` does what you might expect:

In [None]:
primes = [2, 3, 5, 7]
sum(primes)

We've previously used the min and max to get the minimum or maximum of several arguments. But we can also pass in a single list argument.

In [None]:
max(primes)

## Interlude: objects
I've used the term 'object' a lot so far - you may have even read that everything in Python is an object. What does that mean?

In short, objects carry some things around with them. You access that stuff using Python's dot syntax.

For example, numbers in Python carry around an associated variable called `imag` representing their imaginary part. (You'll probably never need to use this unless you're doing some very weird math.)

In [None]:
x = 12
# x is a real number, so its imaginary part is 0.
print(x.imag)
# Here's how to make a complex number, in case you've ever been curious:
c = 12 + 3j
print(c.imag)

The things an object carries around can also include functions. A function attached to an object is called a **method**. (Non-function things attached to an object, such as `imag`, are called attributes).

For example, numbers have a method called `bit_length`. Again, we access it using dot syntax:

In [None]:
x.bit_length

To actually call it, we add parentheses:

In [None]:
x.bit_length()

> **Aside**: You've actually been calling methods already if you've been doing the exercises. In the exercise notebooks `q1`, `q2`, `q3`, etc. are all objects which have methods called `check`, `hint`, and `solution`.

In the same way that we can pass functions to the `help` function (e.g. `help(max)`), we can also pass in methods:

In [None]:
help(x.bit_length)

The examples above were utterly obscure. None of the types of objects we've looked at so far (numbers, functions, booleans) have attributes or methods you're likely ever to use.

But it turns out that lists have several methods which you'll use all the time.

## List methods
`list.append` modifies a list by adding an item to the end:

In [None]:
# Pluto is a planet darn it!
planets.append('Pluto')

Why does the cell above have no output? Let's check the documentation by calling help(`planets.append`).

**Aside**: `append` is a method carried around by all objects of type list, not just `planets`, so we also could have called `help(list.append)`. However, if we try to call `help(append)`, Python will complain that no variable exists called "append". The "append" name only exists within lists - it doesn't exist as a standalone name like builtin functions such as `max` or `len`.

In [None]:
help(planets.append)

The `-> None` part is telling us that `list.append` doesn't return anything. But if we check the value of `planets`, we can see that the method call modified the value of `planets`:

In [None]:
planets

`list.pop` removes and returns the last element of a list:

In [None]:
planets.pop()

In [None]:
planets

### Searching lists
Where does Earth fall in the order of planets? We can get its index using the `list.index` method.

In [None]:
planets.index('Earth')

It comes third (i.e. at index 2 - 0 indexing!).

At what index does Pluto occur?

In [None]:
planets.index('Pluto')

Oh, that's right...

To avoid unpleasant surprises like this, we can use the `in` operator to determine whether a list contains a particular value:

In [None]:
# Is Earth a planet?
"Earth" in planets

In [None]:
# Is Calbefraques a planet?
"Calbefraques" in planets

There are a few more interesting list methods we haven't covered. If you want to learn about all the methods and attributes attached to a particular object, we can call `help()` on the object itself. For example, `help(planets)` will tell us about all the list methods:

In [None]:
help(planets)

Click the "output" button to see the full help page. Lists have lots of methods with weird-looking names like `__eq__` and `__iadd__`. Don't worry too much about these for now. (You'll probably never call such methods directly. But they get called behind the scenes when we use syntax like indexing or comparison operators.) The most interesting methods are toward the bottom of the list (`append`, `clear`, `copy`, etc.).

## Tuples
Tuples are almost exactly the same as lists. They differ in just two ways.

**1**: The syntax for creating them uses parentheses instead of square brackets

In [None]:
t = (1, 2, 3)

In [None]:
t = 1, 2, 3 # equivalent to above
t

**2**: They cannot be modified (they are *immutable*).

In [None]:
t[0] = 100

Tuples are often used for functions that have multiple return values.

For example, the `as_integer_ratio()` method of float objects returns a numerator and a denominator in the form of a tuple:

In [None]:
x = 0.125
x.as_integer_ratio()

These multiple return values can be individually assigned as follows:

In [None]:
numerator, denominator = x.as_integer_ratio()
print(numerator / denominator)

Finally we have some insight into the classic Stupid Python Trick™ for swapping two variables!

In [None]:
a = 1
b = 0
a, b = b, a
print(a, b)

## Your Turn
Try the [hands-on exercise](https://www.kaggle.com/kernels/fork/1275173) with lists and tuples

---
[**Python Micro-Course Home Page**](https://www.kaggle.com/learn/python)

# [Loops and List Comprehensions](https://www.kaggle.com/colinmorris/loops-and-list-comprehensions)


[**Python Home Page**](https://www.kaggle.com/learn/python)

---
## Loops
Loops are a way to repeatedly execute some code. Here's an example:

In [None]:
planets = ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn', 'Uranus', 'Neptune']
for planet in planets:
    print(planet, end=' ') # print all on same line

The `for` loop specifies

* the variable name to use (in this case, `planet`)
* the set of values to loop over (in this case, `planets`)

You use the word "`in`" to link them together.

The object to the right of the "`in`" can be any object that supports iteration. Basically, if it can be thought of as a group of things, you can probably loop over it. In addition to lists, we can iterate over the elements of a tuple:

In [None]:
multiplicands = (2, 2, 2, 3, 3, 5)
product = 1
for mult in multiplicands:
    product = product * mult
product

You can even loop through each character in a string:

In [None]:
s = 'steganograpHy is the practicE of conceaLing a file, message, image, or video within another fiLe, message, image, Or video.'
msg = ''
# print all the uppercase letters in s, one at a time
for char in s:
    if char.isupper():
        print(char, end='')  

### range()
`range()` is a function that returns a sequence of numbers. It turns out to be very useful for writing loops.

For example, if we want to repeat some action 5 times:

In [None]:
for i in range(5):
    print("Doing important work. i =", i)

## `while` loops
The other type of loop in Python is a `while` loop, which iterates until some condition is met:

In [None]:
i = 0
while i < 10:
    print(i, end=' ')
    i += 1

The argument of the `while` loop is evaluated as a boolean statement, and the loop is executed until the statement evaluates to False.

## List comprehensions
List comprehensions are one of Python's most beloved and unique features. The easiest way to understand them is probably to just look at a few examples:

In [None]:
squares = [n**2 for n in range(10)]
squares

Here's how we would do the same thing without a list comprehension:

In [None]:
squares = []
for n in range(10):
    squares.append(n**2)
squares

We can also add an `if` condition:

In [None]:
short_planets = [planet for planet in planets if len(planet) < 6]
short_planets

(If you're familiar with SQL, you might think of this as being like a "WHERE" clause)

Here's an example of filtering with an `if` condition and applying some transformation to the loop variable:

In [None]:
# str.upper() returns an all-caps version of a string
loud_short_planets = [planet.upper() + '!' for planet in planets if len(planet) < 6]
loud_short_planets

People usually write these on a single line, but you might find the structure clearer when it's split up over 3 lines:

In [None]:
[
    planet.upper() + '!' 
    for planet in planets 
    if len(planet) < 6
]

(Continuing the SQL analogy, you could think of these three lines as SELECT, FROM, and WHERE)

The expression on the left doesn't technically have to involve the loop variable (though it'd be pretty unusual for it not to). What do you think the expression below will evaluate to? Press the 'output' button to check.

In [None]:
[32 for planet in planets]

List comprehensions combined with functions like `min`, `max`, and `sum` can lead to impressive one-line solutions for problems that would otherwise require several lines of code.

For example, compare the following two cells of code that do the same thing.

In [None]:
def count_negatives(nums):
    """Return the number of negative numbers in the given list.
    
    >>> count_negatives([5, -1, -2, 0, 3])
    2
    """
    n_negative = 0
    for num in nums:
        if num < 0:
            n_negative = n_negative + 1
    return n_negative

Here's a solution using a list comprehension:

In [None]:
def count_negatives(nums):
    return len([num for num in nums if num < 0])

Much better, right?

Well if all we care about is minimizing the length of our code, this third solution is better still!

In [47]:
def count_negatives(nums):
    # Reminder: in the "booleans and conditionals" exercises, we learned about a quirk of 
    # Python where it calculates something like True + True + False + True to be equal to 3.
    return sum([num < 0 for num in nums])

Which of these solutions is the "best" is entirely subjective. Solving a problem with less code is always nice, but it's worth keeping in mind the following lines from [The Zen of Python](https://en.wikipedia.org/wiki/Zen_of_Python):

> Readability counts.

> Explicit is better than implicit.

So, use these tools to make compact readable programs. But when you have to choose, favor code that is easy for others to understand.

## Your Turn
You know the deal at this point. We have some [**fun coding challenges**](https://www.kaggle.com/kernels/fork/1275177) for you. This next set of coding problems is shorter, so try it now.

---
[**Python Home Page**](https://www.kaggle.com/learn/python)

# [Strings and Dictionaries](https://www.kaggle.com/colinmorris/strings-and-dictionaries)

[**Python Home Page**](https://www.kaggle.com/learn/python)

---

This lesson will be a double-shot of essential Python types: **strings** and **dictionaries**.

## Strings
One place where the Python language really shines is in the manipulation of strings. This section will cover some of Python's built-in string methods and formatting operations.

Such string manipulation patterns come up often in the context of data science work, and is one big perk of Python in this context.

### String syntax
You've already seen plenty of strings in examples during the previous lessons, but just to recap, strings in Python can be defined using either single or double quotations. They are functionally equivalent.

In [48]:
x = 'Pluto is a planet'
y = "Pluto is a planet"
x == y

True

Double quotes are convenient if your string contains a single quote character (e.g. representing an apostrophe).

Similarly, it's easy to create a string that contains double-quotes if you wrap it in single quotes:

In [49]:
print("Pluto's a planet!")
print('My dog is named "Pluto"')

Pluto's a planet!
My dog is named "Pluto"


If we try to put a single quote character inside a single-quoted string, Python gets confused:

In [50]:
'Pluto's a planet!'

SyntaxError: invalid syntax (<ipython-input-50-a43631749f52>, line 1)

We can fix this by "escaping" the single quote with a backslash.

In [None]:
'Pluto\'s a planet!'

The table below summarizes some important uses of the backslash character.
```
What you type...	What you get	example	print(example)
\'	'	'What\'s up?'	What's up?
\"	"	"That's \"cool\""	That's "cool"
\\	\	"Look, a mountain: /\\"	Look, a mountain: /\
\n	
"1\n2 3"	1
2 3
```
The last sequence, `\n`, represents the newline character. It causes Python to start a new line.

In [None]:
hello = "hello\nworld"
print(hello)

In addition, Python's triple quote syntax for strings lets us include newlines literally (i.e. by just hitting 'Enter' on our keyboard, rather than using the special '\n' sequence). We've already seen this in the docstrings we use to document our functions, but we can use them anywhere we want to define a string.

In [None]:
triplequoted_hello = """hello
world"""
print(triplequoted_hello)
triplequoted_hello == hello

The `print()` function automatically adds a newline character unless we specify a value for the keyword argument `end` other than the default value of `'\n'`:

In [None]:
print("hello")
print("world")
print("hello", end='')
print("pluto", end='')

### Strings are sequences
Strings can be thought of as sequences of characters. Almost everything we've seen that we can do to a list, we can also do to a string.

In [None]:
# Indexing
planet = 'Pluto'
planet[0]

In [None]:
# Slicing
planet[-3:]

In [None]:
# How long is this string?
len(planet)

In [None]:
# Yes, we can even loop over them
[char+'! ' for char in planet]

But a major way in which they differ from lists is that they are immutable. We can't modify them.

In [None]:
planet[0] = 'B'
# planet.append doesn't work either

### String methods
Like `list`, the type `str` has lots of very useful methods. I'll show just a few examples here.

In [None]:
# ALL CAPS
claim = "Pluto is a planet!"
claim.upper()

In [None]:
# all lowercase
claim.lower()

In [None]:
# Searching for the first index of a substring
claim.index('plan')

In [None]:
claim.startswith(planet)

In [None]:
claim.endswith('dwarf planet')

#### Going between strings and lists: `.split()` and `.join()`
`str.split()` turns a string into a list of smaller strings, breaking on whitespace by default. This is super useful for taking you from one big string to a list of words.

In [None]:
words = claim.split()
words

Occasionally you'll want to split on something other than whitespace:

In [None]:
datestr = '1956-01-31'
year, month, day = datestr.split('-')

`str.join()` takes us in the other direction, sewing a list of strings up into one long string, using the string it was called on as a separator.

In [None]:
'/'.join([month, day, year])

In [51]:
# Yes, we can put unicode characters right in our string literals :)
' 👏 '.join([word.upper() for word in words])

NameError: name 'words' is not defined

**Building strings with `.format()`**

Python lets us concatenate strings with the `+` operator.

In [None]:
planet + ', we miss you.'

If we want to throw in any non-string objects, we have to be careful to call `str()` on them first

In [None]:
position = 9
planet + ", you'll always be the " + position + "th planet to me."

In [None]:
planet + ", you'll always be the " + str(position) + "th planet to me."

This is getting hard to read and annoying to type. `str.format()` to the rescue.

In [None]:
"{}, you'll always be the {}th planet to me.".format(planet, position)

So much cleaner! We call `.format()` on a "format string", where the Python values we want to insert are represented with `{}` placeholders.

Notice how we didn't even have to call `str()` to convert `position` from an int. `format()` takes care of that for us.

If that was all that `format()` did, it would still be incredibly useful. But as it turns out, it can do a lot more. Here's just a taste:

In [None]:
pluto_mass = 1.303 * 10**22
earth_mass = 5.9722 * 10**24
population = 52910390
#         2 decimal points   3 decimal points, format as percent     separate with commas
"{} weighs about {:.2} kilograms ({:.3%} of Earth's mass). It is home to {:,} Plutonians.".format(
    planet, pluto_mass, pluto_mass / earth_mass, population,
)

In [None]:
# Referring to format() arguments by index, starting from 0
s = """Pluto's a {0}.
No, it's a {1}.
{0}!
{1}!""".format('planet', 'dwarf planet')
print(s)

You could probably write a short book just on `str.format`, so I'll stop here, and point you to [pyformat.info](https://pyformat.info/) and [the official docs](https://docs.python.org/3/library/string.html#formatstrings) for further reading.

## Dictionaries
Dictionaries are a built-in Python data structure for mapping keys to values.

In [None]:
numbers = {'one':1, 'two':2, 'three':3}

In this case `'one'`, `'two'`, and `'three'` are the **keys**, and 1, 2 and 3 are their corresponding values.

Values are accessed via square bracket syntax similar to indexing into lists and strings.

In [None]:
numbers['one']

We can use the same syntax to add another key, value pair

In [None]:
numbers['eleven'] = 11
numbers

Or to change the value associated with an existing key

In [None]:
numbers['one'] = 'Pluto'
numbers

Python has dictionary comprehensions with a syntax similar to the list comprehensions we saw in the previous tutorial.

In [None]:
planets = ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn', 'Uranus', 'Neptune']
planet_to_initial = {planet: planet[0] for planet in planets}
planet_to_initial

The `in` operator tells us whether something is a key in the dictionary

In [None]:
'Saturn' in planet_to_initial

In [None]:
'Betelgeuse' in planet_to_initial

A for loop over a dictionary will loop over its keys

In [None]:
for k in numbers:
    print("{} = {}".format(k, numbers[k]))

We can access a collection of all the keys or all the values with `dict.keys()` and `dict.values()`, respectively.

In [None]:
# Get all the initials, sort them alphabetically, and put them in a space-separated string.
' '.join(sorted(planet_to_initial.values()))

The very useful `dict.items()` method lets us iterate over the keys and values of a dictionary simultaneously. (In Python jargon, an **item** refers to a key, value pair)

In [None]:
for planet, initial in planet_to_initial.items():
    print("{} begins with \"{}\"".format(planet.rjust(10), initial))

To read a full inventory of dictionaries' methods, click the "output" button below to read the full help page, or check out the [official online documentation](https://docs.python.org/3/library/stdtypes.html#dict).

In [52]:
help(dict)

Help on class dict in module builtins:

class dict(object)
 |  dict() -> new empty dictionary
 |  dict(mapping) -> new dictionary initialized from a mapping object's
 |      (key, value) pairs
 |  dict(iterable) -> new dictionary initialized as if via:
 |      d = {}
 |      for k, v in iterable:
 |          d[k] = v
 |  dict(**kwargs) -> new dictionary initialized with the name=value pairs
 |      in the keyword argument list.  For example:  dict(one=1, two=2)
 |  
 |  Methods defined here:
 |  
 |  __contains__(self, key, /)
 |      True if D has a key k, else False.
 |  
 |  __delitem__(self, key, /)
 |      Delete self[key].
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __getitem__(...)
 |      x.__getitem__(y) <==> x[y]
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __init__(self, /, *args, **kwargs)
 |

## Your Turn
You've learned a lot of Python... go [**test your skills**](https://www.kaggle.com/kernels/fork/1275185) with some realistic programming applications.

--- 
[**Python Home Page**](https://www.kaggle.com/learn/python)

# [Working with External Libraries](https://www.kaggle.com/colinmorris/working-with-external-libraries)

[**Python Home Page**](https://www.kaggle.com/learn/python)

--- 

In this lesson, I'll be talking about **imports** in Python, giving some tips for working with unfamiliar libraries (and the objects they return), and digging into the guts of Python just a bit to talk about **operator overloading**.

## Imports
So far we've talked about types and functions which are built-in to the language.

But one of the best things about Python (especially if you're a data scientist) is the vast number of high-quality custom libraries that have been written for it.

Some of these libraries are in the "standard library", meaning you can find them anywhere you run Python. Others libraries can be easily added, even if they aren't always shipped with Python.

Either way, we'll access this code with **imports**.

We'll start our example by importing `math` from the standard library.

In [53]:
import math

print("It's math! It has type {}".format(type(math)))

It's math! It has type <class 'module'>


math is a module. A module is just a collection of variables (a namespace, if you like) defined by someone else. We can see all the names in `math` using the built-in function `dir()`.

In [54]:
print(dir(math))

['__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'acos', 'acosh', 'asin', 'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'copysign', 'cos', 'cosh', 'degrees', 'e', 'erf', 'erfc', 'exp', 'expm1', 'fabs', 'factorial', 'floor', 'fmod', 'frexp', 'fsum', 'gamma', 'gcd', 'hypot', 'inf', 'isclose', 'isfinite', 'isinf', 'isnan', 'ldexp', 'lgamma', 'log', 'log10', 'log1p', 'log2', 'modf', 'nan', 'pi', 'pow', 'radians', 'sin', 'sinh', 'sqrt', 'tan', 'tanh', 'tau', 'trunc']


We can access these variables using dot syntax. Some of them refer to simple values, like `math.pi`:

In [55]:
print("pi to 4 significant digits = {:.4}".format(math.pi))

pi to 4 significant digits = 3.142


But most of what we'll find in the module are functions, like `math.log`:

In [56]:
math.log(32, 2)

5.0

Of course, if we don't know what `math.log` does, we can call `help()` on it:

In [57]:
help(math.log)

Help on built-in function log in module math:

log(...)
    log(x[, base])
    
    Return the logarithm of x to the given base.
    If the base not specified, returns the natural logarithm (base e) of x.



We can also call `help()` on the module itself. This will give us the combined documentation for all the functions and values in the module (as well as a high-level description of the module). Click the "output" button to see the whole math help page.

In [58]:
help(math)

Help on module math:

NAME
    math

MODULE REFERENCE
    https://docs.python.org/3.6/library/math
    
    The following documentation is automatically generated from the Python
    source files.  It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
    implementations.  When in doubt, consult the module reference at the
    location listed above.

DESCRIPTION
    This module is always available.  It provides access to the
    mathematical functions defined by the C standard.

FUNCTIONS
    acos(...)
        acos(x)
        
        Return the arc cosine (measured in radians) of x.
    
    acosh(...)
        acosh(x)
        
        Return the inverse hyperbolic cosine of x.
    
    asin(...)
        asin(x)
        
        Return the arc sine (measured in radians) of x.
    
    asinh(...)
        asinh(x)
        
        Return the inverse hyperbolic sine of x.
    
    atan(...)
        atan(x)
        
 

**Other import syntax**

If we know we'll be using functions in `math` frequently we can import it under a shorter alias to save some typing (though in this case "math" is already pretty short).

In [59]:
import math as mt
mt.pi

3.141592653589793

> You may have seen code that does this with certain popular libraries like Pandas, Numpy, Tensorflow, or Matplotlib. For example, it's a common convention to `import numpy as np` and `import pandas as pd`.

The `as` simply renames the imported module. It's equivalent to doing something like:

In [60]:
import math
mt = math

Wouldn't it be great if we could refer to all the variables in the `math` module by themselves? i.e. if we could just refer to `pi` instead of `math.pi` or `mt.pi`? Good news: we can do that.

In [61]:
from math import *
print(pi, log(32, 2))

3.141592653589793 5.0


`import *` makes all the module's variables directly accessible to you (without any dotted prefix).

Bad news: some purists might grumble at you for doing this.

Worse: they kind of have a point.

In [62]:
from math import *
from numpy import *
print(pi, log(32, 2))

TypeError: return arrays must be of ArrayType

What the what? But it worked before!

These kinds of "star imports" can occasionally lead to weird, difficult-to-debug situations.

The problem in this case is that the `math` and `numpy` modules both have functions called `log`, but they have different semantics. Because we import from `numpy` second, its `log` overwrites (or "shadows") the `log` variable we imported from `math`.

A good compromise is to import only the specific things we'll need from each module:

In [None]:
from math import log, pi
from numpy import asarray

**Submodules**

We've seen that modules contain variables which can refer to functions or values. Something to be aware of is that they can also have variables referring to other modules.

In [None]:
import numpy
print("numpy.random is a", type(numpy.random))
print("it contains names such as...",
      dir(numpy.random)[-15:]
     )

So if we import `numpy` as above, then calling a function in the `random` "submodule" will require two dots.

In [None]:
# Roll 10 dice. High is not inclusive, so high of 7 lets us roll a 6
rolls = numpy.random.randint(low=1, high=7, size=10)
rolls

## Oh the places you'll go, oh the objects you'll see
So after 6 lessons, you're a pro with ints, floats, bools, lists, strings, and dicts (right?).

Even if that were true, it doesn't end there. As you work with various libraries for specialized tasks, you'll find that they define their own types which you'll have to learn to work with. For example, if you work with the graphing library `matplotlib`, you'll be coming into contact with objects it defines which represent Subplots, Figures, TickMarks, and Annotations. pandas functions will give you DataFrames and Series.

In this section, I want to share with you a quick survival guide for working with strange types.

### Three tools for understanding strange objects
In the cell above, we saw that calling a `numpy` function gave us an "array". We've never seen anything like this before (not in this course anyways). But don't panic: we have three familiar builtin functions to help us here.

**1**: `type()` (what is this thing?)

In [None]:
type(rolls)

**2**: `dir()` (what can I do with it?)

In [None]:
print(dir(rolls))

In [None]:
# What am I trying to do with this dice roll data? Maybe I want the average roll, in which case the "mean"
# method looks promising...
rolls.mean()

In [None]:
# Or maybe I just want to get back on familiar ground, in which case I might want to check out "tolist"
rolls.tolist()

**3**: `help()` (tell me more)

In [None]:
# That "ravel" attribute sounds interesting. I'm a big classical music fan.
help(rolls.ravel)

In [None]:
# Okay, just tell me everything there is to know about numpy.ndarray
# (Click the "output" button to see the novel-length output)
help(rolls)

(Of course, you might also prefer to check out [the online docs](https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.ndarray.html))

**Operator overloading**

What's the value of the below expression?

In [None]:
[3, 4, 1, 2, 2, 1] + 10

What a silly question. Of course it's an error.

But what about...

In [None]:
rolls + 10

We might think that Python strictly polices how pieces of its core syntax behave such as `+`, `<`, `in`, `==`, or square brackets for indexing and slicing. But in fact, it takes a very hands-off approach. When you define a new type, you can choose how addition works for it, or what it means for an object of that type to be equal to something else.

The designers of lists decided that adding them to numbers wasn't allowed. The designers of numpy arrays went a different way (adding the number to each element of the array).

Here are a few more examples of how `numpy` arrays interact unexpectedly with Python operators (or at least differently from lists).

In [None]:
# At which indices are the dice less than or equal to 3?
rolls <= 3

In [None]:
xlist = [[1,2,3],[2,4,6],]
# Create a 2-dimensional array
x = numpy.asarray(xlist)
print("xlist = {}\nx =\n{}".format(xlist, x))

In [None]:
# Get the last element of the second row of our numpy array
x[1,-1]

In [None]:
# Get the last element of the second sublist of our nested list?
xlist[1,-1]

numpy's `ndarray` type is specialized for working with multi-dimensional data, so it defines its own logic for indexing, allowing us to index by a tuple to specify the index at each dimension.

**When does 1 + 1 not equal 2?**

Things can get weirder than this. You may have heard of (or even used) tensorflow, a Python library popularly used for deep learning. It makes extensive use of operator overloading.

In [None]:
import tensorflow as tf
# Create two constants, each with value 1
a = tf.constant(1)
b = tf.constant(1)
# Add them together to get...
a + b

`a + b` isn't 2, it is (to quote tensorflow's documentation)...

> a symbolic handle to one of the outputs of an `Operation`. It does not hold the values of that operation's output, but instead provides a means of computing those values in a TensorFlow `tf.Session`.

It's important just to be aware of the fact that this sort of thing is possible and that libraries will often use operator overloading in non-obvious or magical-seeming ways.

Understanding how Python's operators work when applied to ints, strings, and lists is no guarantee that you'll be able to immediately understand what they do when applied to a tensorflow `Tensor`, or a numpy `ndarray`, or a pandas `DataFrame`.

Once you've had a little taste of DataFrames, for example, an expression like the one below starts to look appealingly intuitive:

In [None]:
# Get the rows with population over 1m in South America
df[(df['population'] > 10**6) & (df['continent'] == 'South America')]

But why does it work? The example above features something like 5 different overloaded operators. What's each of those operations doing? It can help to know the answer when things start going wrong.

**Curious how it all works?**

Have you ever called `help()` or `dir()` on an object and wondered what the heck all those names with the double-underscores were?

In [None]:
print(dir(list))

This turns out to be directly related to operator overloading.

When Python programmers want to define how operators behave on their types, they do so by implementing methods with special names beginning and ending with 2 underscores such as `__lt__`, `__setattr__`, or `__contains__`. Generally, names that follow this double-underscore format have a special meaning to Python.

So, for example, the expression `x in [1, 2, 3]` is actually calling the list method `__contains__` behind-the-scenes. It's equivalent to (the much uglier) `[1, 2, 3].__contains__(x)`.

If you're curious to learn more, you can check out [Python's official documentation](https://docs.python.org/3.4/reference/datamodel.html#special-method-names), which describes many, many more of these special "underscores" methods.

We won't be defining our own types in these lessons (if only there was time!), but I hope you'll get to experience the joys of defining your own wonderful, weird types later down the road.

## Your turn!
Head over to [**the final coding exercise**](https://www.kaggle.com/kernels/fork/1275190) for one more round of coding questions involving imports, working with unfamiliar objects, and, of course, more gambling.

---
[**Python Home Page]**(https://www.kaggle.com/learn/python)