# Lesson 1 - Numeric types, loops, flow control, strings

Working as a data scientist often requires us to put the problem we are trying to solve in terms of instructions a computer can understand. In this course, you will develop a working knowledge of Python, a programming language that has become a *de facto* standard for working on data science problems. By learning to wield this tool effectively, you will be well-equipped for tackling real problems in this space.

For this course, we will be using the popular [Anaconda](https://www.anaconda.com/download/) Python distribution. Anaconda easily installs on Windows, Mac OSX, and Linux, and includes over 100 packages commonly used for data science and numerical computing. Included in this distribution is the [Jupyter](http://jupyter.org/) notebook, which allows us to write code, plot figures, make notes, and more all in a single place. Have a look at the [user documentation](https://jupyter-notebook.readthedocs.io/en/stable/notebook.html) for an introduction to Jupyter.

Let's get started!

## Types of data

A programming language like Python deals in terms of data (nouns, things) and operations on that data (actions, verbs). We'll start with the types of data we have at our disposal.

First, we have `int`s, or integers, such as:

In [1]:
3

3

`float`s, or floating point numbers, such as:

In [2]:
3.7

3.7

`complex` numbers:

In [19]:
7 + 3j

(7+3j)

`bool`s, or booleans, which have one of two possible values:

In [3]:
True

True

In [4]:
False

False

And `None`, which is a special type with exactly one possible value:

In [6]:
None

Later in this lesson we'll have a look at strings and data types that have more structure, but these are the basic data types Python deals in. We'll often refer to instances of a data types as *objects*, but a point that will be made clearer over time is that *almost everything in Python is an object*. Best not to dwell on this right now; we will return to it.

To check the type of an object, you can use a *built-in function* called `type`:

In [8]:
type(3)

int

In [9]:
type(True)

bool

In [10]:
type(3.4)

float

Python features many such *built-ins*, and we'll introduce these as we go.

## Binary operations

So far we've looked at the most basic objects in Python, but now we'll have a look at operations between them. For numerical data types, we have arithmetic operations that you're probably familiar with, as as:

In [14]:
# addition
2 + 5

7

In [15]:
# subtraction
2 - 4

-2

In [20]:
# multiplication
3 * 4

12

In [21]:
# division
3 / 4

0.75

Notice that with addition between `int`s, we actually got back a `float`. This functionality is new in Python 3, and avoids sometimes hard-to-spot issues. Previously, in Python 2, division like this would yield:

In [28]:
# "floor" division
3 // 4

0

...which is the same result, but with the decimal values truncated.

These operators follow the order-of-operations you would expect, so:

In [25]:
3 + 4 / 2

5.0

is the same as:

In [27]:
3 + (4 / 2)

5.0

We can also perform comparisons; these yield booleans:

In [68]:
# less than
3 < 2

False

In [72]:
# greater than, equal to
4 >= 4

True

In [73]:
# equal
7 == 2

False

In [74]:
# not equal
7 != 2

True

### Binary operations for booleans

Further on, we'll see how to use flow control to write code that can make decisions and follow different pathways. Key to this behavior are the use of booleans, and operations between booleans.

First, there's `and`: the booleans on either side of `and` must both be `True` for the result to be a `True`, otherwise the result is `False`:

In [35]:
True and True

True

In [30]:
True and False

False

In [36]:
False and False

False

We can show all the possible results of `and`ing two booleans `s1` and `s2` together in a "truth table":

| s1    | s2    | s1 and s2 |
| ----- | ----- | --------- |
| True  | True  | True      |
| True  | False | False     |
| False | True  | False     |
| False | False | False     |

On the other hand, there is `or`: at least one boolean on either side of `or` must be `True` for the result to be `True`, otherwise the result is `False`:

In [37]:
True or True

True

In [33]:
True or False

True

In [38]:
False or False

False

| s1    | s2    | s1 or s2 |
| ----- | ----- | --------- |
| True  | True  | True      |
| True  | False | True      |
| False | True  | True      |
| False | False | False     |

Finally, there is `not`. This is technically a *unary* operator, since it operates on only one boolean. It functions as an inversion; we'll include it here anyway:

In [39]:
not True

False

In [40]:
not False

True

| s1    | not s1 |
| ----- | ------ |
| True  | False  |
| False | True   |

## Names and objects

When we talk about Python, we sometimes mean different things. Sometimes we are talking about *the language*, which includes the syntax and what tools are available to tell the computer what to do. Other times, we might be speaking about the *Python interpreter*: this is the program on your computer that *executes* code you have written in the Python language.

When running a session with the Python interpreter, as this notebook is doing, we can divide the world the interpreter sees into two sets of things: *names* and *objects*. We'll start with an example:

I'm going to create a name `my_weight`, and I'm going to *point it* at an integer of value `170`:

In [59]:
my_weight = 170

We can now refer to this value with the name I've attached to it:

In [60]:
my_weight

170

I can create a new name, `my_weight_kg`, giving my weight in kilograms; I'll point it at the value that results from calculating my weight in kilograms from my weight in pounds:

In [61]:
my_weight_kg = my_weight / 2.2

In [62]:
my_weight_kg

77.27272727272727

Now, what if later I weigh myself and find I've gotten lighter? I'll set `my_weight` to point to a new integer, `165`:

In [63]:
my_weight = 165

What will be my weight in kilograms?

In [64]:
my_weight_kg

77.27272727272727

It didn't change! The reason for this is that the name `my_weight_kg` just point to a value in memory; it does not keep track of *how* that value came about. If we want my weight in kilograms, we'll need to update it explicitly:

In [65]:
my_weight_kg = my_weight / 2.2

In [67]:
my_weight_kg

75.0

Thinking in terms of names and objects will be important to make effective use of Python, and to avoid otherwise unexpected behavior. We'll see more examples where this awareness will serve us well.

## `while` loops

What if we wanted to produce a sum of a bunch of numbers, such as:

In [40]:
1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10

55

This is a series of binary operations between two numbers, and could be rewritten to illustrate this as:

In [40]:
((((((((1 + 2) + 3) + 4) + 5) + 6) + 7) + 8) + 9) + 10

55

Computers are very good at taking a task and repeating it over and over again; we can take advantage of this to perform the same calculation with a ``while`` loop:

In [75]:
# start at 1, and with an accumulated total 0
number = 1
total = 0

# keep executing the body of the loop until
# `number < 11` yields `False`
while number < 11:
    # add the current value of `number` to `total`
    # set the name `total` to the new value
    total = total + number
    
    # increase `number` by 1
    number += 1
    
print(total)

55


### Challenge: calculate the product of the first 20 odd numbers

In [77]:
number = 1
total = 1
count = 1

while count < 21:

    total = total * number
    number += 2
    count += 1
    
print(total)

319830986772877770815625


## Introduction to strings

So far we've dealt in data types that are *non-iterable*; that is, they have no length to them:

In [79]:
len(6)

TypeError: object of type 'int' has no len()

The first *iterable* we'll introduce is the string. For example:

In [81]:
word = 'bird'

In [82]:
word

'bird'

A string has a length:

In [58]:
len(word)

4

And it can be *indexed*. We can select the `'b'` in `'bird'` with:

In [60]:
word[0]

'b'

Or the `'i'`:

In [61]:
word[1]

'i'

Indexing of data structures in Python is by convention zero-based; the first element is at index 0 ("zeroth"), the second at index 1, the third at index 2, etc. This is something you can count on throughout the Python ecosystem, although it takes a little getting used to.

We can also index backward, from the end of the string, with negative integers:

In [83]:
word[-2]

'r'

And we can do *slicing*:

In [65]:
word[0:3]

'bir'

This should be read as "get me the zeroth element, up to but not including the third element". This "...up to and not including..." behavior is also something we'll see more of, and though it also takes some getting used to, it's consistently used in the Python ecosystem except in some very select cases.

Strings can be made in a variety of ways:

In [66]:
'with single quotes'

'with single quotes'

In [67]:
"with double quotes"

'with double quotes'

In [84]:
"single 'scare quotes'"

"single 'scare quotes'"

In [86]:
'double "scare quotes"'

'double "scare quotes"'

The rule is, if you start a string with one kind of quote, you must end it with the same kind. Otherwise, you will get an error.

There are also multi-line strings: 

In [94]:
multiline = """This is
   a multi-line string"""

In [95]:
multiline

'This is\n   a multi-line string'

In [96]:
print(multiline)

This is
   a multi-line string


These are often used for writing documentation for functions, which we will do in a later lesson.

## Using string methods

Objects in Python feature *methods*, which can often be used to manipulate the existing data structure or obtain a new one from an existing one. For example:

In [98]:
word.upper()

'BIRD'

In this case, calling `upper()` on `word` yielded a new string with all the characters capitalized. In the notebook, the methods available on an obect can be shown by typing the name followed by a `.`, then hitting the **Tab** key.

In [99]:
morewords = "a splash of mineral water"

Not all methods of strings return other strings, though. For example, `split()` returns a `list` of strings separated by whitespace in the original:

In [100]:
morewords.split()

['a', 'splash', 'of', 'mineral', 'water']

We'll learn more about lists in the next lesson.

As a brief aside, the `+` operator can be used to concatenate two strings:

In [78]:
"one" + "two"

'onetwo'

And the `*` operator can be used with an integer to concatenate the same string many times:

In [102]:
"six" * 6

'sixsixsixsixsixsix'

Before we move, it's important to note that a string is an *immutable* data structure. This distinction will become more clear later, but suffice it to say that we cannot change the characters in an existing string:

In [103]:
my_string = "cat"

In [104]:
# change 'cat' to 'hat'?
my_string[0] = 'h'

TypeError: 'str' object does not support item assignment

Any of the methods we call on a string, such as:

In [105]:
my_string.replace('c', 'h')

'hat'

Always return a new string, leaving the original intact:

In [106]:
my_string

'cat'

## Using `for` loops for iteration

We already saw how to loop using `while`, but in many cases it is more natural to use an iterable, such as a string, as a way to iterate. We can write a `for` loop like:

In [107]:
for letter in word:
    print(letter)

b
i
r
d


This is equivalent to doing:

In [80]:
print(word[0])
print(word[1])
print(word[2])
print(word[3])

b
i
r
d


But it will work the same no matter what `word` is:

In [109]:
word = 'supercalifragilisticexpialidocious'

In [110]:
for letter in word:
    print(letter)

s
u
p
e
r
c
a
l
i
f
r
a
g
i
l
i
s
t
i
c
e
x
p
i
a
l
i
d
o
c
i
o
u
s


The name `letter` is set to each character in the string, one after the next, and for each iteration the body of the loop executes. The body of the loop, just as with the `while` loop, is indicated with indentation (by convention, four spaces).

## Flow control

So far, we've written code blocks in which every line is executed. But often, we need to write code that makes decisions, and based on those decisions executes different sets of instructions. Enter the `if` statement:

In [112]:
for letter in word:
    if letter != 'l':
        print(letter)

s
u
p
e
r
c
a
i
f
r
a
g
i
i
s
t
i
c
e
x
p
i
a
i
d
o
c
i
o
u
s


Inside the body of this loop, we ask whether or not `letter` is `'l'`. If it is *not* `'l'`, then we go ahead and print it; otherwise, we don't do anything and move on to the next iteration of the loop.

We can also use `else` statements to do something in case the `if` condition is not met:

In [113]:
for letter in word:
    if letter != 'l':
        print(letter)
    else:
        print(None)

s
u
p
e
r
c
a
None
i
f
r
a
g
i
None
i
s
t
i
c
e
x
p
i
a
None
i
d
o
c
i
o
u
s


Or if we need one of many mutally-exclusive possibilities to happen, we can add in `elif`, or "else if":

In [114]:
for letter in word:
    if letter != 'l':
        print(letter)
    elif letter != 'c':
        print("I'm a c!")
    else:
        print(None)

s
u
p
e
r
c
a
I'm a c!
i
f
r
a
g
i
I'm a c!
i
s
t
i
c
e
x
p
i
a
I'm a c!
i
d
o
c
i
o
u
s


It's important to remember that the order of the `if`, `elif`, and `else` matters here. The `if` statement is evaluated first, and if it evaluates `True`, then its body gets executed. It does not matter in that case if an `elif` that follows it would have also evaluated to `True`. The `else` body will get executed if none of the prior statements in the block end up evaluating to `True`.

We can make conditions for an `if` statment as complicated as needed to achieve our desired result, so long as the result evaluates to a boolean:

In [116]:
for letter in word:
    if letter != 'l' and letter != 'c':
        print(letter)
    else:
        print(None)

s
u
p
e
r
None
a
None
i
f
r
a
g
i
None
i
s
t
i
None
e
x
p
i
a
None
i
d
o
None
i
o
u
s


## The first step is the hardest (and sometimes the most boring)

This lesson covered a lot of nuts and bolts, all of which we'll need to do more exciting things later. Play around and practice with these concepts, and keep pace with the assigned homework; this lesson was introducing each block in a set of Legos$^\text{TM}$, but we need to know these things to effectively build things with them.

That being said, don't stress *too* much about learning everything upfront; time, practice, and mistake are the best teachers.