# Basic control structures and variables I

## Nails for the hammer

As one of the prereqs for this course was some knowledge of MATLAB or a decent understanding of programming, we won't spend a huge amount of time on concepts, assuming you have a fair idea, and focus on the Python context. If you find yourself stuck on a fundamental question, do ask others around you, and do help others around you, especially if I'm on the far side of the room. I'll not feel left out.

These first few tasks are all pretty unapplied and guided - this should get us underway, and we will then start learning Python by using actual science applications. Since we only have a short course, and you already have used some coding, I will move through this introductory session fairly quickly, so you have the tools to explore a wide variety of Python aspects by the end of Days 2 and 3 - please discuss with each other and if you are still wondering about things during the break, do ask.

My recommendation is that the first time I run a cell, you do so too - I will show you how in a mo. This means everything I define and later use, you will have defined and can later use too.

# First experiment with Jupyter

* Find which line number you wrote your interests in within Etherpad (https://beta.etherpad.org/p/qub2019-FavvnJ)
* Click the blank box below saying `In [ ]`
* Change the box below to read `x = LINENUM` (e.g. `x = 15`)
* Press *Ctrl+Return*

In [1]:
x=1

Note that `#` is the comment symbol in Python - all text after it is ignored.

This lets you to make notes-to-self in your code. However, above, we want to Python to run that line, so don't forget to delete the `#` symbol before pressing _Ctrl+Return_

This just set the variable `x` (globally for this notebook) to the number of your Etherpad line

Click to edit the next `In [ ]` and press *Ctrl+Return* (without changing anything this time)

In [2]:
2 * x

2

Hopefully, you should notice that *x* is indeed what you input. This should all be pretty familiar, unless you come from statically-typed languages (C/C++/Java) where you would have to declare a variable. Python lets you effectively declare it, and its type, by assigning to it.

Note you can also use _Shift+Return_, which will move you onto the next cell when it runs the current one.

We can refer to this output as `Out[2]`, using it like any variable, without having to re-run everything so far.

Do the same trick in the next `In [ ]` (edit and *Ctrl+Return*)

In [None]:
if Out[2] / 2 == x:
    print("Something else")
print(2/3)

There are a couple of things here to dissect. Many languages need braces `{}`, brackets `[]` or parentheses `()` to draw boundaries around the argument of the `if` statement, or the body. In Python, symbols are reduced in favour of words and spaces, for readability. To achieve the same effect, Python separates the condition from the body with the colon `:` and the indentation (four spaces, by convention)

Indentation is Python's Marmite for other programmers. As they are keen to point out, most whitespace does *not* matter: e.g. I could write

`Out[2] / 2           ==    x  :`

and it's the same as

`Out[2] / 2 = x:`

However, indentation (whitespace at the start of a line) does. It is an extremely visually clear way to show where blocks of code, like the body of this `if` statement, start and end. For scientists, it forces a very good programming practice - basic code style - your code doesn't run without if it isn't visually clear (well... neatly indented). One of the reasons I got into Python was that my PhD C++ code was so unreadable, dodgily indented and unwieldy, it was easier to rewrite in Python than try and extend it - once I had modification was easy, I could see how everything was laid out, where functions started and ended, before even reading a character.

Another thing to note is that you can print with `print`. Here we are running Python3, the latest version. If you've played with Python2, you will notice this syntax is a bit different - it has parentheses (aka. parens) around the argument. A whole raft of changes came in with Python3 and, like Windows XP, Python2 has stayed around for a long time through sheer momentum. However it is about to reach end-of-life, so you should make sure to develop new software for Python3. Good porting tools exist for moving old code and, if required, you can write code that will work in both.

To summarize those key points:

* Indentation separates bodies of functions, conditionals, loops from the outside code
* Basic boolean operators are `==`, `or`, `and`, `not`, `>`, `<`, etc.
* This is Python3 syntax (very slightly different to Python2)

# Debugging
## Ladder for the hole

Now we can edit and run code, the inevitable consequence is that we get a bug!

Try running the line below (without correcting it):

In [None]:
if (x < 2) or (x > 100):
    print("Do I know you?')
else:
    print('Yes I do')

Great. Pretty colours! What do they mean?

In general, the most immediately useful information is on the very last line: `SyntaxError`. Above it, is the last "frame" called - that is, level of function. Say you call a function, and it calls another, and so forth until the innermost function hits an error. Here, it will list all the functions in that sequence, so you can track exactly how you got to the dodgy line.

In Jupyter, the filename is not very informative - but you can see the input number (e.g. `ipython-input-4` &lrarr; `In [4]`).

Bonus mark if you spotted the error (not that we actually have marks, but hey) - have a look and see if you can fix it and re-run the command, the same way you ran it the first time. Feel free to discuss, use Etherpad chat, whichever. Note that the `In [4]` changes to `In [5]`, and so forth, each time you run that "cell".

In general, "a string" and 'a string' are equivalent - Python does not mind whether single or double quotes are used, as long as they match at either end.

* Check the error at the bottom
* Check the caret (^) showing where Python sees a problem
* Look at the syntax highlighting in the cell

# Functions

## Painting the Black Box

Like every language, Python has functions... they are defined like so:

In [None]:
def tell_me_if_im_old(name, age):
    print("...checking if", name, "is old...")
    
    if age < 100:
        return False
    else:
        return True

Couple things to note: (i) this has nested blocks, we just indented twice for the inner ones (`if` and `else` bodies). (ii) this *didn't* specify a type for `age` (or `name`) - all it cares is that we can compare `age` to an *int*. Similarly, we don't specify a return type - it is up to us to return something intuitive and sensible. Let's use it:

In [None]:
phils_age = 34
name = "Phil"

print("I am", phils_age, "- am I old?")

if tell_me_if_im_old(name, phils_age):
    print("Yup, it's a fact.")
else:
    print("Not at all, spring chicken.")

This one I'm saving for future use. You can see that this is used pretty much like every function in many languages - name, parentheses (aka "parens") containing arguments.

There are other types of function - those on objects. Rather than cramming an introduction to object-oriented programming (OOP) now, especially as a number of you have done Computer Science, I have added an Appendix to these notes about my non-existent dogs, "Freddie" and "Nitwit". We will spend some time on that tomorrow.

For the moment though, what matters is that you can add a dot to most things in Python and access a number of properties of that variable or things it can do. A couple of examples to make it clearer:

In [None]:
"just a normal string".upper()

In [None]:
if "another string".islower():
    print("The string was lowercase")

# Lists and dicts
## Conveyor belts and storage units

Like any language, often need to be able to group things in Python.

We can create a `list`, which is like a one-dimensional C or FORTRAN array, as follows (don't forget to execute it!):

In [None]:
things_to_do = ['Learn Python', 'Finish PhD', 'Publish research',
                'Accept Nobel prize', 'Inspire a new generation']

The syntax here is the same as for the more basic types we saw so far, `variable = value`, and our list is a comma-separated series of *things*, bounded by brackets `[]`.

Once you execute the cell above, you should be able to try this:

In [None]:
sorted(things_to_do)

...and get an alphabetical list of tasks. Note that `sorted` doesn't update the original object, so if you re-run without `sorted`, it's back to the previous order.

To do this, you use a slightly different technique *or* you just assign: `things_to_do = sorted(things_to_do)`.

`sorted` guesses the ordering you want, based on the type of data. It doesn't all have to be strings, you could have mixture of strings and integers, say. `sorted` can take optional arguments, letting you force it to do things the way you want, or even provide your own ordering function.

# So how do we get things out of a list?

Lists are ordered, so you can say, "I'd like the fifth item, please". You put it in brackets after the list name:

In [None]:
things_to_do[4]

However, note that...

## PYTHON LISTS COUNT FROM 0
(PYTHON IS ZERO-INDEXED)

Like C, but unlike MATLAB or FORTRAN. The first item in a list is "`things_to_do[0]`"

By the way, did you notice that `In[]` is a list?

In [None]:
type(In)

You can try `len` as well!

# Slicing

## Sounds cool, is cool

Suppose I want a bit of a list, like the middle 3 items, or the first 2...

In [None]:
print(things_to_do)
things_to_do[1:4]

...if you wonder why `things_to_do[4]` (the fifth item) didn't appear, note that that number after the colon means 'up-to-but-not-including'...

In [None]:
things_to_do[:2]

In [None]:
things_to_do[3:]

In [None]:
things_to_do[-2:]

We can use negative numbers to count back from the end instead...

Which is "start two from the end, and give me everything from there onwards"

# Dictionaries

This is fine - a list is like a conveyer belt, with a load of arbitrarily filled boxes passing by, one after the other. However, sometimes we care which item is which, and want to have a name for it, so we can drop in and grab it wherever it may be. That's where dictionaries (= dicts) come in. Try running this:

In [None]:
my_meetup_dot_com_profile = {
    "first name": "Ignatius",
    "favourite number": 9,
    "favourite programming language": "FORTRAN66",
    3: "is the magic number"
}

The syntax is slightly different. Instead of brackets, we use braces `{}`.

Now each element has a name, followed by a colon and the element itself, which can still be basically anything. Like storage units, we now have a group of things we can address. That address could be a number, a string, or any other basic type. And keys can be different types - check the last entry.

This is *completely unsorted*. However, you can get any element back by its name (key):

In [None]:
print("My favourite number is")
my_meetup_dot_com_profile['favourite Numbr']

Hmm... I'll let you fix that...

And even though it is unsorted, it can have integers or so forth as keys - note the very last entry in the definition...

In [None]:
my_meetup_dot_com_profile[3]

In [None]:
my_meetup_dot_com_profile[2]

...but not...

Because it doesn't exist.

# Extending lists and dicts

Extending lists is a little unwieldy, but dicts are more intuitive:

In [None]:
things_to_do.append("Find a nice retirement village in the Galapagos Islands")

my_meetup_dot_com_profile["Interests"] = ["Python2", "Python3", "Scientific Python", "Pottery"]

print("TODO:", things_to_do, "\n\nMEETUP:", my_meetup_dot_com_profile)

Here, we added an `Interests` key to the `my_meetup_dot_com_profile` dictionary.

Note that anything can be an *item* in a dictionary or list - in this case, a list is an item in our dict, whose key is `Interests`. Also note that, as with other languages, `\n` signifies a newline.

# Join

## The most useful method in the str object

Different languages have different ways of collapsing a string with a separator - it is something you will need to do again and again. In Python, the approach is a little surprising - we have already seen that strings are objects (of type `str`), but it turns out Python uses this to provide a way of joining iterables (like `lists` or `dicts`):

In [None]:
', '.join(things_to_do)

This way around, all the string needs to do is add itself before each item in the iterable (except the first) - as long as it can keep getting items, it doesn't care what the iterable is, a dict or anything (iterable == thing you iterate/step through).

Can you see the benefit of defining this method as part of `str`, being able to pass it any possible iterable to join up, instead of having to define a method on every type of iterable and passing it a string?

# Quick aside

## Save early, save often

Jupyter does do some automatic saving, but click the disk icon at the top left now to force an update...

# ReDebugging

## Bringing it together

Here's another one to see if you can fix - try running and then spot the issue:

In [None]:
y = 120
for z in range(5):
    y = y / z
    print("When z is", z, "then y is", y)

Output should be:
```
When z is 1 then y is 120.0
When z is 2 then y is 60.0
When z is 3 then y is 20.0
When z is 4 then y is 5.0
```

Couple of bits of useful information:

* `for` is, unsurprisingly, a `for` loop, as in other languages
* `for` doesn't have limit arguments, like in C, say - it takes a set/list/(anything iterable) and goes through each element
* `range` just returns a list of integers
* `range` can take one argument or two arguments (or more, but not so relevant now)
* `range` documentation is here : [Python3 range syntax](https://docs.python.org/3/library/functions.html#func-range) (link opens in new window, so click it!)

The easiest way, IMO, to info on any Python function or library is to Google "Python3 funcname" and click the first python.org link. If even something seems wrong, make sure you are looking at the Python3, not Python2 docs (or v.v.)

If it isn't there already, when you solve this (i.e. get the output above), please make a note in Etherpad. If you are still solving the puzzle - don't look at Etherpad unless you want the answer.

Note that the `in` operator can separately be used to return a boolean: e.g. "`list_item in alist`", "`key in adict`" can be used for `if` clauses

To see what range actually returns, you can run:

In [None]:
type(range(5))

Well, that wasn't very descriptive. It turns out the `range` function returns something with (a confusingly named) type `range` - this is what `for` gets handed. Don't worry about the ins-and-outs of that type just yet - what matters is: things of type `range` look like, act like and sound like a `list`.

As far as `for` is concerned, that's good enough - this is the [**duck test**](https://simple.wikipedia.org/wiki/Duck_typing), and is a key paradigm in Python - when writing code, don't require your input to be of a specific type `float`, or type `int` (for example), just complain if it doesn't *do* what you want (for example, it refuses to be added to something, or printed, by giving an error).

Here, `range` can iterate, like a list, which is all `for` wants - it doesn't care that it isn't _technically_ a list, as it can do what `for` needs. Similarly with, say, the `==` comparison operator - you can compare any objects and as long as they define their comparison behaviour, the operator itself doesn't care what they are.

To prove `range(5)` is what we expect it is, lets turn it into a list and see what happens...

In [None]:
list(range(5))

Essentially, `for` sees what we see - a sequence of five numbers.

Bear in mind that you can cast like this to various types. `str` will make something a string...

In [None]:
str(range(5))

In [None]:
3.14 + " is almost pi"

However, unlike some languages, Python does care about type, so you must concatenate strings with strings or add numerical types to numerical types. To check the type of any number or variable, you can use the built-in, `type`:

In [None]:
type(str(3.14))

A string is, well, kind of like a list of characters, right? Certainly true in C and FORTRAN. So can "`for`" iterate over that too?

In [None]:
d = ""
for i in range(2):
    d += "Let me hear you say "
    j = 4
    for c in "YMCA":
        d += c + '.'
    d += "  "
print(d)

Yes, yes it can: check out the '`for c in "YMCA"`'. We also snuck in a nested loop - you just keep indenting each time you nest - no more complicated than that.

And, admittedly, a new operator has appeared: "`+=`". Familiar to most languages it is equivalent to `d = d + blah`. Note that we therefore have to have `d` defined before the first `+=`, as "`d = d + anything`" doesn't make sense if `d` does not already exist. This is the reason for our first line: "`d = ""`".

Finally, note that we have looped twice using `range`, just as above, but our loop variable `i` is never used - it is effectively a placeholder. All we want from that line is to have the body below run twice.

*Sidenote*: some people use an underscore "`_`" as a loop variable in this case, to indicate to someone reading that the variable is nothing more than that and never used - as far as Python is concerned, "`_`" is as good a name for a variable as any other, so this is entirely about readability.

Just to prove that "`for`" will iterate over any list, numeric, string or otherwise, we can try it with our todo list:

In [None]:
total_months = 0
for task in things_to_do:
    total_months += len(task)
    print(task, ":")
    print("  using Python, this task will take", len(task), "months to complete")
print("You cannot retire for at least", total_months // 12, "years")

Here we have added one or two minor surprises. One is the *horrendously* misleading use of *len* (hint: you are unlikely to get a Nobel Prize by 2020), which in reality calculates the number of items in any "`iterable`". Reminder: this is the general term for something sort-of-list-like. According to my script, tasks take as long as the number of items in their "list", i.e. characters in the string.

Another aspect is the double slash. This is an important difference between Python3 and Python2 (and many other languages). [Dividing non-divisible integers](https://en.wikipedia.org/wiki/Division_%28mathematics%29#Of_integers) with one slash, as normal, gives a float in Python3. If you want to get an integer (that is, whatever the float answer is with the decimal chopped off), you can use double-slash. Try removing the second slash and re-running to see the exact non-integer answer. In Python2, using one slash *always* gives an integer, unless you cast the bottom or top to a float (as in C, say).

While we have looked at strings and lists, and one or two of their methods - functions within individual string and list objects - there are a few dictionary ones that are relevant to loops too.

In [None]:
for field in my_meetup_dot_com_profile.keys():
    print(field)

In [None]:
for value in my_meetup_dot_com_profile.values():
    print(value)

In [None]:
for pair in my_meetup_dot_com_profile.items():
    print(pair[0], "=", pair[1])

# Exercise

...if we have time...

Use the cell below to create a function that takes a dictionary as an argument (like `name` and `age` were in an earlier example), and returns a new dictionary, with the same keys, but where each original item has been converted to a string.

Any items of a ``list`` type should become a single string with entries separated by commas and no brackets at either side (e.g. "one, two, kumquat, four").

Call your function with "**`my_meetup_dot_com_profile`**" as an argument and print each entry on a separate line like so:

    FAVOURITE PROGRAMMING LANGUAGE : FORTRAN66
    FIRST NAME : Ignatius
    ...

where each key is capitalized and item value printed beside it. The order of lines is not important.

Note that you can create a blank dictionary with "`= {}`".
Extension: if you get that done quickly, try making your function call itself on any (nested) items it finds of `dict` type, and turn the output into the string-value you need for your new dictionary. This is "recursion".

Set
```
   my_meetup_dot_com_profile['dogs'] = {"Freddie": 5, "Nitwit": "also a dog"}
```
and run your function on "**`my_meetup_dot_com_profile`**" again to test your improvement.

In [None]:
# Try the exercise in here

# Modules

## Gotta catch em all

## Introduction to modules

* Where Python gets its power

Wealth of modules providing everything from full GUI toolkits to astrophysics simulations. Mature and reliable numerics libraries, and an evolving ecosystem that grows day by day

* Some modules come bundled, but you can install many more

Python has tools so you can just specify the name and it will get it from the appropriate online repository. On Windows, this can be tricky, but a recent Python distribution, Anaconda, makes it easy, and includes thousands of packages out of the box. This seems your best bet if on Windows or Mac and I have USB sticks here for you to get going at the end of the day.

* Some are part of Python, lots are third-party

It is good to have an idea of which is which, as you can get (normally free) help, often at pretty short notice, by heading to the project's forums - remember, always be polite and, bear in mind, it isn't a commercial service, so an answer isn't guaranteed. For reference, when someone is really desperate they can offer a bounty for a solution or a bit of code on certain websites, not to mention people offering consultancy support, so that doesn't entirely mean there are no other options, but most of the time what's freely available is more than adequate.

## Using modules

First off, you *import* a module. This tells Python that it should hoke it out of its cave of treasures for upcoming use...

In [None]:
import math

This brings in a whole set of tools for dealing with basic mathematics. (so execute it!)

We reach into this with a dot...

In [None]:
math.pi

It has functions and constants...

In [None]:
math.sin(math.pi / 2)

So how do you find out what is available in `math`? Like before, Google `python3 math`... try it now:
https://www.google.ie/search?q=python3+math (this is just the Google search link)

For me, the closest version of Python was actually the second link (for Python 3.5) - normally, minor version differences are rarely a problem, but if what you see seems different to the manual, just add Google for "`python 3.4 math`", or whatever. To check the current version that Jupyter is using, go to `Help->About`.

Another useful one is `os`...

In [None]:
import os
print(os.path.exists('/usr/bin/python'))

This is in fact using a submodule, `os.path`, we can reach in twice to get functions inside that.

Also `sys`...

In [None]:
import sys
print(sys.path)
# sys.exit(1)  # Exit with error code 1

The last command tells your script to exit, with code 1. I don't really want to try that right now, as whatever happens won't be good.

## There are four key variants of importing

### You will see all four, so check back here

In [None]:
import math
print(math.e)

This is the most succint as it just dumps everything in the `math` module into the global namespace, which saves much typing, but it can make debugging a headache, especially if there are some strange functions or variables in there that happen to have the same name as something else you later decide to use (this does happen from time to time)

In [None]:
import math as m
print(m.e)

...we've seen that...

In [None]:
from math import e
print(e)

...in other words, give `math` an alias...

In [None]:
from math import *
print(e)

...which saves some typing if you're *sure* you don't want to use any other variable called `e`...

## We are near the end of the walk-through

Feel free to go back and forward through this. I am going to give you one final task to get you started on this topic...

Remember your line number from Etherpad? If not, it should be in variable `x`...

In [None]:
print("I am on line number", x)

But double-check Etherpad...

Remember the post-its? So, everybody stick your star post-it somewhere I can see it...

Your task is to write a Jupyter cell to give you the $x^{th}$ digit of $pi$ and **write the digit at the start of that Etherpad line**, the one with your name on it. In particular, try and do this, just using `x`, without typing the number in directly. Then you can experiment to see if other numbers work too.

There are several ways to do this... I've given you enough tools above to find at least one. If, when you get it working, no-one has described your approach on Etherpad, write a short description at the bottom of the notepad.

When done, swap your star for an arrow.

In [None]:
# Overwrite this cell with your code and run it

So, hopefully, you now have a calculator for the $x^{th}$ digit of $\pi$. How can you use this without going back and editing the cell?

Well... copy and paste your text below the line in the cell below:

In [None]:
def get_xth_digit_of_pi(x):


Now indent everything beneath the `def` line by four spaces and run it. You have just defined a function! If so, this should give you a `1`:

In [None]:
get_xth_digit_of_pi(5)

If you had errors that don't make sense, try the chat window, try someone beside you (my normal approach in life) or ask me.

By the by, here are a couple of approaches - which is more Pythonic?
```
1.
    scaled_up = math.pi * (10 ** (x - 1))
    higher_digits = 10 * (scaled_up // 10)
    return int(scaled_up - higher_digits)

2.
    pi_as_str = str(math.pi)
    if x == 1:
        x = 0
    return pi_as_str[x]
```

# Pythonic
## Walks like a Python...

This is a fundamental principle of Python and, unusually for something for fundamental, it isn't part of the code. Python is a bit like Ultimate Frisbee - in case you haven't had the pleasure, referees are not required, even in international competitions. This is on the basis that, if you're playing the sport, then you've decided you're there with the ethos, and the players can sort out those decisions amongst themselves and get on with the game. If you're trying to take advantage of that, then the fundamental question is, why play Ultimate Frisbee?

Good Python practice frequently diverges from what is commended in other languages. Succinctness or efficiency is not necessarily good - clarity comes before cleverness. However, using a pattern from previous experience when Python has a neater, more beautiful, more *Pythonic* way - that's something to try to avoid. The upshot of all this is that, when you come to what the community considers "good code", it is simple, elegant and clear.

## Not-so-Pythonic way:

In [None]:
for i in range(len(things_to_do)):
    print(things_to_do[i])

## More Pythonic way:

In [None]:
for task in things_to_do:
    print(task)

Same output, but one feels more like pseudocode, more like how you would verbally convey the idea if somebody asked.

If you want to *get* what Python is about, try this (literally):

In [None]:
import this

This is one of the hardest things to get when switching to Python, particularly from compiled languages, but it is one of Python's greatest rewards. Nothing much beats opening up a script you hacked together two years ago to see zen-like transparency - no confusion, beautifully professional, and thanks to Python's design, no extra time spent writing it, just a feel for what Pythonic *is*.

# A couple of resources...

* If there is one link from this course to click: [What is Pythonic](http://blog.startifact.com/posts/older/what-is-pythonic.html)

* [PEP](https://www.python.org/dev/peps/pep-0008/) (Python Enhancement Proposal) 8 - code style

These are publicly accessible design documents and nearly everything new of importance passes through here. PEP8 is, in fact, a style guide; unlike many languages, there is more or less a definitive recommended style, and this spells it out. Since I discovered *flake8* and *pylint*, I have my IDE set to highlight style errors and it has reaped dividends to the moon and back - I strongly recommend you do so too, especially if we ever have to code together.

# What about all the other structures??

Well, they are not so hard once you get this far:

 but you'll come across them throughout today and tomorrow.

* Most Python entities are objects, with methods like `obj.method(arg1, arg2)` (like routines in modules)

* Loops can be `while [SOME BOOLEAN]:`, as well as for

* We will explore some actual modules after next

* Generators put loops inside lists

I'll let that boggle your mind for a moment. Our future sessions are not going to be so closely led, partially because it *cannot* be as interesting experiment with my code as it is getting your hands dirty with your own. Now you've got plenty of tools at your disposal.

# Summary

* loops & conditionals

* lists & dictionaries

* modules & functions

* methods on objects & "Pythonic"

# Finally...
## After a whistle-stop tour, we pull into a Python siding...

* Please write one great thing about this morning on your arrow and leave it beside the door
* Please write one dire (or just not so good) thing about this morning on your star and leave it beside the door

Now, I'm going to do a quick tour of scripting applications of Python, a few ideas you might want to ask me about over break/lunch, to give you an idea of what Python can do, so sit back relax and let me fiddle with projector technology at the front for a couple of minutes...

Tomorrow will involve a wider range of concepts. In particular, if you are unfamiliar with or rusty on object-oriented programming, please try out the concept intro in the Basic Control Structures II notebook before tomorrow ("PS Introduction to objects..." near the bottom)