# Notebook-1: Thinking like a computer

## _Or, why a Geographer should learn to program_

## What _is_ a computer?

At it's most basic, a computer is a programmable device for performing calculations.

_This_ is a kind of computer.
![Abacus](img/notebook1/OldSchool.png)

As is _this_.
![Modern Computer](img/notebook1/Modern.png)

These calculations always happen in a particular order according to a set of _rules_: do 'A', then do 'B', then... And these sets of rules are called algorithms. Finding the mean (average) of a set of numbers involves an algorithm. And calculating the probability that the lecturer won't show up to the first lecture also involves an algorithm, it's just that it's a much more complicated one unless you take matters into your own hands...

## Calculating the Mean

There are obviously many ways that you can calculate the mean: in your head, using pencil and paper, on a calculator, in Excel... and, of course, using a code! For a small set of simple numbers, using your brain is going to be a lot faster than typing it into a calculator or computer.

_Quick_, what's the mean of: `1, 2, 3, 4`

In [2]:
# Now try doing this with code!
# Type in the formula for the mean on the empty line 
# below and then click the 'play' button above to see if
# you managed to run your first piece of Python code
# correctly!




What makes a computer potentially _better_ than a calculator (or your brain) is that a computer isn't daunted by having to count lots of numbers, nor does it have a problem with big numbers! Not only that, but the computer can also do things like: a) find out the amount of rain that fell in London, Manchester, and Edinburgh yseterday; b) work out the average rainfall for these three cities; and then c) work out the distance from the mean for each city. All in a matter of milliseconds! 

Here's a good example of when computer start to get better and faster than brains:

In [4]:
(23495.23 + 9238832.657 + 2 + 12921)/4

2318812.72175

## Computers: good or bad?

What are computers good at?
- Doing the same thing over and over
- Doing _exactly_ what they are told to do

What are computers bad at?
- Generating knowledge
- Being creative

There is a long-standing contest, called the Turing Test, that demonstrates this difference rather nicely. The point of the Turing test is to have a computer fool a person into thinking that they're talking to another person. Basically, according to Turing if a computer can fool a person into thinking that they're talking to another person then we'll have to declare that machines have become AIs (Artificial Intelligences). Computers have been getting better and better at this recently, but they still have a hard time fooling anyone for very long. 

In contrast, bigger and better computers have now beat us at Chess and Go, and are being used to help us understand earthquakes and climate change on a huge scale. Here, computers can do billions -- or trillions -- of calculations a second to work out that if 'A' happens then 'B' is the next most likely thing to happen, and so on and so on.

#### More on the Turing Test
Turing, A (1950), _Computing Machinery and Intelligence, Mind_ LIX (236): 433–460
doi:10.1093/mind/LIX.236.433, ISSN 0026-4423


### Being a 'good' programmer

The best way to be a 'good' programmer is to know when the computer can help you and when it will get in the way. A computer cannot 'solve' a problem for you, but it _can_ help you to find the answer when you've told it what to look for. A computer can only do _exactly_ what you tell it to do, so if you don't know what to do then the computer won't either.

One of the founders of computing, [Charles Babbage](https://en.wikiquote.org/wiki/Charles_Babbage#Passages_from_the_Life_of_a_Philosopher_.281864.29) had this to say:

> On two occasions I have been asked, — "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" In one case a member of the Upper, and in the other a member of the Lower, House put this question. I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
> _Passages from the Life of a Philosopher (1864), ch. 5 "Difference Engine No. 1"_

Modern programmers call this: garbage in, garbage out. GIGO, for short.

**_The single most important thing that you can learn is: how to think abstractly about solving a problem in a way that you can communicate to a computer._** The rest of this course is really just about getting you started down that path. 

### The 3 virtues of a programmer

Another useful idea comes from [Larry Wall](https://en.wikipedia.org/wiki/Larry_Wall) (check out the 'tache!), who created a programming language called Perl. He said that programmers had three virtues: Laziness, Hubris, and Impatience. The reason that these are virtues in programming (but not in your university courses!) is as follows:

1. Laziness makes you want to put in the effort _now_ to reduce the amount of effort you'll have to put in _later_. So the effort that went into inventing the calculator in the first place (and in making it easy to use) meant that lots and lots of other people never had to do this stuff by hand or using slide rules.
2. Hubris makes you want to write code that other people won't want to "say bad things about". In the course we'll get into what makes 'good' code in more detail, but the short version is: it's efficient, it's easy to read, and it's clever.
3. Impatience is about wanting the answer _now_ rather than waiting for a lazy computer. If you find yourself waiting long periods of time for something that the computer should be able to do _quickly_ then you should be impatient. Chances are, you made a mistake in your code.

**_Hint: you'll also see a lot of laziness when you start trying to write code. Programmers don't like writing `remove` when they could just write `rm`, nor do they like writing `define` when they could just write `def`. Keep an eye out for these mnemonics as they can be pretty daunting at first._**

### The 3 false virtues

Larry also pointed out that these virtues had three mirror-image false virtues:

1. False laziness happens when you leave something working but half-finished and, most likely, about to break. When you start using [StackOverflow](http://www.stackoverflow.com/) you may find that it makes it easy to copy+paste answers into your homework and then you can glue it all together messily. This isn't the same as _understanding_ and _adapting_ the solution that you found online to _your_ problem, so it's false laziness.
2. False hubris is thinking that no one else's code is 'good enough' for you. Sometimes copy+paste is false laziness, but refusing to recognise when copy+paste _is_ the right thing to do is false hubris.
3. False impatience is getting started on solving a problem when you don't yet understand what the problem _is_. One thing that a lot of programmers do is half-listen to what someone has asked them to do and then go haring off without sitting down to make any kind of plan or structure. It's like writing an essay without having done the readings.

There's a lot more thinking on this here: http://blog.teamtreehouse.com/the-programmers-virtues

## The Benefits of Coding?

In a research context we think that the benefits of learning to code fall into three categories:
1. Flexibility -- the computer can often apply the same analytical process to completely different data sets (_e.g._ rainfall in UK vs rainfall in the US) with minimal effort compared to trying to do each step in, say, Excel or SPSS. For students it comes down to this: let's say you discover a newer, better data set for your dissertation and want to use this for your analysis instead of the old, inaccurate data, it's a lot easier and faster to do this with code!
2. Reproducibility -- recently, it's been discovered that a lot of research cannot be reproduced; in other words, if one scientist tries to duplicate what someone else did in order to check something out they're finding that the results don't line up. For stuents the benefit is this: you've just finished your analysis when someone points out that you made a mistake with the data right back at the beginning; redoing all of that in Excel or SPSS would be a nightmare, but with code it can be as easy as changing one line and hitting 'Run'!
3. Scalability -- a computer doesn't care if you throw 10 lines or 10 billion lines at it, the only thing that changes is how long it takes to get an answer. In other words, if your code 'works' on a proper subset of your data it will also work on your entire data set no matter how big it is.

## Computer Languages

In [Notebook 0](notebook-0.ipynb) we briefly mentioned that we'd be using the Python programming language. As with human languages, there are _many_ programming languages in the world, each with their own plusses and minusses, and each with their own vocabulary (allowed words) and grammar (syntax). We use Python. 

#### Python what?

You can find the points below also made in [Python In A Nutshell](http://mbrochh.github.io/python-101/#/6/1) by [Martin Brochhaus](https://github.com/mbrochh)

Python was invented by [Guido van Rossum](https://en.wikipedia.org/wiki/Guido_van_Rossum) in the late 1980s and he continues in the role of 'benevolent dictator' to this day, which means that he (and some other very smart people) try to to ensure that the language continues to meet the basic goals of:
* Being very easy to read (syntax)
* Using plain-English for many functions and operators (allowed words)
* Has a comprehensive style guide: [PEP8](https://www.python.org/dev/peps/pep-0008/) (syntax)
* Has no unnecessary special formatting characters (syntax _and_ allowed words)

So while Python is not the fastest language (C and C++ are faster), nor is it the safest (you wouldn't use it to fly a rocket to Mars), it _is_ a very readable, learnable and maintainable language.

So if you want to learn to code, to do 'data science', or build a business, then use Python.

#### R

The other language that is often mentioned by people doing data-led research is [R](https://www.r-project.org). It's one that we use in a lot of our work. So why don't we teach it? Because it's an eccentric language that does a lot of things differently and it's not much used outside of academia. By teaching you Python we give you a platform from which you can learn _any_ other language (with enough effort!), whereas if we taught you R you would find this step a lot harder. And if you tried to teach you both, it would just be confusing.

We're not [the only people](http://www.dataschool.io/teaching-data-science/#includingtidydataandreproducibilityinthecurriculum) to think that.

### Python & Geography

* Programming and Geography (Geocomputation? Geographic Data Science?)
    - rationale 
    - cool use cases: geocomputation, spatial statistics in Python, R, Javascript, interactive webmaps, databases etc etc..

* The Python programming language
    - mention the fact we're using 2.7 NOT 3.XX
    - Why Python & Geography (or scientific computing in general..)
    - Using the console (with a Video?)





In [1]:
### References:

- a **must read** [the hard way is easier](http://learnpythonthehardway.org/book/intro.html)
- two easy and accessible videos to start wrapping your head around programming (although they are not Python-centric) [1](https://www.youtube.com/watch?v=qUVWM2Q4vAU) and [2](https://www.youtube.com/watch?v=AImF__7FyzM)

SyntaxError: invalid syntax (<ipython-input-1-39baa70dafcc>, line 3)