# Notebook-1: Thinking like a computer

## _Or, why you should learn to program_

OK, so this first video is targeted at a slightly younger crowd that most of the people that we assume are following this introduction, but it sets out a lot of good reasons why _you_ should learn to program a computer. And it also makes a really important point about motivation: 

[![IMAGE ALT TEXT HERE](http://img.youtube.com/vi/pvAsqPbz9Ro/0.jpg)](http://www.youtube.com/watch?v=pvAsqPbz9Ro)

## What _is_ a computer?

At it's most basic, a computer is a programmable device for performing calculations.

_This_ is a kind of computer.
<img src="img/notebook1/OldSchool.png",width=300>

As is _this_.
<img src="img/notebook1/Modern.png",width=300>

If you've never really got to grips with what is happening inside a computer, then this TedED video would be a good way to get started because it helps to explain the basics of things like I/O and what actually happens when you click with the mouse on a button. In fact, you will see that we've used code to import the YouTube video in a way that requires me to do very little work and this is one of the strengths of programming: that someone else created the code to embed a YouTube video into an iPython notebooks (which is what this web page is) and all I need to do is know how to ask that code to find the video on the YouTube web site. Everything else happens automatically.

### What's Going on Inside Your Computer?

Let's find out through some videos – we've tried to pick ones that encompass a range of styles and levels, so we hope you'll find something that 'speaks to you' in here. If not, well Google is your friend: we won't pretend to have all the answers and you might find by searching something that is right on your level.

[![IMAGE ALT TEXT HERE](http://img.youtube.com/vi/AkFi90lZmXA/0.jpg)](http://www.youtube.com/watch?v=AkFi90lZmXA)

### How a Computer Adds Numbers

This next video is a little more technical and we don't really expect you to remember it, but it touches on a lot of really important concepts: binary numbers, Boolean logic, and how these basic building blocks are assembled into much more complex processes like adding numbers or, ultimately, manipulating data.

[![IMAGE ALT TEXT HERE](http://img.youtube.com/vi/VBDoT8o4q00/0.jpg)](http://www.youtube.com/watch?v=VBDoT8o4q00)

The really important thing to get from this last video is that computers are chaining together long sets of simple operations which always basically work out to 1 or 0, which is the same as True or False. This is Boolean logic and we're going to be doing a lot more with it later in this set of sessions, but you should always keep in mind that a huge set of calculations are going on in your computer in an order specified by a set of _rules_: do 'A', then do 'B', then... When these rules become sufficiently complex they are called algorithms. And when they get that complex they are not easy to write down as a set of logical outputs, it's often easier to express in a more human-readable form... which is why we have programming languages.

But remember: finding the average of a set of numbers involves an algorithm. And calculating the probability that the lecturer won't show up to the first lecture also involves an algorithm, it's just that it's a much more complicated one unless you take matters into your own hands and arrange for an accident...

## Calculating the Mean

There are obviously many ways that you can calculate the mean (also known as the _average_ if your maths is a little rusty): in your head, using pencil and paper, on a calculator, in Excel... and, of course, using a code! For a small set of simple numbers, using your brain is going to be a lot faster than typing it into a calculator or computer.

_Quick_, what's the mean of: `1, 2, 3, 4`?

### Now try doing this with code!

In the area immediately below this sentence you should see something like "<span style="color:red">In [4]</span>". Next to this is an empty box into which you can type computer code. Do you remember how to calculate the mean using a set of numbers and a calculator? That's all we're doing now, it's just that we're doing it from a keyboard instead of a keypad. Type in the four numbers on the empty line below and then click the 'play' button (the sideways triangle with the bar) above to see if you managed to run your first piece of Python code correctly! If everything has gone well then you should see something like "<span style="color:red">Out [4]</span>" appear.

Did you get 2.5? Chances are that you got the number 2 instead, which isn't what you expected and probably not what you meant. 

This is something that we're going to come back to again and again: computers do exactly what you tell them to do, even if you tell them to do something silly.

### Silly is not stupid (learning languages is hard)

This is really important: just because you told the computer to do something 'silly' does _not_ mean that you are stupid. Did you ever try to learn a foreign language? Did you expect to be fluent after a few classes? Assuming that you had a realistic expectation of how far you'd get with French, Chinese, or English, then you probably figured it'd be a few years before you could even hold a conversation with someone else. So why would you expect to sit down at a computer and be able to hold a conversation with it (which is another way of thinking about what a program is) after reading a few pages of text and watching a YouTube video or two?

### Practice makes perfect

Your language class (assuming that you took one) probably had two parts: a 'lecture' where you were taught grammar, syntax, and words, and a 'lab' where you practiced. It's the same for programming: the reason you got a 'silly' answer is that we haven't taught you how to ask the right question yet! For a language like Python `2` is not the same as `2.0`... can you now guess what you might need to change in the 'code block' above to get the right answer? Don't worry if you still can't get the right answer, how to 'talk numbers' is the main topic in the _next_ notebook.

So, weo want you remember that there are no stupid questions when it comes to programming. We have _all_ been there at one point or another. Many of us still ask for help, just on harder questions. And that's only because we have had a lot more practice in the language of programming. So the _only_ stupid thing you can do in this course is to assume that you can skip the 'lab classes' and don't have to practice what you're learning. There are web sites that will give you answer (in fact, we're going to point you to some of them) but if you don't expend any effort in trying to understand it, or if you just copy the answer off of your friend, that's the same as assuming you'll learn a language just because your friend is taking the same language course! _That_ is silly.

### When computers beat brains (or calculators)

What makes a computer potentially _better_ than a calculator (or your brain) is that a computer isn't daunted by having to count lots of numbers and it doesn't need you to input each number individually! The computer can also do things like: a) find out the amount of rain that fell in London, Manchester, and Edinburgh yesterday from an online weather service; b) work out the average rainfall for these three cities; and then c) work out the standard deviation. All in a matter of milliseconds! Or, it can do the same for 3,000 cities: it'll take a little bit longer, but it's the _same basic code_. 

So code is scalable in a way that brains and calculators are not and that is a crucial difference.

Here's a good example of when computer start to get better and faster than brains:

In [4]:
(23495.23 + 9238832.657 + 2 + 12921)/4

2318812.72175

## Computers: good or bad?

What are computers good at?
- Doing the same thing over and over
- Doing _exactly_ what they are told to do

What are computers bad at?
- Generating knowledge
- Being creative

There is a long-standing contest, called the Turing Test, that demonstrates this difference rather nicely. The point of the Turing test is to have a computer fool a person into thinking that they're talking to another person. Basically, according to Turing if a computer can fool a person into thinking that they're talking to another person then we'll have to declare that machines have become AIs (Artificial Intelligences). They still have a hard time fooling anyone for very long. 

In contrast, bigger and better computers have now beat us at Chess and Go, and are being used to help us understand earthquakes and climate change on a huge scale. Here, computers can do billions -- or trillions -- of calculations a second to work out that if 'A' happens then 'B' is the next most likely thing to happen, and so on and so on.

The difference is that games like Go and Chess have well-understood rules, as do processes like climate change and (sort of) earthquakes. We may not know all of the rules yet, and even simple games can produce trillions of possible 'next moves', but people don't have the same rules. Yes, conversations have norms (unless you're in an online comment forum you don't normally start a conversation by asking if someone is an idiot) but people don't just play 'games' according to those rules, they actually play with the rules themselves in a way that a computer finds very, very hard to follow.

That's why AI has been twenty years away for the last sixty years. Computers have been getting better and better at doing really difficult things recently, but it's usually in an area where we understand the rules and we normally need to spend a lot of time training the computer. 

#### More on the Turing Test
Turing, A (1950), _Computing Machinery and Intelligence, Mind_ LIX (236): 433–460
doi:10.1093/mind/LIX.236.433, ISSN 0026-4423


### Being a 'good' programmer

The best way to be a 'good' programmer is to know when the computer can help you and when it will get in the way. A computer cannot 'solve' a problem for you, but it _can_ help you to find the answer when you've told it what to look for and what rules to use in that search. A computer can only do _exactly_ what you tell it to do, so if you don't know what to do then the computer won't either.

One of the founders of computing, [Charles Babbage](https://en.wikiquote.org/wiki/Charles_Babbage#Passages_from_the_Life_of_a_Philosopher_.281864.29) had this to say:

> On two occasions I have been asked, — "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" In one case a member of the Upper, and in the other a member of the Lower, House put this question. I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
> _Passages from the Life of a Philosopher (1864), ch. 5 "Difference Engine No. 1"_

Modern programmers call this: garbage in, garbage out. GIGO, for short.

**_The single most important thing that you can learn is: how to think abstractly about solving a problem in a way that you can communicate to a computer._** What we mean is this: the real power of the computer isn't figuring out how to add `1, 2, 3, 4` together and calculate the mean, it's figuring out how to add _any possible set of numbers_ together and to work out the mean. That's what we mean about abstraction: it's not solving the problem _once_, it's solving any set of related problems as well. The rest of this course is really just about getting you started down that path. And remember, you're not stupid if you don't know how to explain to the computer how it can help you to find this answer as you're still learning the basics of how to communicate with it!

### The 3 virtues of a programmer

Another useful idea comes from [Larry Wall](https://en.wikipedia.org/wiki/Larry_Wall) (check out the 'tache!), who created a programming language called Perl. He said that programmers had three virtues: Laziness, Hubris, and Impatience. 

<img src="http://cdn.quotationof.com/images/larry-wall-1.jpg",width="250">

The reason that these are virtues in programming (but not in your university courses!) is as follows:

1. Laziness makes you want to put in the effort _now_ to reduce the amount of effort you'll have to put in _later_. So the effort that went into inventing the calculator in the first place (and in making it easy to use) meant that lots and lots of other people never had to do this stuff by hand or using slide rules.
2. Hubris makes you want to write code that other people won't want to "say bad things about". In the course we'll get into what makes 'good' code in more detail, but the short version is: it's efficient, it's easy to read, and it's clever.
3. Impatience is about wanting the answer _now_ rather than waiting for a lazy computer. If you find yourself waiting long periods of time for something that the computer should be able to do _quickly_ then you should be impatient. Chances are, you made a mistake in your code.

**_Hint: you'll also see a lot of laziness when you start trying to write code. Programmers don't like writing `remove` when they could just write `rm`, nor do they like writing `define` when they could just write `def`. Keep an eye out for these mnemonics as they can be pretty daunting at first._**

### The 3 false virtues

Larry also pointed out that these virtues had three mirror-image false virtues:

1. False laziness happens when you leave something working but half-finished and, most likely, about to break. When you start using [StackOverflow](http://www.stackoverflow.com/) you may find that it makes it easy to copy+paste answers into your homework and then you can glue it all together messily. This isn't the same as _understanding_ and _adapting_ the solution that you found online to _your_ problem, so it's false laziness.
2. False hubris is thinking that no one else's code is 'good enough' for you. Sometimes copy+paste is false laziness, but refusing to recognise when copy+paste _is_ the right thing to do is false hubris.
3. False impatience is getting started on solving a problem when you don't yet understand what the problem _is_. One thing that a lot of programmers do is half-listen to what someone has asked them to do and then go haring off without sitting down to make any kind of plan or structure. It's like writing an essay without having done the readings.

There's a lot more thinking on this here: http://blog.teamtreehouse.com/the-programmers-virtues

## The Benefits of Coding?

In a research context we think that the benefits of learning to code fall into three categories:
1. Flexibility -- the computer can often apply the same analytical process to completely different data sets (_e.g._ rainfall in UK vs rainfall in the US) with minimal effort compared to trying to do each step in, say, Excel or SPSS. For students it comes down to this: let's say you discover a newer, better data set for your dissertation and want to use this for your analysis instead of the old, inaccurate data, it's a lot easier and faster to do this with code!
2. Reproducibility -- recently, it's been discovered that a lot of research cannot be reproduced; in other words, if one scientist tries to duplicate what someone else did in order to check something out they're finding that the results don't line up. For stuents the benefit is this: you've just finished your analysis when someone points out that you made a mistake with the data right back at the beginning; redoing all of that in Excel or SPSS would be a nightmare, but with code it can be as easy as changing one line and hitting 'Run'!
3. Scalability -- a computer doesn't care if you throw 10 lines or 10 billion lines at it, the only thing that changes is how long it takes to get an answer. In other words, if your code 'works' on a proper subset of your data it will also work on your entire data set no matter how big it is.

## Computer Languages

In [Notebook 0](notebook-0.ipynb) we briefly mentioned that we'd be using the Python programming language. As with human languages, there are _many_ programming languages in the world, each with their own plusses and minusses, and each with their own vocabulary (allowed words) and grammar (syntax). We use Python. 

### Python what?

You can find the points below also made in [Python In A Nutshell](http://mbrochh.github.io/python-101/#/6/1) by [Martin Brochhaus](https://github.com/mbrochh)

Python was invented by [Guido van Rossum](https://en.wikipedia.org/wiki/Guido_van_Rossum) in the late 1980s and he continues in the role of 'benevolent dictator' to this day, which means that he (and some other very smart people) try to to ensure that the language continues to meet the basic goals of:
* Being very easy to read (syntax)
* Using plain-English for many functions and operators (allowed words)
* Has a comprehensive style guide: [PEP8](https://www.python.org/dev/peps/pep-0008/) (syntax)
* Has no unnecessary special formatting characters (syntax _and_ allowed words)

So while Python is not the fastest language (C and C++ are faster), nor is it the safest (you wouldn't use it to fly a rocket to Mars), it _is_ a very readable, learnable and maintainable language.

So if you want to learn to code, to do 'data science', or build a business, then use Python.

#### Three takes on Python

Here are three videos pitched in quite different ways at the plusses of Python, all of which touch on issues we'll be dealing with later... so watch the videos (even if they're a bit silly in places)!

[![IMAGE ALT TEXT HERE](http://img.youtube.com/vi/aXKVOLwpDg8/0.jpg)](http://www.youtube.com/watch?v=aXKVOLwpDg8)

[![IMAGE ALT TEXT HERE](http://img.youtube.com/vi/Hn4FbT4wMms/0.jpg)](http://www.youtube.com/watch?v=Hn4FbT4wMms)

[![IMAGE ALT TEXT HERE](http://img.youtube.com/vi/G8brQdClo9s/0.jpg)](http://www.youtube.com/watch?v=G8brQdClo9s)

### R

The other language that is often mentioned by people doing data-led research is [R](https://www.r-project.org). It's one that we use in a lot of our work. So why don't we teach it? Because it's an eccentric language that does a lot of things differently and it's not much used outside of academia. By teaching you Python we give you a platform from which you can learn _any_ other language (with enough effort!), whereas if we taught you R you would find this step a lot harder. And if you tried to teach you both, it would just be confusing.

We're not [the only people](http://www.dataschool.io/teaching-data-science/#includingtidydataandreproducibilityinthecurriculum) to think that.

## Programming and Geography 

One of the big things that is changing with the emergence of what is being called 'computational social science' and of environmental modelling with computers is the extent to which geography – regardless of whether you consider yourself a human or physical geographer – is the importance of programming. We are moving away from 'push button' spatial analysis or, rather, a big gap is opening up between the stuff that can be done by pushing buttons (which no longer even really requires geographical training) and the 'cutting edge'. We'll be pointing you to several pieces that argue this case, but here is one to start with:

* [GIS Jobs of Today](http://www.directionsmag.com/entry/gis-jobs-of-today-should-you-have-programming-skills/473296): should you have programming skills?

### Python & Geography

Python and R can both help us to undertake geographical analysis. That is, in fact, the premise of this entire course! But of the two languages, Python is the one most-used as part of a _geographical workflow_ – what we mean by this is that you can find Python buried inside of ESRI's ArcGIS and the open-source QGIS applications, and it also sits behind (or talks to) a number of other tools that allow us to work flexibly and scalably with geo-data.

Right now, we are using Python version 2.7, although you can also download a version 3. We're sticking with the older version for the time being because there are more (geo) tools available to us in Python 2 than Python 3. This is changing quickly, but for now it's the safest decision. So if at any point you are asked to choose between Python versions, make sure you pick 2.X (there will be a 2.8 and 2.9 eventually).


### References:

- a **must** read: [the hard way is easier](http://learnpythonthehardway.org/book/intro.html)
- two easy and accessible videos to start wrapping your head around programming (although they are not Python-centric) [1](https://www.youtube.com/watch?v=qUVWM2Q4vAU) and [2](https://www.youtube.com/watch?v=AImF__7FyzM)