# Programming and Geography 

- Contributors: Jon Reades (jon@reades.com); James Millington (jamesdamillington@gmail.com)


## The Fall and Rise of Geocomputation

We live in a world transformed by big (geo)data: from Facebook likes and satellites, to travel cards and drones, the process of collecting and analysing data about the world around us is becoming very, very cheap. Twenty years ago, gathering data about the human and physical environment was expensive, but now a lot of it is generated as the ‘exhaust’ of day-to-day activity: tapping on to the bus or train, taking photos (whether from a satellite, drone, or disposable camera), making phone calls, using our credit cards, and surfing the web. And that's before you start looking at the Terabytes of data being generated by satellites, air quality and river flow sensors, and other Earth Observation Systems! 

As the costs of capturing, curating, and processing these data sets falls, the discipline of geography is changing. You face a world in which many of the defining career options for geographers with basic quantitative skills will either no longer exist, or will have been seriously de-skilled. So much can now be done through a web browser (e.g. [CartoDB](https://carto.com)) that specifying ‘Knowledge of ArcGIS’ is becoming superfluous; not because geo-analysis jobs are no longer in demand or no longer done -- in fact, they are more vital than ever -- but because the market for these skills has split in two: expensive, specialist software is being superseded by simple, non-specialist web-based tools on the ‘basic’ side, and by customised code on the 'advanced' side.

It is for these reasons that terms like 'geocomputation', 'computational geography' and 'geographic data science' are back in vogue. After a period in which GIS was front-and-center for many geographers with an interest in spatial data (as well for many geographers who objected to the shortcomings of quantitative approaches), the availability of data and advanced computational techniques (including Machine Learning), together with the 'discovery' by other disciplines of the role of geography in 'big data' processes, has created a need for a 'new' (or old, depending on your view) type of geographer able to reason much more directly _through_ code while remaining rooted in the critical geographic tradition that is aware of the short-comings (and opportunities) of data.

## What is Geocomputation?

Computational approaches -- which is to say, approaches to geography using computers excuting commands written in programming code -- differ in important ways from the quantitative skills taught in traditional geography ‘methods’ classes: computational geography is underpinned by algorithms that employ concepts such as _iteration_ and _recursion_, and we use these to tackle everything from a data processing problem to an entire research question. 

<img src="https://data.cdrc.ac.uk/uploads/group/2015-08-13-085820.634829productdefault.png" alt="Open Atlas Logo" width="350px"/>

For example, Alex Singleton’s OpenAtlas (available for free from the [Consumer Data Research Centre](https://data.cdrc.ac.uk/product/cdrc-2011-census-open-atlas)) contains 134,567 maps. Alex designed and wrote a script to _iterate_ over the Census areas (i.e. to ‘visit’ each area in turn when creating a map), and to _recurse_ into smaller sub-regions from larger regions (i.e. to keep drilling down into smaller and smaller geographies) in order to generate maps at, literally, every conceivable scale. Then he let the computer do the ‘boring bit’ of actually creating each and every map. 

### Thinking Algorithmically

Thinking _algorithmically_ requires students and professionals to deal with abstraction: we don’t want to define how each analysis should work – or how each map should look – rather, we want to specify a set of rules about how to select and display data on a map, and then let the computer make them all for us. In this way of working it’s not really any more work to create 500 or 5,000 maps than it is to create 5 because we’ve already told the computer how to make useful maps. 

Here's another way to think about it:

> _An algorithm is like a recipe. It takes "inputs" (the ingredients), performs a set of simple and (hopefully) well-defined steps, and then terminates after producing an "output" (the meal)._

This article also goes on to make some interesing points about AI and deep learning that are [well worth a read](https://medium.com/@geomblog/when-an-algorithm-isn-t-2b9fe01b9bb5), but for our purposes the bit about a _recipe_ is the important bit: how would you break your problem down into steps _like the ones you'd see for a recipe_?

Learning to think this way is _hard work_: the _first_ time I try a new recipe I really don't know how things are going to taste. Similarly, the first time I use an algorithm to make a map or solve a problem I usually don't actually know exactly how my maps are going to look until _after_ I've made them. The difference from the 'normal', non-computational way of working is that I make a few changes to my code and then just run it again. And again... as many times as I need to in order to get what I want. I can keep changing the recipe until I get it just right.

### Thinking Like a Programmer

However, trying the same recipe again and again and again _also_ sounds like hard work! Wouldn't it be faster to just click and choose what you want the computer to do in SPSS or ArcMap? Well, yes and no. The two advantages to doing this with code over pointing-and-clicking are: 1) your solution is _transferrable_; and 2) thinking 'like a programmer' is also about problem-solving, and that _also_ transfers very nicely to the 'real world' of employment.

Why do we say this:

1. Programming solutions are transferrable because you aren't just solving _one_ problem, you are solving _classes_ of problems. In the same way that many recipes build on the same basic ingredients (sometimes adding something new for added 'spice'), many applications use the same basic ingredients: it's how they're put together in new ways that leads to new outputs. It's a lot like Lego.

2. Thinking like a programmer also translates well because you are learning to deal with abstraction. Yes, the details of a problem matter (just as ignoring cultural differences between two countries can matter), but it's important to be able to break a really big, messy, complex problem down into smaller, tidier, more comprehensible bits that you _can_ tackle. Programmers deal with this every day, so they tend to develop important skills in understanding and dealing with practical challenges of the sort that you'll face every day in your career.

Here's another [useful bit of insight](https://medium.freecodecamp.org/how-to-think-like-a-programmer-lessons-in-problem-solving-d1d8bf1de7d2?source=userActivityShare-65ab89778550-1526122060&gi=f9005e8aacb5):

> The best way \[of solving problems\] involves a) having a framework and b) practicing it.
> 
> Problem-solving skills are almost unanimously the most important qualification that employers look for... more than programming languages proficiency, debugging, and system design...
>
> — Hacker Rank (2018 Developer Skills Report)

You really should [read the article](https://medium.freecodecamp.org/how-to-think-like-a-programmer-lessons-in-problem-solving-d1d8bf1de7d2) (it's not very long) but here are the key points:

1. Understand the problem -- most problems are hard because you don't understand them _and_ you will only know that you've understood it when you can explain it in plain-English.
2. Plan -- if you just dive in without thinking about what you need your code to take in and spit out then you're going to waste a _lot_ of time.
3. Divide -- break a hard problem down into simple, small steps and tackle them in small, simple blocks of code. _Never_ try to just sit down and 'code'.
4. Debug with a fresh eye -- if you really feel stuck, step away from the computer for 5 minutes, take a deep breath, and try to look at the problem with a fresh set of eyes rather than just diving back in. Most problems boil down to either not seeing the big picture, or not realising that the computer is doing _exactly what you told it to do_, and _not what you meant for it to do_.
5. Practice -- find ways to practice problem-solving and coding (not necessarily at the same time).

If you don't take our word for it, how about taking Richard Feynman's word on it?

> If you can’t explain something in simple terms, you don’t understand it.

<img src="https://kottke.org/plus/misc/images/feynman-blackboard.jpg" width="400" />

### The Open Source Ethos

Once I've got a solution to my current problem, I can take that code and apply it to a new problem. Or a new case study. _Or_, I can [post it online](https://github.com/kingsgeocomp) and let others build off of my work to tackle problems that I've not even considered! Giving away my code might seem like a bad idea, but think about this: in a world of exciting research questions, are you going to be able to tackle every single one? Your own work _already_ builds off of code that other people gave away (the Mac OS, Linux, QGIS, Python, etc.)... perhaps you should give something back to the community? Not _just_ because it's a nice thing to do, but because people will find out about you through your code. And those people might be in a position to offer you a job, or they might approach you as a collaborator, or they might point someone else with an interesting opportunity in your direction because you have built a reputation as a 'contributor'.

## Computers: good or bad?

The best way to be a 'good' programmer is to know when the computer can help you and when it will just get in the way. A computer cannot 'solve' a problem for you, but it _can_ help you to find the answer when you've told it what to look for and what rules to use in that search. A computer can only do _exactly_ what you tell it to do, so if you don't know what to do then the computer won't either.

One of the founders of computing, [Charles Babbage](https://en.wikiquote.org/wiki/Charles_Babbage#Passages_from_the_Life_of_a_Philosopher_.281864.29) had this to say:

> On two occasions I have been asked, — "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" In one case a member of the Upper, and in the other a member of the Lower, House put this question. I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
> _Passages from the Life of a Philosopher (1864), ch. 5 "Difference Engine No. 1"_

What are computers good at?
- Doing the same thing over and over
- Doing _exactly_ what they are told to do

What are computers currently still bad at?
- Generating knowledge
- Being creative

There is a long-standing contest, called the Turing Test in honour of [the famous computer pioneer](https://en.wikipedia.org/wiki/Alan_Turing), that demonstrates this difference rather nicely: a computer passes the Turing test if it can fool a person into thinking that they're talking to another person. Some people have claimed that if a computer can _really_ pass the Turing Test by keeping up a conversation of indefinite length on any range of topics then we'll have to declare that machines have become full AIs (Artificial Intelligences). To put it another way: if it sounds like a human and responds like a human... then is it a human?

Perhaps fortunately for us, althoug computers are getting a lot better at holding up their end of the conversation they still seem to have a hard time fooling anyone for very long. In contrast, bigger and better computers have now beat the best humans at Chess and Go, and are being used to help us understand earthquakes and climate change on a huge scale. Here, computers can do billions -- or trillions -- of calculations a second to work out that if 'A' happens then 'B' is the next most likely thing to happen, and so on and so on.

The difference is that games like Go and Chess have well-understood rules as (ultimately) do natural processes like climate change and earthquakes. Chess is 'easier' for a computer than Go because a big enough computer can work out every possible chess move and pick the best one, whereas it can't do that for Go and so has to make 'choices' based on incomplete information. Earthquakes have even more 'rules', but as far we know they still follow _some_ set of rules dictated by physics and chemistry. 

People, however, don't use the same unchanging rules in conversation. Yes, conversations have norms, unless you're using an online comment forum where it's normal to start a conversation by asking someone if they're an idiot, but people don't just 'play games' within the rules, they actually play with the rules themselves in a way that computers find very, very hard to follow. Think of sarcasm: you say one thing but it means exactly the opposite. And if it's delivered deadpan then sometimes even people have trouble knowing if you're being sincere!

That's why AI of the sort you might have seen in _2001_ or _Blade Runner_ has been twenty years away for the last sixty years! Recently, computers have been getting better and better at doing really difficult things, but it's usually still in a narrow area where we understand the rules and we normally need to spend a lot of time training the computer. 

#### More About the Turing Test
Turing, A (1950), _Computing Machinery and Intelligence, Mind_ LIX (236): 433–460
doi: [10.1093/mind/LIX.236.433](http://dx.doi.org/10.1093/mind/LIX.236.433), ISSN 0026-4423

### Further Reading 

A big gap is opening up between the stuff that can be done by pushing buttons (which no longer even really requires geographical training) and the 'cutting edge'. There are many pieces that argue this case, but here are a few to start with:

* [Why the Future of Geography is Cheap](https://www.rgs.org/schools/teaching-resources/why-the-future-of-geography-is-cheap/)
* [GIS Jobs of Today](http://www.directionsmag.com/entry/gis-jobs-of-today-should-you-have-programming-skills/473296): should you have programming skills?