<a href="https://colab.research.google.com/github/edwardoughton/GeoAI/blob/main/01_01_GeoAI_intro.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Welcome to GGS590 GeoAI

This GGS590 Special Topics class focuses on the use of AI in the geospatial sciences ("GeoAI").

This is the first time GGS has looked to integrate an AI class into the department. Hence, the aim is to transition into the main roster of classes over 2026. Therefore, you have a great opportunity to influence the structure and content of this class.



By the end of this session, students should be able to:

*   Define GeoAI and distinguish it from GIS, geospatial data science, and general AI
*   Identify major GeoAI use cases across domains
*   Recognize core methods and model families used in GeoAI
*   Set up and run notebooks in Google Colab
*   Refresh essential Python skills (aimed at geospatial analysis but today we will be focusing on things as basic as data types)



This GeoAI course will be pitched to try explore the intersections between:

*   Geospatial data (raster, vector, spatiotemporal)
*   Artificial intelligence and machine learning (AI/ML)
*   Domain-driven spatial reasoning
*   The use of common geospatial and AI tools
*   Critical thinking in the spatial sciences



I will place a key emphasis on:

*   Practical modeling with real geospatial data (frequently satellite imagery and OpenStreetMap data)
*   Understanding why particular AI methods work (or fail) in spatial contexts
*   Scientifically reproducible GeoAI workflows
*   Code testing to ensure you have a correct/plausible answer!



Please be aware that this GeoAI course will not:

*   Be a general GIS introductory course
*   Teach you explicitly how to use Graphical User Interface (GUI) GIS software such as ESRI ArcGIS Pro or Quantum GIS
*   A general deep learning theory course
*   A “black box” AI tools class
*   A deeply math-based introduction to AI/ML tools



## Syllabus

You can find the syllabus on the class GitHub page here: https://github.com/edwardoughton/GeoAI

Let us take some time to cover what is there, including the course policies, structure and expected content.

As this is a new class, please do let me know if you think something should be added here.

##What is GeoAI?

Obviously there are lots of definitions but let us run with a working definition as follows:



> The application of AI/ML to spatially explicit data, accounting for traditional geospatial characteristics such as location and spatial dependency.



By "traditional geospatial characteristics" we mean that GeoAI accounts for spatial structure (e.g., non-independent relationships imposed by geography):



*   Spatial autocorrelation (e.g., Tobler's first law)
*   Scale effects
*   Spatiotemporal dependence
*   Topological relationships
*   Heterogeneous geospatial data sources
*   Uncertainty and bias often tied to geographic space

A GeoAI model is “spatially aware” when these properties are explicitly encoded rather than implicitly hoped for.







We can take a minute to understand more via this summary video:

https://www.youtube.com/watch?v=uUqYGP0UYTI

### How does GeoAI shape up versus related geospatial fields?


| Field                   | Focus                                             | Examples                                  |
|-------------------------|---------------------------------------------------|-------------------------------------------|
| GIS                     | Data management, visualization, spatial analysis | ArcGIS, QGIS, PostGIS                     |
| Geospatial Data Science | Statistics + ML on spatial data                  | Spatial regression, kriging, GWR          |
| Computer Vision         | Image understanding (often non-spatial)          | CNNs, object detection, image segmentation|
| GeoAI                   | AI models aware of spatial structure             | GNNs, spatial deep learning, remote sensing AI |


##Core GeoAI Use Cases

**Remote sensing and Earth Observation (EO)**:

*   Land cover / land use classification
*   Change detection
*   Object detection (buildings, roads, ships)
*   Environmental monitoring

Some example papers:

Zhu et al. (2017). Deep learning in remote sensing: A comprehensive review.
https://doi.org/10.1109/MGRS.2017.2762307

Ma et al. (2019). Deep learning in remote sensing applications.
https://doi.org/10.1016/j.isprsjprs.2019.04.015

Zhu et al. (2025) GlobalBuildingAtlas: an open global and complete dataset of building polygons, heights and LoD1 3D models. https://doi.org/10.5194/essd-17-6647-2025

**Urban/rural analytics, smart cities, and transportation planning**:

*   Population estimation
*   Transportation and mobility prediction
*   Urban growth modeling
*   Informal settlement detection

Some example papers:

Batty, M. (2018). Artificial intelligence and smart cities: https://doi.org/10.1177/2399808317751169

Tatem, A. (2017) WorldPop, open data for spatial demography: https://doi.org/10.1038/sdata.2017.4

  




**Climate, environment, and natural hazards**:

*   Flood, hurricane and wildfire (NatCat) risk modeling
*   Climate downscaling
*   Ecosystem and biodiversity modeling

Some example papers:

Reichstein et al. (2019). Deep learning and process understanding for data-driven Earth system science.
https://doi.org/10.1038/s41586-019-0912-1

Rolnick et al. (2022). Tackling climate change with machine learning.
https://doi.org/10.1145/3485128

**Human–environment and social applications**:

*   Disease mapping
*   Socioeconomic inference
*   Accessibility and equity analysis

Some example papers:

Jean et al. (2016). Combining satellite imagery and machine learning to predict poverty.
https://doi.org/10.1126/science.aaf7894

Stevens et al. (2015). Disaggregating census data using remote sensing and machine learning.
https://doi.org/10.1371/journal.pone.0107042

Oughton and Mather (2021) Predicting cell phone adoption metrics using machine learning and satellite imagery. https://doi.org/10.1016/j.tele.2021.101622

## Core GeoAI methods

**Traditional ML modeling approaches**:

*  Random forests
*  Gradient boosting
*  Support vector machines

The benefit of using these approaches is that they are highly explainable (unlike more fancy deep learning methods). These techniques might also be much more computationally efficient than other options (so do not neglect them because they are classic approaches).

**Deep learning approaches**:

*  Convolutional Neural Networks (CNNs)
*  U-Net and encoder–decoder architectures
*  Vision Transformers (ViTs)

These methods have achieved substantial improvements (especially in computer vision) over the past decade. They are particularly strong for working with raster data, imagery, and potentially spatial data patterns.

**Graphs and spatial models**:

*   Graph Neural Networks (GNNs)
*   Spatial autoregressive models
*   Point process models

Graph-based, autoregressive, and point process models each address spatial dependence from distinct perspectives. There are distinct advantages and disadvantages which  make them suitable for different data structures and research objectives.

**Space: The final assumption violation**

We need to be aware that in the same way that spatially dependent data can badly jeopardize regression assumptions, this is equally true for spatially applied naive AI:

* Independent and Identically Distributed (IID) assumption violations
* Spatial leakage (e.g., when proximate statistical entries lead between training and testing data, so an example of non-IID)
* Edge effects (biases at the edge of the study area)
* Modifiable Areal Unit Problem (MAUP)

## Using AI as a coding partner

Most of us writing code regularly will probably be teaming with an AI tool. Generally, GenAI tools are good at:

*   Producing boilerplate code
*   Providing library usage examples
*   Debugging syntax errors that inevitably arise when coding
*   Explaining unfamiliar APIs
*   Helping you refactor your code
*   Speeding up the process of profiling your code

However, there are key limitations. For example:

*   AI tools have very limited understanding of data provenance
*   Making modeling decisions for spatial validity (e.g., correct results)
*   Ensuring unbiased results
*   AI cannot guarantee correctness

Responsible use guidelines:

*   Always inspect and test AI-generated code
*   Never assume spatial correctness
*   Document AI assistance when used
*   Treat AI output as a draft, not an answer

So in summary, GenAI code is a starting point, not an end point.

If you really want to be a skeptic on AI, it can be argued that we have just shifted time previously spent on planning/architecture, to testing/validation. For example, you can get 90% of the way there in 20 minutes ("vibecoding"), but it takes an extremely long time to then test and validate this code, especially for all edge cases.





# Google Colab

There are lots of benefits to using Google Colab, which include:

*   Zero required effort to setup
*   Easy sharing and reproducibility
*   Swift integration of text notes and Python code
*   Possible progression to GPU/TPU processing (if required)

Some of you may have used Jupyter Notebooks previously to read, write and run code (also known as iPython Notebooks, thus the .ipynb file extension).

In which case, you might have been familiar with the traditional approach to running Python. This often begins with downloading the data science meta-package Anaconda, creating a virtual environment to run the code, and then getting started with your notebook.

Each notebook is stored in your Google Drive, and you can easily share each one further with friends, collaborators, professors etc.

This means we might not need to create a virtual environment at all, massively simplifying the process for new comers to begin reading and writing Python code.



## Requirements

There are a set of key requirements you will need, including:

*   A Google account with Google Drive
*   An Internet connection (e.g., >10 Mbps)

If you do not have a Google account (I know, unlikely in this day and age), please can you create one now.

Now go get a Google Colab Pro account using your @gmu.edu address:

https://colab.research.google.com/signup



## Exploring Colab

Now we get to start the fun part (playing with some introductory Python code!).

To do that, we can go to Colab and start a new notebook (click the big New notebook button):

https://colab.research.google.com/

To summarize, a notebook is basically a collection of cells which we can define as either being text or code. Here, the text is written in Markdown, and the code is written in Python. In fact, we can combined both executable code, our markdown text, and HTML code, images and many other file types in this single flexible format.

You can learn other programming languages, however Python is one of the most powerful that you can still easily read.

Ideally, you should aim to become an advanced programmer in a language such as Python, before you look to become a beginner/intermediate programmer in another (as there is great value in becoming a pro, rather than being a beginner in multiple).





## Modes and navigation

Importantly, Colab, like any iPython notebook (.ipynb) has two modes:

*    **Edit mode**. Press enter (or double-click) on a cell to directly edit the contents you are located on.
*    **Command mode**. Press escape to exit a cell and then navigate elsewhere.

Once you are in command mode, it is possible to navigate using the up or down arrows on your keyboard navigation section.

Pressing shift and up/down will allow you to multiple highlight (e.g., in preparation for a copy/paste).


## Executing code

Our first example of a code cell is present below, where we will print a string to the console (a string is a sequence of characters).

Either click the round play button, or type control + enter to execute (or Apple command + enter on Mac).



In [None]:
print("Welcome to GGS590")

## Commenting in a code cell

We are able to write comments into our Python cells by using the pound sign e.g., '#'.

In [None]:
# 2 + 2    <- hashed out, so Python does not execute!
1 + 1     #<- whereas Python will execute this

## Text cells

Alternatively you can add in your own text cells. When you hover between each cell, you will see either the Code or Text icons show up, which you can click on as desired (you can also convert a code cell to a text cell by pressing control + M M).

Have a go at adding a text cell below.

As mentioned previously, Markdown is a simple markup language (essentially, a basic approach for outlining a document text/image layout and presentation format).

Some key formatting includes the following:

*   Making an item bold using a **double asterisk**.
*   Making an item italicized using a *single asterisk*.
*   Adding a strikethrough using ~~double tilde~~.
*   Or adding a hyperlink [using square parentheses followed by the address in regular parentheses](https://www.google.com/).

The other key formatting technique you might like to use is for heading titles, as follows:

*   "# Large heading"
*   "## Medium heading"
*   "### Small heading"
*   "#### Very small heading"

These titles are all in quotes to prevent Colab thinking they are real section headings. Remember to remove the quotes if you want to use them properly. Colab will add each heading to the table of contents.

For more information, see the Colab Markdown guide, [here](https://colab.research.google.com/notebooks/markdown_guide.ipynb).

## Python as a calculator

Obviously, we will want to run a range of operations and Python is very supportive for numerical computing.

In [None]:
# Addition
2 + 2

In [None]:
# Subtraction
5 - 3

In [None]:
# Multiplication
3 * 3

In [None]:
# Division
10 / 2

In [None]:
# Powers
3**3

Do not forget to follow a proper order of operations, such as Parenthesis, Exponents, Multiplication, Division, Addition and Subtraction (PEMDAS) (or other common variant BODMAS, BIDMAS etc.).

In [None]:
# Consider
8 / 2 + 2

In [None]:
# Versus
8 / (2 + 2)

Before we even get to GeoAI, this is a good example of how we need to be careful in scientific computing.

## Scientific notation

We can also write in scientific notation using the `e` character to signify the number of zeros. There are many benefits.

For example, for 1,000 we can write:

In [None]:
1e3

In [None]:
1e6

## In-line commenting

Be aware that the pound/hash sign enables you to directly write natural language into a coding cell.

For example, in the coding cell below we have both hashed out text, and code we want to execute.

Examine how using # will indicate to Python you want to ignore whatever is written to the right hand side of this line.

In [None]:
# In-line commenting
# 2 + 2    <- hashed out, so Python does not execute 2 + 2!
1 + 1     #<- whereas Python will execute this

# Recap on Python

Let us recap on some basic introductory concepts.

The aim is to make this content fun an engaging. Therefore, the will avoid long one-directional lecturing, and provide the learner with the opportunity to interact with small programming examples.

1.  Declaring and assigning variables
2.  Updating assigned variables
3.  Variable types
4.  Multiple assignments
5.  Background to text-based programming (e.g., for strings)
6.  Formatting strings
7.  Different types of numbers


## Declaring and assigning variables

A variable is a name used to store data.

This could be any type of data, e.g. a list, dictionary, dataframe etc. but here we will begin first by just considering simple numerical values or strings of characters.

You can allocate a value to a variable name by using the equals sign. The value on the right of the equals sign is then added to the computer memory, given the variable name you declare.

For example, we can declare and allocate variables for a user, with a specific age, spending a certain amount of time at a Point of Interest (POT), for a certain location in space, as follows:


In [None]:
# Example: Declaring and assigning variables
user = "Ed"
age = 38
time_spent_at_poi_mins = 36
location_coordinates = (1.23, -4.67)

Now this information has been added to memory as a variable, we can subsequently recall this information, like so:

In [None]:
# Example
print("User:", user)
print("Age:", age)
print("Time Spent at POI:", time_spent_at_poi_mins)
print("Location Coordinates:", location_coordinates)

## Task

Can you fill out the code cell below with information for where you spent lunch (12-1pm). Include the following variables:

* User
* Age
* Time spent at POI
* Location coordinates

Declare and assign these variables, and then print them to the console.

In [None]:
# Enter your attempt here


## Updating assigned variables

Once you have defined a variable, it is possible to then update this value again later.

For example, say you spent time at a specific POI for 2 minutes.

That could be defined below first as the initial number of minutes (e.g., "Initial minutes:").

Next, you could update this variable to add this quantity to the time spent (e.g., by an extra 2 minutes, to get a total time of 4 minutes).

In [None]:
# Example: Updating existing variables using arithmetic operations
minutes = 2
print(f"Initial minutes: {minutes}")

minutes = minutes + 2
print(f"Updated minutes: {minutes}")


## Task

You have been working on quantifying foot traffic in shopping malls. You installed a camera outside a shop front and implemented a machine-driven image recognition system to quantify the number of people passing by.

Declare a variable called pass_count and assign zero. Print this value with a string explaining what it represents.

Next, write a simple program which updates this value every time someone walks past.

For inspiration, you can use a similar format to the previous code box, but you must update the variable names, values etc.

Use in-line comments to describe what each step does.

In [None]:
# Enter your attempt here


You can also update variables using a shortened approach (via shorthand notation, such as +=).

This works for any operator you wish to choose, addition, subtraction, division, multiplication etc.

For example:

In [None]:
# Example: Shortened update
value = 2
print(f"This is our starting value: {value}")

value += 10  # Equivalent to value = value + 10
print(f"This is our updated value: {value}")

## Task

Rewrite the previous task program (footcount quantification) but with shortened update notation.

In [None]:
# Enter your attempt here


## Variable types

In computer programming, there are a range of different types of variables.

To summarize, these include:

*  Integer - A whole number.
*  Float - A decimal number.
*  String - A sequence of characters.

Thankfully, Python is clever enough to detect the type of variable we are using. Therefore, we do not have to do this explicitly, like in other languages (there are caveats though).

For example, see the variable demo below:

In [None]:
# Example: Variable types
x = 2      # This is an integer (a whole number)
y = 3.14    # This is a float (a decimal number)
z = "I am a string because I am in quotations" # This is a string (a character sequence)

print(f"Variable x: {x} (integer)")
print(f"Variable y: {y} (float)")
print(f"Variable z: {z} (string)")

## Task

Create a new set of variables which represent your first name, second name, height and age.

Print each variable, along with a description of the variable type.

Add in-line comments to describe what you are doing in your code at each stage.

In [None]:
# Enter your attempt here


## Multiple assignments

A very nice feature of Python is the ability to assign multiple variables in a single line of code (known colloquially on Stack Overflow as a "cheeky one-liner").

See the example below. This works providing we have an equal number of variables being declared on the left hand side of the equals sign, compared to the propose number of assigned values on the right.

In [None]:
# Example: Multiple assignments
a, b, c, = 8.1, 4, "I am a string as I am in quotations"
print(f"a: {a}")
print(f"b: {b}")
print(f"c: {c}")

## Task

Allocate your name, age and height to variables n, a and h, in the code cell below.

Add in-line comments to describe what you are doing.



In [None]:
# Enter your attempt here


## Some background to text-based programming

In Python (and other computer languages), a text/character-based piece of information is referred to as a 'string'.

Strings consist of characters and symbols, often representing natural human language (as opposed to a computer language, which would be binary).

Being able to parse strings (e.g., read/process strings), is very handy for text manipulation, file modification, labelling of plots, requesting inputs from user, arguments in functions, debugging, signposting code, reporting errors etc.

They are created by enclosing a sequence of characters using a pair of single or double quotes. It does not matter whether single or double but it should be consistent, e.g., "I am a string".

First, we will begin with a simple print() statement. Essentially, this function does what it suggests, by printing anything which is within the following parentheses.

In [None]:
# Example
print("I am a string")

Make a note to remember quotation marks = strings.

## Formatting strings

Next, we are going to learn a really useful function called format().

This allows us to add a variable value into a string.

There are two parts:

You need to define a string with curly parentheses (e.g. {}) where you want the new value to be inserted.
Then after the string you need to add the .format() function name.
For example, we can print to the console both the first part of the following string which we have written out ("Add to my string"), and then the second part which is added ("the ending of my sentence.").


In [None]:
# Example:
print("Add to my string {}".format("the ending of my sentence."))

And this does not just need to be a string that you insert, it could be any variable type, including an integer, float etc.

In [None]:
# Example:
print("Class number GGS {}".format(366))

This is a very handy function for later when debugging using loops.

More recently, from Python 3.6 onwards, you can use f-strings to do this task, which saves space:

In [None]:
# Example:
ending = "the ending of my sentence."
print(f"Add to my string {ending}")

In [None]:
# Example:
print(f"Add to my string {'the ending'}")

## Task

Create print statements which allow you to insert your existing user POI variables:

* User
* Age
* Time spent at POI
* Location coordinates

Use a descriptive string, followed by the inserted variable name.

In [None]:
# Enter your attempt here


String variables also have the property where it is possible to concatenate them together via addition. For example:

In [None]:
# Example
a = "Add"
b = "strings"
c = "together"

print(a + " " + b + " " + c)

## Task

Define the following variables, as v1, v2, v3.

* "The function yielded"
* "a final value of:"
* 44

Now concatenate them together in a print function, but make sure they are grammatically correct.

In [None]:
# Enter your attempt here


## More on different types of numbers

Numbers in Python can be integers, floats, and complex numbers.

For complete clarification, integers have no decimal value, so are whole numbers. Regardless of whether they are positive or negative.

For example, we can clarify the value type by using the in-built type() function:


In [None]:
# Example: Integer numbers
a, b, c = 2, 0, -2

print(type(a), type(b), type(c))

The same goes for floats, regardless of the sign a number may have:

In [None]:
# Example: Float numbers
a, b, c = 2.5, 0.0, -2.5

print(type(a), type(b), type(c))

Now the tricky part is recognizing the difference between integers/floats and strings. You need to be careful to recognize what is in quotes, or not.

For example, the values below are all strings! Python handily colors strings in a different color, which is a formatting style you should look out for.

In [None]:
# Example: Strings
a, b, c = "2.5", "0", "-2.5"

print(type(a), type(b), type(c))

## Task

Define five different variables of your choice, using a mix of integers, floats and strings.

Then use the type() function to clarify their type.

Annotate your code.

In [None]:
# Enter your attempt here


There are some videos on my YouTube channel which go over some of these introductory concepts: https://www.youtube.com/@edwardoughton6864/videos