# Lab 2: Python Fundamentals for Geospatial Machine Learning
## Overview
In this lab, participants will be introduced to the foundational concepts of Python programming, tailored specifically for geospatial machine learning applications. The lab will focus on building core Python skills for handling geospatial data, developing machine learning models, and automating workflows. Participants will learn Python's syntax, data structures, and key libraries commonly used in geospatial and machine learning tasks through hands-on exercises and examples.

### Learning objectives
By the end of lab 2, participants will be able to:
- Write Python scripts using essential syntax, variables, and operators.
- Work with different data types and structures such as lists, tuples, dictionaries, and arrays.
- Explore Python's standard libraries and their applications in data manipulation.



## Introduction to Python
Python has become a cornerstone language for geospatial machine learning due to its versatility, simplicity, and an extensive ecosystem of libraries. Its open-source nature and active community make it a preferred choice for both beginners and advanced practitioners.

## Variables, data types, and basic operations
### Variables
Variables are containers used to store data values, allowing you to work with them later in your program. You can create a variable by assigning a value to a name using the `=` operator. For example, `x = 10` assigns the value `10` to the variable `x`.

##### Naming variables
When naming variables in Python, it is important to follow these rules to ensure clarity and maintain coding standards:

(1) Use descriptive names
- Choose names that clearly describe the variable's purpose (e.g., land_cover_type instead of lct).

(2) Follow 'snake' or 'camel' Case convention:
- Use lowercase letters separated by underscores for readability (e.g., total_area, totalArea).

(3) Avoid reserved words
- Do not use Python keywords or built-in function names as variable names (e.g.,if, print,class, object etc).

(4) Start with a letter or underscore:
- Variable names must begin with a letter (a-z, A-Z) or an underscore (_) but cannot start with a number.

(5) Use only alphanumeric characters and underscores
- Variable names can include letters, numbers, and underscores but no spaces or special characters.

(6) Be consistent
- Maintain a consistent naming style throughout your code for better readability and organization.

In [None]:
# Variables
country_name = "Zimbabwe"  # The variable 'name' stores the string "Zimbabwe"

#### Naming variables
When naming variables in Python, it is important to follow these rules to ensure clarity and maintain coding standards:

- Use descriptive names
Choose names that clearly describe the variable's purpose (e.g., land_cover_type instead of lct).

- Follow 'snake' or 'camel' Case convention:
Use lowercase letters separated by underscores for readability (e.g., total_area, totalArea).

- Avoid reserved words
Do not use Python keywords or built-in function names as variable names (e.g.,if, print,class, object etc).

- Start with a letter or underscore:
Variable names must begin with a letter (a-z, A-Z) or an underscore (_) but cannot start with a number.

- Use only alphanumeric characters and underscores
Variable names can include letters, numbers, and underscores but no spaces or special characters.

Be consistent
Maintain a consistent naming style throughout your code for better readability and organization.

###  Data types
Data types in Python define the kind of data a variable can hold. Common data types include integers (`int`), floating-point numbers (`float`), strings (`str`), and booleans (`bool`). Python also supports complex data types like lists, tuples, dictionaries, and sets, which are used for more advanced data handling.



#### Integer (int)
An integer is a whole number, positive or negative, without any fractional or decimal part. In Python, integers can be of any size, limited only by the memory of your system.

In [None]:
# Integer
age = 25
print(age)

25


#### Float (float)
A float represents a number that contains a decimal point or is expressed in exponential (scientific) notation. It is used for more precise calculations involving fractions or large ranges of numbers.


In [None]:
# Float
elevation = 5.8
print(elevation)

5.8


#### String (str)
A string is a sequence of characters enclosed within single (') or double (") quotes. Strings are used to represent text data in Python. They can include letters, numbers, symbols, and spaces.

In [None]:
 # string
city = "Harare"
print(city)

Harare


Multiline strings in Python are string that span across multiple lines. They are typically enclosed within triple quotes (""" or '''). These strings are especially helpful when you want to maintain formatting (such as new lines or paragraphs) or when writing lengthy SQL queries or other multi-line text.

In [None]:
# Multiline string example
info = """
What is the capital city of Zimbabwe?
The capital city of Zimbabwe is Harare.

Where is the capital city of Zimbabwe located?
Harare is located in the northern part of Zimbabwe.
It is situated on the central plateau at an elevation of approximately 1,483 meters (4,865 feet) above sea level.
"""

# Print the multiline string
print(info)


What is the capital city of Zimbabwe?
The capital city of Zimbabwe is Harare.

Where is the capital city of Zimbabwe located?
Harare is located in the northern part of Zimbabwe. 
It is situated on the central plateau at an elevation of approximately 1,483 meters (4,865 feet) above sea level.



#### Boolean (bool)
A boolean represents one of two values: True or False. Booleans are often used in conditions and comparisons to control program flow or decision-making. For example, the variable 'is_within_boundaries' represents whether a geographic point or feature falls within a specific region (e.g., a study area or administrative boundary).


In [None]:
# Boolean example for a geographic context
is_within_boundaries = True  # Example: Whether a location is within a specified area
print("Is the point within the study area boundaries?")
print(is_within_boundaries)

Is the point within the study area boundaries?
True


This can later be used in conditional checks or logical operations when analyzing spatial data.

In [None]:
if is_within_boundaries:
    print("The location is valid for analysis.")
else:
    print("The location is outside the study area.")

The location is valid for analysis.


### Basic operations
Basic Operations in Python allow manipulation of data. Arithmetic operations include addition (`+`), subtraction (`-`), multiplication (`*`), and division (`/`). Python also supports comparison operators (`==`, `<`, `>`) and logical operators (`and`, `or`, `not`) for conditional expressions. For instance, `x + y` adds two variables, while `x > y` evaluates whether `x` is greater than `y`. Mastering these fundamentals is essential for performing calculations, logical reasoning, and building more complex Python programs.

#### Arithmetic operations
These operations perform basic mathematical calculations (addition, Subtraction, multiplication, division). For example, adding two numbers.


In [None]:
# Addition
result1 = 5 + 3
print(result1)

8


Below a simple geographic example using arithmetic operations to calculate the total and average area of several countries.

In [None]:
# Calculating total and average aand area
# Land area of countries (in square kilometers)
land_areas = {
    "Zimbabwe": 390757,
    "South Africa": 1219090,
    "Botswana": 581730,
    "Namibia": 825615
}

# Arithmetic operations
total_area = sum(land_areas.values())  # Calculate the total area
average_area = total_area / len(land_areas)  # Calculate the average area

# Print results
print(f"Total land area: {total_area:,} square kilometers")
print(f"Average land area: {average_area:,.2f} square kilometers")

Total land area: 3,017,192 square kilometers
Average land area: 754,298.00 square kilometers


#### Comparison operators
Comparison operators are used to compare two values and return a Boolean result: True if the condition is met, or False otherwise. These include == (equal to) and != (not equal to) for equality checks, as well as >, <, >=, and <= for comparing magnitudes. They are widely used for decision-making and logical expressions in programs. For example, they can determine whether a value exceeds a threshold or falls within a specific range.

In [None]:
# List of elevations (in meters) for different locations
elevations = [450, 1200, 950, 1800, 3000, 650]

# Threshold elevation (in meters)
threshold = 1000

# Use comparison operators to check which locations are above the threshold
above_threshold = [elevation > threshold for elevation in elevations]

# Use comparison operators to find the maximum elevation and compare with a value
highest_elevation = max(elevations)
is_highest_extreme = highest_elevation > 2500

# Print results
print("Elevations:", elevations)
print(f"Threshold: {threshold} meters")
print("Locations above the threshold:", above_threshold)
print(f"Highest elevation ({highest_elevation} meters) is extreme:", is_highest_extreme)

Elevations: [450, 1200, 950, 1800, 3000, 650]
Threshold: 1000 meters
Locations above the threshold: [False, True, False, True, True, False]
Highest elevation (3000 meters) is extreme: True


#### Logical operators
Python provides three logical operators: and, or, and not. The and operator returns True if both conditions are True, such as in x > 10 and y < 20. The or operator returns True if at least one condition is True, as in x > 10 or y < 20. Lastly, the not operator reverses the result of a condition, returning True if the condition is False, as seen in not (x > 10). These operators are fundamental for combining and evaluating multiple logical conditions in decision-making processes.

Below is a simple geographic example using logical operators to determine whether a location meets specific criteria based on elevation and temperature.

In [None]:
# Geographic data
elevation = 800  # in meters
temperature = 22  # in degrees Celsius

# Criteria for suitability
is_suitable_elevation = elevation > 500 and elevation < 1500  # Elevation between 500m and 1500m
is_suitable_temperature = temperature > 15 or temperature < 25  # Temperature within a comfortable range
is_not_extreme = not (elevation > 3000 or temperature < 0)  # Not extreme conditions

# Final suitability check
is_location_suitable = is_suitable_elevation and is_suitable_temperature and is_not_extreme

# Print results
print(f"Suitable elevation: {is_suitable_elevation}")
print(f"Suitable temperature: {is_suitable_temperature}")
print(f"Not extreme conditions: {is_not_extreme}")
print(f"Is the location suitable? {is_location_suitable}")

Suitable elevation: True
Suitable temperature: True
Not extreme conditions: True
Is the location suitable? True


### Control structures: Loops and conditionals
Control structures in Python are fundamental tools that allow developers to control the flow of a program. They enable decision-making (conditionals) and repetitive execution (loops), making programs dynamic and efficient. For example, we can use loops and conditional statements to help categorize elevation.


In [None]:
# List of elevation points (in meters) for different locations
elevations = [450, 1200, 200, 950, 1800, 3000]

# Loop through elevations and categorize them
for elevation in elevations:
    if elevation < 500:
        category = "Lowland"
    elif 500 <= elevation < 1500:
        category = "Upland"
    elif 1500 <= elevation < 2500:
        category = "Highland"
    else:
        category = "Mountain"

    # Print the elevation and its category
    print(f"Elevation: {elevation} m - Category: {category}")

Elevation: 450 m - Category: Lowland
Elevation: 1200 m - Category: Upland
Elevation: 200 m - Category: Lowland
Elevation: 950 m - Category: Upland
Elevation: 1800 m - Category: Highland
Elevation: 3000 m - Category: Mountain


- Notes

Python provides two main types of loops: the for loop, which iterates over a sequence like lists or strings, and the while loop, which continues execution as long as a specified condition is True. For decision-making, Python uses conditional statements such as if, if-else, and if-elif-else to execute specific blocks of code based on conditions, with nested if statements allowing for more complex logic. These loops and conditional constructs are fundamental for controlling program flow, enabling dynamic and efficient execution of repetitive tasks and logical decisions.

### Data Structures: Lists, Dictionaries, and Tuples
Data structures in Python are used to organize and store data efficiently. Among the most commonly used are lists, dictionaries, and tuples, each with distinct properties and use cases.

#### Lists
A list is a collection of ordered, mutable items, which means you can modify its contents after creation. Lists are defined using square brackets ([ ]) and can hold elements of any data type, including other lists.

Features of lists:

- Ordered - Elements are stored in the order they are added.
- Mutable - You can add, remove, or modify elements.
- Allow duplicates -Lists can contain duplicate elements.

In [None]:
# List of cities in Zimbabwe
cities = ["Harare", "Bulawayo", "Mutare", "Gweru"]

# Add a city to the list
cities.append("Victoria Falls")

# Access and print the second city
print("Second city:", cities[1])

# Print the complete list
print("Cities:", cities)

Second city: Bulawayo
Cities: ['Harare', 'Bulawayo', 'Mutare', 'Gweru', 'Victoria Falls']


#### Dictionaries
A dictionary is a collection of key-value pairs, where each key is unique, and each key maps to a value. Dictionaries are defined using curly braces ({}).

Features of dictionaries:

- Unordered - Elements are not stored in a specific order (though Python 3.7+ maintains insertion order).
- Mutable - You can add, modify, or remove key-value pairs.
- Keys are unique - Duplicate keys are not allowed.

In [None]:
# Dictionary of city and its population
city_population = {
    "Harare": 1500000,
    "Bulawayo": 650000,
    "Mutare": 300000,
    "Gweru": 160000
}

# Print city populations with formatted numbers
for city, population in city_population.items():
    print(f"{city}: {population:,}")

# Add a new city with its population
city_population["Victoria Falls"] = 35000

# Print the updated dictionary with formatted numbers
print("\nUpdated city populations:")
for city, population in city_population.items():
    print(f"{city}: {population:,}")

Harare: 1,500,000
Bulawayo: 650,000
Mutare: 300,000
Gweru: 160,000

Updated city populations:
Harare: 1,500,000
Bulawayo: 650,000
Mutare: 300,000
Gweru: 160,000
Victoria Falls: 35,000


#### Tuples
A tuple is an ordered, immutable collection of elements. Once created, the elements of a tuple cannot be changed. Tuples are defined using parentheses (()).

Features of tuples

- Ordered - Elements maintain the order they were added.
- Immutable - Elements cannot be modified after creation.
- Allow duplicates -Tuples can contain duplicate elements.
- Efficient - Tuples consume less memory and are faster than lists for read-only operations.

In [None]:
# Creating a tuple
coordinates = (10.5, 20.3, 30.7)

# Accessing elements
print(coordinates[1])

# Attempting to modify a tuple (raises an error)
# coordinates[1] = 25.0  # TypeError: 'tuple' object does not support item assignment

# Unpacking a tuple
x, y, z = coordinates
print(x, y, z)

# Using tuples in functions
def get_dimensions():
    return (1920, 1080)
width, height = get_dimensions()
print(f"Width: {width}, Height: {height}")

20.3
10.5 20.3 30.7
Width: 1920, Height: 1080


### Writing and importing functions and modules
Functions and modules are fundamental for organizing and reusing code in Python. Functions are blocks of reusable code, while modules group related functions, variables, and classes into separate files

#### Functions
A function is a block of reusable code that performs a specific task. In Python, functions are categorized based on their purpose and usage. We will focus on built-in and user-define functions.

###### Built-in Functions

Python comes with a rich set of built-in functions that are ready to use without any imports. Examples include:
- Mathematical functions: abs(), round(), pow()
- Type conversion functions: int(), float(), str()
- Data structure functions: len(), sum(), max(), min()

In [None]:
# List of elevations (in meters) for various locations
elevations = [450, 1200, 950, 1800, 3000, 650]

# Use built-in functions to analyze the elevation data
highest_elevation = max(elevations)  # Find the maximum elevation
lowest_elevation = min(elevations)  # Find the minimum elevation
total_locations = len(elevations)   # Count the number of locations
average_elevation = sum(elevations) / total_locations  # Calculate the average elevation

# Print the results
print(f"Highest elevation: {highest_elevation} meters")
print(f"Lowest elevation: {lowest_elevation} meters")
print(f"Number of locations: {total_locations}")
print(f"Average elevation: {average_elevation:.2f} meters")

type(elevations)

Highest elevation: 3000 meters
Lowest elevation: 450 meters
Number of locations: 6
Average elevation: 1341.67 meters


list

###### User-defined Functions

These are functions that you define yourself to perform specific tasks.

In [None]:
# Define a function to calculate the area of a plot
def calculate_area(length, width):
    """
    Calculate the area of a rectangular region in square meters.
    """
    return length * width

# Example usage
plot_length = 50  # in meters
plot_width = 30   # in meters

area = calculate_area(plot_length, plot_width)
print(f"The area of the plot is: {area} square meters")

The area of the plot is: 1500 square meters


#### Using built-in modules
A module is a single file that contains Python code, including functions, classes, and variables. It can also include runnable code. Modules are designed to organize code into reusable and logical units. Python has numerous built-in modules like math, os, and datetime, and third-party modules like numpy and pandas.

Below is an example of using Python's built-in math module, which provides mathematical functions like:
- radians: Converts degrees to radians.
- sin, cos: Trigonometric functions.
- sqrt: Calculates the square root.
- atan2: Computes the angle in radians.

In [None]:
# Using an built-in module
from math import radians, sin, cos, sqrt, atan2

# Define a function to calculate the great-circle distance between two coordinates
def calculate_distance(lat1, lon1, lat2, lon2):
    """
    Calculate the distance between two points (lat1, lon1) and (lat2, lon2) in kilometers.
    """
    # Radius of the Earth in kilometers
    R = 6371.0

    # Convert latitude and longitude from degrees to radians
    lat1, lon1, lat2, lon2 = map(radians, [lat1, lon1, lat2, lon2])

    # Differences in coordinates
    dlat = lat2 - lat1
    dlon = lon2 - lon1

    # Haversine formula
    a = sin(dlat / 2)**2 + cos(lat1) * cos(lat2) * sin(dlon / 2)**2
    c = 2 * atan2(sqrt(a), sqrt(1 - a))
    distance = R * c

    return distance

# Example usage
harare_coords = (-17.8252, 31.0335)  # Harare, Zimbabwe
bulawayo_coords = (-20.1500, 28.5833)  # Bulawayo, Zimbabwe

distance = calculate_distance(harare_coords[0], harare_coords[1], bulawayo_coords[0], bulawayo_coords[1])
print(f"Distance between Harare and Bulawayo: {distance:.2f} km")

4.0
3.141592653589793


### Library
A library is a collection of modules packaged together to provide a specific set of functionalities. Libraries can range from a few modules to hundreds of them, designed to help developers perform tasks without having to write code from scratch. For example, NumPy is a library for numerical computations, containing numerous modules. You can use functions from the NumPy library after installing and importing it.

In the example below, elevations are stored in a NumPy array, which allows for efficient numerical computation and manipulation. The np.mean() function is used to calculate the average elevation.

In [None]:
# Import numpy
import numpy as np

# Array of elevations (in meters) for various locations
elevations = np.array([450, 1200, 950, 1800, 3000, 650])

# Calculate the average elevation
average_elevation = np.mean(elevations)

# Print results
print(f"Average elevation: {average_elevation:.2f} meters")

Average elevation: 1341.67 meters


## Best practices for functions and modules
- Use descriptive names for functions and modules.
- Write docstrings to describe what the function/module does.
- Avoid circular imports by structuring code logically.
- Reuse and modularize code into smaller modules for better readability and maintainability.
- By mastering functions and modules, you can build scalable and efficient Python applications

By C Kamusoko

© Copyright 2024.