
## Hints

In this notebook you will find some potential hints that can help you solve the tasks of exercise 1. 

- [Introduction to Python lecture 4: Functions](https://github.com/Sgeorgan/Lecture_4)
- [Introduction to Python lecture 6: Iterating dataframe rows](https://github.com/Sgeorgan/Lecture_6)
- [Introduction to Python lecture 6: Using assertions](https://github.com/Sgeorgan/Lecture_6)


### `assert` statements

# Assertions in Python

**Assertions** are a language feature in Python that enable programmers to verify that a certain condition is met. This feature is particularly useful for ensuring that variables are within an appropriate range, or having the right type for further analysis. For example, in a function that converts temperatures, an assertion can be used to check that the input value is not below absolute zero or if that input is of type float or integer.

In essence, `assert` statements function similarly to an electrical fuse: if the input current is too high, the fuse blows to protect the appliance that follows. Likewise, if input values are outside an expected range, the `assert` statement triggers an error, halting the program to prevent subsequent code from running with incorrect inputs.

Assertions are commonly utilized within functions to verify that input values are within acceptable bounds and sometimes part of the exercises so that the examiners (or yourself) can assess check that your code works as expected. Here's an example:

```python


In [None]:

def convert_temperature(value):
    # Ensure the temperature is above absolute zero
    assert value >= -273.15, "Temperature below absolute zero!"
    # Temperature conversion logic here


In [None]:
convert_temperature(-290.15)

# Checking input validity in functions

In Python, ensuring that function arguments are of the expected type is important for writing robust code. The isinstance() function allows one to verify that a variable or argument belongs to a specific data type or tuple of types. This is particularly useful for validating inputs to functions.

The following example demonstrates how to use isinstance() to ensure that a function only processes numbers (integers and floats) by squaring them. If the input is not a number, the function raises a TypeError with an informative message, preventing unexpected behavior or errors.

In [None]:
def square_number(number):
    # Check if the input is an instance of int or float
    if not isinstance(number, (int, float)):
        raise TypeError("Input must be an integer or float")
    
    # Perform the operation
    result = number ** 2
    
    return result

# Correct usage examples
print(square_number(4))    # Should print 16
print(square_number(3.5))  # Should print 12.25

# Incorrect usage example, uncomment to test
# print(square_number("not a number"))  # This will raise a TypeError


### Alternatives to `pandas.DataFrame.iterrows()
It is entirely possible to solve *problem 3* using the `iterrows()` approach shown here https://github.com/Sgeorgan/Lecture_6: 
and your code could look something like this:

In [None]:
import pandas
import shapely.geometry

data = pandas.DataFrame({"x": [10, 20, 30], "y": [1, 3, 4]})

# Option 1: iterate over DataFrame’s rows:

for i, row in data.iterrows():
    point = shapely.geometry.Point(row["x"], row["y"])
    # ...


# Enhancing Efficiency with Pandas `apply()`

**However**, it's worth noting that there are better, faster, and more elegant solutions available that are also more concise. One such example is the use of the `apply()` method in Pandas' `DataFrame`s. This method allows for the execution of a user-defined function on each row or column of the DataFrame, the behavior of which is determined by the `axis` parameter. Setting `axis=1` applies the function across rows.

The results of executing the function in parallel are aggregated into a `pandas.GeoSeries`, which is then returned by `apply()`. This result can subsequently be assigned to a new column or row.

To illustrate, consider the following simple example: We define a function that takes a row as input and multiplies its `x` and `y` values:




In [None]:
def multiply(row):
    """Multiply a row’s x and y values."""
    return (row["x"] * row["y"])

product = data.apply(multiply, axis=1)
# note how the function is not called here (no parentheses!),
# but only passed as a reference

product = list(product)
product

#### Pandas’ `apply()` method

Exactly the same can be done with the more complex example of creating a point geometry:

In [None]:
# Option 2: Define a custom function, and apply this function to the data frame

def create_point(row):
    """Create a Point geometry from a row with x and y values."""
    point = shapely.geometry.Point(row["x"], row["y"])
    return point

point_series = data.apply(create_point, axis=1)


### Applying a Lambda Function

For straightforward functions that are concise enough to fit in a single line, it's possible to utilize what's known as a *lambda function* or *lambda notation*. Lambda functions adhere to the syntax `lambda arguments: return-value`, which involves the `lambda` keyword followed by one or more argument names (separated by commas), a colon (`:`), and the expression defining the return value. For instance, a lambda function that takes two arguments and returns their sum would be written as `lambda a, b: (a + b)`.

These lambda functions are useful for their brevity and are meant to be used directly where they're defined, providing a convenient way to incorporate simple expressions without the need for defining separate functions. Although prevalent in data science, lambda functions should be used with caution. As a general guideline, lambda functions are best reserved for cases where the code can comfortably fit on one (short) line.



You can find more detailed information on lambda functions in the [Python documentation](https://docs.python.org/3/tutorial/controlflow.html#lambda-expressions).


In the context of the previously mentioned geo-spatial problem, a lambda function can be applied to generate a point 'on-the-fly':


In [None]:
# Option 3: Apply a lambda function to the data frame

point_series = data.apply(
    lambda row: shapely.geometry.Point(row["x"], row["y"]),
    axis=1
)
point_series


### Iterating over multiple lists simultaneously

The [built-in Python function `zip()`](https://docs.python.org/3/library/functions.html#zip)
makes it easy to work with multiple lists at the same time. It combines two or
more lists and iterates over them in parallel, returning one value of each list
at a time. Consider the following example:

## Iterating Over Lists in Parallel

Python provides a straightforward way to iterate over multiple lists in parallel, which is particularly useful when you have related data in separate lists. This can be achieved using the `zip` function. `zip` takes multiple iterable objects and returns an iterator of tuples, where each tuple contains elements from each list corresponding to the same index.

Consider the example of maintaining lists of pet names and their corresponding ages. Here's how you can iterate over these lists in parallel to print out a statement about each pet:



In [None]:
# Example lists of pet names and their ages
pet_names = ["Whiskers", "Paws", "Fido"]
pet_ages = [3, 7, 2]

# Iterate over the pet names and ages lists in parallel:
for name, age in zip(pet_names, pet_ages):
    print(f"{name} is {age} years old")



This example illustrates quite well, why variable names should be chosen wisely: lists, for instance, almost always represent multiple values, so their names should be in plural (E.g., `dog_names`). In a loop, having more than one variable can become confusing quickly; refrain from using short names such as `i` or `j` for anything but a simple counter: use descriptive names such as `pet_name` or `pet_age` in the above example.





Sources: https://autogis-site.readthedocs.io/en/latest/lessons/lesson-1/exercise-1.html under a CC4 license https://creativecommons.org/licenses/by/4.0/deed.en