<a href="https://colab.research.google.com/github/bamacgabhann/GY5021/blob/2024/GY5021/1_Introduction_to_Geospatial_Data/GY5021_The_Python_Language.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>     <a href="https://mybinder.org/v2/gh/bamacgabhann/GY5021/9a706c8973d5bde0e50593ecc94941b0426f24a6?urlpath=lab%2Ftree%2FGY5021%2F1_Introduction_to_Geospatial_Data%2FGY5021_The_Python_Language.ipynb" target="_parent"><img src="https://mybinder.org/badge_logo.svg" alt="Open in Binder" /></a>

<img src="https://raw.githubusercontent.com/bamacgabhann/GY5021/2024/PD_logo.png" align=center alt="UL Geography logo"/>

# The Python language

This Notebook is going to throw a *lot* at you all at once. Don't expect, or even *try*, to fully understand and remember all of this in one go. My purpose in writing this for you is to give you an introduction, so that when you see certain things next, they won't be brand new to you - and you'll be able to refer back to this for reference. Some of these concepts will only really make sense once you've used them for yourself, or at least until you've seen them in real action a few times. If you want to, you'll get there. 

When you install python, you're installing what's called the Standard Library - the basic set of tools which comprises the python language (https://docs.python.org/3/library/index.html).

But that's not where the power of python lies. Rather, the power lies in the numerous packages - 509,528 currently listed on PyPI, the Python Packaging Index - which have already been written for python, for everything from the simplest operations all the way to machine learning and AI.

A Python package is a collection of modules which have been written for a common purpose - to do something specific. In this course, we'll use a few different Python packages, such as pandas for working with tables of data, and matplotlib for drawing graphs. 

A Python module is a file containing Python code, which defines classes of data, and methods for working with that data. For example, the ```datetime``` module includes classes of date and time data, and methods for working with these dates and times. Some Python packages will just have a single module, but others will have several different modules for different aspects of what the package is for.

A Python script is a single Python file with code for a specific purpose. You run the script to do something. If you run a module or a package, all that happens is that the different classes of data and methods for working with that data in that module or package are loaded by Python. There won't be anything produced. But a script might use some methods defined in a module or package to work with some particular data, and produce an output, like a graph, or saving a file containing processed data.

If you want your Python script to be able to use classes of data or methods from a particular package or module, you have to install the package, and then you have to import it.

## 1. Import Statements

When you open a python file or a Jupyter Notebook, you can write some simple code to do something. But even most of the Standard Library isn't loaded by default - only the subset referred to as the built-in functions (https://docs.python.org/3/library/functions.html). Loading elements requires memory, so to keep memory use low, you only bring in what's needed. So the first part of virtually every python file is doing just that, via import statements.

For example, to work with dates and times, we need the datetime module.

In [1]:
import datetime
datetime.datetime.now()

datetime.datetime(2024, 1, 29, 12, 33, 56, 145305)

When we run ```import datetime```, what we're doing is importing the datetime *module*.

The Python Standard Library includes several modules like this, containing classes of data and methods which are broadly useful enough to be worth being a default part of the language, but not essential enough to be a default part of every project - hence we only import what we need.

We can go a little further, actually. If we only need that one method to get the current date and time, it's even inefficient to import the entire datetime module. So we can import only what we need:

In [2]:
from datetime import datetime
datetime.now()

datetime.datetime(2024, 1, 29, 12, 33, 56, 166211)

This simply imports the class of data for combined dates and times, which comes with the method to get the current date and time. 

We can also alias imports, to save writing out full names all the time.

In [3]:
from datetime import datetime as dt
dt.now()

datetime.datetime(2024, 1, 29, 12, 33, 56, 173734)

## 2. Variables

To understand why Python is such a powerful language, you have to understand something of programming languages. 

Classic, old-school languages like FORTRAN - which my mother used in the 1970s - are _compiled_ languages. There's still a lot of these around - FORTRAN is still there, albeit rare, as is PASCAL; C is the basis of much of modern computer architecture, with the C++ and C# variants; there's even new compiled languages, for example Rust, which will probably be my next language. 

Compiled languages take the code you have written and translate it to machine-readable byte code before you can run it. This compilation is heavily dependant on machine architecture and operating system - which is why you have different versions of software for PC, Mac, and Linux. 

The benefit of compiling code is that it's fast.

The disadvantage is that you have to be very specific about everything: for example, if you want a variable, you have to specify what kind of data it will hold, and how much memory it will take up. This means compiled languages are generally not particularly user-friendly.

Python is a _dynamic_ language - it's not compiled. Instead, when you run a python script, it is passed to the Python Interpreter (which is written and compiled in C), which translates the code and runs it. This intermediate step means you have a lot more latitude. You can assign variables, change their type, change their size - and the interpreter will deal with all the management.

You can assign variables with the ```=``` sign:

In [4]:
a = 5
print(a)

5


and change them, if you want:

In [5]:
a = dt.now()
print(a)

2024-01-29 12:33:56.189718


## 3. Functions

Functions are code that takes a piece of data and does something to it. ```dt.now()```, for example, is a function that takes the current date and time from the system, and returns it. We can also define functions:

In [6]:
def add_5(x):
    x = x + 5
    return x

a = 5

b = add_5(a)

print(b)

10


This funtion _returns_ a value - note that it does not modify the value of the variable a

In [7]:
print(a)

5


## 4. Scope

It's possible to write a function which _does_ modify an existing variable, but only in some specific cases. Usually, even specifically referencing a variable inside a function doesn't make a difference:

In [8]:
def change_a():
    a = 7

change_a()

print(a)

5


This is because of a concept called _scope_. Variables all have a scope. Global variables are known throughout a program. Local variables might be confined to a single module. Function variables exist only within the scope of a function. In the example above, the assignment ```a = 7``` creates a variable a which exists only within the function itself - it does not exist outside the function, so it does not affect the global variable a, even though they appear to have the same name.

## 5. Classes

In our example above, our first import was ```import datetime```.

This imports the module called ```datetime```, which is part of the standard library.

As part of the import, this imports a class, which is also called datetime. In our second example, ```from datetime import datetime```, we imported only this class.

You can think of classes as templates for objects, which have properties stored as variables, and actions which are functions or methods. 

For example, we could define a class for rectangles. When defining a class, you can pass values to it, and define a special method called ```__init__()``` which runs automatically when the class is used to create an object.

In [9]:
class Rectangle:
    def __init__(self, a, b):
        self.length = max(a,b)
        self.width = min(a,b)

In [10]:
shape1 = Rectangle(4, 5)
print(shape1.length)

5


The class itself is ```Rectangle```. The object ```shape1``` is an _instance_ of this class. We could add other instances:

In [11]:
shape2 = Rectangle(7,9)
print(shape2.length)

9


This class just stores the length and width, after figuring out which of the sides is the larger, but say we want to store the area as a property. We can define a function to do this. A quirk of the language: inside a class, functions are referred to as class _methods_.

In [12]:
class Rectangle:
    def __init__(self, a, b):
        self.length = max(a,b)
        self.width = min(a,b)

    def calc_area(self):
        self.area = self.length * self.width

shape1 = Rectangle(4, 5)
shape1.calc_area()
print(shape1.area)

20


Notice that here, we didn't pass any arguments to the method, nor did it return a value. Instead, it modified the variable ```shape1.area``` _in place_ - because the definition of the method took ```self``` as an argument - in other words, it applies that method to the class instance itself.

We can also define _subclasses_, which are just classes, but which inheret properties from the parent class. For example:

In [13]:
class Square(Rectangle):
    def __init__(self, a):
        super().__init__(a, a)

In [14]:
shape3 = Square(4)

shape3.calc_area()
print(shape3.area)

16


Note how I only declared one length while creating shape3, rather than two: because the super().__init__() inheritance function used a twice, instead of taking a and a different b. But I also didn't have to define ```calc_area``` again - I was able to simply use the method from the Rectangle class, because my Square class is just a variant of my Rectangle class.

## 6. Flow

Once we have variables, we can do things with them. For example, we can compare values:

In [15]:
shape3.area > shape1.area

False

Because this is a straight comparison, it returns a boolean: true or false. We can also use this in a conditional:

In [16]:
if shape3.area > shape1.area:
    print("shape 3 is bigger")

Nothing happened there because shape 3 _isn't_ bigger. We can account for that:

In [17]:
if shape3.area > shape1.area:
    print("shape 3 is bigger")

else:
    print("shape 1 is bigger")

shape 1 is bigger


```If``` statements are one example of flow control. Another is while loops. For example, we can repeat calculations until a certain variable reaches a particular value:

In [18]:
area = 0
n = 0
while area<30:
    square = Square(n)
    square.calc_area()
    area = square.area
    n = n+1

print(area)

36


There's also ```for```, which is particularly useful combined with the range function:

In [19]:
for i in range(30):
    square = Square(i)
    square.calc_area()
    print(f"The area of a square side length {i} is {square.area}")

The area of a square side length 0 is 0
The area of a square side length 1 is 1
The area of a square side length 2 is 4
The area of a square side length 3 is 9
The area of a square side length 4 is 16
The area of a square side length 5 is 25
The area of a square side length 6 is 36
The area of a square side length 7 is 49
The area of a square side length 8 is 64
The area of a square side length 9 is 81
The area of a square side length 10 is 100
The area of a square side length 11 is 121
The area of a square side length 12 is 144
The area of a square side length 13 is 169
The area of a square side length 14 is 196
The area of a square side length 15 is 225
The area of a square side length 16 is 256
The area of a square side length 17 is 289
The area of a square side length 18 is 324
The area of a square side length 19 is 361
The area of a square side length 20 is 400
The area of a square side length 21 is 441
The area of a square side length 22 is 484
The area of a square side length 23

Now, note two things here. First, range(30) started at 0 and went up to 29. This is how numbers work in many aspects of python - indexes start at 0, not 1, and usually go up to n-1 for the number you type in. So list[4] is the _fifth_ item of a list, not the fourth, because there's also list[0], list[1], list[2], and list[3]. 

Second, I did something fancy in the print() statement. I've used print() a few times, to output a result, but this is the first time I've used an f-string. This formatting allows you to insert variables into a text string. This isn't just for print statements - anywhere you have a text string, you can use this formatting. Simply open with f" rather than just ", and you can put variables in braces within the string: {}.

This is just a very brief introduction to the Python language, and went very quickly through some really fundamental concepts. As I said at the top, don't expect to have understood all of this, and remember it perfectly - but you've seen some of the core basics now, so they'll be less new when they come up next, and you can refer back to this at any time. You'll very likely want to refer back to it a lot.

___

Week 1 Notebooks: 

1. Geospatial Software and Programming Languages <a href="https://colab.research.google.com/github/bamacgabhann/GY5021/blob/2024/GY5021/1_Introduction_to_Geospatial_Data/GY5021_1_Geospatial_Software_and_Programming_Languages.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>     <a href="https://mybinder.org/v2/gh/bamacgabhann/GY5021/9a706c8973d5bde0e50593ecc94941b0426f24a6?urlpath=lab%2Ftree%2FGY5021%2F1_Introduction_to_Geospatial_Data%2FGY5021_1_Geospatial_Software_and_Programming_Languages.ipynb" target="_parent"><img src="https://mybinder.org/badge_logo.svg" alt="Open in Binder" /></a>

2. Data Types <a href="https://colab.research.google.com/github/bamacgabhann/GY5021/blob/2024/GY5021/1_Introduction_to_Geospatial_Data/GY5021_2_Data_Types.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>     <a href="https://mybinder.org/v2/gh/bamacgabhann/GY5021/9a706c8973d5bde0e50593ecc94941b0426f24a6?urlpath=lab%2Ftree%2FGY5021%2F1_Introduction_to_Geospatial_Data%2FGY5021_2_Data_Types.ipynb" target="_parent"><img src="https://mybinder.org/badge_logo.svg" alt="Open in Binder" /></a>

3. Vector Data <a href="https://colab.research.google.com/github/bamacgabhann/GY5021/blob/2024/GY5021/1_Introduction_to_Geospatial_Data/GY5021_3_Vector_Data.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>     <a href="https://mybinder.org/v2/gh/bamacgabhann/GY5021/9a706c8973d5bde0e50593ecc94941b0426f24a6?urlpath=lab%2Ftree%2FGY5021%2F1_Introduction_to_Geospatial_Data%2FGY5021_3_Vector_Data.ipynb" target="_parent"><img src="https://mybinder.org/badge_logo.svg" alt="Open in Binder" /></a>

4. Attribute Data <a href="https://colab.research.google.com/github/bamacgabhann/GY5021/blob/2024/GY5021/1_Introduction_to_Geospatial_Data/GY5021_4_Attribute_Data.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>     <a href="https://mybinder.org/v2/gh/bamacgabhann/GY5021/9a706c8973d5bde0e50593ecc94941b0426f24a6?urlpath=lab%2Ftree%2FGY5021%2F1_Introduction_to_Geospatial_Data%2FGY5021_4_Attribute_Data.ipynb" target="_parent"><img src="https://mybinder.org/badge_logo.svg" alt="Open in Binder" /></a>

5. Coordinate Reference Systems <a href="https://colab.research.google.com/github/bamacgabhann/GY5021/blob/2024/GY5021/1_Introduction_to_Geospatial_Data/GY5021_5_Coordinate_Reference_Systems.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>     <a href="https://mybinder.org/v2/gh/bamacgabhann/GY5021/9a706c8973d5bde0e50593ecc94941b0426f24a6?urlpath=lab%2Ftree%2FGY5021%2F1_Introduction_to_Geospatial_Data%2FGY5021_5_Coordinate_Reference_Systems.ipynb" target="_parent"><img src="https://mybinder.org/badge_logo.svg" alt="Open in Binder" /></a>

6. Geospatial Data Files <a href="https://colab.research.google.com/github/bamacgabhann/GY5021/blob/2024/GY5021/1_Introduction_to_Geospatial_Data/GY5021_6_Geospatial_Data_Files.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>     <a href="https://mybinder.org/v2/gh/bamacgabhann/GY5021/9a706c8973d5bde0e50593ecc94941b0426f24a6?urlpath=lab%2Ftree%2FGY5021%2F1_Introduction_to_Geospatial_Data%2FGY5021_6_Geospatial_Data_Files.ipynb" target="_parent"><img src="https://mybinder.org/badge_logo.svg" alt="Open in Binder" /></a>

7. Vector Geoprocessing <a href="https://colab.research.google.com/github/bamacgabhann/GY5021/blob/2024/GY5021/1_Introduction_to_Geospatial_Data/GY5021_7_Vector_Geoprocessing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>     <a href="https://mybinder.org/v2/gh/bamacgabhann/GY5021/9a706c8973d5bde0e50593ecc94941b0426f24a6?urlpath=lab%2Ftree%2FGY5021%2F1_Introduction_to_Geospatial_Data%2FGY5021_7_Vector_Geoprocessing.ipynb" target="_parent"><img src="https://mybinder.org/badge_logo.svg" alt="Open in Binder" /></a>

Additional:

- The Python Language <a href="https://colab.research.google.com/github/bamacgabhann/GY5021/blob/2024/GY5021/1_Introduction_to_Geospatial_Data/GY5021_The_Python_Language.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>     <a href="https://mybinder.org/v2/gh/bamacgabhann/GY5021/9a706c8973d5bde0e50593ecc94941b0426f24a6?urlpath=lab%2Ftree%2FGY5021%2F1_Introduction_to_Geospatial_Data%2FGY5021_The_Python_Language.ipynb" target="_parent"><img src="https://mybinder.org/badge_logo.svg" alt="Open in Binder" /></a>

- Getting Started Seriously With Python <a href="https://colab.research.google.com/github/bamacgabhann/GY5021/blob/2024/GY5021/1_Introduction_to_Geospatial_Data/GY5021_Getting_Started_Seriously_With_Python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>     <a href="https://mybinder.org/v2/gh/bamacgabhann/GY5021/9a706c8973d5bde0e50593ecc94941b0426f24a6?urlpath=lab%2Ftree%2FGY5021%2F1_Introduction_to_Geospatial_Data%2FGY5021_Getting_Started_Seriously_With_Python.ipynb" target="_parent"><img src="https://mybinder.org/badge_logo.svg" alt="Open in Binder" /></a>