### License and Disclaimer

**Copyright <2024> <Red Bush Analytics (Pty) Ltd>**

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions.

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.<br>

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.<br>

This notebook was prepared by Red Bush Analytics for the Geostatistical Association of Southern Africa. <br> 
The reader can use the information below when appropriate credit is given where text is extracted.<br>
This document's information is generally from public sources, and the copyrights to the content contained in the links belong to the authors.  It is your responsibility to check any restrictions on use with the respective copyright holders.<br>

Opinions presented in this document are those of Red Bush Analytics and Kathleen Body (Author) as of the date published (2024/08/26) and are subject to change when presented with different information.   If you see another document with a different date and opinion, it is because I found new information, learned something and amended my opinion. Or, the Python code, Jupyter Notebook/lab software, or third-party links have changed, etc., and I updated the information. The same goes for the links presented. The information may change in the future. Check the document dates to make sure you have the information appropriate to the version of the software you are using.<br>

**The information in this document is generally appropriate for Python versions from 3.10. Substantial changes have occurred from 3.10onwards, and the information may not be appropriate for earlier versions of Python (especially 3.8 or earlier).**


# Python Basics

This workbook introduces some basic concepts and terminology in Python.<br>
It can be used as a reference for beginners.
You should have done the Jupyter Notebook tutorial and know the difference between a Code block and a Markdown Block.
This is a Markdown block with no code.<br>
You can run this entire notebook by clicking Run on each code cell.  You will learn more if you create a new Notebook and type in the code for each code cell and then run each cell as you go along.
***************************************************************************

## Other types of variables
More complex objects include:
>- lists
>- tuples
>- dictionaries
>- matrices
>- DataFrames (a type of matrix)
>- Series
    

A **list** is a list of data.<br>
It is unordered- so the order of the elements of the object does not matter, and <br>
It is mutable - the contents can be changed.<br>
It does not allow duplicate values.<br>
A list is denoted by squared brackets [].<br>
A list can include numbers, strings, variables, another object, etc.

In [2]:
x = 6
name = 'boat'
a_list = [1, 2, 3, 'cat','dog', x, name]
a_list

[1, 2, 3, 'cat', 'dog', 6, 'boat']

A **tuple** is an ordered set of data.<br>
It is immutable (cannot be changed) and allows duplicate values. <br>
A tuple is denoted by curved brackets.
*Note: to change a tuple, first convert to a list, change the list and convert back to a tuple.*  There are better ways to edit than the sequence below. It is just to show the difference between lists and tuples.


In [3]:
a_tuple = (1, 2, 1, 'cat','dog', x, name)
a_tuple

(1, 2, 1, 'cat', 'dog', 6, 'boat')

In [4]:
#To convert a list to a tuple
b_tuple = tuple(a_list)
b_tuple

(1, 2, 3, 'cat', 'dog', 6, 'boat')

In [5]:
#to convert a tuple to a list 
b_list = list(b_tuple)
b_list

[1, 2, 3, 'cat', 'dog', 6, 'boat']

In [6]:
#to call an element of a list call by its index 
#(number denoting place in the list) indices start at 0
b_list[0], b_list[1], b_list[4]

(1, 2, 'dog')

A **dictionary** is a list that includes a key and its value.<br>
Keys can be strings or numbers.<br>
A dictionary is unordered (the order is preserved from Python 3.7), mutable, and does not allow duplicate keys.
Dictionary keys are immutable and can consist of immutable strings, numbers, and tuples but not mutable objects such as lists.
It is denoted by curly brackets{}
https://realpython.com/python-dicts/

Items in a dictionary can be numbers, strings or variables, lists and other dictionaries.<br>
A value is called by its key.
Dictionaries are not indexed.

In [7]:
a_dict = {'a': 1, 'b':2, 'c':3, 2.5:"cat", 1.2:'dog', 'f': x, 'g':name}
print(a_dict)

{'a': 1, 'b': 2, 'c': 3, 2.5: 'cat', 1.2: 'dog', 'f': 6, 'g': 'boat'}


In [8]:
var = a_dict['g']
var

'boat'

In [9]:
# An alternative way to construct a dictionary is to use the method dict on a list of tuples, in this case, pairs. There are cases for both both ways of constructing dictionaries.
# Notice that the print command returns a dictionary, b_dict,  with the same format as a_dict.
b_dict = dict([('a', 1),
               ('b', 2),
               ('c', 3),
               (2.5, "cat"),
               (1.2, 'dog'),
               ('f', x),
               ('g', name)])
print(b_dict)
var = a_dict[2.5]
var

{'a': 1, 'b': 2, 'c': 3, 2.5: 'cat', 1.2: 'dog', 'f': 6, 'g': 'boat'}


'cat'

Dictionaries like above are not common.  Generally they have a common theme such as below.

color_theme = {'red':1, 'green':2, 'blue': 3, 'cyan':4, 'magenta':5, 'yellow': 6, 'black': 7}<br>
dog_breeds = {'great_dane': large, 'doberman': large, 'boxer': medium, 'cocker_spaniel': medium, 'maltese': small, 'scottie': small}

## Brackets and Braces
There are three types of brackets (), [] and { }. They have different meanings and are not interchangeable.
There are two string notation symbols  ' ', and " ".  They are interchangeable but must be the same when opening and closing the string. I.e., x = "boat" or x = 'boat', which  is the same thing,   but not x = 'boat" .

More can be found here  https://www.geeksforgeeks.org/python-syntax/

## Matrices
A Matrix is a list of lists.  This isn't particularly useful in itself as a limited number of operations can be performed on it.

Certain types of matrices allow mathematical operations. These are called **arrays** and are dealt with in a package called **numpy**. <br>
Other types of matrices are tables, similar to Excel tables, and we use a package called **Pandas** to manipulate and use these.  In Pandas, these tables are called **DataFrames**.  A one-dimensional version of a DataFrame is called a **Series** in Pandas. 

In [10]:
a_matrix = [[1, 'cat', 3], [4, name, 6]]
a_matrix

[[1, 'cat', 3], [4, 'boat', 6]]

In [11]:
b_matrix = [[[1, 'cat', 3], [4, name, 6]],
           [[7, 8, 9], b_list]]
b_matrix

[[[1, 'cat', 3], [4, 'boat', 6]],
 [[7, 8, 9], [1, 2, 3, 'cat', 'dog', 6, 'boat']]]

In [12]:
#Matrix elements are called by position (rows and columns)
# Each element is indexed
row1 = b_matrix[0]
row2 = b_matrix[1]
bright = b_matrix[1][1]
tleft = b_matrix[0][0]
xx = b_matrix[1][1][5]   #row, column, element

print('The first row is: ', row1)
print('The second row is: ', row2)
print('The bottom right cell is: ', bright)
print('The top left cell is: ', tleft)
print('The third element of the top left cell is: ', tleft[2])
print('the third element of the botton right cell is: ', xx)

The first row is:  [[1, 'cat', 3], [4, 'boat', 6]]
The second row is:  [[7, 8, 9], [1, 2, 3, 'cat', 'dog', 6, 'boat']]
The bottom right cell is:  [1, 2, 3, 'cat', 'dog', 6, 'boat']
The top left cell is:  [1, 'cat', 3]
The third element of the top left cell is:  3
the third element of the botton right cell is:  6


Mostly, what we want to do is manipulate numbers or sort data into categories.  Lists are very useful for this.
Our data is usually in tables.<br>

Tables are built from lists and can be converted into Series and Arrays.
More on Series and Arrays will be shown in later sections.

## Syntax in Python

Like all languages, Python has a vocabulary and "grammar" or syntax. It also has style conventions used in the coding community. You will need to learn the difference between syntax and conventional styles. While not required by Python, certain styles are expected to be followed to make your code readable to others. Not following these conventions will cause problems, and your work will not be respected.

Some examples below:



x=y+1  #Python can read and execute<br>

**convention**<br>

x = y + 1    #add space between variables, operators and numbers. <br>  

**indentation**<br>

x = y + 1   # Start at left with NO SPACES.  Python can read this line.<br>
 >x = y +1   # get an indentation error.  Python cannot read this line.<br>
 
Spaces between code elements, i.e. variables and operators, and between lines or code blocks are called white space. White space does nothing, but having enough makes the code easier to read.<br>

Try each of the above and see what happens.
to add a block below, click on this Markdown box, then above on the box with the + sign below ![image.png](attachment:4b2bc66a-001f-436c-a828-1b26a171a0ba.png)

Or click on the + in the menu at the top of the page.


In [13]:
#Correct syntax
y = 1
x = y+1

print(x)

2


In [14]:
# The second line produces an error due to the indent
y = 1
x = y+1

 print(x)

IndentationError: unexpected indent (404601233.py, line 5)

**Indents** —Jupyter Notebook and other editing software will automatically indent to the correct space for some statements. i.e., if statements. DO NOT CHANGE THIS; these indents are required for syntax.<br>
Subsequent lines must line up with the correct indents.  The indents tell Python what lines belong to which group of code.<br>
If the auto-indent does not work, go to the left and use the tab key to get the indent.  Indents are 4 spaces-always.  

In [15]:
#Correct syntax
# Use line spaces between blocks of code.
y = 5
if x > y:
    print(x)
else:
    print(x, " Is not greater than " , y)

if x > y:
    print(y)
else:
    print(y, " Is not less than " , x)

2  Is not greater than  5
5  Is not less than  2


In [16]:
#Incorrect Syntax 1
y = 5
if x > y:
    print(x)
    else:
    print(x, " Is not greater than " , y)

SyntaxError: invalid syntax (2174677103.py, line 5)

In [None]:
#Incorrect Syntax 2
if x > y:
    print(y)
else:
print(y, " Is not less than " , x)

IndentationError: expected an indented block after 'else' statement on line 4 (2691334561.py, line 5)

Liberal use of # followed by a comment to record what you are doing is good practice. It helps you remember what you did and helps others understand what the code is supposed to do.<br>

If using Jupyter Notebook or Jupyter Lab or VS Code, I suggest using # comments for short comments in a code block and Markdown blocks, like this one, for longer commentary. R Studio also has a Markdown language.

## Loops

Loops are code that repeats until a condition is met. Then, the code stops, and the result is output as a variable of some sort.  There are two types of loops a ***for*** loop and a ***while*** loop.  Good explanations with examples are given here:  https://www.geeksforgeeks.org/difference-between-for-loop-and-while-loop-in-programming/   , and <br>
https://www.codingem.com/for-and-while-loops-in-python/

***For*** loops are used when you have a defined condition and you know when the code will stop, e.g., in a block model where you know you have X blocks east, Y blocks North, and Z blocks up. Do something for each block and stop.  These are always finite and will terminate unless you have made a coding error.

A ***while*** loop keeps running until a condition is met. Check and ensure there is a final condition and it is finite, or the loop will run forever.    A final condition is, normally, a specific value or number of iterations.

Examples are given in the code blocks below. You will get more examples of loops in the course and have a chance to write some simple loops.  

In [None]:
list = [1, 2, 3, 4, 5, 6]        # Define a list
total = 0                        # Define an initial condition
for x in list:                   # Define variable from the list
    total += x                   # Add the numbers
  
print(total)                     # Print the total

21


the += means total = total + x   see for more similar options https://www.w3schools.com/python/python_operators.asp

In [None]:
num_list = []                    # Define a variable list empty to start
total = 0                        # Define an initial condition 
list_elements = 7                # Define the number of elements to add to the list   -list element is 1 more than required if <,  otherwise use the number of elements and <=.
counter = 1                      # Define the initial number value
while counter < list_elements:
    num_list.append(counter)    # Create the list by appending each number in order
    total += counter            # Add the number to the previous total
    counter += 1                # Move to the next number higher by the increment +1
print(num_list)                 # Print the list of numbers
print(total)                    # Print the total of the numbers
print(counter)                  # Print the last counter

[1, 2, 3, 4, 5, 6]
21
7


## Definitions or Functions

If you are going to be using the same calculation many times, you can define a function and call it by name and input variables instead of writing the whole code each time. This is denoted by **def** followed by the code.

You can define functions at the beginning of the notebook and call them from anywhere in the same notebook.
You can set up a lot of these functions and create libraries or "package" them and then call the "package".  

In [None]:

def sum_list(list_elements,first_number, interval):  #define a function call sum_list with input variable list ; list_elements- number of elements, first_number-value, interval -value
    num_list = []                    # Define an empty variable list
    total = 0                        # Define an initial condition 
    list_len = 0                     # Define an initial condition length or number of elements in the list
    counter = first_number           # Define the initial number value
    while list_len < list_elements :
        num_list.append(counter)     # Create the list by adding each number to the list at the beginning of the loop
        list_len = len(num_list)     # count the length- number of elements- in the list
        total += counter             # add the number to the previous total
        counter += interval          # move to the next number higher by the increment interval
        
    print("The numbers in the list are:  ", num_list)                     # print the list of numbers
    print("The sum of the numbers in the list is:  ", total)              # print the total of the numbers
    print("The number of elements in the list is:  ", len(num_list))      # print number of elements in the list

#Create a list of 6 numbers, starting with 1, increasing by an interval of 1 and adding them together.
sum_list(6,1,1)    

The numbers in the list are:   [1, 2, 3, 4, 5, 6]
The sum of the numbers in the list is:   21
The number of elements in the list is:   6


## Exercise 10

Use the function **sum_list** above to

1. Create a list of 12 numbers, starting with -1, and increase by an interval of 2 and add them together.<br>
2. Create a list of 5 numbers, starting with 2, and increase by an interval of 0.5 and add them together.


## Some other functions

The statistical functions in Python itself aren't very comprehensive, and we will use other packages for some of the calculations. However, some simple ones are given below.

In [None]:
data = (2,4,4,6,3,7,12,7,8,9,1,6,10,5,5,9,3)    # This is a tuple with data
sum(data)

101

In [None]:
mean(data)    # when you run this block you will get an error - there is no equivent mean or average

NameError: name 'mean' is not defined

In [None]:
mean = sum(data)/len(data)
mean

5.9411764705882355

Trying to calculate the variance and standard deviation gets complicated, and it's unnecessary. There are "packages" already written to do these calculations; we need to import these packages to access the code.<br>
If the package is already installed on your machine, we just need to *import* the module and call the required function.  In Python, there is a module called **statistics** which has basic statistics functions.<br>
There is also a module called **math** which has more math functions. You will use this module in later lessons.

In [None]:
# To find out what is in the package and the syntax required, type help('statistics'), and the help reference will appear.  
help('statistics')

Help on module statistics:

NAME
    statistics - Basic statistics module.

MODULE REFERENCE
    https://docs.python.org/3.11/library/statistics.html
    
    The following documentation is automatically generated from the Python
    source files.  It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
    implementations.  When in doubt, consult the module reference at the
    location listed above.

DESCRIPTION
    This module provides functions for calculating statistics of data, including
    averages, variance, and standard deviation.
    
    Calculating averages
    --------------------
    
    Function            Description
    mean                Arithmetic mean (average) of data.
    fmean               Fast, floating point arithmetic mean.
    geometric_mean      Geometric mean of data.
    harmonic_mean       Harmonic mean of data.
    median              Median (middle value) of data.
    median_low  

In [None]:
help('math')

Help on built-in module math:

NAME
    math

DESCRIPTION
    This module provides access to the mathematical functions
    defined by the C standard.

FUNCTIONS
    acos(x, /)
        Return the arc cosine (measured in radians) of x.
        
        The result is between 0 and pi.
    
    acosh(x, /)
        Return the inverse hyperbolic cosine of x.
    
    asin(x, /)
        Return the arc sine (measured in radians) of x.
        
        The result is between -pi/2 and pi/2.
    
    asinh(x, /)
        Return the inverse hyperbolic sine of x.
    
    atan(x, /)
        Return the arc tangent (measured in radians) of x.
        
        The result is between -pi/2 and pi/2.
    
    atan2(y, x, /)
        Return the arc tangent (measured in radians) of y/x.
        
        Unlike atan(y/x), the signs of both x and y are considered.
    
    atanh(x, /)
        Return the inverse hyperbolic tangent of x.
    
    cbrt(x, /)
        Return the cube root of x.
    
    ceil(x, /)

In [None]:
import statistics as st   # we use a short alias to cut down on typing

In [None]:
#the syntax is:  module.method(input)
st.mean(data)

5.9411764705882355

In [None]:
st.geometric_mean(data)

5.076494973909092

In [None]:
st.quantiles(data)

[3.5, 6.0, 8.5]

In [None]:
st.pvariance(data)    #same a P.VAR or PVAR in Excel  population

8.525951557093425

In [None]:
st.variance(data)   # same as S.VAR or SVAR in Excel  for sample  

9.058823529411764

In [None]:
mean = st.mean(data)
mean

5.9411764705882355

In [None]:
# Example of a weighted average
grades = [85, 92, 83, 91]
weights = [0.20, 0.20, 0.30, 0.30]
st.fmean(grades, weights)

87.6

## Exercise 11


Try creating a list of data and calculating some of the statistics.

Create another list of the same number of elements, but these should be weights.  In the example above, the weights add up o one. Weights such as density and volume can be anything and do not need to add up to 1.   Try both options an see what happens.

Create another list with values related to the first (y = ax+b, b is not constant) list and try the covariance, correlation, and regression options.  

## Submitting the Exercises

Once done, make a zip file using Windows Explorer and upload the entire Notebook with the completed exercises into the LMS or email as instructed for your course. You need to make a zip file as the file types for upload are restricted, and most servers do not like files with code—these are potentially viruses or could damage the system, even an innocent Notebook like this one.