# Introduction to Python for Open Source Geocomputation

![python](pics/python-logo-master-v3-TM.png)

* Instructor: Dr. Wei Kang

Content:

* Scripts
* Modules
* Packages

### Two working modes: interactively and running scripts

* Interactive working mode (interpreter (code cell in a Jupyter Notebook))
    * You can use the interpreter to build up and test pieces of code until you get them working to your liking, at which point you can save them to the text file/script.
* Running scripts:
    * Scripts are useful when you want to permanently save some code with an eye for reuse later. 

Complementary: using both an interpreter and scripts together is a common use pattern for scientific programming in Python.

# What is a script?


* write the code in text files 
    * using a text editor
    * Integrated Development Environment (IDE)
        * more features such as build automation, code highlighting, testing and debugging.
        * examples: PyCharm, Spyder, visual studio
* The file has the **.py** extension  

### Writing and Running Python Scripts

* Create a new file from the Jupyter directory called `hello.py` and enter the following into this file 

```python 
print('Hello World!')
```
* Running the script `hello.py`

    * Shell (Windows: Git BASH, Mac: terminal): 
    ```
    python hello.py 
    ```

    * Python interpreter:
    ```python
    >>> exec(open("hello.py").read())
    ```


In [1]:
exec(open("hello11.py").read())

hello!


# What is a Module?

* More organized "script" 
    * A module is a set of **functions and data structures**.
* A module is a file containing Python code that can be **reused** in other Python code files.
* The file of a Python module has the **.py** extension.
* A module is often reused by `import` statements.

## Standard Python Modules

Python comes with a rich set of standard modules ([learn more](https://docs.python.org/3/tutorial/stdlib.html)):

* Numeric and Mathematical Modules
    * numbers — Numeric abstract base classes
    * math — Mathematical functions
    * cmath — Mathematical functions for complex numbers
    * decimal — Decimal fixed point and floating point arithmetic
    * fractions — Rational numbers
    * random — Generate pseudo-random numbers
    * statistics — Mathematical statistics functions
* Functional Programming Modules
    * itertools — Functions creating iterators for efficient looping
    * functools — Higher-order functions and operations on callable objects
    * operator — Standard operators as functions



## Using Modules

* getting the whole module: `import module_name`
* only getting a specific function/class/attribute: `from module_name import something`

In [2]:
import math

Use `math.` tab to inspect all the objects (functions, variables, etc) in the current namespace 

In [3]:
math.

SyntaxError: invalid syntax (3171735483.py, line 1)

In [4]:
math.isinf?

[0;31mSignature:[0m [0mmath[0m[0;34m.[0m[0misinf[0m[0;34m([0m[0mx[0m[0;34m,[0m [0;34m/[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m Return True if x is a positive or negative infinity, and False otherwise.
[0;31mType:[0m      builtin_function_or_method

In [5]:
help(math.isinf)

Help on built-in function isinf in module math:

isinf(x, /)
    Return True if x is a positive or negative infinity, and False otherwise.



In [6]:
dir(math) # used to find out which names a module defines

['__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'acos',
 'acosh',
 'asin',
 'asinh',
 'atan',
 'atan2',
 'atanh',
 'cbrt',
 'ceil',
 'comb',
 'copysign',
 'cos',
 'cosh',
 'degrees',
 'dist',
 'e',
 'erf',
 'erfc',
 'exp',
 'exp2',
 'expm1',
 'fabs',
 'factorial',
 'floor',
 'fmod',
 'frexp',
 'fsum',
 'gamma',
 'gcd',
 'hypot',
 'inf',
 'isclose',
 'isfinite',
 'isinf',
 'isnan',
 'isqrt',
 'lcm',
 'ldexp',
 'lgamma',
 'log',
 'log10',
 'log1p',
 'log2',
 'modf',
 'nan',
 'nextafter',
 'perm',
 'pi',
 'pow',
 'prod',
 'radians',
 'remainder',
 'sin',
 'sinh',
 'sqrt',
 'tan',
 'tanh',
 'tau',
 'trunc',
 'ulp']

Use the form ``module.function_name`` to call a function that lives inside the module.

In [7]:
math.sqrt(4)

2.0

Use the form ``module.attribute`` to call an attribute that lives inside the module.

In [8]:
math.pi

3.141592653589793

Use an alias to rename the module upon import, which is handy when you have a long
module name and want to save some typing.

In [9]:
import math as m

In [10]:
m.pi

3.141592653589793

In [11]:
pi = 10

In [12]:
m.pi

3.141592653589793

`from module_name import something`

In [13]:
dir() # get a list of names comprising the attributes of the current namespac

['In',
 'Out',
 '_',
 '_10',
 '_12',
 '_6',
 '_7',
 '_8',
 '__',
 '___',
 '__builtin__',
 '__builtins__',
 '__doc__',
 '__loader__',
 '__name__',
 '__package__',
 '__session__',
 '__spec__',
 '_dh',
 '_i',
 '_i1',
 '_i10',
 '_i11',
 '_i12',
 '_i13',
 '_i2',
 '_i3',
 '_i4',
 '_i5',
 '_i6',
 '_i7',
 '_i8',
 '_i9',
 '_ih',
 '_ii',
 '_iii',
 '_oh',
 'exit',
 'get_ipython',
 'm',
 'math',
 'open',
 'pi',
 'quit']

In [14]:
from math import floor

In [15]:
help(floor)

Help on built-in function floor in module math:

floor(x, /)
    Return the floor of x as an Integral.
    
    This is the largest integer <= x.



In [16]:
dir()

['In',
 'Out',
 '_',
 '_10',
 '_12',
 '_13',
 '_6',
 '_7',
 '_8',
 '__',
 '___',
 '__builtin__',
 '__builtins__',
 '__doc__',
 '__loader__',
 '__name__',
 '__package__',
 '__session__',
 '__spec__',
 '_dh',
 '_i',
 '_i1',
 '_i10',
 '_i11',
 '_i12',
 '_i13',
 '_i14',
 '_i15',
 '_i16',
 '_i2',
 '_i3',
 '_i4',
 '_i5',
 '_i6',
 '_i7',
 '_i8',
 '_i9',
 '_ih',
 '_ii',
 '_iii',
 '_oh',
 'exit',
 'floor',
 'get_ipython',
 'm',
 'math',
 'open',
 'pi',
 'quit']

In [17]:
floor(4.1)

4

In [18]:
math.floor(4.1)

4

In [19]:
from math import * # import everything in the math module, not recommended

In [20]:
dir()

['In',
 'Out',
 '_',
 '_10',
 '_12',
 '_13',
 '_16',
 '_17',
 '_18',
 '_6',
 '_7',
 '_8',
 '__',
 '___',
 '__builtin__',
 '__builtins__',
 '__doc__',
 '__loader__',
 '__name__',
 '__package__',
 '__session__',
 '__spec__',
 '_dh',
 '_i',
 '_i1',
 '_i10',
 '_i11',
 '_i12',
 '_i13',
 '_i14',
 '_i15',
 '_i16',
 '_i17',
 '_i18',
 '_i19',
 '_i2',
 '_i20',
 '_i3',
 '_i4',
 '_i5',
 '_i6',
 '_i7',
 '_i8',
 '_i9',
 '_ih',
 '_ii',
 '_iii',
 '_oh',
 'acos',
 'acosh',
 'asin',
 'asinh',
 'atan',
 'atan2',
 'atanh',
 'cbrt',
 'ceil',
 'comb',
 'copysign',
 'cos',
 'cosh',
 'degrees',
 'dist',
 'e',
 'erf',
 'erfc',
 'exit',
 'exp',
 'exp2',
 'expm1',
 'fabs',
 'factorial',
 'floor',
 'fmod',
 'frexp',
 'fsum',
 'gamma',
 'gcd',
 'get_ipython',
 'hypot',
 'inf',
 'isclose',
 'isfinite',
 'isinf',
 'isnan',
 'isqrt',
 'lcm',
 'ldexp',
 'lgamma',
 'log',
 'log10',
 'log1p',
 'log2',
 'm',
 'math',
 'modf',
 'nan',
 'nextafter',
 'open',
 'perm',
 'pi',
 'pow',
 'prod',
 'quit',
 'radians',
 'remainder

In [21]:
pi

3.141592653589793

In [22]:
import math

In [23]:
math.pi

3.141592653589793

### Issues with `from math import * `

While this approach might seem like a good idea, it can create namespace
clashes which arise if the module has objects with names that are identitical
to those in the current namespace. In other words if we already had a
``floor`` object in our namespace, it is now overwritten and points to the
object in the math module and not the original ``floor`` object.

So using the explicit module name or alias
to protect the namespace is generally **good practice**. It also lets us reuse the
same name across different modules.

### Further learning about python standard modules

* There are many more python standard modules that are very useful.
    * Follow the following tutorials/documentations to learn more about them:
        * [Brief Tour of the Standard Library](https://docs.python.org/3/tutorial/stdlib.html)
        * [Python 3 Module of the Week](https://pymotw.com/3/)
        * [Python - Built-in Modules from tutorialsteacher](https://www.tutorialsteacher.com/python/python-builtin-modules)
* These standard python modules are building blocks of many external modules and projects

In [24]:
import os #python standard module for operating system interface

In [25]:
os.getcwd() # Current Directory

'/Users/wk0110/My Drive (weikang9009@gmail.com)/teaching/Intro to Python/2023_Fall/geog5560_2023Fall/notebooks'

In [26]:
os.listdir()  #Return a list containing the names of the files in the directory.

['hello.py',
 '02.2_Functions.ipynb',
 '05.1_Strings.ipynb',
 '.DS_Store',
 '15.1_Mapping.ipynb',
 '01.1_Introduction.ipynb',
 '12.1_geopandas.ipynb',
 '06.1_Lists.ipynb',
 '02.1_Program-Variables-Operators.ipynb',
 '08.1_Python_Ecosystem.ipynb',
 '11.1_Matplotlib.ipynb',
 '01.2_Installation-Notebook-GitHub.ipynb',
 'ex1.csv',
 '08.2_Iteration(2).ipynb',
 'hw4.py',
 '10.2_Numpy(1).ipynb',
 'pics',
 '07.1_Sets_Dictionaries.ipynb',
 'ex5.csv',
 '__pycache__',
 '11.2_pandas.ipynb',
 'hello11.py',
 '04.1_Conditionals_Strings.ipynb',
 '04_Git.ipynb',
 '10.1_OOP(3).ipynb',
 'untitled1.txt',
 '8.1_Final_Project_Template.ipynb',
 '09.2_OOP(2).ipynb',
 '03.1_ScalarDataTypes.ipynb',
 '.ipynb_checkpoints',
 '10.2_Numpy(2).ipynb',
 '09.1_OOP(1).ipynb',
 '06.2_Lists_Tuples-Copy1.ipynb',
 '04.2_Strings_Iteration.ipynb',
 'data',
 '05.2_Strings_Lists.ipynb',
 '06.2_Lists_Tuples.ipynb',
 '08.1_Final_Project.ipynb',
 '08.2_Functions(2).ipynb',
 '08.1_Scripts_Modules.ipynb']

In [27]:
os.listdir('/Users/wk0110/My Drive/teaching') 

FileNotFoundError: [Errno 2] No such file or directory: '/Users/wk0110/My Drive/teaching'

## Writing our own module

We will create a module and include the function that was used to `Check whether two words (strings) are the reverse of each other` (Part B of HW4) in the module. Our module will be named `hw4.py` and the function will be called `is_reverse` and take two arguments:

* string1
* string2

It will return `True` or `False`.

From the Jupyter Notebook Dashboard, click "New"-"Text File" to create the file ``hw4.py`` so that its contents are as follows (You may copy your solution to Part B of HW4 to this file instead)::

```python
def is_reverse(string1, string2):
    """compare two words (strings) and return True if one of the words is the reverse of the other
    
    Parameters
    ----------
    string1 : str
              A string containing one word
    string2 : str
              A string containing one word          
        
    Return
    ------
            : bool
              True if string1 is the reverse of string2; Otherwise, False
    
    """
    length1 = len(string1)
    length2 = len(string2)
    if length1 != length2:
        return False
    
    for i in range(length1):
        if string1[i] != string2[-1-i]:
            return False
    return True
```

Importing the module gives access to its objects, using the `module.object` syntax

Now we can use our module by importing the function from the module:

In [28]:
from hw4 import is_reverse

In [29]:
dir()

['In',
 'Out',
 '_',
 '_10',
 '_12',
 '_13',
 '_16',
 '_17',
 '_18',
 '_20',
 '_21',
 '_23',
 '_25',
 '_26',
 '_6',
 '_7',
 '_8',
 '__',
 '___',
 '__builtin__',
 '__builtins__',
 '__doc__',
 '__loader__',
 '__name__',
 '__package__',
 '__session__',
 '__spec__',
 '_dh',
 '_i',
 '_i1',
 '_i10',
 '_i11',
 '_i12',
 '_i13',
 '_i14',
 '_i15',
 '_i16',
 '_i17',
 '_i18',
 '_i19',
 '_i2',
 '_i20',
 '_i21',
 '_i22',
 '_i23',
 '_i24',
 '_i25',
 '_i26',
 '_i27',
 '_i28',
 '_i29',
 '_i3',
 '_i4',
 '_i5',
 '_i6',
 '_i7',
 '_i8',
 '_i9',
 '_ih',
 '_ii',
 '_iii',
 '_oh',
 'acos',
 'acosh',
 'asin',
 'asinh',
 'atan',
 'atan2',
 'atanh',
 'cbrt',
 'ceil',
 'comb',
 'copysign',
 'cos',
 'cosh',
 'degrees',
 'dist',
 'e',
 'erf',
 'erfc',
 'exit',
 'exp',
 'exp2',
 'expm1',
 'fabs',
 'factorial',
 'floor',
 'fmod',
 'frexp',
 'fsum',
 'gamma',
 'gcd',
 'get_ipython',
 'hypot',
 'inf',
 'is_reverse',
 'isclose',
 'isfinite',
 'isinf',
 'isnan',
 'isqrt',
 'lcm',
 'ldexp',
 'lgamma',
 'log',
 'log10',
 'l

In [30]:
is_reverse?

[0;31mSignature:[0m [0mis_reverse[0m[0;34m([0m[0mstring1[0m[0;34m,[0m [0mstring2[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
compare two words (strings) and return True if one of the words is the reverse of the other

Parameters
----------
string1 : str
          A string containing one word
string2 : str
          A string containing one word          
    
Return
------
        : bool
          True if string1 is the reverse of string2; Otherwise, False
[0;31mFile:[0m      ~/My Drive (weikang9009@gmail.com)/teaching/Intro to Python/2023_Fall/geog5560_2023Fall/notebooks/hw4.py
[0;31mType:[0m      function

In [31]:
is_reverse("Happy", "Happy")

False

In [32]:
is_reverse("Happy", "yppaH")

True

Or we import the module:

In [33]:
import hw4

In [34]:
hw4.is_reverse("ha","ds")

False

In [35]:
hw4?

[0;31mType:[0m        module
[0;31mString form:[0m <module 'hw4' from '/Users/wk0110/My Drive (weikang9009@gmail.com)/teaching/Intro to Python/2023_Fall/geog5560_2023Fall/notebooks/hw4.py'>
[0;31mFile:[0m        ~/My Drive (weikang9009@gmail.com)/teaching/Intro to Python/2023_Fall/geog5560_2023Fall/notebooks/hw4.py
[0;31mDocstring:[0m   <no docstring>

In [36]:
dir(hw4)

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'is_reverse']

In [37]:
hw4.is_reverse("Happy", "Happy")

False

In [38]:
hw4.is_reverse("Happy", "yppaH")

True

### Documenting your module with Docstrings

* It is good practice to document your modules so that users know what and how to use the objects in the module. 
* A Python docstring is a string used to document a Python module, class, function or method, so programmers can understand what it does without having to read the details of the implementation. 
* It is a common practice to generate online (html) documentation automatically from docstrings.
* Everything between the pair of *triple quotes* is our function's **docstring**.
* There are some [standard conventions](http://www.python.org/dev/peps/pep-0257/) and [numpy doc style](https://numpydoc.readthedocs.io/en/latest/format.html) for writing docstrings that provide consistency. 

In [39]:
hw4.is_reverse?

[0;31mSignature:[0m [0mhw4[0m[0;34m.[0m[0mis_reverse[0m[0;34m([0m[0mstring1[0m[0;34m,[0m [0mstring2[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
compare two words (strings) and return True if one of the words is the reverse of the other

Parameters
----------
string1 : str
          A string containing one word
string2 : str
          A string containing one word          
    
Return
------
        : bool
          True if string1 is the reverse of string2; Otherwise, False
[0;31mFile:[0m      ~/My Drive (weikang9009@gmail.com)/teaching/Intro to Python/2023_Fall/geog5560_2023Fall/notebooks/hw4.py
[0;31mType:[0m      function

In [40]:
help(hw4.is_reverse)

Help on function is_reverse in module hw4:

is_reverse(string1, string2)
    compare two words (strings) and return True if one of the words is the reverse of the other
    
    Parameters
    ----------
    string1 : str
              A string containing one word
    string2 : str
              A string containing one word          
        
    Return
    ------
            : bool
              True if string1 is the reverse of string2; Otherwise, False



### Locating your module

When issuing an ``import`` statement Python searches a list of directories: 

* The directory containing the input script (or the current directory).
* PYTHONPATH (a list of directory names, with the same syntax as the shell variable PATH).
* The installation-dependent default (by convention including a site-packages directory, handled by the site module).

Over time you will likely be
accumutating more of your own modules, and rather than continuing to copy them
to new working directories (and have multiple copies of the same file on your
system) you can create your own central directory to contain your Python
modules. How you do this depends on what operating system you are on. The Python Documentation wesbite provide 

* [instructions for windows](https://docs.python.org/3/using/windows.html#finding-modules)
* [instructions for Unix type systems (including Mac)](https://docs.python.org/3/tutorial/modules.html#the-module-search-path)

# Python Packages

* Suitable for a large application that includes many *modules* 
* Allow for a hierarchical structuring of the *module* namespace using dot notation `.`. In the same way that modules help avoid collisions between global variable names, packages help avoid collisions between module names.
* special file called ``__init__.py`` (which may be empty) tells Python that the directory is a Python package, from which modules can be imported.

## Importing a package

<img src="pics/pkg.png">

Given this structure, if the pkg directory resides in a location where it can be found (in one of the directories contained in `sys.path`), you can refer to the two modules with dot notation (`pkg.mod1`, `pkg.mod2`) and `import` them with the syntax you are already familiar with:

```python
import pkg.mod1, pkg.mod2
```

If `mod1` has a function `bar()` and `mod2` has a function `foo()`, we can import and call these functions as shown below:

```python
import pkg.mod1, pkg.mod2
pkg.mod1.bar()
pkg.mod2.foo()
```

Or

```python
from pkg.mod1 import bar
bar()
from pkg.mod2 import foo
foo()
```

In [41]:
import scipy

In [42]:
scipy.__file__ #location of the package on your computer

'/Users/wk0110/anaconda3/lib/python3.11/site-packages/scipy/__init__.py'

In [43]:
scipy.__version__ #version of the package

'1.11.2'

In [44]:
dir(scipy)

['LowLevelCallable',
 '__version__',
 'cluster',
 'constants',
 'datasets',
 'fft',
 'fftpack',
 'integrate',
 'interpolate',
 'io',
 'linalg',
 'misc',
 'ndimage',
 'odr',
 'optimize',
 'show_config',
 'signal',
 'sparse',
 'spatial',
 'special',
 'stats',
 'test']

In [45]:
from scipy import stats

In [46]:
dir(stats)

['BootstrapMethod',
 'CensoredData',
 'Covariance',
 'FitError',
 'MonteCarloMethod',
 'PermutationMethod',
 '__all__',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 '_axis_nan_policy',
 '_biasedurn',
 '_binned_statistic',
 '_binomtest',
 '_boost',
 '_censored_data',
 '_common',
 '_constants',
 '_continuous_distns',
 '_covariance',
 '_crosstab',
 '_discrete_distns',
 '_distn_infrastructure',
 '_distr_params',
 '_entropy',
 '_fit',
 '_hypotests',
 '_kde',
 '_ksstats',
 '_levy_stable',
 '_mannwhitneyu',
 '_morestats',
 '_mstats_basic',
 '_mstats_extras',
 '_multicomp',
 '_multivariate',
 '_mvn',
 '_odds_ratio',
 '_page_trend_test',
 '_qmc',
 '_qmc_cy',
 '_qmvnt',
 '_rcont',
 '_relative_risk',
 '_resampling',
 '_rvs_sampling',
 '_sensitivity_analysis',
 '_sobol',
 '_statlib',
 '_stats',
 '_stats_mstats_common',
 '_stats_py',
 '_stats_pythran',
 '_survival',
 '_tukeylambda_stats',
 '_variation',
 'alexandergo

In [47]:
stats.boxcox?

[0;31mSignature:[0m [0mstats[0m[0;34m.[0m[0mboxcox[0m[0;34m([0m[0mx[0m[0;34m,[0m [0mlmbda[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m [0malpha[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m [0moptimizer[0m[0;34m=[0m[0;32mNone[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Return a dataset transformed by a Box-Cox power transformation.

Parameters
----------
x : ndarray
    Input array to be transformed.

    If `lmbda` is not None, this is an alias of
    `scipy.special.boxcox`.
    Returns nan if ``x < 0``; returns -inf if ``x == 0 and lmbda < 0``.

    If `lmbda` is None, array must be positive, 1-dimensional, and
    non-constant.

lmbda : scalar, optional
    If `lmbda` is None (default), find the value of `lmbda` that maximizes
    the log-likelihood function and return it as the second output
    argument.

    If `lmbda` is not None, do the transformation for that value.

alpha : float, optional
    If `lmbda` is None and `alpha` is not None (def

### Installing python packages

* `pip`: `pip install scipy`
    * The [Python Package Index (PyPI)](https://pypi.org/) is a repository of software for the Python programming language. 
* `conda`: `conda install scipy`
    * [Anaconda Packages](https://repo.anaconda.com/)

## Further readings on creating a python package

Read the [Real Python tutorial on Python packages](https://realpython.com/python-modules-packages/) to learn more about creating modules and packages.

Read the open source book [Python Packages](https://py-pkgs.org/welcome) to learn more about the modern and efficient workflows for creating Python packages.