# Structure: *Programming is more than writing code*

**Table of contents**<a id='toc0_'></a>    
- 1. [Structure](#toc1_)    
- 2. [Design patterns](#toc2_)    
- 3. [Modules](#toc3_)    
- 4. [Git](#toc4_)    

<!-- vscode-jupyter-toc-config
	numbering=true
	anchor=true
	flat=false
	minLevel=2
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

You seldom write some code, run it, get the right results, and then never use it again.

 * Firstly: You make errors (bugs) when you code.
 * Secondly: You need to share your code with colleagues and your future self.

Transparent **macro- and microstructure** is important: 
* For preventing errors
* For finding errors
* For making your code interpretable for others and your future-self

**No code is self-explanatory** - even though if might seem so when you write it. 

**Cleaning, commenting and documenting code takes time**, but is a crucial aspect of good programming.

In **scientific programming**, a transparent program structure and good documentation is also a cornerstone in securing **replicability**. 

## 1. <a id='toc1_'></a>[Structure](#toc0_)

**Macro-structure** (wrt. folders and files):

1. **One folder** for each project with ALL required files.
2. **End goal**: *One file to run it all*. *Very important!*
3. **Module files** (.py): Define functions, classes, etc.
4. **Notebook files** (.ipynb): Call functions, classes etc. + explain and present the results.
5. **Larger projects:** Sub-folders for data, figures, etc. (*not relevant now*).

**Micro-structure** (with-in files): Follow the official [PEP8 guideline](https://www.python.org/dev/peps/pep-0008/).

**Note:** A style guide is about consistency. Consistency with this style guide is important. Consistency within a project is more important. Consistency within one module or function is the most important.

**Recommendations:**

1. **Code layout:**
    * **Indentation:** Four spaces
    * **Line length:** Max of 79 characters (wrap line + indent properly)
    * **Strings:** Use single or double quote (be consistent)
    * **White space:**
        * Around assignment: ``x = y``
        * After colon: ``if x == 2: print(x)``
        * Around operators with lowest priority in a calculation: ``c = (a+b) * (a-b)`` or `z = x*x + y*y` 
2. **Naming conventions:** Short, but also precise
    * **Modules:** Lower case with potential underscores (e.g. ``numecon`` or ``num_econ``)
    * **Classes:** Camel case (e.g. ``ConsumerClass``)
    * **Variables, functions and methods:** Lower case with potential underscores     
3. **Ordered section comments:** Break your code into sections
    * Give each section a name and a place in the ordering
    * Level 1: a, b, c etc.
    * Level 2: i, ii, iii, iv etc.
    * Level 3: o, oo, ooo, oooo etc.
4. **Line comments:** Small additional hints
    * Again, short and precise
    * Avoid just explaining what the code does (must provide additional information)
5. **Docstrings:** Should be written for all functions, methods and classes (see how below).

**More on names:**

1. Name functions after their **intended use**.
1. Help your self in debugging and name variables in a **searchable way**
1. Normally avoid using any special characters.
2. Unused variables and non-public methods should start with a ``_``


**Two different perspectives on comments:**

1. The comments explain humans what the code does.
2. The code makes the computer do what the comments say. 

**Example of well formatted code:**

In [1]:
import math

# a. name for section
alpha = 1
beta = 2
x = [-3, -2, -1, 1, 2, 3]

# b. name for section
def my_function(x,alpha,beta):
    """ explain what the function does (docstring)
    
    Args:
    
        x (float): explanation
        alpha (float): explanation
        beta (float): explanation
        
    Returns:
    
        y (float): explanation
    
    """
    
    y = x**2 
    return y

# c. name for section
for i in range(len(x)):
    
    # i. name for sub-section
    y = my_function(x[i],alpha,alpha)
    
    # ii. name for sub-section
    cond = y > 0 # non-positive not allowed due to log (line comment)
    
    # iii. name for sub-section
    if cond:
        print(math.log(y))

2.1972245773362196
1.3862943611198906
0.0
0.0
1.3862943611198906
2.1972245773362196


**Try:** Hover over ``my_function``

**Recommendation:** Try to think about which sections and sub-sections you need beforehand. You can even write *before* you write code! 

## 2. <a id='toc2_'></a>[Design patterns](#toc0_)

When thinking about how organize your functions and objects, few commandments that will serve you well: 

1. **DRY:** *Do not Repeat Yourself*. A specific line of code must only appear once in your script.    
2. **One job:** A function has *one job only*. Sub-tasks within the main task is delegated to other functions.   
3. **No side effects:** If a function returns $x$, then it should *not also* produce lasting changes to $y$.

**More on design patterns**:  

* You can check out [**Google's Python style guide**](https://google.github.io/styleguide/pyguide.html) to catch a quick glimpse of how they organize their work. 
* One of the bibles on design patterns is edited by the famous **Uncle Bob**.

<img src="cleancode_book.jpg" alt="Drawing" style="width: 300px; margin-left: 300px"/>  

## 3. <a id='toc3_'></a>[Modules](#toc0_)

**Important:** if you write changes in the code of your own module, eg. mymodule, and if mymodule has **already** been imported before the changes, then simply running the *import mymodule* statement again will **not** import your changes. 

**Solution:** Use the the autoreload magics below

In [2]:
%load_ext autoreload
%autoreload 2

In [3]:
import mymodule as mm

In [6]:
try:
    mm.myfun(2)
except Exception as e:
    print(e)

name 'a' is not defined


**Task:** Place cursor at `myfun` and press `F12`

## 4. <a id='toc4_'></a>[Git](#toc0_)

The purpose of git is to allow you to easily share your code with collaborators and track the changes each of you make.

**Essential Git terms:**

* **Local** your computer
* **Remotes** the code on Github and on other computers.
* *Branch** a branch of code is a separate track or copy of the code base on which you can develop new stuff. There is normally a structure of a main branch that holds the current working version of code and then several testing branches where new stuff is developed. After development, those braches are merged onto the main branch.
* **.gitignore** a file that contains specifications on which types of files that are not included in process of sending changes back and forth.
* **.git** there is a hidden folder in all git repositories. This folder includes the diff and head files that contain the whole history of changes to code so far. Delete .git, and your code folders are no longer a working repository. Now it's just regular code.

**Essential Git commands:**

* **Fetch** is the process of getting aware of any changes to code outside the local repository on your computer. Does not happen automatically! You are not importing changes by fetching, you just make your local check if anything has happened on the remote repo(s).
* **Stage** before you can send off your own changes to code, you need to decide which chunks of code specifically to send. Mostly, you will just stage all, that is, send off all changes you have made.
* **Commit** the process of making your changes available to the remote repo.
* **Merge** when you let changes to code from remotes get weaved into your own code.
* **Push** after committing, you order the remote take the changes you made. The remote will not automatically accept the order, if you are not the admin of the remote repo.
* **Pull** is fetching and merging with the remote.
* **Sync** is pulling and then pushing to the remote. It's a function special to VS Code.