# Code Style & Documentation

- readability
- PEP8 code style
- comments
- documentation
- linters
- versioning

## Code Readability

"Code is more often read than written" - Guido van Rossum, Python's creator

So: code should be written to be readable by humans.

Note: one of those humans is future you.

## The Zen of Python

(or, at least, its aspiration)

In [1]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


### Writing Readable Code

So how do we write good code for humans?

- Use good naming
- Use good structure
- Use code comments and include documentation

### Good Naming

Clear names are for humans. The computer doesn't care, but you and others reading your code do.

### Good Structure

If you design your program using separate functions for each task, avoid copying + pasting (functions and loops instead), and consider structure beforehand, you'll be set up for success

### Code Comments & Documentation

Helpful comments and documentation take your code to the next level. The final piece in the trifecta of readable code! 

Good code has good documentation - but code documentation should _not_ be used to try and fix unclear names, or bad structure. 

Rather, comments should add any additional context and information that helps explain what the code is, how it works, and why it works that way. 

These will all be components of your final project grade.

#### Class Question #1

What does the following code do?

```python
def ff(jj):
    oo = list(); jj = list(jj) 
    for ii in jj: oo.append(str(ord(ii)))
    return '+'.join(oo)
```

- A) Returns unicode code points, as a list
- B) Encodes a string as a cypher, returning a string of alphabetical characters
- C) Returns unicode code points, as a string
- D) Encodes inputs alphabetical characters, returned as a list
- E) This code will fail

Improvement Considerations:
- Structural considerations: indentations & spacing
- Improved naming: functions & variables
- Add Comments within code

#### Class Question #2

What does the following code do?


```python
def return_unicode(input_list):
    string = list()
    input_list = list(input_list)

    for character in input_list: 
        string.append(str(ord(character)))

    output_string = '+'.join(string)
    return output_string
```

- A) Returns unicode code points, as a list
- B) Encodes a string as a cypher, returning a string of alphabetical characters
- C) Returns unicode code points, as a string
- D) Encodes inputs alphabetical characters, returned as a list
- E) This code will fail

Improvement Considerations:

- Structural considerations: indentations & spacing
- Improved naming: functions & variables
- Add Comments within code
- **Proper Documentation!**

We'll make this even better near the end of the lecture.

## Code Style: The PEP 8 Style Guide

<div class="alert alert-success">
Coding style refers to a set of conventions for how to write code for readability and consistency.
</div>

<div class="alert alert-success">
Python Enhancement Proposals (PEPs) are proposals for how something should be / work in the Python programming language. 
</div>

These are written by the people responsible for the Python programming language. PEPs are voted on before incorporation.

<a href="https://www.python.org/dev/peps/pep-0008/">PEP8</a> is an accepted proposal that outlines the style guide for Python. Read it if you write Python other people, they may expect that you follow its conventions.

Here are the select rules we care about:

### Naming Style

- CapWords (leading capitals, no separation) for Classes
- snake_case (all lowercase, underscore separator) for variables, functions, and modules

### Use Blank Lines

- Use 2 blank lines between functions & classes, and 1 between methods
- Use 1 blank line between segments to indicate logical structure

### Avoid Long Lines

- PEP8 recommends that each line be at most 79 characters long

Computers used to require this because of the width of the display (or teletype). They don't anymore. But even today, super long lines are hard to read at a glance.

If your lines are too long, you can make them multi-line:

In [None]:
my_long_list = [1, 2, 3, 4, 5,
                6, 7, 8, 9, 10]

In [None]:
# Note: you can explicitly indicate a new line with '\'
my_string = 'Python is ' + \
            'an alright language.'

## Comments

#### Unhelpful Comments

In [None]:
# This is a loop that iterates over elements in a list and calls each one
for element in list_of_elements:
    element()

The comment adds no information over what the code says.

What are the elements? Why do we care?

#### Better Comments

In [None]:
# We already filtered the list down to the things
# that are due today, so now we do each one.
for element in list_of_elements:
    # Each element is a function, so we run it
    element()

Better, now the comments give some explanation.

But, with more descriptive names we could comment even less.

#### The Best

Although it's not always possible to forego all comments, **careful naming is better than commenting!**

With better names, we can use shorter and fewer comments.

In [None]:
# Run all tasks due today.
for task_func in tasks_due_today:
    task_func()

The names describe what the things are:
    
- `tasks_due_today` sounds like a list of tasks that are due today.
- `task_func` sounds like a function that does the task

Because what we do here in the for-loop is a little unusual – it's uncommon to loop through a list of _functions_ – we still include the short comment "Run all tasks due today" just so the reader can know our goal and understand why we call `task_func()`.

(The short comment also lets the reader know what the loop does at a glance so they can then decide whether to even bother reading the code for the loop.)

How to use comments:
- Generally:
    - focus on the *how* and *why*, over literal 'what is the code'
    - explain any context needed to understand the task at hand
    - give a broad overview of what approach you are taking to perform the task
    - if you're using any unusual approaches, explain what they are, and why you're using them
- Comments need to be maintained as the code changes

**Out-of-date comments are worse than no comments at all!** Keep your comments up-to-date.

## Code Style: Comments

- **Block comments** are comments that occur on their own line(s) of code
```python
# This is a block comment
# about some code here
some
code
here
```


- **Inline comments** are comments that occur at the end of a line of code

```python
some_line_of_code  # This is an inline comment about some_line_of_code
```


#### Block comments
- apply to some (or all) code that follows them
- are indented to the same level as that code. 
- Each line of a block comment starts with a # and a single space

Ugly block comment:

In [None]:
import random

def week_9():
#help try to destress students by picking one thing from the following list using random
    statements = ["You've totally got this!","You're so close!","You're going to do great!","Remember to take breaks!","Sleep, water, and food are really important!"]
    out = random.choice(statements)
    return out

week_9()

Nicely formatted block comment:

In [None]:
def week_9():
    
    # Randomly pick from list of de-stressing statements
    # to help students as they finish the quarter.
    statements = ["You've totally got this!", 
                  "You're so close!", 
                  "You're going to do great!",
                  "Remember to take breaks!",
                  "Sleep, water, and food are really important!"]
    
    out = random.choice(statements)
    
    return out

week_9()

#### Inline comments
- occur on the line of code they are talking about
- should be separated by at least two spaces from the statement
- start with a # and a single space

In [None]:
# Ugly
week_9()#words of encouragement

In [None]:
# Idiomatic
week_9()  # words of encouragement

## Code Documentation

**Comments** are text written directly in the code, typically directed at developers - people reading and potentially writing the code.

**Documentation** is descriptions and guides written for code users. 

## Docstrings

Docstrings are a primary form of documentation (info for code users).

<div class="alert alert-success">
<b>Docstrings</b> are in-code text that describe modules, classes and functions. They describe the operation of the code.
</div>

### Example Docstring

(In NumPy style.)

In [None]:
def add(num1, num2):
    """Add two numbers together. 
    
    Parameters
    ----------
    num1 : int or float
        The first number, to be added. 
    num2 : int or float
        The second number, to be added.
    
    Returns
    -------
    answer : float
        The result of the addition. 
    """
    
    answer = num1 + num2
    
    return answer

Docstrings

- multi-line string that describes what's going on
- starts and ends with triple quotes `"""`

**[NumPy style](https://numpydoc.readthedocs.io/en/latest/format.html)** is a particular convention for docstrings. It's the format we will ask you to use for your final project. [Here is a longer example.](https://numpydoc.readthedocs.io/en/latest/example.html#source)

- one sentence overview at the top - the task/goal of function
- **Parameters** : description of function arguments, keywords & respective types
- **Returns** : explanation of returned values and their types

It's a lot to write, but you write all this when you are writing code that you expect _others_ to use.

The term **API (Application Programming Interface)** refers to the code you tell other people to use.

NumPy style docstrings are a way to describe your API for others.

### Docstrings are available through the code

In [None]:
add?

In [None]:
# The `help` function prints out the `docstring` 
# of an object (module, function, class)
help(add)

In [None]:
# Docstrings get stored as the `__doc__` attribute
# can also be accessed from there
print(add.__doc__)

#### Docstrings can also be available *outside* of the source code.

For example, the [online pandas documentation](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html#pandas-dataframe) is rendered from the [docstrings in the code](https://github.com/pandas-dev/pandas/blob/v2.1.3/pandas/core/frame.py#L491-L12281).

#### Class Question #3

What should be included in a docstring?

1) Input arguments and their types  
2) A brief overview sentence about the code  
3) A copy of the `def` line  
4) Returned variables and their types  
5) A step by step description of the procedure used in the code  

- A) 1, 4 
- B) 2, 3, 5
- C) 3, 5 
- D) 1, 2, 4 
- E) 1, 2, 3, 4, 5

### Projects & Documentation

**Do *all* of my functions need NumPy-style documentation?**

- Your original code definitely needs NumPy-style docstrings (required)
    - Exception: if you write a teeny tiny function to accomplish a super small task _and_ you don't intend for users of your code to call that function directly _and_ how the function works is obvious with a quick glance at the code, then you do not need docstring.
- If you include functions from an assignment, you do not have to document them.
    - But if they are main functions in the code (would be used by others) it is better to add documentation them

### Practice

Together, let's try to:

1. improve the names
2. Document this code

In [None]:
def f(input):
    a = []
    b = list(input)
    for x in b: 
        a.append(str(ord(x)))    
    c = '+'.join(a)
    return c


f(['m', 'y', 'c', 'o'])

### A Note About the Real-World: Documentation

**Documentation in a large software project** may include many things, docstrings included:

- A `README` is a file that provides an overview of the project to potential users and developers
- A `LICENSE` file specifies the license under which the code is available (the terms of use)
- An `API Reference` is a collection of the docstrings, listing public interfaces, parameters and return values
- Tutorials and/or Examples show examples and tutorials for using the codebase

Projects may have **online documentation sites** so you can access the docs on the web.

#### Function Documentation Example
`numpy.array` :
https://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html

#### Package Documentation Example
**scikit learn** (`sklearn`) : https://scikit-learn.org/stable/index.html

## A Note About the Real-World: Linters

<div class="alert alert-success">
A linter is a tool that analyzes code for both programmatic errors and stylistic issues. 
</div>

`pylint` is available from Anaconda to check this for you. (Not available on datahub.)


```python
# to install on datahub
!pip install --user pylint
```

In [None]:
!pip install --user pylint

Let's lint this code:

```python
def MyFunction(input_num):
    
    my_list = [0,1,2,3]
    if 1 in my_list: ind = 1
    else:
      ind = 0
    qq = []
    for i in my_list [ind:]:
        qq.append(input_num/i)
    return qq
```

It should already be saved as `linter_example.py`.

In [None]:
# Check using pylint
!pylint linter_example.py

## A Note About the Real-World: Software Versioning

When you make changes to the software you've released into the world, you change the version of that software to let people know changes have occurred.

You've probably seen version numbers like `3.11.6`, there is some meaning to those numbers. The rules can be [dizzying](https://www.python.org/dev/peps/pep-0440/#version-scheme), so we'll simplify for now:

- `<MAJOR>.<MINOR>`
    - i.e. 1.3
    
- `<MAJOR>.<MINOR>.<MAINTENANCE>`
    - i.e. 1.3.1

- `<MAJOR>` - increase by 1 w/ incompatible API (Application Programming Interface) changes
- `<MINOR>` - increase by 1 w/ added functionality in a backwards-compatible manner
- `<MAINTENANCE>` - (aka patch) increase by 1 w/  backwards-compatible bug fixes.

When `<MAJOR>` is `0`, that indicates the package is still in development. Usually `1.0.0` is the first version you expect the broader public to use.

In [None]:
# see version information
import pandas as pd
pd.__version__

In Python package development... when `<MAJOR>` == 0, suggests a package in development

In [None]:
# see version information on command line
!pip show numpy