# Python Best Practices
This guide is intended as general advice for Python coding within the LME. It aims to help guide behaviours for business users who wish to use Python more effectively from ad-hoc analysis through to collaborative projects which will be put into production. 

## Contents

- Code Style
- Repositories
- Testing
- Documentation
- Version Control
- PyPI
- Environments
- Object Oriented Programming
- Data Structures

## Overview


One of the most useful bits of advice is accessible as an import from within Python, and **this** is the advice below. 

In [1]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


<br>
When starting out in Python, not all of these concepts will make sense, but over time they will provide a steer back to more 'pythonic' code. When talking about a portion of code, if it is not 'pythonic', then the code does not follow the common guidelines and does not express its intent in the most readable way.

An example of non-pythonic vs pythonic can be shown with the following problem:
> Sum all numbers between 10 and 1000 (inclusive)

Firstly the non-pythonic solution:

In [2]:
a = 10
b = 1000
total_sum = 0
while b >= a:
    total_sum += a
    a += 1

Now the pythonic solution:

In [3]:
total_sum = sum(range(10, 1001))

## Code Style
**Readability first.** 

The primary resources for writing style are [PEP8](https://www.python.org/dev/peps/pep-0008/) for a general style guide and [PEP257](https://www.python.org/dev/peps/pep-0257/) for docstring conventions. These will cover things such as indentations, spacing, comments, naming conventions etc. The guiding principle is that it is read more often than it is written, so readability is the primary end goal of the style guides. 
Other useful rules of thumb can be found on [The Hitchhiker's Guide to Python](https://docs.python-guide.org/writing/style/).

Standardising these things across an organisation will make it easier to read other users' code.

## Code Repositories
### Repository Structure
A structure guide to creating a repository can be found [here](https://kenreitz.org/essays/repository-structure-and-python) with an example of that repository also available on [github](https://github.com/navdeep-G/samplemod) as a reference point.

#### Module Code

**Structuring of code is important.** Things to avoid include the following:
- Circular dependencies
- Hidden coupling
- Overuse of global states
- Spaghetti Code
- Ravioli Code

## Testing
Add in automated testing using [pytest](https://docs.pytest.org/en/latest/) or a similar package. Ideally, this should be done during development. If one can write tests before the code is written, the following benefits are acheived:
1. Describe what the code is supposed to do in concrete, verifiable terms.
2. Provide an example of how the code should be used, in a working and tested example.
3. Provide a way to verify when the code is finished.

This testing can include an example input, and an assertion of the known result, as below:

In [4]:
def test_reversed():
    assert list(reversed([1, 2, 3, 4])) == [4, 3, 2, 1]

This assert statement allows for simple tests, without the user needing to know too much about what is happening behind the curtain. All one needs to do is write a statement that they expect to evaluate to true if their function is working as intended. Pytest wil run all of these 'test' prefixed functions as one and produce an error report.

Resources to help with learning how to run tests:
- [Pytest Tutorial](https://realpython.com/pytest-python-testing/)
- [Pytest Tutorial 2](https://semaphoreci.com/community/tutorials/testing-python-applications-with-pytest)

## Documentation
For creating documentation of the project, I'd recommend using Sphinx, which has some useful features including autodocumentation, which use the docstrings inside the code to populate documentation. More on docstrings below.
- [Sphinx docs](https://pythonhosted.org/an_example_pypi_project/sphinx.html)
- [Sphinx tutorial 1](https://buildmedia.readthedocs.org/media/pdf/brandons-sphinx-tutorial/latest/brandons-sphinx-tutorial.pdf)
- [Sphinx tutorial 2](https://samnicholls.net/2016/06/15/how-to-sphinx-readthedocs/)
- [Sphinx tutorial 3](https://www.patricksoftwareblog.com/python-documentation-using-sphinx/)

Sphinx is a tool to create documentation from reStructuredText, Markdown, docstrings etc. and allows for easy output to html or pdf.

### Docstrings
In Python, documentation strings (docstrings) are a convenient way of associating documentation with python classes, functions, methods and modules. As mentioned in the Documentation section, if they are written in the right format these can easily be exported to a documentation file with automated formatting.

In [1]:
def function_with_types_in_docstring(param1, param2):
    """Example function with types documented in the docstring.

    `PEP 484`_ type annotations are supported. If attribute, parameter, and
    return types are annotated according to `PEP 484`_, they do not need to be
    included in the docstring:

    Args:
        param1 (int): The first parameter.
        param2 (str): The second parameter.

    Returns:
        bool: The return value. True for success, False otherwise.

    .. _PEP 484:
        https://www.python.org/dev/peps/pep-0484/

    """
    
    return param1 == int(param2)

Furthermore, examples can be provided within the docstrings that python can run tests against! Implementation is relatively simple using the [doctest module](https://docs.python.org/3/library/doctest.html). This means all the unit tests and the documentation for functions and classes can be written within the code itself. In practice, this helps to acheive the aim of writing tests before or during the writing of the code by integrating the test writing into the function itself.

## Version Control
Using a version control system helps teams to work collaboratively on code. When anyone makes a change to the code, this requires a comment explaining what that change was, and the software will track exactly what changed between the two versions. In practice this means that whenever code breaks, it can be tracked back to a specific change after the last tests were all passed, and then resolved. Version control also allows for branching to release slight adjustments to the code. Bitbucket is the LME's choice for version control, which primarily uses Git.

A good habit to get into is combining your automated testing with your version control, essentially making sure your code works before merging it back into the main 

### Required Installs/Registrations
Using Bitbucket requires the installation of Git on the VDI/Desktop and access to the [LME Bitbucket site](https://bitbucket.lme.co.uk/).

## Use the Python Package Index
> "Stuck on a problem? There's a Python Package for that!"

For the majority of problems you will come across when coding, there is a package that someone has built to solve that problem. Use these as much as possible, rather than reinventing the wheel yourself. <br>
It should be noted that at the time of writing, users only have access to the base packages that ship with Anaconda, and ideally they would be able to experiment using more specialist libraries. In order to do this, there may need to be some process around scanning for CVEs. This can be done using source clear as an initial check.

## Use Virtual Environments

**Create a virtual environment for each project**. This helps avoid any clashes between libraries, and allows the enviroment to be provided across multiple platforms. If code is run from its virtual environment, it will continue to work regardless of the state of the root environment, so upgrades to packages will not break code or introduce deprecation errors over time. <br>
Conda environments should be the go to for Anaconda users. Link to the docs [here](https://docs.conda.io/projects/conda/en/latest/user-guide/concepts/environments.html). This ensures that any user can pick up the project and use it without the need to alter their root environment, and can simply activate the environment used in that project. This is key to working collaboratively, and this can be done either from the command line or from the Anaconda GUI. It should be noted that the command line requires separate permissions to allow users to open anaconda prompt or similar.

## Write Object-Oriented Code

Python is an object-oriented language, and everything in Python is an object. using object-oriented programming (OOP) focuses on creating reusable code, also known as DRY (don't repeat yourself). The short version of taking advantage of OOP is to use classes and functions where you can, and the longer version can be found [here](https://python-textbok.readthedocs.io/en/1.0/).

## Data Structures

Know the basic inbuilt data structures that Python has (list, set, dictionary etc) and how to use them. These can be leveraged to help speed up code or simplify in memory data storage. A useful example is data imported in JSON or XML, as this translates well to a mixture of nested lists and dictionaries in Python, but can be difficult to map to a table, particularly with a variety of data types. 

## Learning Resources

This subreddit has as many links as you'll probably need.
https://www.reddit.com/r/learnpython/wiki/index

O'Reilly subscription - lots of books