# Introduction to Writing Functions in Python 

## Write efficient modular code using functions

In the introduction to DRY code, you learned about DRY code principle

Functions can help you both eliminate repetition and improve efficiency in your code through modularity.

Modularity means that code is separated into independent units that can be reused and even be combined to complete a longer chain of tasks

## The benefits of functions

- Modularity: Functions only need to be defined once in a workflow (e.g. Jupyter Notebook file, script). Functions that you write for specific tasks can be used over and over, without needintg to redefine the function again. A function that you write for one Python workflow can also be reused in other workflows!
- Fewer variables: When you run a function, the intermediate variables (i.e. placeholders) that it creates are not stored as explicit variables unless you specify otherwise. This saves memory and keeps your Python environment cleaner.
- Better documentation: Well-documented functions help other users understand the steps of your processing and helps your future self to understand previously written code.
- Easier to maintiain and edit your code: Because a function is only defined once in the workflow, you can simply just update the original function definition Then, each instance in which you call that function in your code (i.e. when same task is performed) is automatically updated.
- Testing: You won't learn about this in this class, but writing functions allows you to more easily test your code to identify issues (i.e. bugs).

### Write modular functions and code

A well-defined function only does one thing, but it does it well and often in a variety of contexts. Often, the operations contained in a good function are generally useful for many tasks.

Take, for instance, the numpy function called mean(), which computes mean values from a numpy array.

This function only does one thing (i.e. computes a mean); however, you may use the np.mean() function many times in your code on multiple numpy arrays because it has been defined to take any numpy array as an input.

In [2]:
import numpy as np

arr = np.array([1, 2, 3])

# Calculate mean of input array
np.mean(arr)

2.0

The np.mean() function is modular, and it can be easily combined with other functions to accomplish a variety of tasks.

### Functions create fewer variables

When you code task line by line, you often create numerous intermediate variables that you do not need to use again.

This is inefficient and can cause your code to be repetitive, if you are constantly creating variables that you will not use again.

Functions allow you to focus on the inputs and the outputs of your workflow, rather than the intermediate steps, such as creating extra variables that are not needed.

## Reasons why functions improve code readability

### Functions cna result in better documentation

Ideally, your code is easy to understand and is well-documented with Markdown. However.

Well-written functions help you document your workflow because:
- They are well-documented by clearly outlining the inputs and outputs.
- They use descriptive names that help you better understand the task that the function performs.

### Expressive function names make code self-describing

When writing your own functions, you should name functions using verbs and/or clear labels to indicate what the function does (i.e. in_to_mm for converting inches to mm).

This makes your code more expressive, and easier to read for you, your future self, and your colleagues.

### Modular code is easier to maintain and edit

Organising your code using functions from the beginning allows you to explicitly document the tawsks that your codeperforms, as all code and documentation for the function is contained in the function definition.

### You can incorporate testing to ensure code runs properly

As your code gets longer and more complex, it is prone to mistake. For example, if your analysis relies on data that gets updated often, you may want to make sure that all the columns in your spreadsheet are present before performing an analysis. Or, that the new data are not formatted in a different way.

Changes in data structure and format could cause your code to not run. Or, in worst case, your code may run but return the wrong values!

If all your code is made up of functions (with built-in tests to ensure that they run as expected), then you can control the input to the function and test that the output returned is correct for that input. It is something that would be difficult to do if all of youor code is written line by line with repeated steps.

## Summary of writing modular code with functions

It is a good idea to learn how to:
1. Modularise your code into generalisable tasks using functions.
2. Wqrite functions for parts of your code which include repeated steps.
3. Document your functions clearly, specifying the structure of the inputs and outputs with clear comments about what function can do