# 10 Modules Extra


## 10.1 Introduction
We know how to make functions, but how can you re-use them? Imagine that you've started writing code and functions in one file and the project has grown to such an extent that it would be easier to maintain it in different files each containing a specific part of the project. Or you want to re-use some of the functions in other projects as well. 

In Python you can import functions and chunks of code from files. Such a file containing the functions is called a *module*. Generally we say that we import a *definition* from a *module*. A module can have one or multiple functions in it. 
The file name is the module name with the suffix `.py` appended. 

Using the code from this module is possible by using **import**. In this way you can import your own functions, but also draw on a very extensive library of functions provided by Python (built-in modules). In this extra sections part we will look at the syntax for imports and how to import your own functions.

## 10.2 How imports work
The easiest way to import a module looks like this:

```python
import module1
```

Imagine that in the module `module1`, there is a function called `getMeanValue()`. This way of importing does not make the name of the function available; it only remembers the module name `module1` which you can than use to access the functions within the module:

```python
import module1
module1.getMeanValue([1,2,3])
```

## 10.3 How to create your own module
The easiest example is importing a module from within the same working directory. Let's create a Python module called `module1.py` with the code of the function `getMeanValue()` that we have written earlier (and you can find here below). 

**Create a module in Jupyter Lab/Notebook**
- In order to create a module in Jupyter Lab, first create a new notebook 
- Rename the notebook (e.g. 'module1.ipynb') and copy paste the code in the notebook 
- Click 'File', 'Download as' and 'Python' 
- Jupyter will not download it in some local folder, copy it to your current working directory (in our case in the same directory as we're in right now). 

Unfortunately, Jupyter Lab/Notebook doesn't have a streamlined & straightforward way of creating Python modules and Python scripts. When you export the notebook, it will always export the whole Notebook and not just a part of it, which makes it very messy if you have a very large notebook. 

Import the following code in the `module1.py` file. 

In [2]:
# When you download this as a Python script, Jupyter will automatically insert the environment shebang here. 

def getMeanValue(valueList):
    """
    Calculate the mean (average) value from a list of values.
    Input: list of integers/floats
    Output: mean value
    """
    valueTotal = 0.0
 
    for value in valueList:
        valueTotal += value
    numberValues = len(valueList)
    
    return (valueTotal/numberValues)

## 10.4 Import syntax 
We can now use the module we just created by importing it. In this case where we import the whole 'module1' file, we can call the function as a method, similar to the methods for lists and strings that we saw earlier:

In [None]:
import module

print(module1.getMeanValue([4,6,77,3,67,54,6,5]))

If we were to write code for a huge project, long names can get exhaustive. Programmers will intrinsically make shortcut names for functions they use a lot. Renaming a module is therefore a common thing to do (e.g. NumPy as np, pandas as pd, etc.):

In [None]:
import module1 as m1

print(m1.getMeanValue([4,6,77,3,67,54,6,5]))

When importing a file, Python only searches the current directory, the directory that the entry-point script is running from, and sys.path which includes locations such as the package installation directory (it's actually a little more complex than this, but this covers most cases).

However, you can specify the Python path yourself as well. Note that within our folders there is a directory named `modules` and within this folder, there is a module named `module2` (recognizable due to its .py extension). In that module there are two functions: 'getMeanValue' and 'compareMeanValueOfLists'. 

In [None]:
from modules import module2

print(module2.getMeanValue([4,6,77,3,67,54,6,5]))

In [None]:
from modules import module2 as m2

print(m2.getMeanValue([4,6,77,3,67,54,6,5]))

Another way of writing this is with an absolute path to the module. You can explicitly import an attribute from a module.

In [None]:
from modules.module2 import compareMeanValueOfLists

print(compareMeanValueOfLists([1,2,3,4,5,6,7], [4,6,77,3,67,54,6,5]))

So here we *import* the function compareMeanValueOfLists (without brackets!) from the file *module2* (without .py extension!).

In order to have an overview of all the different functions within a module, use `dir()`:

In [None]:
dir(module2)

---
### 10.4.5 Extra exercises

Inspect the file `SampleInfo.txt`. Write a program that:

- Has a function `readSampleInformationFile()` to read the information from this sample data file into a dictionary. Also check whether the file exists.
- Has a function `getSampleIdsForValueRange()` that can extract sample IDs from this dictionary. Print the sample IDs for pH 6.0-7.0, temperature 280-290 and volume 200-220 using this function.

---

In [None]:
import os
 
def readSampleInformationFile(fileName):
 
    # Read in the sample information file in .csv (comma-delimited) format

    # Doublecheck if file exists
    if not os.path.exists(fileName):
        print(f"File {fileName} does not exist!")
        return None
 
    # Open the file and read the information
    with open(fileName) as fileHandle:
        lines = fileHandle.readlines()

    # Now read the information. The first line has the header information which
    # we are going to use to create the dictionary!

    fileInfoDict = {}

    headerCols = lines[0].strip().split(',')

    # Now read in the information, use the first column as the key for the dictionary
    # Note that you could organise this differently by creating a dictionary with
    # the header names as keys, then a list of the values for each of the columns.

    for line in lines[1:]:
 
        line = line.strip()  # Remove newline characters
        cols = line.split(',')

        sampleId = int(cols[0])

        fileInfoDict[sampleId] = {}

        # Don't use the first column, is already the key!
        for i in range(1,len(headerCols)):
            valueName = headerCols[i]
 
            value = cols[i]
            if valueName in ('pH','temperature','volume'):
                value = float(value)

            fileInfoDict[sampleId][valueName] = value

    # Return the dictionary with the file information
    return fileInfoDict

def getSampleIdsForValueRange(fileInfoDict,valueName,lowValue,highValue):
 
    # Return the sample IDs that fit within the given value range for a kind of value
 
    #sampleIdList = fileInfoDict.keys()
    #sampleIdList.sort()
    sampleIdList = sorted(fileInfoDict.keys())
    sampleIdsFound = []

    for sampleId in sampleIdList:

        currentValue = fileInfoDict[sampleId][valueName]
 
        if lowValue <= currentValue <= highValue:
            sampleIdsFound.append(sampleId)
 
    return sampleIdsFound
 
if __name__ == '__main__':
 
    fileInfoDict = readSampleInformationFile("../data/SampleInfo.txt")

    print(getSampleIdsForValueRange(fileInfoDict,'pH',6.0,7.0))
    print(getSampleIdsForValueRange(fileInfoDict,'temperature',280,290))
    print(getSampleIdsForValueRange(fileInfoDict,'volume',200,220))