<div align="center" style=" font-size: 80%; text-align: center; margin: 0 auto">
<img src="https://raw.githubusercontent.com/Explore-AI/Pictures/master/Python-Notebook-Banners/Examples.png"  style="display: block; margin-left: auto; margin-right: auto;";/>
</div>

# Examples: Building a Python package
© ExploreAI Academy

In this notebook, we cover how to build a Python package and share it with others online. We also look at how to install the package and steps for updating it in the future.

## Learning objectives

In this lesson, we will:
* Understand how Python packages work.
* Create our own Python package.
* Distribute the package on GitHub.



## Outline

  - [Introduction](#Introduction)
  - [1. Set up project working directory](#1.-Set-up-project-working-directory)
  - [2. Build our package](#2.-Build-our-package)
  - [3. Distribute our package](#3.-Distribute-our-package)
  - [4. Install our package](#4.-Install-our-package)
  - [5. Maintaining our package](#5.-Maintaining-our-package)
  - [Conclusion](#Conclusion)

## Introduction

By now, we have made use of some popular Python packages such as `numpy` and `pandas`. It's hard to find a data scientist who hasn't, at the very least, *heard* of either of these packages. 

One of the great features of Python is that it is an **open-source programming language**, with an active community of developers who have helped, and continue to help, make it so user-friendly and versatile. 

NumPy and Pandas are just two examples of useful packages that we will come across on our data science journey. There are thousands of Python packages out there, and today we are going to **learn how to build our own**!

### Requirements

Before we get started, here are a few things we will need to do:

  - Install the [Visual Studio Code IDE](https://code.visualstudio.com/download).
  - Install [Git](https://git-scm.com/downloads).
  - Sign up for a [free GitHub account](https://github.com/).
  
**Note:** It is recommended that we be familiar with GitHub and know how to use Git.

## 1. Set up project working directory

Once all of the necessary software has been installed, we will need to **set up our project working directory**. Let's work through this process step by step to **create** our **file structure** and each of the **files we will need**.

### File structure

We navigate to a **location** in our computer's file system where we wish **to store our project** and **create a new folder**. 

We may **name the folder** whatever we like. For this tutorial, we will name it **`mypackage`**. 

From here on, we will refer to this new folder as our **project's root folder**.

> **Note:** The naming convention for Python packages is to use **short** and **all-lowercase** names. Underscores are permissible but discouraged.

<img src="https://github.com/Explore-AI/Pictures/blob/master/mypackage.jpg?raw=true" alt="Python package root folder - Windows" style="width: 60%;"/>

If you are using a Mac, your folder will look similar to this:

<img src="https://github.com/Explore-AI/Pictures/blob/master/mypackage-mac.png?raw=true" alt="Python package root folder - Mac" style="width: 60%;"/>

The end goal of this tutorial is to make our package **_pip installable_**. For this to be possible, we will need to **structure our files in a particular way**.

### Setup files

Now let's **create our files**. We will do this using the VS Code IDE we installed earlier. 

Open up the VS Code text editor then click on **`Open...`** and navigate to and select the root folder, `mypackage` , which we created in the previous step.

> VS Code has a built-in file browser that allows us to create new files and folders. The file browser is shown on the left side of our screen.

Our next step is to **create two new folders** within our project's **root folder**:

- **`mypackage`**
- **`tests`**

<img src="https://raw.githubusercontent.com/Explore-AI/Pictures/master/Python_package_1.png" width="500">

Within the **`mypackage`** folder, we **create two Python files**: 

- **`myModule.py`** : This is where we will write our function – the task we wish our package to do.
- **`__init__.py`** : This is used so that Python knows the directory is a module.

Within the **`tests`** folder, we **create one Python file**:

- **`test.py`** : This file will be our unit test, to ensure our module is working correctly before we publish our package.

Our project directory should now look like this:   

<img src="https://raw.githubusercontent.com/Explore-AI/Pictures/master/Python_package_2.png" width="400">

## 2. Build our package

Now that we have our folder structure set out, we can start writing some code! We will need to do three things:

- Create our function.
- Test our function.
- Write some documentation for our package.

### Create our function

The function we are going to create will perform the task of **returning** the **top-n items** in an **array**, in **descending order**. To do this, we will create an algorithm not too dissimilar to the Bubble sort algorithm.

> #### Docstrings
All good programmers need to know how to write **clean, concise,** and **descriptive** [docstrings](https://www.python.org/dev/peps/pep-0257/#:~:text=A%20docstring%20is%20a%20string,module%20should%20also%20have%20docstrings.) for their functions. 

Here is an example of a well-documented function:

In [None]:
def fibonacci(n):

    """
    Calculate nth term in fibonacci sequence
    
    Args:
        n (int): nth term in fibonacci sequence to calculate
    
    Returns:
        int: nth term of fibonacci sequence,
             equal to sum of previous two terms
    
    Examples:
        >>> fibonacci(1)
        1        
        >> fibonacci(2)
        1
        >> fibonacci(3)
        2
    """

    if n <= 1:
        return n

    else:
        return fibonacci(n - 1) + fibonacci(n - 2)

Having this level of documentation will help anyone who uses your function to **properly understand how the function works**.

We'll do the same for our function. Add the following code into the **`myModule.py`** file:

In [None]:
def top_n(items, n):
    """Return the top n items in an array, in descending order.

    Args:
        items (array): list or array-like object containing numerical values.
        n (int): number of top items to return.

    Returns:
        array: top n items, in descending order.

    Examples:
        >>> top_n([8, 3, 2, 7, 4], 3)
        [8, 7, 3]
    """
    
    # We add the body of the function just below the docstring:
    
    for i in range(n):  # Keep sorting until we have the top n items
        for j in range(len(items)-1-i):

            if items[j] > items[j+1]:  # If this item is bigger than next the item..
                items[j], items[j+1] = items[j+1], items[j]  # swap the two!
                
    # Get last two items
    top_n = items[-n:]
    
    # Return in descending order
    return top_n[::-1]

In [None]:
# Check whether the function works
top_n([8, 3, 2, 7, 4], 3)

This is what it should look like in VS Code:


<img src="https://raw.githubusercontent.com/Explore-AI/Pictures/master/Python_package_3.png" alt="View of our code in VS Code" style="width: 65%;"/>

Now add the following to the **`__init__.py`** file. 

> Ensure that you **save** your files.

<br>

```python
from . import myModule
```

### Testing our package

It is good practice to write some tests for every function we  create. 

> Why is this a good practice?

In **`test.py`**, add the following:

In [None]:
from mypackage import myModule

def test_top_n():
    """
    make sure top_n works correctly
    """
    
    assert myModule.top_n([8, 3, 2, 7, 4], 3) == [8, 7, 4], 'incorrect'
    assert myModule.top_n([10, 1, 12, 9, 2], 2) == [12, 10], 'incorrect'
    assert myModule.top_n([1, 2, 3, 4, 5], 5) == [5, 4, 3, 2, 1], 'incorrect'

### Creating supporting files

Next, we will need to create another file named **`setup.py`** which **describes our package**. This setup file is what makes our **package installable**. 

In our package's **root directory**, create **`setup.py`** and add the following code. 

> Replace the `url`, `author`, and `author_email` value fields with what is relevant to your package.

In [None]:
from setuptools import setup, find_packages

setup(
    name='mypackage',
    version='0.1',
    packages=find_packages(exclude=['tests*']),
    license='MIT',
    description='EDSA example python package',
    long_description=open('README.md').read(),
    install_requires=['numpy'],
    url='https://github.com/<username>/<package-name>',
    author='<Your Name>',
    author_email='<Your Email>'
)

Consult the table below for some additional information on the **parameters** in `setup.py`.

| Parameter | Comments |
|---|---|
| name | The name package managers will use for your project, like `numpy` or `pandas` |
| version | The current version number of your project |
| license | Name of the [license](https://opensource.org/licenses/) you chose |
| description | One-sentence description of your package |
| install_requires | List of all other packages this package depends on; package managers will install these automatically as needed |

Lastly, we create a **`README.md`** file in our project's root folder. Here, we add anything we would like to **describe our package in more detail**. 

> Go to [this website](https://www.makeareadme.com/) for some helpful info on how to make a proper README file.

<img src="https://github.com/James-Leslie/example-python-package/blob/master/images/4.0_readme.png?raw=true" alt="Completed code in Atom" style="width: 65%;"/>

## 3. Distribute our package

We are now ready to ship our package. Let's **package it** up and **distribute it** on GitHub.

### Distribute to GitHub

We now want to publish our package so that anyone else can **download and use it**. This is done by publishing your package to GitHub. 

> What are some of the other ways to publish a Python package?

#### a) Initialise local Git repository
Using any terminal, we **navigate to our project's root folder** and issue the following commands, one line at a time.

```bash
git init
git add .
git commit -m "First commit"
```

#### b) Create remote repository
Log into **GitHub** and **create a new repository**. 

> The following image depicts this process, where the GitHub user 'James-Leslie' is creating a new repository. Ensure that your repository is marked as **Public**.

<img src="https://github.com/James-Leslie/example-python-package/blob/master/images/4.2_new_repo.png?raw=true" width="700">


#### c) Push to GitHub
Copy the **URL for the remote repository** and issue the following commands. 

> The image below shows where you can obtain the URL.

```bash
git remote add origin <remoteURL>
git push origin master
```

<img src="https://github.com/James-Leslie/example-python-package/blob/master/images/5_new_repo.png?raw=true" width="700">


## 4. Install our package

We can now install our package onto any computer with internet access!  

We issue the command below to **install our package from GitHub**.

> Make sure to replace `your-name` and `your-repo` with the appropriate text.  

```
pip install git+https://github.com/your-name/your-repo.git
```

If you need to **install a later version** of your package, then use:  

```
pip install --upgrade git+https://github.com/your-name/your-repo.git
```

## 5. Maintaining our package

We now have a `version 0.1` of our first Python package! With this being done, we are in a position to make improvements and expand on its scope.

### Package development workflow
We follow these steps when making changes to our package:

1. Make changes locally.
2. Push changes to GitHub.
3. Install updated version.

#### 1) Make changes locally
Your package consists of **several interdependent files**. It is important to keep all of these dependencies in check.   

A likely workflow will look something like this:

- add new functions, or improve existing functions
- update `test.py` if needed
- update `__init__.py` if needed
- update `setup.py` if needed (make sure to update the version number)
    
Once we have **tested our functions**, and we are happy to push the new version, we run the same setup command as before:   
```
python setup.py sdist
```

#### 2) Push changes to GitHub
When we are **ready to publish our updated package**, we follow the commands below:

```
git status
git add .
git commit -m 'make sure to include an appropriate commit message'
git push
```

#### 3) Install the updated version
The last step is to **install our updated version**, using the command below: 

```
pip install --upgrade git+https://github.com/your-name/your-repo.git


## Conclusion

We have now built a modular Python package and published this package to GitHub. We should now understand how Python packages work and also have gained more experience using Git. Storing our projects on GitHub is a great way to share our portfolio of work with potential employers.

#  

<div align="center" style=" font-size: 80%; text-align: center; margin: 0 auto">
<img src="https://raw.githubusercontent.com/Explore-AI/Pictures/master/ExploreAI_logos/EAI_Blue_Dark.png"  style="width:200px";/>
</div>