# Developing Python Packages

## Introduction

These are my notes for DataCamp's course [_Developing Python Packages_](https://www.datacamp.com/courses/developing-python-packages).

This course is presented by James Fulton, Climate Informatics Researcher. Collaborators are Amy Peterson and Maggie Matsui.

Prerequisites:

- Introduction to Shell
- [Writing Functions in Python](../Writing%20Functions%20in%20Python/Writing%20Functions%20in%20Python.ipynb)

This course is part of these tracks:

- Data Scientist Professional with Python
- Python Programmer

There are no downloadable data sets for this course.

## Versions

This notebook was created using Python 3.11.2.


## From Loose Code to Local Package

### Starting a Package

#### Why Build a Package?

Build a package to:
- make your code easier to use
- avoid copying and pasting code
- keep your functions up to date
- give your code to others

#### Course Content

This course involves building a full package. The course covers:

- file layout
- structuring imports
- making the package installable
- adding licenses and READMEs
- style and unit tests for a high quality package
- registering and publishing your package to PyPI (the Python Package Index)
- using package templates

#### Scripts, Modules, and Packages

| Term | Description |
| :--- | :--- |
| sript | a Python file which is run like `python myscript.py`and which is design to do one set of tasks |
| package | a directory of Python code files to be imported (e.g., `numpy`); all of the code is related and works together |
| subpackage | a smaller package inside a package (e.g., `numpy.random`, `numpy.linalg` |
| module | a Python file inside a package which stores package code; each module stores some of the package code |
| library | either a package or a collection of packages (e.g., the Python standard library, which includes packages such as `math`, `os`, or `datetime`|

#### Directory Tree of a Package

This is an example of a directory as used in this course:

    mysimplepackage/
    |-- simplemodule.py
    |-- __init__.py

- This directory, `mysimplepackage`, is an example of a simplest Python package
- `simplemodule.py` contains all of the package code
- `__init__.py` marks this directory as a Python package

Initially, the `__init__.py` file is completely empty, but later in the course this file will be used to structure the package imports.

#### Subpackages

The directory tree for a package contains subdirectories, as in this example, where `preprocessing` and `regression` are subdirectories of `mysklearn`:

    mysklearn/
    |-- __init__.py
    |-- preprocessing
    |   |-- __init__.py
    |   |-- normalize.py
    |   |-- standardize.py
    |-- regression
    |   |-- __init__.py
    |   |-- regression.py
    |-- utils.py

Each subpackage has its own `__init__.py` file. Use subpackages to organize your code, placing related functions and classes in the same module, and related modules in the same subpackage.

#### Modules, Packages, and Subpackages (Exercise)

Name the different parts of this package directory tree:

    directory1/
    |-- __init__.py
    |-- directory2
    |   |-- __init__.py
    |   |-- file1.py
    |-- file2.py

- Module
    - file1.py
    - file2.py
- Package
    - directory1
- Subpackage
    - directory2

#### From Script to Package (Exercise)

Start with this code and convert it to a generalized function you can use on any text file for any list of search words. This will be the first function in a new library (module or package). This code comes from the course [Writing Functions in Python](../Writing%20Functions%20in%20Python/Writing%20Functions%20in%20Python.ipynb).
```python
# Open the text file
with open('alice.txt') as file:
    text = file.read()

n = 0
for word in text.split():
    # Count the number of times the words in the list appear
    if word.lower() in ['cat', 'cats']:
        n += 1

print('Lewis Carroll uses the word "cat" {} times'.format(n))
```
- Step 1: Create a new directory called textanalysis for your package.
- Step 2: Create`__init__.py` and `textanalysis.py` modules inside `textanalysis`.
- Step 3: Copy the code from `myscript.py` into `textanalysis.py`.
- Step 4: Modify `textanalysis.py` to create the function `count_words(filepath, words_list)`, which opens the text file `filepath` and returns the number of times the words in `words_list` appear.

The file `textanalysis/__init__.py` was empty.

This the code for file `textanalysis/textanalysis.py`:

```python
# File textanalysis.py
def count_words(filepath, words_list):
    with open(filepath) as file:
        text = file.read()
    
    n = 0
    for word in text.split():
        # Count the number of times the words in the list appear.
        if word.lower() in words_list:
            n += 1
    return n
```

This was the directory tree so far:

    textanalysis
    |-- __init__.py
    |-- textanalysis.py

#### Putting Your Package to Work (Exercise)

Create a script `newscript.py` that uses the textanalysis package.dsI can't find a way to get the contents of the file `hotel-reviews.txt` since commands don't work in the terminal window when I use the Safari web browser.

```python
# File myscript.py
from textanalysis.textanalysis import count_words

# Count the number of positive words
nb_positive_words = count_words("hotel-reviews.txt", ["good", "great"])

# Count the number of negative words
nb_negative_words = count_words("hotel-reviews.txt", ["bad", "awful"])

print("{} positive words.".format(nb_positive_words))
print("{} negative words.".format(nb_negative_words))
```

The result was:

    $ python3 newscript.py
    18816 positive words.
    1706 negative words.

This was the directory tree at this point:

    myscript.py
    textanalysis
    |-- __init__.py
    |-- textanalysis.py

### Documentation

#### Why Include Documentation?

Writing documentation helps your users use your code.

Document each
- function
- class
- class method

In [None]:
## Install Your Package from Anywhere

In [None]:
## Increasing Your Package Quality

In [None]:
## Reapid Package Development