# Python Packages

Python packages are used for functionality that is not included in the Python standard library. Anyone can write and share Python packages through the Python Package Index (PyPI) or other repositories like GitHub.

## Python modules

Python modules are `.py` files that contain functions and attributes. They can be accessed by importing them. We can import the `analysis` module because it is in the same directory as this notebook.

In [1]:
import analysis

For example, we have a function that can be used to take trial types and responses and calculate d-prime, as measure of response accuracy.

In [2]:
trial_type = ["target", "lure", "lure", "target", "target", "target"]
response = ["old", "old", "new", "new", "old", "old"]
analysis.dprime(trial_type, response)

np.float64(0.967421566101701)

We can use help to see docstrings for the module and each function, just like we can do for built-in modules like `math` and third-party modules like `numpy`.

In [3]:
help(analysis)

Help on module analysis:

NAME
    analysis - Sample module with some sample functions for data analysis.

FUNCTIONS
    dprime(trial_type, response)
        Calculate d-prime for recognition memory task responses.

        Args:
          trial_type:
            An iterable with strings, indicating whether each trial is a "target"
            or "lure".
          response:
            An iterable with strings, indicating whether the response on each trial
            was "old" or "new".

        Returns:
          The d-prime measure of recognition accuracy.

    exclude_fast_responses(response_times, threshold)
        Exclude response times that are too fast.

        Args:
          response_times:
            An iterable with response times.
          threshold:
            Threshold for marking response times. Response times less than or equal
            to the threshold will be marked False.

        Returns:
          filtered:
            An array with only the included respo

Let's try adding a new function. Edit `analysis.py` to add a function called `hello` that takes no inputs and just prints `"hello world"` when called. Add a docstring that says `"Print 'hello world'"`. Try calling it using `analysis.hello()`.

In [4]:
# analysis.hello()  # uncommment to try running the new function

You probably got an error saying `AttributeError: module 'analysis' has no attribute 'hello'`. This is because Python does not automatically update a module when you change the code. Try importing the module again using `import analysis`. Then call the function using `analysis.hello()`.

In [5]:
import analysis
# analysis.hello()  # uncommment to try running the new function

You probably got the same error again. This is because `import` only runs if the module has not been imported already. Python halts importing of previously imported modules because importing takes time, and a module may be imported in multiple places when importing a package. To avoid running the same code multiple times, will only import a module once.

What should we do then, if we want to work on developing a new module?

We can always restart the notebook kernel and re-run all the cells. This takes time, though, and can slow things down when working on a notebook.

A better solution is to use the `importlib` module to *reload* a module that has already been imported.

In [6]:
import importlib
importlib.reload(analysis)
# analysis.hello()  # uncomment to try running the new function
# help(analysis.hello)  # uncomment to see function docstring

<module 'analysis' from '/Users/morton/VS Code/datascipsych/assignments/assignment8/analysis.py'>

The `importlib.reload` function takes a module, looks up the current source code, and updates the imported module to reflect the latest code.

### Exercise: developing a module

Add a function called `ismissing` to `analysis.py` that takes in a `response` NumPy array and returns a boolean array that is `True` for items that are equal to `"n/a"`. Make a test array and check that your function works.

#### Advanced

Extend your function to also work with lists. If one of the inputs to a function may be either a list or a NumPy array, you can use `np.asarray` to either convert to a NumPy array (in case the input is a list) or do nothing (if the input is already a NumPy array).

In [7]:
# answer here

## The search path

Python does not automatically have access to all Python code on your computer. To import a module, it must be in a list of directories called the search path.

We can see the current path using `sys.path`. The `sys` module provides access to information that is specific to your system.

In [8]:
import sys
sys.path

['/Users/morton/.pyenv/versions/3.12.8/lib/python312.zip',
 '/Users/morton/.pyenv/versions/3.12.8/lib/python3.12',
 '/Users/morton/.pyenv/versions/3.12.8/lib/python3.12/lib-dynload',
 '',
 '/Users/morton/VS Code/datascipsych/.venv/lib/python3.12/site-packages',
 '/Users/morton/VS Code/datascipsych/src']

The path shown will depend on your system. Each entry in the list is one directory where Python will look for modules. The empty string indicates the current directory; if it is included, this means that modules in your current directory, like `analysis.py`, can be imported. You should also see a `site-packages` directory; this is where Python packages are installed when you run `pip install`. You may also see the `src` directory of this project. If a project is installed using a command like `pip install -e .`, with the `-e` flag, then the source code will be added to your path so you can make changes and have them immediately "installed".

We can use the `os` module to see files in a directory. Let's use it to look at the files in the current directory.

In [9]:
import os
os.listdir(".")

['analysis.py', '__pycache__', 'python_packages.ipynb']

This should show the `analysis.py` file that we've been working with. Because the current directory is automatically added to the search path, we can import modules from it.

If we look in the `site-packages` directory, we can see all the packages that have been installed into our virtual environment.

In [10]:
sp = [p for p in sys.path if p.endswith("site-packages")][0]
os.listdir(sp)[:10]

['fastjsonschema',
 'polars-1.21.0.dist-info',
 'overrides',
 'jupyterlab_rise-0.43.1.dist-info',
 'arrow-1.3.0.dist-info',
 'async_lru-2.0.4.dist-info',
 'uri_template-1.3.0.dist-info',
 'appnope',
 'packaging',
 'rfc3339_validator-0.1.4.dist-info']

Finally, if the `datascipsych` package is installed, we will have a `src` directory with that package installed. This setup helps us develop a Python package with one or more modules, which we can import and use in our notebooks.

In [11]:
sp = [p for p in sys.path if p.endswith("src")]
if sp:
    print(os.listdir(sp[0]))

['datascipsych.egg-info', 'datascipsych']


The `examples` module has the `hello` and `ismissing` functions that we wrote earlier.

In [12]:
from datascipsych import examples
help(examples)

Help on module datascipsych.examples in datascipsych:

NAME
    datascipsych.examples - Module with example functions.

FUNCTIONS
    hello()
        Print a greeting.

    ismissing(responses)
        Check if responses are n/a.

FILE
    /Users/morton/VS Code/datascipsych/src/datascipsych/examples.py


