# Module Basics

**Chapter 7 - Learning Python, 5th Edition**

Modules are the fundamental unit of code reuse in Python. Any `.py` file is a module,
and Python's import system provides flexible mechanisms for loading, organizing, and
managing code across files. This notebook covers module creation, import variants,
the module search path, and reloading.

## What Is a Module?

A module is simply a `.py` file containing Python definitions and statements.
When you import a module, Python executes the file's code and creates a module
object whose attributes are the names defined in that file.

In [None]:
import types
import math

# A module is an object of type 'module'
print(f"type(math) = {type(math)}")
print(f"isinstance(math, types.ModuleType) = {isinstance(math, types.ModuleType)}")

# Every module has a __file__ attribute (except built-in modules)
print(f"\nmath.__file__ = {math.__file__}")
print(f"math.__name__ = {math.__name__}")
print(f"math.__package__ = {math.__package__!r}")

# Built-in modules have no __file__
import sys
print(f"\nsys has __file__: {hasattr(sys, '__file__')}")
print(f"sys is built-in: {'sys' in sys.builtin_module_names}")

# Modules are cached in sys.modules after first import
print(f"\n'math' in sys.modules: {'math' in sys.modules}")
print(f"sys.modules['math'] is math: {sys.modules['math'] is math}")

## Import Statement Variants

Python provides several forms of the `import` statement, each with different
effects on the importing namespace:

| Syntax | Effect |
|--------|--------|
| `import x` | Bind module object to name `x` |
| `import x as alias` | Bind module object to name `alias` |
| `from x import y` | Bind attribute `y` from module `x` |
| `from x import y as z` | Bind attribute `y` to name `z` |
| `from x import *` | Bind all public names from `x` |

In [None]:
# 1. import module - binds the module object
import json
data: str = json.dumps({"key": "value"})
print(f"json.dumps: {data}")

# 2. import module as alias - useful for long names or conventions
import collections as col
counter: col.Counter[str] = col.Counter("mississippi")
print(f"Counter: {counter.most_common(3)}")

# 3. from module import name - binds specific attributes
from pathlib import Path
from typing import NamedTuple

class ServerConfig(NamedTuple):
    host: str
    port: int
    data_dir: Path

config = ServerConfig(host="localhost", port=8080, data_dir=Path("/tmp/data"))
print(f"Config: {config}")

# 4. from module import name as alias - rename on import
from datetime import datetime as dt
now: dt = dt.now()
print(f"Now: {now.isoformat()}")

# 5. from module import * - imports all public names (avoid in production code)
# This imports everything not prefixed with underscore, or everything in __all__
# from math import *  # Avoid: pollutes namespace, hides name origins

## Module Search Path (sys.path)

When you `import x`, Python searches these locations in order:

1. `sys.modules` cache (already imported modules)
2. Built-in modules (`sys.builtin_module_names`)
3. Directories listed in `sys.path`

`sys.path` is initialized from:
- The directory containing the input script (or current directory)
- `PYTHONPATH` environment variable
- Installation-dependent defaults (site-packages, etc.)

In [None]:
import sys
from pathlib import Path

# sys.path is a list of strings - Python searches these for modules
print("Module search path (sys.path):")
for i, path in enumerate(sys.path[:8]):
    exists = Path(path).exists() if path else "(empty = cwd)"
    print(f"  [{i}] {path or '(empty string)'} - exists: {exists}")

if len(sys.path) > 8:
    print(f"  ... and {len(sys.path) - 8} more entries")

# Built-in modules are found without sys.path
print(f"\nBuilt-in modules ({len(sys.builtin_module_names)} total):")
print(f"  {', '.join(sorted(sys.builtin_module_names)[:12])}...")

# You can modify sys.path at runtime (useful for development)
custom_path = "/tmp/my_custom_modules"
if custom_path not in sys.path:
    sys.path.insert(0, custom_path)
    print(f"\nAdded '{custom_path}' to sys.path[0]")

# Clean up
sys.path.remove(custom_path)
print(f"Removed '{custom_path}' from sys.path")

## `__name__` and the `__main__` Pattern

Every module has a `__name__` attribute. When a file is run directly,
`__name__` is set to `"__main__"`. When imported, `__name__` is set to the
module's qualified name. This enables the common `if __name__ == "__main__"` pattern.

In [None]:
import math
import json

# When imported, __name__ is the module name
print(f"math.__name__ = {math.__name__!r}")
print(f"json.__name__ = {json.__name__!r}")

# In a notebook or script run directly, __name__ is '__main__'
print(f"This notebook's __name__ = {__name__!r}")


# The if __name__ == '__main__' pattern in a module file:
example_module_code = '''
"""Example module demonstrating the __main__ guard."""

def calculate_average(values: list[float]) -> float:
    """Calculate the arithmetic mean."""
    if not values:
        raise ValueError("Cannot average empty sequence")
    return sum(values) / len(values)


def main() -> None:
    """Entry point when run as a script."""
    test_data = [10.0, 20.0, 30.0, 40.0]
    result = calculate_average(test_data)
    print(f"Average: {result}")


if __name__ == "__main__":
    main()
'''

print("\nTypical module structure with __main__ guard:")
print(example_module_code)

## Module Attributes and `dir()`

Every module has built-in attributes (dunder attributes) and user-defined
attributes. The `dir()` function lists all names defined in a module.

In [None]:
import json
from types import ModuleType

# dir() lists all attributes of a module
json_attrs: list[str] = dir(json)
print(f"Total attributes in json module: {len(json_attrs)}")

# Separate dunder attributes from public API
dunder_attrs: list[str] = [a for a in json_attrs if a.startswith("__")]
private_attrs: list[str] = [a for a in json_attrs if a.startswith("_") and not a.startswith("__")]
public_attrs: list[str] = [a for a in json_attrs if not a.startswith("_")]

print(f"\nDunder attributes ({len(dunder_attrs)}): {dunder_attrs[:6]}...")
print(f"Private attributes ({len(private_attrs)}): {private_attrs[:6]}")
print(f"Public attributes ({len(public_attrs)}): {public_attrs}")

# Key module dunder attributes
print(f"\n--- Key Module Attributes ---")
print(f"__name__:    {json.__name__!r}")
print(f"__file__:    {json.__file__!r}")
print(f"__doc__:     {json.__doc__[:80] if json.__doc__ else None!r}...")
print(f"__package__: {json.__package__!r}")
print(f"__spec__:    {json.__spec__!r}")

# Check if an attribute exists without importing it
print(f"\nhasattr(json, 'dumps'): {hasattr(json, 'dumps')}")
print(f"hasattr(json, 'nonexistent'): {hasattr(json, 'nonexistent')}")

# getattr with default for safe access
encoder_cls = getattr(json, 'JSONEncoder', None)
missing_cls = getattr(json, 'XMLEncoder', None)
print(f"\ngetattr(json, 'JSONEncoder'): {encoder_cls}")
print(f"getattr(json, 'XMLEncoder', None): {missing_cls}")

## Module Reload with `importlib`

Modules are cached in `sys.modules` after the first import. Subsequent imports
return the cached version. Use `importlib.reload()` to force re-execution of
a module's code, which is useful during interactive development.

In [None]:
import importlib
import sys
import types
import tempfile
from pathlib import Path

# Create a temporary module to demonstrate reload behavior
tmp_dir = tempfile.mkdtemp()
module_path = Path(tmp_dir) / "counter_mod.py"

# Write version 1 of the module
module_path.write_text(
    'VERSION: str = "1.0"\n'
    'LOAD_COUNT: int = 1\n'
    'print(f"Module loaded: version={VERSION}")\n'
)

# Add tmp_dir to sys.path so we can import it
sys.path.insert(0, tmp_dir)

# First import - executes the module code
import counter_mod
print(f"After import: VERSION={counter_mod.VERSION}, LOAD_COUNT={counter_mod.LOAD_COUNT}")

# Second import - returns cached version, does NOT re-execute
import counter_mod
print(f"After re-import: VERSION={counter_mod.VERSION} (no reload message)")

# Update the module file
module_path.write_text(
    'VERSION: str = "2.0"\n'
    'LOAD_COUNT: int = 2\n'
    'NEW_FEATURE: str = "added in v2"\n'
    'print(f"Module loaded: version={VERSION}")\n'
)

# importlib.reload() forces re-execution
importlib.reload(counter_mod)
print(f"After reload: VERSION={counter_mod.VERSION}, LOAD_COUNT={counter_mod.LOAD_COUNT}")
print(f"New attribute: NEW_FEATURE={counter_mod.NEW_FEATURE}")

# Important caveat: existing references are NOT updated
# If you did 'from counter_mod import VERSION' before reload,
# your local VERSION would still be "1.0"

# Clean up
sys.path.remove(tmp_dir)
del sys.modules['counter_mod']
module_path.unlink()
Path(tmp_dir).rmdir()
print("\nCleaned up temporary module")

## Creating Modules Programmatically

You can create module objects at runtime using `types.ModuleType`. This is
useful for dynamic code generation, testing, and plugin systems.

In [None]:
import types
import sys


def create_config_module(
    name: str,
    settings: dict[str, object],
) -> types.ModuleType:
    """Dynamically create a configuration module from a dictionary."""
    module = types.ModuleType(name, doc=f"Dynamic config module: {name}")
    for key, value in settings.items():
        setattr(module, key, value)

    # Register in sys.modules so it can be imported elsewhere
    sys.modules[name] = module
    return module


# Create a config module dynamically
app_config = create_config_module("app_config", {
    "DATABASE_URL": "postgresql://localhost/mydb",
    "DEBUG": True,
    "MAX_CONNECTIONS": 10,
    "ALLOWED_HOSTS": ["localhost", "127.0.0.1"],
})

# Now it can be imported anywhere in this process
import app_config

print(f"type: {type(app_config)}")
print(f"DATABASE_URL: {app_config.DATABASE_URL}")
print(f"DEBUG: {app_config.DEBUG}")
print(f"MAX_CONNECTIONS: {app_config.MAX_CONNECTIONS}")
print(f"ALLOWED_HOSTS: {app_config.ALLOWED_HOSTS}")
print(f"\nPublic attributes: {[a for a in dir(app_config) if not a.startswith('_')]}")

# Clean up
del sys.modules['app_config']

## Summary

### Key Concepts
1. **Any `.py` file is a module** - Python executes the file and creates a module object
2. **Import variants** - `import x`, `from x import y`, `import x as z` serve different use cases
3. **`sys.path`** determines where Python looks for modules to import
4. **`__name__ == "__main__"`** distinguishes between running a file directly vs importing it
5. **`dir()` and `getattr()`** provide runtime introspection of module contents
6. **`importlib.reload()`** forces re-execution of a module (but does not update existing references)

### Best Practices
- Prefer `import module` over `from module import *` for clarity
- Use `import x as alias` for conventional abbreviations (e.g., `import numpy as np`)
- Always include an `if __name__ == "__main__"` guard in executable scripts
- Avoid modifying `sys.path` in production code; use proper package installation instead