# Agenda, week 5

1. Q&A
2. What are modules? What do they give us?
3. The various forms of the `import` statement
4. Developing a simple module
5. What happens when we `import` a module -- the special `__name__` variable
6. Python's standard library
7. PyPI and `pip` -- installing third-party modules
8. A little bit about `uv`
9. Using third-party modules -- how do you find, choose, download, and use them (and do that safely)?
10. AMA -- ask me anything
11. What next? Now that you've finished this course, where do you go for additional Python learning, resources, etc?

# What are modules? What do they give us?

We've talked a number of times about the "Don't repeat yourself" rule in programming, aka "DRY." If you have written something once, you shouldn't repeat it!

1. If the same line is repeated several times in a row, we should replace that with a loop.
2. If the same code is repeated several times in a program, we should replace that with a function.
3. If the same code is repeated several times across multiple programs, we can use a *library*.

A library is a term used in many programming languages to describe either code or data that was defined by someone else, in the past, which you can load into ("borrow," if you will) your program. You don't have to reinvent the wheel this way!

Just about every programming language supports libraries. This way, you can save yourself time in the future, or you can save your colleagues time, or they can save you time.

Moreover, if there is a bug in a library, it can be fixed once and affect all of our programs. Or if there's an optimization/fix/improvement, then we can fix it once, and affect all of our programs.

In Python, we call our libraries "modules" and "packages":

- A module is a single file in which we've defined Python functions and/or variables
- A package is a directory containing one or more modules.

When we use a module in Python, we get to reuse code and data that other people have defined.

A module also defines what we call a "namespace," a way to ensure that variable names are separated from one another, so different pieces of your program don't step on each other's variables.

# AP: What do you mean by "separate namespaces" for data and functions?

If I have a software product called ABCDE, and I take out a trademark on that product, then no one else in the software industry can have a product called ABCDE, because that would be confusing.

But if someone comes out with a car and calls it ABCDE, then I cannot stop them from having that trademark, because it's a different field.

You could say that even though the name is the same, it's in two different "namespaces," it has two different domains.

Another example: My name is "Reuven," and there are other Reuvens around. How do we distinguish between me and the others? We use last names! Last names serve as a "namespace," ensuring that I am easily distinguished from other Reuvens out there.

The problem of namespacing is basically that we need to resolve the ambiguity when there are multiple thigns with the same name.

In many languages, the namespace for functions and the namespace for variables is kept separate. You can have variables `a`, `b`, and `c`, and these names have nothing whatsoever to do with the functions called `a`, `b`, and `c`. They will never interfere with one another. (Whether this is a good idea for you to do in your program is separate.)

Python doesn't make this distinction. In Python, we have a single namespace, a single last name, a single domain for trademarks, for all of the things we define, both functions and variables. 

- If you define a variable `print`, you have just erased (in some ways) the builtin function `print`.   Only one can exist at a time! (This isn't quite true in the case of `print` and other builtin functions, but it will feel that way if you make such a definition!)
- If you define a function `total` that takes a bunch of numbers and returns their total, and then you invoke the function and assign the result to a variable named `total`, you have now erased the function's definition! Because now `total` is an integer, not a function.

Among other things, this means that you have to be a bit careful about what names you give to variables and functions, to ensure that they don't collide in this sort of way.

When we use `def` to define a function, we're really assigning to a variable. 

# How do we use a module in Python?

Python comes with a large collection of modules known as the "standard library." One of the modules in there is `random`, which contains a large number of functions having to do with random values -- getting random integers, choosing random elements of a list, etc.

If we want to use that functionality (rather than invent it ourselves), we need to tell Python that we want to load up the `random` module. How do we do that?

We use the `import` statement. It looks like this:

In [1]:
import random

# Dive into the syntax of `import`

1. We say `import`. Note that it's not a function! We don't put the name to its right in `()`.
2. The name to its right, unlike *most* programming languages I've used in the last, is not a string indicating the filename that we want to import or read. Rather, it's the name of the variable into which the module's values will be assigned.
3. There is no (easy, standard) way to use `import` with the name of a file. Rather, you provide the name of the module, and Python uses that as the basis for finding and loading the module you named.

Now that I've imported `random`, what can I do with it? 

Answer: Use any/all of the values that it now contains.

Well, what does it contain?

We can find out in a few different ways. The easiest (in Jupyter) is to run the `dir` function, which returns a list of strings, names that can go after a `.` after the module name.

If `dir(random)` includes the string `randint`, then we can use `random.randint` as a value or function, depending on what it is. I know that `random.randint` is a function, and we can then call

    random.randint(0, 100)

to get a random integer from 0-100.

If you want documentation showing not just the names, but what they do, then you can go to `docs.python.org` and look up the `random` module.

# Exercise: Guessing game

1. Have the computer choose a random integer from 0-100 using `random.randint` and assign to `number`.
2. Ask the user, repeatedly, to guess a number:
    - If it's too high, say "too high"
    - If it's too low, say "too low"
    - Otherwise, say "just right" and exit
3. If it's not right, then keep going, and ask the user to guess again.
4. When exiting the program, print not only the number, but how many guesses it took.

In [3]:
import random

number = random.randint(0, 100)
guess_counter = 0

while True:
    s = input('Guess: ').strip()

    if not s.isdigit():
        print('Not numeric; try again!')
        continue

    guess = int(s)
    guess_counter += 1

    if guess == number:
        print('You got it!')
        break
    elif guess < number:
        print('Too low!')
    else:
        print('Too high!')

print(f'Number was {number}, got it in {guess_counter} guesses.')        

Guess:  50


Too low!


Guess:  75


Too high!


Guess:  62


Too high!


Guess:  57


Too high!


Guess:  53


Too high!


Guess:  52


Too high!


Guess:  51


You got it!
Number was 51, got it in 7 guesses.


# Where did Python load `random` from?

We said `import random`, and we know that the `random` variable was assigned a module object, and that the module was populated from somewhere. But where? 

We can ask `random` to tell us about itself:

In [4]:
random

<module 'random' from '/Users/reuven/.pyenv/versions/3.13.5/lib/python3.13/random.py'>

How did saying `import random` get translated into loading the file in my `.pyenv` directory, deep down?

`sys.path` is the answer. The `sys` module is Python's runtime system. When Python runs, it consults `sys` all of the time. You have to `import sys` to use it, but that just defines the name; the module is loaded at runtime.

`sys.path` is a list of strings that Python consults to find modules we're loading.

In [5]:
import sys

sys.path

['/Users/reuven/.pyenv/versions/3.13.5/lib/python313.zip',
 '/Users/reuven/.pyenv/versions/3.13.5/lib/python3.13',
 '/Users/reuven/.pyenv/versions/3.13.5/lib/python3.13/lib-dynload',
 '',
 '/Users/reuven/.local/lib/python3.13/site-packages',
 '/Users/reuven/.pyenv/versions/3.13.5/lib/python3.13/site-packages']

In [6]:
random

<module 'random' from '/Users/reuven/.pyenv/versions/3.13.5/lib/python3.13/random.py'>

What if I want Python to look in other places? There are ways to do it. One way is with environment variables, set in your OS. 

# Can I just invoke `randint`?

What if I get tired of writing `random.randint` all of the time? I just want to write `randint`! Will that work?

In [7]:
randint(0, 100)

NameError: name 'randint' is not defined

# `from .. import`

The special `from .. import` syntax lets us import a module, and then instead of assigning the module's name as a variable, we get the function we named.

If I were to say

    from random import randint

then I wouldn't be able to say `random.randint`. That's beacuse the only thing we defined in the above line is one variable, `randint`. `random` isn't defined, and thus `random.randint` isn't avaiable.

Can you say both

    import random
    from random import randint

Yes, and sometimes I even encourage that.    

# What if I want to give the module an alias?

What if, instead of importing the module and getting its name dictated by its filename, I want to have the module variable defined as a different name.

For this, we have `import .. as`. This imports as per usual, but we get to define an alias for our module.

This is useful in two cases:

1. The name is super long and annoying
2. The name appears in more than one place, and we want to ensure that there aren't collisions.

In [11]:
import random as r

In [12]:
r.randint(0, 100)

57

# The fourth version: `from MODULE import NAME as ALIAS`

Then I can say


In [13]:
from random import randint as ri

# The four ways to use `import`:

1. `import MODNAME` -- the standard way to do things, which imports the entire module and defines `MODNAME` AS A VARIABLE
2. `IMPORT MODNAME AS ALIAS` -- just rename the variable that will be used for the module
3. `from MODNAME import NAME` -- don't define `MODNAME`, but do define `NAME`
4. `from MODNAME import NAME as ALIAS` -- rename the imported `NAME` as `ALIAS`.

# Importing more than one module

Each `import` needs to be on a line by itself.

If you want to import from `abcd` and `efgh` and `ijkl`, you have to say:

    import abcd
    import efgh
    import ijkl

If you're using `from .. import`, then you can import multiple names from the same module:

    from abcd import thing1, thing2
    

# Exercise: Another guessing game!

1. Define a list of strings, where each string represents the months of the year.
2. Use `random.choice`, a function in the `random` module, to select one of the months. However, you don't want to invoke it as `random.choice`, but rather as just `ch`, for short. You'll call `ch` on the list of month names, and get a random one.
3. Ask the user, repeatedly, to guess the month that was selected.
4. When they answer correctly, indicate how many times they guessed.

In [14]:
months = 'Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec'.split()

In [15]:
months

['Jan',
 'Feb',
 'Mar',
 'Apr',
 'May',
 'Jun',
 'Jul',
 'Aug',
 'Sep',
 'Oct',
 'Nov',
 'Dec']

In [16]:
# by using from .. import .. as
# - I imported the random module
# - I said that I want to define just the "choice" function from the module, not the module itself
# - but I want to alias "choice" to something else, "ch"

from random import choice as ch

In [20]:
ch(months)

'Dec'

In [21]:
from random import choice as ch

months = 'Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec'.split()
random_month = ch(months)
guess_counter = 0

while True:
    guess = input('Guess the month: ').strip().capitalize()[:3]
    guess_counter += 1

    if guess == random_month:
        print('You got it!')
        break

print(f'You got it in {guess_counter} guesses!')    

Guess the month:  Jul
Guess the month:  May
Guess the month:  Mar
Guess the month:  Jan
Guess the month:  Feb
Guess the month:  Apr
Guess the month:  Sep
Guess the month:  Oct
Guess the month:  Nov
Guess the month:  Dec
Guess the month:  Aug


You got it!
You got it in 11 guesses!


Many, *many* famous modules expect us to alias them when we import them:

- `import numpy as np`
- `import pandas as pd`
- `import matplotlib as plt`
- `from plotly import express as px`

# Next up

1. Write a simple module and `import` it
2. How do modules work?

# Writing a module

Writing a module is one of the easiest things you can do in Python! (Distributing it, packaging it up, etc., can be frustrating and confusing.) However, if you're writing a module that will be used in one particular program, then you can get away with just putting the module file inside of the program's directory. That's because `import` always looks first in the current directory.

What is a module?

- A file
- containing Python variable and function definitions
- with a `.py` suffix
- in a directory from which it can be loaded (in this case, the same as our program or our Jupyter notebook)

In [22]:
# right now, in the same directory as this Jupyter notebook, is an empty file
# called mymod.py

# if I say "import mymod", will it work?

import mymod

In [23]:
dir(mymod)  # show me all of the names defined on this module

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__']

# What is defined on an empty module?

Python did take our `import mymod` statement, look for `mymod.py` in the current directory, and even found it there! It then loaded the module's contents (very fast and easy) into memory, and all of the things that were defined there (which is none) were set to be attributes on `mymod`, the module object.

But even though we didn't define anything, Python does, behind the scenes. In this case, they're all "dunder" names, meaning they start and end with "double underscore."

A few of these:

- `__builtins__` -- this is an alias to the `builtins` namespace in Python, where such things as `print`, `int`, `dict`, and `len` are all defined
- `__file__` -- the name of the file that was loaded, as a string
- `__name__` -- the name of the module that was loaded, as a string

In [24]:
mymod.__name__

'mymod'

In [25]:
mymod.__file__

'/Users/reuven/Courses/Current/oreilly-2025-08August-python/mymod.py'

In [26]:
# after I have added definitions for x, y, and hello, I'll reload my module

import mymod

In [27]:
dir(mymod)

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__']

# Why don't I see the new names?

The rule is that Python will only import a module *once* during the run of any given program. Since we had already loaded `mymod` earlier, even though it's an earlier edition, Python won't load a new one.

Some options:

1. Restart Python ("restart kernel" in Jupyter)
2. `import importlib`, and then invoke `importlib.reload(mymod)`, which forces a reload

In [28]:
from importlib import reload  # now I have a "reload" function at my disposal

In [29]:
reload(mymod)   # reload our mymod module

<module 'mymod' from '/Users/reuven/Courses/Current/oreilly-2025-08August-python/mymod.py'>

In [30]:
dir(mymod)

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'hello',
 'x',
 'y']

In [31]:
mymod.x

100

In [32]:
mymod.y

[10, 20, 30]

In [33]:
mymod.hello('world')

'Hello, world from mymod!'

# Mapping the names

If you define a variable in your module file, it'll be available as an attribute on the module object after `import`. Meaning:

- In `mymod.py`, I defined `x`. After importing `mymod`, we have `mymod.x`.
- In `mymod.py`, I defined `y`. After importing `mymod`, we have `mymod.y`.
- In `mymod.py`, I defined the function `hello`. After importing `mymod`, we have `mymod.hello`.

# Exercise: `menu` module

1. Create a new file in the same directory as Jupyter, called `menu.py`. This module will be used on many programs that want to ask the user to choose from several potential actions.
2. In `menu.py`, define a function called `menu`. That function should take a list of strings as an argument, and ask the user to choose from among them.
3. If the user chooses one of the strings in that argument, return that value. Otherwise, have the user try again.
4. `import` the module from Jupyter, and see how it works.

In [34]:
import menu

s = menu.menu(['a', 'b', 'c'])

print(f'You chose {s}.')

Choose: ['a', 'b', 'c'] no!
Choose: ['a', 'b', 'c'] really, no!
Choose: ['a', 'b', 'c'] 
Choose: ['a', 'b', 'c'] d
Choose: ['a', 'b', 'c'] C
Choose: ['a', 'b', 'c'] b


You chose b.


# What happens when we import a module?

We know that before a module is imported, the module hasn't be read into Python.

But after a module is imported, then all of its variables and functions are attributes on the newly created module object.

This means that when we create the module object, we also read through the module file and assign what the module's author asked to assign.

In other words, if I say `import mymod`, then Python creates a skeleton, bare-bones `mymod` module. Then Python turns to the module file, goes through it, and every definition is added to that module object.

But wait -- what does it mean for us to "go through the module file"? The answer is: We execute the module file! Every single assignment of a variable is turned into the assignment of an attribute on the module. The same goes for every function.

In other words: When we `import` a module, the entire module file is executed.

In [35]:
reload(mymod)  # tell mymod to be reloaded

Hello from mymod!
Goodbye from mymod!


<module 'mymod' from '/Users/reuven/Courses/Current/oreilly-2025-08August-python/mymod.py'>

# Notice:

I said that when Python reads through (and executes) the module, line by line, it turns every variable/function definition into an attribute on the module object.

Maybe this works both ways? Maybe, just as `x`, `y`, and `hello` are defined on the module object, maybe things on the module object are also available in the module? For example, `__name__` might be available inside of the module.

In [36]:
reload(mymod)  # tell mymod to be reloaded

Hello from mymod!
Goodbye from mymod!


<module 'mymod' from '/Users/reuven/Courses/Current/oreilly-2025-08August-python/mymod.py'>

# `__name__`

When we `import` a module, its `__name__` is set to be the name of the file, a short string. But if we execute a module, as if it were the first and only program (rather than just a library loaded), then `__name__` is actually set to the string `'main'`.

This allows us to distinguish between when our module is imported and when it is being run. We can say

    if __name__ == '__main__':

The above line means:

- We want to run it when the module is loaded into memory
- Everything below this line will only be used if `__name__` is `'__main__'`.

Who uses this? Everyone!
- Run tests
- Give a demo
- Let the user try something interactively


# Exercise: Interactive menu demo

Modify `menu.py`, such that someone running it from the command line will be invited to enter words and then have the user choose from among them.

- If I `import menu.py`, then it will do as before
- If I invoke `menu.py`, then it'l demo my menu system.

In [37]:
reload(menu)

menu


<module 'menu' from '/Users/reuven/Courses/Current/oreilly-2025-08August-python/menu.py'>

In [39]:
reload(menu)

<module 'menu' from '/Users/reuven/Courses/Current/oreilly-2025-08August-python/menu.py'>

# Next up

- Python's standard library
- PyPI and `pip`

# What comes with Python?

When we download and install Python, we obviously get the Python language. This allows us to run Python programs. 

In addition, we get the "standard library," a bunch of modules that you can assume everyone running Python also has.

If you're running Python 3.13, then you can assume that anyone else with Python 3.13 has the same modules available, thanks to the standard library.

Even if "batteries included" is no longer as true as was once the case, the standard library still includes a *huge* amount of functionality. Some old things are being removed, but not many new things are being added -- that's because the standard libray is only updated with new released of Python. If you're a library author, you often want to be on your own schedule.

Where is the standard library located? It depends on your Python installation!

In [40]:
import sys
sys.path

['/Users/reuven/.pyenv/versions/3.13.5/lib/python313.zip',
 '/Users/reuven/.pyenv/versions/3.13.5/lib/python3.13',
 '/Users/reuven/.pyenv/versions/3.13.5/lib/python3.13/lib-dynload',
 '',
 '/Users/reuven/.local/lib/python3.13/site-packages',
 '/Users/reuven/.pyenv/versions/3.13.5/lib/python3.13/site-packages']

How do we use something in the standard library? We just `import` it. It's already in `sys.path`, so we can just `import` it and use it.

# Exercise: Globbing

The `glob` module contains a function `glob` (i.e., `glob.glob`) that takes a single argument, a string with a pattern (e.g., `*` and `?`). It returns a list of strings, filenames matching that pattern. 

1. Ask the user to enter a pattern
2. Find all of the files matching that pattern
3. For each file, print its size (which you can calculate by opening the file and totalling up the lengths of the lines)
4. The output should be filenames and their sizes.

Don't give binary filenames -- stick with text -- avoid issues.

In [50]:
import glob

user_pattern = input('Enter filename pattern: ').strip()

for one_filename in glob.glob(user_pattern):
    total_size = 0

    for one_line in open(one_filename):
        total_size += len(one_line)

    print(f'{one_filename}:{total_size}')


Enter filename pattern:  *.py


menu.py:406
mymod.py:153


# Beyond the standard library

What if there's functionality that you know exists, and must have been shared with the Python community, but isn't in the standard library?

You can look at PyPI, the Python Package Index. This was originally just a set of links to the files people would download with Python functionality. Nowadays, PyPI is sponsored by the Python Software Foundation (PSF) and it hosts the files themselves.

If you find something in the standard library, that's always best. But if you don't, you'll often find a solution to your problem on PyPI.

In [51]:
!pip install pdfcombine

Collecting pdfcombine
  Downloading pdfcombine-1.1.5-py3-none-any.whl.metadata (7.6 kB)
Collecting docopt>=0.6.2 (from pdfcombine)
  Downloading docopt-0.6.2.tar.gz (25 kB)
  Preparing metadata (setup.py) ... [?2done
[?25hDownloading pdfcombine-1.1.5-py3-none-any.whl (9.3 kB)
Building wheels for collected packages: docopt
[33m  DEPRECATION: Building 'docopt' using the legacy setup.py bdist_wheel mechanism, which will be removed in a future version. pip 25.3 will enforce this behaviour change. A possible replacement is to use the standardized build interface by setting the `--use-pep517` option, (possibly combined with `--no-build-isolation`), or adding a `pyproject.toml` file to the source tree of 'docopt'. Discussion can be found at https://github.com/pypa/pip/issues/6334[0m[33m
  Building wheel for docopt (setup.py) ... [?25done
[?25h  Created wheel for docopt: filename=docopt-0.6.2-py2.py3-none-any.whl size=13706 sha256=3bcde15da045c4239a9d6ed92caa718919b7613189367e47eb2f5c2a

In [52]:
import pdfcombine

# Exercise: Colorize text!

1. The "rich" library lets you invoke `rich.print('before [blue]stuff[/blue] after')`. When you do that, the text gets colorized.
2. Download and install `rich` from PyPI
3. Use `rich.print` to do the above, or something like it.}

In [53]:
%pip install rich

Note: you may need to restart the kernel to use updated packages.


In [56]:
import rich 

rich.print('Hello, [red]out there[/red]!')

In [57]:
rich.print('Hello, [red on yellow]out there[/red on yellow]!')

# Next up

1. A bit more about PyPI, `pip`, and `uv`
2. AMA (ask me anything!)
3. Next steps in Python

# `pip` and installing packages from PyPI

A module is a single file, like the ones that we wrote earlier, containing Python code. And a package is a directory containing one or more modules. Things get more complex when we want to distribute a package, because then there's additional data about dependencies -- what other packages/modules this package needs. The sum total is known as a "distribution package," with the Python package and the metadata that comes it.  This is what PyPI helps us to distribute.

When you download and install a file using `pip`, it's going to PyPI, downloading the distribution package, opening it up, and putting it in the `site-packages` directory for your version of Python. Every version of Python gets its own `site-packages` directory. Every `site-packages` directory has room for one package of each name.

If you have more than one Python project on your computer, then each will have its own packages to download and install. It's quite possible that they'll have contradictory requirements -- that one of your projects will need version 1.0 of a package, but another of your projects will require version 2.0 of that same package. Python can't handle that easily. To do it, we need something known as a "virtual environment."

Part of the basic way that Python tries to protect you from such issues is by having `pip` only download and install packages that it doesn't already have.

If you say, on the command line,

    pip install amazingpackage

and `amazingpackage` is already installed, then it won't do anything -- even if `amazingpackage` has been updated 100 times on PyPI since the version you have installed is there.

If you want to upgrade, you can use

    pip install -U amazingpackage

The thing is, packaging in Python has become very complex, frustrating, and full of tools that do some (but not all) of what you want. Some check for these contradictions. Some put together distribution packages. Some run virtual environments.

About a year ago, some people came out with `uv`, a totally new package manager that basically claimed to be many 1000s of times faster than `pip`. It turns out that it really is that much faster, because it's written in Rust.

I started to use `uv` instead of `pip`:

    uv pip install mypackage

Just this alone is a big advance over `pip`, but I've slowly but surely learning to use `uv` in new and better ways. Everything I do, I'm trying to do inside of a `uv` "project," meaning a directory for my software that contains information about what packages I'm using.

# AMA 

# What next?

1. You will only improve with practice.
    - http://practiceyourpython.com/
    - Do small projects that are on topics that are of interest to you!
    - Join an open-source project
2. Go to conferences!
    - Attend the talks, and learn from them
    - Be social! Talk to people at meals and coffee breaks, and ask what they do!
    - Big ones each year are PyCon US (Long Beach, CA in May 2026 / Euro Python which will be in July 2026)