# Agenda, day 5: Modules and packages

1. Q&A
2. What are modules? What do they contain?
3. `import`, and how to use a module
4. The different forms of `import`, and how/why we use them
5. How can we develop our own module?
6. The special `__name__` value in a  module
7. Python's standard library
8. PyPI and `pip`, and installing packages from the Internet
9. What next?
10. AMA -- ask me anything!

# What are modules?

DRY rule -- "don't repeat yourself."

- If you have code that repeats itself, several lines in a row, then you should use a loop
- If you have code that repeats itself in several places in your program, then you should use a function
- If you have code that repeats itself in several places across several programs, then you can use a *library*.

"Library" is the term used across all programming languages. In Python, we call our libraries "modules." When someone uses a module, they're using some Python code that was written once, and can be used many times. If your code is being used in many programs, then you can also write a module -- using it yourself, or giving it to other people to use, as well.

Actually, a module in Python does two things:

1. It provides reusable code
2. It provides us with a "namespace," to ensure that we don't have "namespace collisions" -- in other words, when the same variable is defined in several different places.

It's a rare Python program that doesn't use modules.

# Using modules

We can take advantage of a module using the `import` statement in Python. It looks like this:

    import random

Several things to notice about `import`:

1. It isn't a function! It doesn't use `()`
2. The argument you give it is not a filename (or a string). It's the name of the module variable you want to define. You can think of it as the name of the module you want to load without the `.py` at the end of the file.

In [1]:
# I want to use the "random" module, because I want to generate random numbers

import random

# What happens when I `import random`?

1. Python takes the name we gave it and looks for a file called `random.py`
2. If it finds that file, then it loads all of the definitions the file contains, and puts them into a module
3. It makes that module available via a variable, the same name we gave it (i.e., `random`)
4. The functions and variables in that module are then available as *attributes* on `random`, i.e., names after a `.`

In [6]:
random.randint(0, 100)   # this invokes the "randint" function in the "random" module

22

# What names are in a module?

It's nice that I knew that `randint` was in `random` -- but how is a newcomer supposed to know?

1. You can invoke `dir` on a module value, and you'll get a list of strings -- the names defined. This will mix together variables, functions, data types, etc., so it's not perfect, it's not a bad shortcut.
2. You can invoke `help` on a module value, and you'll get the docstring for the module and all of its functions. This is easier to undertsand and read. (`help` works in Jupyter; if you're in PyCharm or some editor like that, you can usually hover over a name to get its documentation.)
3. You can go to `docs.python.org` and find the documentation, and read it there.

In [7]:
dir(random)

['BPF',
 'LOG4',
 'NV_MAGICCONST',
 'RECIP_BPF',
 'Random',
 'SG_MAGICCONST',
 'SystemRandom',
 'TWOPI',
 '_ONE',
 '_Sequence',
 '__all__',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 '_accumulate',
 '_acos',
 '_bisect',
 '_ceil',
 '_cos',
 '_e',
 '_exp',
 '_fabs',
 '_floor',
 '_index',
 '_inst',
 '_isfinite',
 '_lgamma',
 '_log',
 '_log2',
 '_os',
 '_parse_args',
 '_pi',
 '_random',
 '_repeat',
 '_sha512',
 '_sin',
 '_sqrt',
 '_test',
 '_test_generator',
 '_urandom',
 'betavariate',
 'binomialvariate',
 'choice',
 'choices',
 'expovariate',
 'gammavariate',
 'gauss',
 'getrandbits',
 'getstate',
 'lognormvariate',
 'main',
 'normalvariate',
 'paretovariate',
 'randbytes',
 'randint',
 'random',
 'randrange',
 'sample',
 'seed',
 'setstate',
 'shuffle',
 'triangular',
 'uniform',
 'vonmisesvariate',
 'weibullvariate']

In [8]:
# You can invoke help

help(random)

Help on module random:

NAME
    random - Random variable generators.

MODULE REFERENCE
    https://docs.python.org/3.13/library/random.html

    The following documentation is automatically generated from the Python
    source files.  It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
    implementations.  When in doubt, consult the module reference at the
    location listed above.

DESCRIPTION
        bytes
        -----
               uniform bytes (values between 0 and 255)

        integers
        --------
               uniform within range

        sequences
        ---------
               pick random element
               pick random sample
               pick weighted random sample
               generate random permutation

        distributions on the real line:
        ------------------------------
               uniform
               triangular
               normal (Gaussian)
               l

# What about `print`?

`print` and many other names are defined as "builtins," meaning that they are in modules, but those modules are loaded automatically when Python starts up. That's true for `print`, `input`, `len`, `str`, `int`, `dict`, and about 30 others. You don't need to `import` anything to use them. You can use `help` on them, and you can also find them documented at `docs.python.org`; just look in "builtins".

In [9]:
dir(__builtins__)

['ArithmeticError',
 'AssertionError',
 'AttributeError',
 'BaseException',
 'BaseExceptionGroup',
 'BlockingIOError',
 'BrokenPipeError',
 'BufferError',
 'ChildProcessError',
 'ConnectionAbortedError',
 'ConnectionError',
 'ConnectionRefusedError',
 'ConnectionResetError',
 'EOFError',
 'Ellipsis',
 'EnvironmentError',
 'Exception',
 'ExceptionGroup',
 'False',
 'FileExistsError',
 'FileNotFoundError',
 'FloatingPointError',
 'GeneratorExit',
 'IOError',
 'ImportError',
 'IndentationError',
 'IndexError',
 'InterruptedError',
 'IsADirectoryError',
 'KeyError',
 'KeyboardInterrupt',
 'LookupError',
 'MemoryError',
 'ModuleNotFoundError',
 'NameError',
 'None',
 'NotADirectoryError',
 'NotImplemented',
 'NotImplementedError',
 'OSError',
 'OverflowError',
 'PermissionError',
 'ProcessLookupError',
 'PythonFinalizationError',
 'RecursionError',
 'ReferenceError',
 'RuntimeError',
 'StopAsyncIteration',
 'StopIteration',
 'SyntaxError',
 'SystemError',
 'SystemExit',
 'TabError',
 'Timeo

# What's with `_`?

In Python, `_` is just another character that you can use in function / variable names. For example, you can say `first_name = 'Reuven'`.

BUT. Many variables and functions (and methods) adhere to conventions in Python regarding naming of things:

- If the first character is `_`, that means it's *private*, and if you use it, you might get stuck in the future, because there are no guarantees that it won't change, or even that it'll stick around.
- If the first two characters are `__`, that is doing special things with names to avoid collisions.
- If the final character is `_`, that means it's typically temporary from a data-analysis library
- If the first two and final two are `__`, then the name is called a "dunder," for "double underscore," and it's either used in a special way or defined in a special way (or both!) in Python.

You should not start or end any of your variable/function names with `_`, to avoid people getting confused.

# Loading a module more than once

When we `import random` into our program, Python actually does the following:

1. It checks if the module has been loaded into memory already
    - If not, then it finds the file, and loads it
2. It defines the variable, assigning the module value to it

This means that if a given program has multiple files, and each file says `import random`, then only the first will really load it. The rest will have the variable defined, but won't waste time/memory loading the module again.

# Exercise: Guessing game

1. Use `random.randint` to generate a random number between 0 and 100.
2. Ask the user, repeatedly, to guess the number.
    - If the user guesses correctly, congratulate them and stop asking.
    - If the user is too low, then print "too low!"
    - If the user's guess is too high, then print "too high!"
3. Bonus: Indicate how many guesses the user required before getting it right.

In [14]:
import random 

number = random.randint(0, 100)
counter = 0

while True:
    s = input('Guess: ').strip()

    if not s.isdigit():
        print(f'{s} is not numeric; try again')
        continue
    
    guess = int(s)
    counter += 1
    
    if guess == number:
        print(f'You got it in {counter} tries!')
        break
    elif guess < number:
        print('Too low!')
    else:
        print('Too high!')    

Guess:  50


Too high!


Guess:  25


Too high!


Guess:  15


Too high!


Guess:  7


Too high!


Guess:  3


Too high!


Guess:  1


You got it in 6 tries!


In [None]:
# VR

import random

number = random.randint(0, 100)
guesses_left = 10

print("Welcome to the Guessing Game!")
print("I'm thinking of a number between 0 and 100.")

while guesses_left > 0:
    print(f"You have {guesses_left} guesses left.")
    try:
        guess = int(input("Take a guess: "))
    except ValueError:
        print("Invalid input. Please enter a number.")
        continue

    if guess < number:
        print("Too low!")
    elif guess > number:
        print("Too high!")
    else:
            print(f"Congratulations! You guessed the number in {10 - guesses_left} tries.")
            break
    
    guesses_left -= 1

    if guesses_left == 0:
        print(f"Sorry, you ran out of guesses. The number was {number}.")        

# Naming and namespaces

I mentioned before that when we `import` a module, we get all of its defined functions, variables, and data types. Those are all under the module's namespace, meaning that we say `MODNAME.NAME`, where the `.` is mandatory. 

We cannot run `randint` -- we have to run `random.randint`. This means that we know (and have to know) the module's name in order to run the function. We also need to use the full name. This resolve ambiguity, and ensures that if there is another `randint` function in the program, they won't collide.

But... it's a pain to write `random.randint` each time.

What happens if I write `randint` by itself?

In [15]:
randint(0, 100)

NameError: name 'randint' is not defined

Sometimes, we *want* to just call a function that was defined in a module, without having to specify the module name first.

We can do that with a different version of `import`, known as `from import`.

In [16]:
# this version (from ... import) still loads the module into memory!
# but it doesn't define "random". and doesn't create a module object that we can use
# it does, however, define "randint" in our namespace, so we can use it directly.

from random import randint

In [17]:
randint(0, 100)

4

When use `import` and when use `from .. import`?

- Easier to read/understand/maintain code with the full name, including the module name
- If the definition is under many layers of modules (`a.b.c.d.e.f.g()`), then it's much easier to use `from .. import` and just keep the final name

It depends. Usually, I think it's a good idea to keep the module name, unless the name you want will be used a *lot*.

If you say

    from random import randint

then then *only* variable defined in your program will be `randint`. The `random` module will not be defined or available, at least not as a variable.  You can, however, say something like this:

    import random
    from random import randint

This defines *two* variables, `random` (the module with all names under it) and `randint` (the function that you can call directly without going through `random.` first.

Because the first `import` loads it into memory, the second one is just a variable definition.

You can use `from .. import` to get more than one name:

    from random import randint, choice

# Versions of `import`

1. `import MODULE`
2. `from MODULE import NAME`
3. `import MODULE as ALIAS` -- this is a very common way to do things, especially if the module's name is long, or if the people using it have a convention of shortening it. For example, everyone says `import numpy as np` and `import pandas as pd`. This doesn't change what module is loaded, but it does change the variable that is defined. If you do this, you cannot say `pandas.xyz`, but you can say `pd.xyz`.

4. `from MODULE import NAME as ALIAS` -- this is the same idea, but for `from .. import`, and basically lets you change the name of a function/variable imported from a module. This is useful for shortening, but also if you want to use `from .. import` but you worry about colliding with one or more names already defined in your program.

# One final version of `import`

You can say:

    from random import *

This takes *all* of the names defined in the `random` module, and assigns all of them to variables in your current program.

**NEVER EVER EVER EVER EVER EVER DO THIS!**

This leads to chaos. You want the modules for their namespaces! You want to be able to distinguish between variables from module `a` and those from module `b`. This removes that possibility.

# Where are these modules located?

When you say `import MODNAME`, `MODNAME` is turned into a filename, ending with `.py`. Python then looks for `MODNAME.py` in a number of different directories. These are defined in `sys.path`, a list of strings.

In [18]:
sys.path

NameError: name 'sys' is not defined

In [19]:
import sys   # first, we have to import sys to get this module's info

In [20]:
sys.path

['/Users/reuven/.pyenv/versions/3.13.2/lib/python313.zip',
 '/Users/reuven/.pyenv/versions/3.13.2/lib/python3.13',
 '/Users/reuven/.pyenv/versions/3.13.2/lib/python3.13/lib-dynload',
 '',
 '/Users/reuven/.pyenv/versions/3.13.2/lib/python3.13/site-packages']

When I say `import random`, it looks for `random.py` in these directories, and the first match wins. This means that if you have multiple files named `random.py`, the first one that Python encounters will be loaded.

In [21]:
# let's ask random.py where it was loaded from

random

<module 'random' from '/Users/reuven/.pyenv/versions/3.13.2/lib/python3.13/random.py'>

# If I say

    from a import *
    from b import *

and both have a `hello` function, then we have b's `hello`, and a's `hello` is nowhere to be found.



# Next up

- Developing a module
- What `__name__` is, and how it's used
- Python standard library


# Exercise: Pick a card

1. Define a string representing playing cards from Ace through King, with one letter per card.
2. Use `random.choice` to choose an element from this string.
3. Ask the user to guess the card that was picked; keep going until they guess correctly.

Example:

    I've chosen a card. Guess what it is!
    Guess: 5
    Nope! Try again
    Guess: J
    Nope! Try again
    Guess: 9
    You got it!

Instead of 10, just use 1 (because there is no 1 -- there is an Ace)    

In [22]:
random.choice('abc')

'a'

In [23]:
import random

cards = 'a234567891jqk'

chosen_card = random.choice(cards)

while True:
    user_choice = input('Pick a card: ').strip().lower()

    if user_choice not in cards:
        print(f'{user_choice} is not a valid card; try again')
        continue

    if user_choice == chosen_card:
        print(f'Good job!')
        break
    else:
        print(f'Try again')

Pick a card:  9


Try again


Pick a card:  5


Try again


Pick a card:  q


Good job!


In [None]:
# AG

import random

user_pick = input("Enter a card number: ")
if user_pick == random.choice(['A', '2', '3', '4', '5', '6', '7', '8', '9', '1', 'J', 'Q', 'K']):
    print("You win!")
else:
    print("Try Again!")

In [None]:
# VR

import random

cards = "A23456789TJQK"  # String representing cards from Ace to King
counter = 0

while True:
    chosen_card = random.choice(cards)
    guess = input("Guess the card (A, 2, 3, 4, 5, 6, 7, 8, 9, T, J, Q, Kb): ").upper()
    counter += 1
    if guess == chosen_card:
        print(f"Correct! You guessed in {counter} tries.")
        
        break
    else:
        print("Incorrect. Try again.")

# Writing our own module

So far, we've seen how we can use modules that others have written. Can we write a module, too?

Yes! A module is just a file containing Python code (definitions of variables and functions) that is in a directory in `sys.path`. We can then `import` it.

In [24]:
# mymod.py is in the same directory as this notebook
# always, a file in the same directory is considered first, before anything in sys.path

import mymod

In [25]:
mymod.x

100

In [26]:
mymod.y

[10, 20, 30]

In [27]:
mymod.hello('world')

'Hello, world from mymod!'

In [28]:
dir(mymod)

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'hello',
 'x',
 'y']

In [29]:
mymod.__name__

'mymod'

In [30]:
mymod.__file__

'/Users/reuven/Courses/Current/OReilly-2025-04April-python/mymod.py'

# Exercise: `greet` module

1. Create a module, `greet.py`, with a single function, `hello` in it. The function takes a string, and returns a greeting with that string.
2. Use your module (via `import greet`) to invoke `greet.hello('world')`.

Example:

    import greet
    print(greet.hello('world'))

In [31]:
import greet
print(greet.hello('world'))

Hello, to you, world!


# Exercise: `menu` module

It's common for a program to ask users to choose from a number of limited/set choices. I want you to create a module called `menu.py`, in which you define a single function, `menu`. (Yes, it's common for a Python module and its main function to have the same name.) You'll call `menu.menu` with a list of strings, the menu options that the user can choose from.

When `menu.menu` is invoked, it asks the user to enter one of the provided choices. If the user does so, then the choice is returned from the function. If not, then the user is scolded and asked to do it again.

Example:

    import menu
    user_choice = menu.menu(['a', 'b', 'c'])    # user is asked to choose from a, b, or c
    print(f'User chose {user_choice}')          # this will print whatever they chose

If you `import` the module, use it, and find that it's buggy, and fix the bug, you cannot just `import` it again -- because each module is imported *once* in a Python session. If you need to re-import a module, you can restart the Python process in Jupyter under the Kernel menu.

In [32]:
import menu

In [33]:
user_choice = menu.menu(['a', 'b', 'c'])    # user is asked to choose from a, b, or c

Choose one from ['a', 'b', 'c']:  q


q is invalid; try again


Choose one from ['a', 'b', 'c']:  C


C is invalid; try again


Choose one from ['a', 'b', 'c']:  abcd


abcd is invalid; try again


Choose one from ['a', 'b', 'c']:  b


In [34]:
print(f'User chose {user_choice}')          # this will print whatever they chose

User chose b


# Next up

- What happens when we `import` a module?
- The magic of `__name__`
- Python standard library

# After I say `import mymod`

Once I've done that, then `mymod.a`, `mymod.b`, and `mymod.hello` are all defined.

This means that 1, 3, and 5 have all executed. Because if they didn't, then we wouldn't have the variables and function defined.

Does this mean that whenever we `import` a module the entire module is executed by Python?

Yes.

The first time we `import` a module, the file is executed, from the start to the finish. The result of the execution is packaged up into a module, which is stored and cached in Python. The module variable (`mymod`, in this case) then refers to that module value.

The second time (in the same program) we `import` that module, Python says: I have the cached version, so I can just use that; I don't need to re-execute or re-load the module.

In [35]:
import mymod

In [36]:
# I'll use a module, importlib, which provides us with a function, reload, that does what I want

from importlib import reload

reload(mymod)   # this forces a reload, even if the module was already loaded

Hello from mymod!
Goodbye from mymod!


<module 'mymod' from '/Users/reuven/Courses/Current/OReilly-2025-04April-python/mymod.py'>

In [37]:
dir(mymod)

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'hello',
 'x',
 'y']

In [38]:
mymod.__name__

'mymod'

In `mymod.py`, we defined `x`, `y`, and `hello`. Outside of the module, those variables were all available as attributes (after a `.`) on the module, `mymod`.

Does this work in the opposite direction, too? If I have the attribute `__name__` outside of the module, is `__name__` available *inside* of the module?

In [39]:
reload(mymod)   # this forces a reload, even if the module was already loaded

Hello from mymod!
Goodbye from mymod!


<module 'mymod' from '/Users/reuven/Courses/Current/OReilly-2025-04April-python/mymod.py'>

# Should we be printing when a module is loaded?

No! This is considered weird and rude.




# Inside vs. outside

Inside of our module file, `mymod.py`, I define three variables -- `x`, `y`, and `hello`. Inside of that file, I can use any and all of these in printing, etc.

Outside of the module file, but in the module object that I create with `import mymod`, I still see `x`, `y`, and `hello`. But now, they are attributes on the `mymod` module object. I can access them as `mymod.x`, `mymod.y`, and `mymod.hello`.

So what were variables inside of the module file are also available as attributes outside of the module file.

But we can take this in the opposite direction, too:

The `mymod.__name__` attribute is available on the `mymod` module object. It turns out that inside of the module file `mymod.py`, we have access to the variable `__name__`. Just as `mymod.__name__` contains the module's name as a string, inside of the file, `__name__` contains the module's name as a string.

Anything defined inside the module as a variable is an attribute outside. And many special names, such as `__name__`, which are attributes outside, are available as variables inside.

# The magic `__name__` variable

`__name__` is always defined in Python. It describes the current "namespace," meaning, the environment in which variables are defined. There are basically two possible values for `__name__`:

- The string naming the current module that was imported. This is true whenever a module is imported.
- Anywhere else, if the code isn't in a module that was imported, the value is the string `'__main__'`.

So:

- If we `import mymod`, then its `__name__` is the string `'mymod'`
- If we execute `mymod.py`, then its `__name__` is the string `'__main__'`

Why am I telling you this? Because one of the most famous lines in all of Python, which is in many *many* modules, looks like this:

```python
if __name__ == '__main__':
    SOMETHING_HERE
```

This `if` statement lets us make the code conditional on our module being run as a program:

- If the module is imported, then `__name__` won't be `'__main__'`, and the code won't run
- If the module is run as a standalone program, then the stuff under `if` will run.

Whatever is under this `if` block will only run if the program is run standalone, not imported as a module. Modules do many things with this:

- Some run their own tests
- Some produce documentation
- Some demo their abilities

In [40]:
reload(mymod)


<module 'mymod' from '/Users/reuven/Courses/Current/OReilly-2025-04April-python/mymod.py'>