# Agenda Day 5: Modules and packages

1. Intro to modules -- what are they?
2. The various forms of `import`
3. Developing our own module and using it
4. Python standard library
5. Modules vs. packages
6. PyPI and third-party packages
7. Using `pip` to install PyPI packages
8. What next?
9. AMA (ask me anything!) 

# DRY -- don't repeat yourself!

1. If you have one or more lines of code that repeat themselves, then you should consider a loop.
2. If you have the same code in several different places in your program, then you should consider writing a function.
3. If you have the same code in *several different programs*, then you should consider a *library*.

What is a library? It's a collection of functions and variables that someone else has written, which we can then use/reuse.

If I can use a library, then I don't need to reinvent the wheel. Plus, they have probably spent time improving/debugging their library.

In Python, we call our libraries *modules* and *packages*.

- A module is a single file containing Python functions and data
- A package is a directory/folder containing one or more modules

A module in Python gives us not only the capabilities of a library in other languages, but it also gives us *namespaces*.

A namespace is basically a last name for functions/variables, so that we (and Python) don't get confused between them and encounter "namespace collisions."



# How do we use a module?

In Python, we use the `import` statement to use a module, and load it into memory.  The syntax for `import` is a bit strange: It's not a function, so we don't use `()`. It's not like in other languages, where we say what file we want to load. Rather, it looks like this:

```python
import modname
```

Notice how we run it:

1. `import` doesn't have `()` after it
2. The module we want to load isn't in `''`, because it's not a string or a filename. It's the name of the module we want to create.
3. Python takes that module name, tacks on a `.py`, and then looks for a file -- in this case, `modname.py`.

Once we load the module with `import`, we then have a module variable defined (`modname` in this case), and we can access its data and functions via `modname.NAME`, where `NAME` is anything we might want to access in the module.

If the module `modname` has a function named `hello`, then after loading `modname`, we can run that function as `modname.hello()`. The `.` indicates that `hello` belongs to the module `modname`. That's the namespace!

Module names are case senstive; they are traditionally all in lowercase.

In [2]:
# here, I'll import the "random" module, which comes with Python

import random

In [3]:
# what is this "random" variable that I just defined? It's a module!

random

<module 'random' from '/Users/reuven/.pyenv/versions/3.13.0/lib/python3.13/random.py'>

In [8]:
# there is a function, random.randint, that returns a random integer between two extremes.
# we can invoke "randint" that's defined inside of the "random" module

random.randint(0, 100)

33

In [9]:
# if you try to load a module that doesn't exist (or if you spelled it wrong!), then you'll get an error

import randommmmmmm

ModuleNotFoundError: No module named 'randommmmmmm'

In [10]:
# how can I know what names are defined in a given module?
# (1) Use the dir function on the module object

dir(random)  # this will return a list of strings, names that we can use inside of random

['BPF',
 'LOG4',
 'NV_MAGICCONST',
 'RECIP_BPF',
 'Random',
 'SG_MAGICCONST',
 'SystemRandom',
 'TWOPI',
 '_ONE',
 '_Sequence',
 '__all__',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 '_accumulate',
 '_acos',
 '_bisect',
 '_ceil',
 '_cos',
 '_e',
 '_exp',
 '_fabs',
 '_floor',
 '_index',
 '_inst',
 '_isfinite',
 '_lgamma',
 '_log',
 '_log2',
 '_os',
 '_parse_args',
 '_pi',
 '_random',
 '_repeat',
 '_sha512',
 '_sin',
 '_sqrt',
 '_test',
 '_test_generator',
 '_urandom',
 'betavariate',
 'binomialvariate',
 'choice',
 'choices',
 'expovariate',
 'gammavariate',
 'gauss',
 'getrandbits',
 'getstate',
 'lognormvariate',
 'main',
 'normalvariate',
 'paretovariate',
 'randbytes',
 'randint',
 'random',
 'randrange',
 'sample',
 'seed',
 'setstate',
 'shuffle',
 'triangular',
 'uniform',
 'vonmisesvariate',
 'weibullvariate']

In [11]:
# (2) We can use "help" in Jupyter to get some documentation
# In an IDE like PyCharm/VSCode, you can usually hover over a module's name and click to get full documentation

help(random)


Help on module random:

NAME
    random - Random variable generators.

MODULE REFERENCE
    https://docs.python.org/3.13/library/random.html

    The following documentation is automatically generated from the Python
    source files.  It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
    implementations.  When in doubt, consult the module reference at the
    location listed above.

DESCRIPTION
        bytes
        -----
               uniform bytes (values between 0 and 255)

        integers
        --------
               uniform within range

        sequences
        ---------
               pick random element
               pick random sample
               pick weighted random sample
               generate random permutation

        distributions on the real line:
        ------------------------------
               uniform
               triangular
               normal (Gaussian)
               l

In [12]:
# (3) Use the documentation on docs.python.org
# that's the official Python documentation, and often includes even more examples

# Look for random here: https://docs.python.org/3/library/random.html

# Exercise: Guessing game

1. Use `random.randint` to get a random number between 0 and 100.
2. Repeatedly let the user guess:
    - If they try a non-numeric guess, then scold them
    - If they guess correctly, then congratulate them and stop
    - If they're too low, say so
    - If they're too high, say so
3. When they get it, indicate how many guesses they needed.

Example:

    Guess: 50
    Too low!
    Guess: 75
    Too high!
    Guess: hello
    hello is not numeric
    Guess: 70
    Too low!
    Guess: 72
    You got it in 5 guesses.

In [15]:
import random

number = random.randint(0, 100)
guess_number = 0

while True:
    guess_number += 1
    s = input('Guess: ').strip()

    if not s.isdigit():
        print(f'{s} is not numeric; try again')
        continue

    guess = int(s)    # get the guess as an integer

    if guess == number:
        print(f'You got it, in {guess_number} guesses!')
        break
    elif guess < number:
        print('Too low!')
    else:
        print('Too high!')


Guess:  50


Too high!


Guess:  25


Too high!


Guess:  12


Too low!


Guess:  15


Too low!


Guess:  19


Too low!


Guess:  21


Too high!


Guess:  20


You got it, in 7 guesses!


# Where do the modules live?

When we say `import random`, I've told you that Python looks for a file called `'random.py'`. Where does Python look? Where does that file live?

We can get a bit of a hint from looking at the module's printed representation.

Where did Python look for it, though?

We can find out by looking at `sys.path` -- meaning, the module `sys` (which you need to `import`) and then the `path` variable inside of it, a list of strings. These strings are directories in which we'll look.

In [16]:
random

<module 'random' from '/Users/reuven/.pyenv/versions/3.13.0/lib/python3.13/random.py'>

In [17]:
import sys
sys.path

['/Users/reuven/.pyenv/versions/3.13.0/lib/python313.zip',
 '/Users/reuven/.pyenv/versions/3.13.0/lib/python3.13',
 '/Users/reuven/.pyenv/versions/3.13.0/lib/python3.13/lib-dynload',
 '',
 '/Users/reuven/.pyenv/versions/3.13.0/lib/python3.13/site-packages']

In [None]:
# JM

import random

num = random.randint(0, 100)

while True:
    guess = input("Guess the magic number, pick any number! ").strip()
    if num == num:
        print("You guessed correctly, Congratulations! ")
    elif num == True and num < num:
        print("Your number is too low, try again! ")
    elif num == True and num > num:
        print("Your number is too high, try again! ")
    elif num != num:
        print("That is not a number you crazy kook! You know what a number is don't you?")

What if you want to load modules from a directory that isn't in `sys.path`?

You have to add it to `sys.path`. There are a few ways to do this, but setting the environment variable `PYTHONPATH` outside of Python is probably the best way.

In [24]:
# random also has the "choice" method, which chooses one element from a sequence

items = ['rock', 'paper', 'scissors']

person1 = random.choice(items)
person2 = random.choice(items)

print(f'{person1} vs. {person2}')

paper vs. rock


In [25]:
import math

In [26]:
math.e

2.718281828459045

In [27]:
math.pi

3.141592653589793

In [3]:
import unicodedata   # this lets us find out about characters in our strings -- not just English/Latin, but any

while True:
    s = input('Enter text: ').strip()

    if s == '':
        break

    for one_character in s:
        print(f'{one_character} is {unicodedata.name(one_character)}, category {unicodedata.category(one_character)}')

Enter text:  abc


a is LATIN SMALL LETTER A, category Ll
b is LATIN SMALL LETTER B, category Ll
c is LATIN SMALL LETTER C, category Ll


Enter text:  中国


中 is CJK UNIFIED IDEOGRAPH-4E2D, category Lo
国 is CJK UNIFIED IDEOGRAPH-56FD, category Lo


Enter text:  


# Next up

1. Different forms of `import`
2. Write our own module (that we can use)

# What happens when we `import`?

When we use `import`, we really do two (or 2.5) different things:

- Python checks: Is the module already loaded?
- If not, then it loads the module into memory from the file it finds in `sys.path`.
- It defines a variable with the name we gave, referring to the module object it created when loading.

If we have already loaded a module once, then only the variable assignment happens. If you want to `import` the same thing many times, that's fine; you'll just end up assigning to that variable many times.

In [4]:
import random

# now that I've done that, I can use any variable/function defined in random

random.randint(0, 100000)

15471

In [5]:
random.randint(0, 5)

1

In [6]:
# what if I get tired of typing "random." before randint?

randint(0, 5)   # will this work?

NameError: name 'randint' is not defined

In [7]:
# we can ask the import system to define randint as a variable, and thus not have to go through random

from random import randint
randint(0, 5)

0

# What does `from .. import` do?

- Python checks: Is the module already loaded?
- If not, then it loads the module into memory from the file it finds in `sys.path`.
- It defines the variable we want (and only that variable), referring to the value in the module.

If you use `from .. import`, then the module name will **NOT** be defined as a variable.

Also note: When you use `from .. import`, it still loads the entire module into memory. This is *not* a good way to try to save memory, by only loading parts of a module.

Using `from .. import` is *ONLY* about making it easier/shorter to refer to names inside of the module.

# Versions of `import`

1. `import MODNAME`, loads `MODNAME.py` that it finds in `sys.path` (if needed), and defines `MODNAME` as a variable referring to a module.
2. `from MODNAME import NAME`, loads `MODNAME.py` that it finds in `sys.path`, and defines `NAME` as a variable referring to `MODNAME.NAME`. It does *NOT* define `MODNAME`.

It's not uncommon for us to use `import`, but not to want to use the official module name as our variable name:

- It might be hard/long to write
- It might conflict with another variable/function in our system
- It might not be standard

In such cases, we want to load a module, but we want to give it a different name when defined. We want, in other words, to give it an alias. We can do that with `import MODNAME as ALIAS`.

In [8]:
import random as r    # this loads the random module, as before, but defines the variable "r" to refer to it

In [9]:
r.randint(0, 100)

46

In [10]:
# if I want to load NumPy and Pandas, I'll say:

import numpy as np
import pandas as pd

In [11]:
# we can also give an alias to a single imported name from a module

from random import randint as ri  

ri(100, 200)

109

# Versions of `import`

1. `import MODNAME`, loads `MODNAME.py` that it finds in `sys.path` (if needed), and defines `MODNAME` as a variable referring to a module.
2. `from MODNAME import NAME`, loads `MODNAME.py` that it finds in `sys.path`, and defines `NAME` as a variable referring to `MODNAME.NAME`. It does *NOT* define `MODNAME`.
3. `import MODNAME as ALIAS`, loads `MODNAME.py` that it finds in `sys.path` (if needed), and defines `ALIAS` as a variable referring to a module.
4. `from MODNAME import NAME as ALIAS`, loads `MODNAME.py` that it finds in `sys.path`, and defines `ALIAS` as a variable referring to `MODNAME.NAME`. It does *NOT* define `MODNAME`.


If we use `from MODNAME import NAME`, we can get one of the module's names into our namespace, so that we don't need to say `MODNAME.NAME`. What if I want more than one name? What if I want *all* of the names in a module to be imported into our namespace? Then we won't need to use the module name!

**THIS IS A TERRIBLE IDEA!** 

You can do this with

```python
from MODNAME import *
```

This will take all of the variable, function, and class names defined in `MODNAME`, and define them in your namespace. If 500 variables are defined in `MODNAME`, then you'll now have 500 variables in your namespace.

# Exercise: Using aliases

1. Ask the user to enter 3 numbers, in a single string separated by spaces.
2. Choose one of them randomly, using `random.choice` -- but I want you to invoke it as `pick_a_number`.

Example:

    Enter numbers: 10 20 30
    I chose 20.
    

In [15]:
from random import choice as pick_a_number

numbers = input('Enter numbers: ').split()
picked_number = pick_a_number(numbers)
print(f'I chose {picked_number}.')

Enter numbers:  10 20 30


I chose 20.


# Writing our own module

How can we write our own module in Python?

1. We write some Python code (variable and function definitions) in a file that ends in `.py`, and which is in `sys.path` -- or which is in the same directory as the program that wants to `import` it.
2. We let someone else `import` it.
3. 

In [16]:
import mymod    # this will create a module based on the contents of mymod.py (in the current directory)

In [17]:
mymod

<module 'mymod' from '/Users/reuven/Courses/Current/OReilly-2024-autumn-python/mymod.py'>

In [18]:
dir(mymod)   # what names are defined on our module?  A bunch of "dunders," names with a double underscore before/after them

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__']

In [19]:
mymod.__file__  # what file were you loaded from?

'/Users/reuven/Courses/Current/OReilly-2024-autumn-python/mymod.py'

In [20]:
mymod.__name__   # what do you think your name is, module?

'mymod'

In [21]:
dir(mymod)

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__']

In [22]:
import mymod   # let's try loading it a second time...

In [24]:
dir(mymod)

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__']

In [25]:
# if I'm in Jupyter and want to reload a module, I need some help from the importlib module!

from importlib import reload 
reload(mymod)

<module 'mymod' from '/Users/reuven/Courses/Current/OReilly-2024-autumn-python/mymod.py'>

In [26]:
dir(mymod)

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'hello',
 'x',
 'y']

In [27]:
mymod.x

100

In [28]:
mymod.y

[10, 20, 30]

In [29]:
mymod.hello('world')

'Hello, world!'

In [30]:
from mymod import hello

In [31]:
hello('world')

'Hello, world!'

# Exercise: `menu` module

1. Define a module, `menu.py`, which contains a single variable (`x`) with a value and a single function, `hello`, that takes an argument (`name`) and returns a string with a greeting for that name.
2. `import` your module, and invoke `hello` on a name that the user provides via `input`.

In [32]:
import menu 

In [34]:
menu.hello('world')

'Hello, world from menu'

In [35]:
name = input('Enter your name: ').strip()
print(menu.hello(name))

Enter your name:  Reuven


Hello, Reuven from menu
