# Agenda: Week 5 (Modules and packages)

1. Review last week's challenge
2. Intro to modules
    - What are they?
    - How can we use them?
3. Using `import`
    - Different forms of `import`
4. Python's standard library    
5. Develop our own (simple) module
6. Modules vs. packages
7. PyPI (the Python Package Index)
    - Using `pip`
    - Installing third-party modules
    - Using third-party modules
8. What's next? Where do we go from here?    

# Modules

I've probably mentioned, several times, the "DRY rule" of programming, namely "Don't repeat yourself."

- If I have the same code, several lines in a row, then I can apply the DRY rule and replace those lines with a loop.
- If I have the same code, in several places in my program, then I can apply the DRY rule and replace those lines with a function.

If the same code is repeated, then:
- You have more to remember
- When (not if!) you need to debug/fix/modify the code, you have to do it in several places

By "DRYing up your code," you are saving yourself work and cognitive load (having to remember).

- If I have the same code, in several different programs, then I can apply the DRY rule and put that code in a *library*. In Python, we call our libraries "modules."

# Using modules

Python comes with many modules in its "standard library." If we want to use functionality in one of those modules, then we need to use the `import` statement.

It looks like this:

```python
import modname
```

Once we've done that, `modname` is defined as a global variable, an object of type "module," from which we can then run functions and grab data defined in that module.

In [1]:
# for example, the "random" module

import random

# `import` is weird!

1. It's not a function. So don't use parentheses after it
2. The argument to `import` is not a filename. It's a variable -- the variable we want to define, with the module's contents.
3. Python looks for a file with the same name as the module, plus the `.py` extension. If it finds such a file, then it loads the module into memory -- all of its functions, classes, and data definitions.
4. Functions and data defined on the module are then available as `modname.funcname`.  The module comes first, then a `.`, then the function we want to use.

In [2]:
import random    # this loads the "random" module into memory -- and yes, it's OK to do it more than once

random.randint(0, 100)   # go into random, find randint (a function), and execute it with (0, 100) as args

55

In [3]:
random.randint(0, 100)

50

# Lots of questions about what we just did!

1. What else is in the module? We know about `randint`, but what else is there?

We can use the `dir` function to ask a Python object what attributes (i.e., names after a `.`) are defined on an object. It won't give us documentation, but it's a good, quick-and-dirty start.

In [4]:
dir(random)

['BPF',
 'LOG4',
 'NV_MAGICCONST',
 'RECIP_BPF',
 'Random',
 'SG_MAGICCONST',
 'SystemRandom',
 'TWOPI',
 '_ONE',
 '_Sequence',
 '_Set',
 '__all__',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 '_accumulate',
 '_acos',
 '_bisect',
 '_ceil',
 '_cos',
 '_e',
 '_exp',
 '_floor',
 '_index',
 '_inst',
 '_isfinite',
 '_log',
 '_os',
 '_pi',
 '_random',
 '_repeat',
 '_sha512',
 '_sin',
 '_sqrt',
 '_test',
 '_test_generator',
 '_urandom',
 '_warn',
 'betavariate',
 'choice',
 'choices',
 'expovariate',
 'gammavariate',
 'gauss',
 'getrandbits',
 'getstate',
 'lognormvariate',
 'normalvariate',
 'paretovariate',
 'randbytes',
 'randint',
 'random',
 'randrange',
 'sample',
 'seed',
 'setstate',
 'shuffle',
 'triangular',
 'uniform',
 'vonmisesvariate',
 'weibullvariate']

2. How can we know what is data, and what is a function?

Answer: You have to read the documentation, or ask the objects themselves.



In [5]:
type(random.seed)

method

In [6]:
random.seed(100)

3. Where did Python load this information from?

We can look at the module object that we created to find out!

In [7]:
type(random)  # what kind of object was assigned to random?

module

In [9]:
# Module objects are basically warehouses for names
# they don't do much other than let us access names defined there.

# still, the data came from a file, right? 

random.__file__     # this is pronounced "random dot dunder file" (dunder == double underscore)

'/usr/local/Cellar/python@3.11/3.11.0/Frameworks/Python.framework/Versions/3.11/lib/python3.11/random.py'

When we said `import random`, Python looked for `random.py` in a bunch of directories. When it found `random.py`, it stopped looking further.

Which directories? The ones defined in `sys.path`.

So...
1. We can import `sys`
2. Then we can ask for the value of `sys.path`, a list of strings describing directories where Python modules might be.

In [10]:
import sys
sys.path

['/Users/reuven/Courses/Current/OReilly-2022-10Oct-PythonFirstSteps',
 '/usr/local/Cellar/python@3.11/3.11.0/Frameworks/Python.framework/Versions/3.11/lib/python311.zip',
 '/usr/local/Cellar/python@3.11/3.11.0/Frameworks/Python.framework/Versions/3.11/lib/python3.11',
 '/usr/local/Cellar/python@3.11/3.11.0/Frameworks/Python.framework/Versions/3.11/lib/python3.11/lib-dynload',
 '',
 '/usr/local/lib/python3.11/site-packages',
 '/usr/local/opt/python-tk@3.11/libexec']

In [11]:
import asdfafafaffa



ModuleNotFoundError: No module named 'asdfafafaffa'

# Exercise: Comparing random numbers

1. Import the `random` module.
2. Use the `random.randint` function in the module to generate two random numbers between 0 and 100.
3. Tell the user which of the numbers comes before the other -- or that they're the same (if they are).

In [22]:
import random  

num1 = random.randint(0, 100)   # call random.randint(0, 100) and assign its results to num1
num2 = random.randint(0, 100)   # call random.randint(0, 100) and assign its results to num2

if num1 < num2:
    print(f'{num1} < {num2}')
elif num1 > num2:
    print(f'{num1} > {num2}')
else:
    print(f'{num1} == {num2}')


82 > 26


In [23]:
# what if I try to call randint without mentioning "random"?

x = randint(0, 100)

NameError: name 'randint' is not defined

In [24]:
# if I'm going to use randint a *lot* in my program, it's a pain to always have to say
# random.randint. 

# I can get around that by telling the import system: Yes, import the module. But define the function
# (randint) as a global variable, not inside of the "random" module's namespace

from random import randint   # this means: define randint as a global name here, so I don't have to use random

In [25]:
randint(0, 100)

42

# Options for importing

1. `import MODNAME` -- then `MODNAME` is defined as a variable, and we can access its functionality via `MODNAME`.
2. `from MODNAME import THING` -- then `THING` is defined as a variable, and we don't need to go through `MODNAME` for it. Note that this does *not* define `MODNAME` as a global variable!
3. `import MODNAME as ALIAS` -- then we don't define `MODNAME` but rather `ALIAS`.  This is good if we might have something else with the same name, or if we want a shorter name. 
4. `from MODNAME import THING as T` -- then `T` is a global variable that refers to `MODNAME.THING`, but we don't define either `MODNAME` or `THING`.

# Exercise: Circle area

1. Define a function, `circle_area`, that takes a single argument - the length of a circle's radius.
2. The function should return the area of the circle, which can be calculated as π * r * r.
3. Use the definition of π that's defined in `math.pi`.
4. Test that the function works.

In [26]:
import math

def circle_area(r):
    return math.pi * r * r

circle_area(10)

314.1592653589793

In [27]:
from math import pi    # this doesn't define math, but does define pi

def circle_area(r):
    return pi * r * r

circle_area(9)

254.46900494077323

# The type of `import` you should *not* use

You will sometimes see code that looks like this:

    from MODULE import *
    
This means that Python should not only load the module into memory, but also define all of its names as global variables in our namespace.

This is a bad idea, for several reasons:

1. You can't know in advance what names are defined in the module. What if you are using the same variable name? Now you have a "namespace collision."
2. From an aesthetic perspective, it's nice to have things separate. By using `from .. import *`, you're mushing everything together.



# Next up

1. Python standard library
2. Defining our own (simple) module



# Python standard library

This is the list of modules that comes with Python. Yes, you can write modules. And yes, you can download and install modules. But before you do either of those, it's useful to know what comes with Python in the standard library.

Note: The standard library is **HUGE**. No one, but no one, remembers everything in it.

Python documentation is here: https://docs.python.org/3/

If you click on "standard library reference," you'll see a list of the modules in the standard library.

# Exercise: Directory file sizes

1. Write a function, `all_file_sizes`, which takes a single argument, a string with a directory name.
2. Use the `os.listdir` function to get a list of the files in that directory.  This function returns a list of strings, all filenames in that directory.
3. Use a `for` loop on each element (filename) in that list.  Get the file's size by running `os.stat` on the filename.  This function returns a special data structure; you want the `.st_size` attribute from it.
4. Print the filename and its size.

Outline:
- The function gets a directory name.
- Use `os.listdir` to get a list of files in that directory.
- Go through each file, and run `os.stat` (a function in `os`) on it
- This will return a special data structure with stat info.  Use `.st_size` on it to get the size.
- Print the filename + size.

In [38]:
import os

def all_file_sizes(dirname):                       # dirname will be a string -- a directory name
    for one_filename in os.listdir(dirname):       # call os.listdir(dirname), and print the result
            size = os.stat(one_filename).st_size
            
            print(f'{one_filename}: {size}')
    
    
all_file_sizes('.')    

myconfig.txt: 28
mini-access-log.txt: 36562
.DS_Store: 6148
nums.txt: 42
shoe-data.txt: 1676
First steps day 5, 2022-11Nov-10.ipynb: 20252
linux-etc-passwd.txt: 2683
README.md: 549
First steps day 3, 2022-10Oct-27.ipynb: 108331
wcfile.txt: 165
README.md~: 490
exercise-files.zip: 6148
First steps day 2, 2022-10Oct-20.ipynb: 81474
First steps day 1, 2022-10Oct-13.ipynb: 75063
myfile.txt: 16
.ipynb_checkpoints: 256
.git: 416
First steps day 4, 2022-11Nov-03.ipynb: 69115


# How can I define a module?

When I say `import modname`, Python looks in `sys.path` for `modname.py`, and then loads it.

The first directory in `sys.path` is always the current directory, where the Python program is running.

If we want, we could write a module and then load it!

In [39]:
# what will happen if I now import mymod?
# it's an empty file.  Will it work?

In [40]:
import mymod

In [41]:
dir(mymod)  # what attributes have been defined on mymod?

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__']

In [44]:
import mymod  # import only works once in a Python session! 

In [43]:
dir(mymod)

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__']

In [45]:
# let's get "reload" defined, a function that reloads a module.
# (this used to be in the standard library.)

from importlib import reload   # load the "reload" function from importlib

reload(mymod)                  # reload the module

<module 'mymod' from '/Users/reuven/Courses/Current/OReilly-2022-10Oct-PythonFirstSteps/mymod.py'>

In [46]:
dir(mymod)

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'hello',
 'x',
 'y']

In [47]:
mymod.x

100

In [48]:
mymod.y

[10, 20, 30]

In [49]:
mymod.hello('world')

'Hello, world!'

# Exercise: Temp conversion module

1. Create a new module, `convert_temp.py`, containing two functions, `c2f` and `f2c`, which convert from Celsius to Fahrentheight (or back).  The functions assume that they get a single number as input.
2. The return value will be the converted temperature.
3. import this module, and show that you can convert temperatures in either direction.

In [50]:
import temperatures

In [51]:
temperatures.c2f(-40)

-40.0

In [52]:
temperatures.c2f(100)

212.0

In [53]:
temperatures.c2f(0)

32.0

In [54]:
temperatures.f2c(-40)

-40.0

In [55]:
temperatures.f2c(212)

100.0

In [56]:
temperatures.f2c(32)

0.0

# Next up:

1. What happens in a module when we `import` it?
2. `__name__` and other module-related idioms
3. Packages (a little)
4. PyPI

In [57]:
reload(temperatures)

<module 'temperatures' from '/Users/reuven/Courses/Current/OReilly-2022-10Oct-PythonFirstSteps/temperatures.py'>

In [58]:
reload(temperatures)

Hello from the temp module!
Goodbye from the temp module!


<module 'temperatures' from '/Users/reuven/Courses/Current/OReilly-2022-10Oct-PythonFirstSteps/temperatures.py'>

In [59]:
dir(temperatures)

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'c2f',
 'f2c']

The two functions that we defined in our `temperatures` module are, after the module is imported, available as attributes (i.e., names after a `.`) on the `temperatures` module object.

So global variables (and function names) in our module then become attributes out of the module.

Does this work in the reverse way, too?

That is: Does `__file__`, the name of the file that we have on the module object, exist as a global variable inside of the module?



In [60]:
temperatures.__file__

'/Users/reuven/Courses/Current/OReilly-2022-10Oct-PythonFirstSteps/temperatures.py'

In [62]:
reload(temperatures)

Hello from the temp module!
Now in file: /Users/reuven/Courses/Current/OReilly-2022-10Oct-PythonFirstSteps/temperatures.py
Goodbye from the temp module!


<module 'temperatures' from '/Users/reuven/Courses/Current/OReilly-2022-10Oct-PythonFirstSteps/temperatures.py'>

In [63]:
reload(temperatures)

Hello from temperatures!
Now in file: /Users/reuven/Courses/Current/OReilly-2022-10Oct-PythonFirstSteps/temperatures.py
Goodbye from temperatures!


<module 'temperatures' from '/Users/reuven/Courses/Current/OReilly-2022-10Oct-PythonFirstSteps/temperatures.py'>

Bottom line:

1. The module is executed when we import it
2. The special "dunder" attributes on the module object are available as global variables inside of the module.
    - The `__file__` attribute becomes the global variable `__file__`
    - The `__name__` attribute becomes the global variable `__name__`

# `__name__` and `'__main__'`

When we `import` a module, the special `__name__` variable is defined to be a string, the same as the filename.

But if we run a module directly from the command line, as a Python program, then `__name__` gets a special value. That value is the string `'__main__'`.  Yes, it's confusing that `__name__` is a variable and `'__main__'` is a string.

We said before that printing when someone imports a module is considered rude. But it's OK to print when a program is run standalone.

I want to modify `temperatures.py`, such that when we `import` the module, it doesn't say anything. But when we execute the module, it does.

# Exercise: Interactive temperatures

1. Modify `temperatures.py`, so that it has a line near the bottom: `if __name__ == '__main__':`.
2. If the program is imported, do nothing special.
3. If the program is run standalone, then ask the user to enter a number and a letter (C or F). Invoke the appropriate function, and print the conversion.

In [64]:
reload(temperatures)

Hello from temperatures!
Now in file: /Users/reuven/Courses/Current/OReilly-2022-10Oct-PythonFirstSteps/temperatures.py
Goodbye from temperatures!


<module 'temperatures' from '/Users/reuven/Courses/Current/OReilly-2022-10Oct-PythonFirstSteps/temperatures.py'>

In [1]:
import temperatures

In [2]:
from importlib import reload

In [5]:
reload(temperatures)

<module 'temperatures' from '/Users/reuven/Courses/Current/OReilly-2022-10Oct-PythonFirstSteps/temperatures.py'>

In [6]:
temperatures.c2f(100)

212.0

# Turning a module into a package

Python modules are individual files, as we've seen, containing code — mostly function and data definitions.

If we have a lot of functionality that could be in a module, but which would be overwhelming to put into a single file, we can use a "package" -- a directory containing modules. This way, I can treat multiple files as a single file, which is easier to organize.

Many items in the standard library, even though we call them "modules," are really "packages" -- directories containing multiple modules.

A "distribution package" is the same as a regular package, but adds meta-data to it — what other packages does this package need to use? What versions of those packages? 

We're not going to talk about creating distribution packages, but you can do it pretty easily with https://python-poetry.org/.

Where are all of the distribution packages that I can use? The answer is: PyPI (https://pypi.org/).

# Exercise: `requests`

`requests` is one of the most popular Python packages on PyPI. It allows you to have a HTTP client in your Python program. It can do anything that a browser can do.

1. Use `pip` to install the `requests` package from PyPI.  (that is: `pip install requests`)
2. In a program, `import requests`
3. Then use the `requests.get` method to retrieve the page associated with a URL.  This will return a "response" object.
4. You can invoke the `content` method on the response object, and get back a bytestring (sort of like a string). Just print the length of the bytestring.

In [7]:
import requests   # this assumes that requests has already been installed (with pip) on the machine

response = requests.get('https://python.org/')

print(len(response.content))

50877


# Next up

1. Beautiful Soup -- quick demo
2. More on PyPI
    - Safety
    - Licenses
    - Finding good packages
3. What's next? (And general Q&A)


In [8]:
from bs4 import BeautifulSoup

html = requests.get('https://python.org/').content    # get the python.org homepage's HTML

soup = BeautifulSoup(html, 'html.parser')

In [9]:
soup


<!DOCTYPE html>

<!--[if lt IE 7]>   <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9">   <![endif]-->
<!--[if IE 7]>      <html class="no-js ie7 lt-ie8 lt-ie9">          <![endif]-->
<!--[if IE 8]>      <html class="no-js ie8 lt-ie9">                 <![endif]-->
<!--[if gt IE 8]><!--><html class="no-js" dir="ltr" lang="en"> <!--<![endif]-->
<head>
<meta charset="utf-8"/>
<meta content="IE=edge" http-equiv="X-UA-Compatible"/>
<link href="//ajax.googleapis.com/ajax/libs/jquery/1.8.2/jquery.min.js" rel="prefetch"/>
<link href="//ajax.googleapis.com/ajax/libs/jqueryui/1.12.1/jquery-ui.min.js" rel="prefetch"/>
<meta content="Python.org" name="application-name"/>
<meta content="The official home of the Python Programming Language" name="msapplication-tooltip"/>
<meta content="Python.org" name="apple-mobile-web-app-title"/>
<meta content="yes" name="apple-mobile-web-app-capable"/>
<meta content="black" name="apple-mobile-web-app-status-bar-style"/>
<meta content="width=device-width, initial-scal

In [11]:
soup.prettify()



In [12]:
soup.title

<title>Welcome to Python.org</title>

In [13]:
soup.text

'\n\n\n\n \n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nWelcome to Python.org\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nNotice: While JavaScript is not essential for this website, your interaction with the content will be limited. Please turn JavaScript on for the full experience. \n\n\n\n\n\n\nSkip to content\n\n\n▼ Close\n                \n\n\nPython\n\n\nPSF\n\n\nDocs\n\n\nPyPI\n\n\nJobs\n\n\nCommunity\n\n\n\n▲ The Python Network\n                \n\n\n\n\n\n\n\n\n\nDonate\n\n≡ Menu\n\n\nSearch This Site\n\n\n                                    GO\n                                \n\n\n\n\n\nA A\n\nSmaller\nLarger\nReset\n\n\n\n\n\n\nSocialize\n\nFacebook\nTwitter\nChat on IRC\n\n\n\n\n\n\n\n\n\n\nAbout\n\nApplications\nQuotes\nGetting Started\nHelp\nPython Brochure\n\n\n\nDownloads\n\nAll releases\nSource code\nWindows\nmacOS\nOther Platforms\nLicense\nAlternative Implementations\n\n\n\nDocumentation\n\nDocs\nAudio/Visual Talks\nBeginner\'s Guide\nDeveloper\'s Guide\

In [14]:
import weather
forecast=weather.forecast()
forecast.today['6:00'].temp # Get temperature in current location at 6.00

error: (25, 'Inappropriate ioctl for device')

# Exercise:

1. Install `weather2` from PyPI
2. Make a few calls (even if they're only from the documentation) to the weather service.
3. Note: Looks like it won't work in Jupyter.  