# Week 5 agenda

1. Review last week's challenge
2. Modules and packages
    - Importing modules
    - Using modules
    - Writing modules (a tiny bit -- needs an external editor)
    - PyPI
    - `pip` and installing packages from the Internet
3. General Q&A about Python, software, etc.    

In [1]:
# challenge program:

def count_ips(filename):
    output = {}
    for one_line in open(filename):
        fields = one_line.split()
        ip_address = fields[0]

        if ip_address in output:    # have we seen this IP address already?
            output[ip_address] += 1 # if so, add 1 to the count
        else:
            output[ip_address] = 1  # otherwise, set it to 1

    return output

# count_ips('logfile.txt')        


# Modules and packages

Let's start with my favorite programming rule, DRY (don't repeat yourself):

1. If we have several lines repeated in our program, we can replace them ("DRY up our code") with a loop.
2. If we have the same code repeated in multiple places in our program, we can replace them with a function.
3. If we have the same code repeated in multiple programs, we can use a *library*.  Or, as it's known in Python, a *module*.

# Using a module in Python

In order to use a module in Python, we must "import" it.  This gives us access to whatever the module has defined. That'll typically be:

- Data structures
- Functions
- Entirely new types of data ("classes")

The `import` statement in Python is *not* a function! It's a statement -- so don't try to use it with parentheses. Think of `import` sort of like `def`.  `def` creates a new function object, and assigns it to a variable.  In the same way, `import` creates a new module object, and assigns it to a variable.

If I say `import abcd`, the variable `abcd` will then be defined, and it'll contain a module object.  Assuming, of course, that `abcd` exists as a module on your computer.

In [None]:
import random  

In [3]:
type(random)   # what kind of data does the "random" variable contain?

module

In [4]:
# once I've imported the module, I have access to all of the data and functions that it defines.
# I can access those via a .
# meaning: MODULENAME.DATA or MODULENAME.FUNCTION
# then I just use the data, or use the function, as per usual.

# for example, the "random" module defines the "randint" function.  I can call it as follows:

random.randint(0, 100)

3

In [6]:
# we can ask a module to print itself out ("printed representation" of an object)
random

<module 'random' from '/usr/local/Cellar/python@3.10/3.10.4/Frameworks/Python.framework/Versions/3.10/lib/python3.10/random.py'>

# How does Python know where to find `random` and load it?

If we say `import random`, Python looks for a file called `random.py`, where `py` is the standard Python suffix for program files.

Where does it look for `random.py`?

It looks in a whole bunch of directories, known as the "search path." It looks through each of the directories in this path, one at a time.  The first directory in which it finds `random.py` wins, and the search ends.

If Python doesn't find a matching name in its search path, it raises an error.  This means that module import is a matter of "first come, first serve."

In [7]:
import sys     # sys is a special module -- it describes your Python running environment

In [8]:
sys.version    # what version of Python am I running?

'3.10.4 (main, Apr 26 2022, 19:42:59) [Clang 13.1.6 (clang-1316.0.21.2)]'

In [9]:
sys.path       # this is a list of strings -- the search path for modules we import

['/Users/reuven/Courses/Current/oreilly-2022-q2-first-steps',
 '/usr/local/Cellar/python@3.10/3.10.4/Frameworks/Python.framework/Versions/3.10/lib/python310.zip',
 '/usr/local/Cellar/python@3.10/3.10.4/Frameworks/Python.framework/Versions/3.10/lib/python3.10',
 '/usr/local/Cellar/python@3.10/3.10.4/Frameworks/Python.framework/Versions/3.10/lib/python3.10/lib-dynload',
 '',
 '/usr/local/lib/python3.10/site-packages',
 '/usr/local/lib/python3.10/site-packages/argclass-0.1.2-py3.10.egg',
 '/usr/local/Cellar/pybind11/2.9.2/libexec/lib/python3.10/site-packages',
 '/usr/local/lib/python3.10/site-packages/IPython/extensions',
 '/Users/reuven/.ipython']

# Exercise: Character classification

1. Import the `string` module in Python.  Historically, this module used to have a lot of functionality, but most of that was moved into methods on the `str` (string) class.  However, it still defines a few different variables that can be useful.  For example, `string.digits` (all digits), `string.punctuation` (punctuation), and `string.ascii_letters` (letters).
2. Define a dict with three keys -- `digits`, `punctuation`, and `letters`, and set the value to be 0 in each.
3. Ask the user to enter a string.
3. Go through the string, one character at a time:
    - If the character is a digit, add 1 to the `digits` value
    - If the character is punctuation, add 1 to the `punctuation` value
    - If the character is letter, add 1 to the `letter` value
4. Print out the resulting dict    

In [10]:
import string

In [11]:
string.digits

'0123456789'

In [12]:
string.punctuation

'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'

In [13]:
string.ascii_letters

'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'

In [None]:
counts = {'digits':0, 'punctuation':0, 'letters':0}

s = input('Enter a string: ').strip()

for one_character in s:
    if one_character in string.digits:
        counts['digits'] += 1
    elif one_character in string.punctuation:
        counts['punctuation'] += 1
    elif one_character in string.ascii_letters:
        counts['letter'] += 1