# Agenda: Day 5 (Modules and packages)

1. Review of last week's challenge
2. Q&A
3. Modules
    - Why do need modules?
    - Using `import` to load modules
    - Different variations on `import`
    - Writing a simple module
    - How do modules work?
4. Python's standard library
5. Modules vs. packages
6. PyPI
    - What is it?
    - How can we install things from PyPI?
    - Issues with installation
    - Understanding how to navigate through and use PyPI
7. Next steps -- where do we go from here?    

In [1]:
# Code from the inteactive exercise

def count_ips(filename):
    output = {}   # new, empty dict

    for one_line in open(filename):           # go through the file, one line at a time, assigning to one_line
        ip_address = one_line.split()[0]      # grab the IP address from the start of each line

        if ip_address in output:              # if we've already seen ip_address, then add 1 to its count
            output[ip_address] += 1           # (ip_address is a key in the "output" dict)

        else:                                 # if this is the first time we see ip_address, set it to be a key
            output[ip_address] = 1            # in output, and the value is 1

    # - each new IP address adds a new key-value pair to output
    # - each repeat IP address adds 1 to the value of the existing key

    return output  # this is a dict


In [2]:
count_ips('mini-access-log.txt')

{'67.218.116.165': 2,
 '66.249.71.65': 3,
 '65.55.106.183': 2,
 '66.249.65.12': 32,
 '65.55.106.131': 2,
 '65.55.106.186': 2,
 '74.52.245.146': 2,
 '66.249.65.43': 3,
 '65.55.207.25': 2,
 '65.55.207.94': 2,
 '65.55.207.71': 1,
 '98.242.170.241': 1,
 '66.249.65.38': 100,
 '65.55.207.126': 2,
 '82.34.9.20': 2,
 '65.55.106.155': 2,
 '65.55.207.77': 2,
 '208.80.193.28': 1,
 '89.248.172.58': 22,
 '67.195.112.35': 16,
 '65.55.207.50': 3,
 '65.55.215.75': 2}

# Modules

One of the main things to keep in mind when programming is the DRY ("don't repeat yourself") rule:

- If you have several lines of code in a row that basically repeat themselves, you can replace them with a `for` loop.
- If you have several lines of code that repeat themselves in various places in your program, then you can replace them with a function.
- If you have several lines of code that repeat themselves across different programs, then you can replace them with a library. A library is a collection of code that you can use in numerous programs.

In Python, we call our libraries "modules." A module is thus:

1. A collection of code (function and variable definitions) that we can use in numerous programs, and
2. The variable/namespace we use to access those functions and variables in our program.

In [3]:
# one example of a module is "random"
# it contains many functions and data structures for working with random and related data.

# we can load it into memory using "import"
import random

In [4]:
# random is a variable, defined in our program
# we can ask it: what kind of value does it refer to?

type(random)

module

In [6]:
# what does the module provide us with? Functions and data we can use
# to work with random-related things

# for example, the random.randint function
random.randint(0, 100)

34

# Is `random` a module, or a variable?

It's both!

When we say `import random`, we're defining the `random` variable. Like all variables in Python, it refers to a value. In this case, the `random` variable refers to the the module object that we loaded with `import`, which knows itself as `random.`

We can refer to `random` as a variable, and we can also refer to `random` as a module object, even though technically `random` is a name referring to such an object.

# Exercise: Random numbers

1. `import` the `random` module.
2. Set two variables, `x` and `y`, to be random integers from 0 to 1,000.
3. Print `x`, `y`, and their product.