# Libraries & Pandas

### Objectives

- Explain what Python libraries and modules are.
- Write Python code that imports and uses modules from Python's standard library.
- Find and read documentation for standard libraries.
- Import the pandas library.
- Use pandas to load a CSV file as a data set.
- Get some basic information about a pandas DataFrame.

### Questions

- How can I extend the capabilities of Python?
- How can I use Python code that other people have written?
- How can I read tabular data?


In [2]:
## Python libraries are powerful collections of tools.

# Import the `string` library
import string

print(f'The lower ascii letters are {string.ascii_lowercase}')
print(string.capwords('capitalise this sentence please.'))


The lower ascii letters are abcdefghijklmnopqrstuvwxyz
Capitalise This Sentence Please.


In [3]:
# Learn more about the contents of a library
help(string)

Help on module string:

NAME
    string - A collection of string constants.

MODULE REFERENCE
    https://docs.python.org/3.12/library/string.html

    The following documentation is automatically generated from the Python
    source files.  It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
    implementations.  When in doubt, consult the module reference at the
    location listed above.

DESCRIPTION
    Public module variables:

    whitespace -- a string containing all ASCII whitespace
    ascii_lowercase -- a string containing all ASCII lowercase letters
    ascii_uppercase -- a string containing all ASCII uppercase letters
    ascii_letters -- a string containing all ASCII letters
    digits -- a string containing all ASCII decimal digits
    hexdigits -- a string containing all ASCII hexadecimal digits
    octdigits -- a string containing all ASCII octal digits
    punctuation -- a string containing all A

In [5]:
# Import specific items from a library
from string import ascii_letters

print(f'The ASCII letters are {ascii_letters}')

The ASCII letters are abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ


In [None]:
# ModuleNotFoundError example (only run if you want the error)
# import pymarc

In [6]:
# Use pip to install the package (uncomment to install)
%pip install pymarc
import pymarc

Collecting pymarc
  Downloading pymarc-5.2.2-py3-none-any.whl.metadata (12 kB)
Downloading pymarc-5.2.2-py3-none-any.whl (158 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m158.8/158.8 kB[0m [31m1.8 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hInstalling collected packages: pymarc
Successfully installed pymarc-5.2.2
Note: you may need to restart the kernel to use updated packages.


In [9]:
# Importing pandas with an alias
import pandas as pd


# Loading CSV file into pandas DataFrame
df = pd.read_csv('../data/2022_circ.csv')  # You need a valid CSV file path here
df.head()

# Using pandas `info()` method to inspect DataFrame
df.info()

# Using pandas `describe()` to get summary statistics
df.describe()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 81 entries, 0 to 80
Data columns (total 17 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   branch     81 non-null     object 
 1   address    81 non-null     object 
 2   city       81 non-null     object 
 3   zip code   81 non-null     float64
 4   january    81 non-null     int64  
 5   february   81 non-null     int64  
 6   march      81 non-null     int64  
 7   april      81 non-null     int64  
 8   may        81 non-null     int64  
 9   june       81 non-null     int64  
 10  july       81 non-null     int64  
 11  august     81 non-null     int64  
 12  september  81 non-null     int64  
 13  october    81 non-null     int64  
 14  november   81 non-null     int64  
 15  december   81 non-null     int64  
 16  ytd        81 non-null     int64  
dtypes: float64(1), int64(13), object(3)
memory usage: 10.9+ KB


Unnamed: 0,zip code,january,february,march,april,may,june,july,august,september,october,november,december,ytd
count,81.0,81.0,81.0,81.0,81.0,81.0,81.0,81.0,81.0,81.0,81.0,81.0,81.0,81.0
mean,60632.358025,3452.987654,3047.444444,3613.469136,3500.91358,3115.345679,3606.641975,3777.074074,3773.62963,3487.123457,3424.123457,3420.592593,3045.740741,41265.08642
std,27.971552,4435.69522,4075.917038,4789.373973,4458.024162,4076.371294,4529.175102,4623.343008,4658.632751,4251.322307,4154.300384,4196.489988,3759.827362,51479.220988
min,60605.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,60617.0,611.0,579.0,755.0,775.0,629.0,804.0,835.0,783.0,747.0,761.0,718.0,753.0,8920.0
50%,60629.0,1712.0,1563.0,1709.0,1846.0,1575.0,1808.0,1914.0,2191.0,2172.0,2054.0,1988.0,1469.0,23735.0
75%,60643.0,4795.0,4258.0,5200.0,4931.0,4200.0,4959.0,5395.0,5309.0,4991.0,4733.0,4856.0,4241.0,55614.0
max,60827.0,25207.0,25276.0,29870.0,25578.0,23141.0,25830.0,26692.0,26071.0,24423.0,23921.0,24073.0,21258.0,301340.0


In [10]:
## Challenge: Importing With Aliases

# 1. Fill in the blanks so that the program below prints `0123456789`.
import string as s
numbers = s.digits
print(numbers)


0123456789


In [12]:
# The alternative without using alias:
import string
numbers = string.digits
print(numbers)

# Which version do you find easier to read?


0123456789


In [13]:

## Bonus Challenge: Locating the Right Module

# Given the variables year, month, and day, how would you generate a date in the standard iso format:
year = 1971
month = 8
day = 26

# Using the datetime library to create an ISO format date
import datetime

iso_date = datetime.date(year, month, day).isoformat()
print(iso_date)

# More compact version:
print(datetime.date(year, month, day).isoformat())


1971-08-26
1971-08-26
