# Chapter 6: The Standard Library

## The Importance of the Standard Library

### High-Level Modules

Here is a detailed explanation of the turtle module code snippet:

    It creates a turtle in the middle of the screen.
    It then rotates it 180 degrees to the right.
    It moves forward 100 pixels, painting as it walks.
    It then rotates to the right once again, this time by 90 degrees.
    It then moves forward 50 pixels once again.
    It ends the program using done().

Other examples of high-level modules include:

    Difflib: To check the differences line by line across two blocks of text.
    Re: For regular expressions, which will be covered in Chapter 7, Being Pythonic.
    Sqlite3: To create and interact with SQLite databases.
    Multiple data compressing and archiving modules, such as gzip, zipfile, and tarfile.
    XML, JSON, CSV, and config parser: For working with multiple file formats.
    Sched: To schedule events in the standard library.
    Argparse: For the straightforward creation of command-line interfaces.

### Lower-Level Modules

The standard library also contains multiple lower-level modules that users rarely interact with. These lower-level modules are outside that of the standard library. Good examples are the different internet protocol modules, text formatting and templating, interacting with C code, testing, serving HTTP sites, and so on. 

Finally, there is another type of low-level module, which extends or simplifies the language. Notable examples of these are the following:

    Asyncio: To write asynchronous code
    Typing: To type hinting
    Contextvar: To save state based on the context
    Contextlib: To help with the creation of context managers
    Doctest: To verify code examples in documentation and docstrings
    Pdb and bdb: To access debugging tools

## Exercise 85: Using the dataclass Module


This exercise can be performed in the Jupyter notebook:

    1. Import the dataclass module:

In [7]:
import dataclasses

This line brings the dataclasses module to the local namespace, allowing us to use it.

2. Define a dataclass:

In [8]:
@dataclasses.dataclass
class Point:
    x: int
    y: int

With these four lines, you have defined a dataclass by its most common methods. You can now see how it behaves differently from a standard class.
3. Create an instance, which is the data for a geographical point:

In [9]:
p = Point(x=10, y=20)
print(p)

Point(x=10, y=20)


4. Now, compare the data points with another Point object:

In [10]:
p2 = Point(x=10, y=20)
p == p2

True

5. Serialize the data:

In [11]:
dataclasses.asdict(p)

{'x': 10, 'y': 20}

The dataclasses module is part of the standard library, so most experienced users will understand how a class decorated with a dataclass decorator will behave compared to a custom implementation of those methods. This would require either further documentation to be written, or for users to fully understand all the code in all classes that are manually crafting those methods

## Exercise 86: Extending the echo.py Example

In [12]:
%run echo -h

usage: echo.py [-h] [-c] [--repeat REPEAT] message [message ...]

Prints out the words passed in, capitalizes them if required and repeats them in as many lines as requested.

positional arguments:
  message           Messages to be echoed

optional arguments:
  -h, --help        show this help message and exit
  -c, --capitalize
  --repeat REPEAT


In [13]:
%run echo.py hello packt reader --repeat=3 -c

Hello Packt Reader
Hello Packt Reader
Hello Packt Reader


## Dates and Times

In [14]:
import datetime
datetime.date.today()

datetime.date(2020, 11, 9)

In [15]:
import datetime
from dateutil import tz
datetime.datetime(1989, 4, 24, 10, 11, 
                      tzinfo=tz.gettz("Europe/Madrid"))

datetime.datetime(1989, 4, 24, 10, 11, tzinfo=tzfile('Europe/Madrid'))

## Exercise 87: Comparing datetime across Time Zones

1. Import the **datetime** and **tz** modules from **dateutil**:

In [16]:
import datetime
from dateutil import tz
# Dateutil is not a module from the standard library, though it is the one recommended by the standard library.

2. Create the first **datetime** for **Madrid**:

In [17]:
d1 = datetime.datetime(1989, 4, 24, hour=11,
                         tzinfo=tz.gettz("Europe/Madrid"))

3. Create the second **datetime** for **Los_Angeles**:

In [18]:
d2 = datetime.datetime(1989, 4, 24, hour=8,
                         tzinfo=tz.gettz("America/Los_Angeles"))

4. Now, compare them:

In [19]:
print(d1.hour > d2.hour)

True


In [20]:
print(d1 > d2)

False


5. Now, convert the datetime object to a different time zone. You can convert a datetime from one time zone to another. You should do that to see what time the second datetime would show if it was in Madrid:

In [21]:
d2_madrid = d2.astimezone(tz.gettz("Europe/Madrid"))
print(d2_madrid.hour)

17


## Exercise 88: Calculating the Time Delta between Two datetime Objects

1. Import the **datetime** module:

In [22]:
import datetime as dt

2. Create two **datetime** objects. You now create two dates:

In [23]:
d1 = dt.datetime(2019, 2, 25, 10, 50,
                 tzinfo=dt.timezone.utc)
d2 = dt.datetime(2019, 2, 26, 11, 20,
                 tzinfo=dt.timezone.utc)

3. Subtract **d1** from **d2**. You can subtract two **datetime** to get a time delta back or add a time delta to a **datetime**. Adding two **datetime** makes no sense, and the operation will, therefore, output an error with an exception. Hence, you subtract the two **datetime** to get the delta:

In [25]:
d2–d1

SyntaxError: invalid character in identifier (<ipython-input-25-915216ad6909>, line 1)

4. You can see that the delta between the two datetime is 1 day and 1,800 seconds, which can be translated to the total number of seconds by calling total_seconds in the time delta object that the subtraction returns:

In [26]:
td = d2 - d1
td.total_seconds()

88200.0

5. It happens quite often that you need to send datetime objects in formats such as JSON or others that do not support native datetimes. A common way to serialize datetime is by encoding them in a string using the ISO 8601 standard. This can be done by using isoformat, which will output a string, and parsing them with the fromisoformat method, which takes a datetime serialized to a string with isoformat and transforms it back to a datetime:

In [27]:
d1 = dt.datetime.now(dt.timezone.utc)
d1.isoformat()

'2020-11-09T18:57:16.550318+00:00'

In [28]:
d1_new = dt.datetime.fromisoformat(iso_date)
d1_new

NameError: name 'iso_date' is not defined

## Exercise 89: Calculating the Unix Epoch Time

1. Import the time and datetime modules and get them to the current namespace:

In [29]:
import datetime as dt
import time

2. Get the current time. You use both datetime and time to do this:

In [30]:
time_now = time.time()
datetime_now = dt.datetime.now(dt.timezone.utc)

3. You can now calculate the epoch by subtracting datetime and a time delta, which you get from the current time since you said that these are the number of seconds since the epoch:

In [31]:
epoch = datetime_now - dt.timedelta(seconds=time_now)
print(epoch)

1969-12-31 23:59:59.999999+00:00


In [17]:
import calendar
c = calendar.Calendar()
list(c.itermonthdates(2019, 2))

[datetime.date(2019, 1, 28),
 datetime.date(2019, 1, 29),
 datetime.date(2019, 1, 30),
 datetime.date(2019, 1, 31),
 datetime.date(2019, 2, 1),
 datetime.date(2019, 2, 2),
 datetime.date(2019, 2, 3),
 datetime.date(2019, 2, 4),
 datetime.date(2019, 2, 5),
 datetime.date(2019, 2, 6),
 datetime.date(2019, 2, 7),
 datetime.date(2019, 2, 8),
 datetime.date(2019, 2, 9),
 datetime.date(2019, 2, 10),
 datetime.date(2019, 2, 11),
 datetime.date(2019, 2, 12),
 datetime.date(2019, 2, 13),
 datetime.date(2019, 2, 14),
 datetime.date(2019, 2, 15),
 datetime.date(2019, 2, 16),
 datetime.date(2019, 2, 17),
 datetime.date(2019, 2, 18),
 datetime.date(2019, 2, 19),
 datetime.date(2019, 2, 20),
 datetime.date(2019, 2, 21),
 datetime.date(2019, 2, 22),
 datetime.date(2019, 2, 23),
 datetime.date(2019, 2, 24),
 datetime.date(2019, 2, 25),
 datetime.date(2019, 2, 26),
 datetime.date(2019, 2, 27),
 datetime.date(2019, 2, 28),
 datetime.date(2019, 3, 1),
 datetime.date(2019, 3, 2),
 datetime.date(2019, 3, 3

In [33]:
list(d for d in c.itermonthdates(2019, 2)
        if d.month == 2)

NameError: name 'c' is not defined

## Activity 15: Calculating the Time Elapsed to Run a Loop

In [34]:
import random
import time

In [35]:
start = time.time()
l = [random.randint(1, 999) for _ in range(10 * 3)]
end = time.time()
print(end - start)

0.05600380897521973


In [29]:
start = time.time_ns()
l = [random.randint(1, 999) for _ in range(10 * 3)]
end = time.time_ns()
print(end - start)

1002000


## Interacting with the OS

## Exercise 90: Inspecting the Current Process Information

The goal of this exercise is to use the standard library to report information about the running process and the platform on your system:

1. Import the **os, platform, and sys** modules:

In [36]:
import platform
import os
import sys

2. Get basic process information: To obtain information such as the **Process id**, **Parent id** you can use the **os** module:

In [37]:
print("Process id:", os.getpid())
print("Parent process id:", os.getppid())

Process id: 10480
Parent process id: 8468


3. Now, get the **platform** and Python interpreter information:

In [38]:
print("Machine network name:", platform.node())
print("Python version:", platform.python_version())
print("System:", platform.system())

Machine network name: DESKTOP-NIJ54KC
Python version: 3.8.3
System: Windows


4. Get the Python path and the arguments passed to the interpreter:

In [39]:
print("Python module lookup path:", sys.path)
print("Command to run Python:", sys.argv)

Python module lookup path: ['C:\\Users\\YCHZSH\\Python', 'C:\\Users\\YCHZSH\\anaconda3\\python38.zip', 'C:\\Users\\YCHZSH\\anaconda3\\DLLs', 'C:\\Users\\YCHZSH\\anaconda3\\lib', 'C:\\Users\\YCHZSH\\anaconda3', '', 'C:\\Users\\YCHZSH\\anaconda3\\lib\\site-packages', 'C:\\Users\\YCHZSH\\anaconda3\\lib\\site-packages\\win32', 'C:\\Users\\YCHZSH\\anaconda3\\lib\\site-packages\\win32\\lib', 'C:\\Users\\YCHZSH\\anaconda3\\lib\\site-packages\\Pythonwin', 'C:\\Users\\YCHZSH\\anaconda3\\lib\\site-packages\\IPython\\extensions', 'C:\\Users\\YCHZSH\\.ipython']
Command to run Python: ['C:\\Users\\YCHZSH\\anaconda3\\lib\\site-packages\\ipykernel_launcher.py', '-f', 'C:\\Users\\YCHZSH\\AppData\\Roaming\\jupyter\\runtime\\kernel-61f2f0ee-8cb5-4b2e-acbd-312aca632478.json']


5. Get the username through an environment variable

In [40]:
print("USERNAME environment variable:", os.environ["USERNAME"])

USERNAME environment variable: YCHZSH


## Exercise 91: Using the glob Pattern to List Files within a Directory

1. Create a path object for the current path:

In [41]:
import pathlib
p = pathlib.Path("F:\Zeinab 2016- 2019")

2. Find all files in the directory with the txt extension. You can start by listing all files with the txt extension by using the following glob:

In [42]:
txt_files = p.glob("*.txt")
print("*.txt:", list(txt_files))

*.txt: [WindowsPath('F:/Zeinab 2016- 2019/tel.txt')]


List all files one level deep that are within the subdirectory. If you wanted to list all files one level deep within a subdirectory only, you could use the following glob pattern:

In [43]:
print("**/*.txt:", list(p.glob("**/*.txt")))

**/*.txt: [WindowsPath('F:/Zeinab 2016- 2019/tel.txt'), WindowsPath('F:/Zeinab 2016- 2019/Amoozesho Parvaresh/1-AP-Majmoe{@Yenipc2}/نمونه سوالات آموزش وپروش.txt'), WindowsPath('F:/Zeinab 2016- 2019/Elearning Courses/links.txt'), WindowsPath('F:/Zeinab 2016- 2019/Elearning Courses/AI Course/Learn to Program_The_Fundamentals/_9b843cf8f88dfd5332f16e78be773ba4_lists_transcript.txt'), WindowsPath('F:/Zeinab 2016- 2019/Elearning Courses/Python for Data Science_Udemy/Section 4/P4-BasketBall-Dataset.txt'), WindowsPath('F:/Zeinab 2016- 2019/Elearning Courses/SW/JetBrains.PyCharm.2019.3.4/JetBrains.PyCharm.2019.3.4/Crack/Offline/ACTIVATION_CODE.txt'), WindowsPath('F:/Zeinab 2016- 2019/Elearning Courses/SW/JetBrains.PyCharm.2019.3.4/JetBrains.PyCharm.2019.3.4/Crack/Offline/readme.txt'), WindowsPath('F:/Zeinab 2016- 2019/Elearning Courses/SW/JetBrains.PyCharm.2019.3.4/JetBrains.PyCharm.2019.3.4/Crack/Online/Readme.txt'), WindowsPath('F:/Zeinab 2016- 2019/Elearning Courses/SW/Navicat_Premium_15.0.1

This will list both files within folder_1 and the folder "folder_2/folder_3, which is also a path. If you wanted to get only files, you could filter each of the paths by checking the is_file method, as mentioned previously:

In [44]:
print("Files in */*:", [f for f in p.glob("*/*") if f.is_file()])

Files in */*: [WindowsPath('F:/Zeinab 2016- 2019/Amoozesho Parvaresh/eteducationalfiles_EToolsFile_9a10f155-78a7-4842-a2cf-69230f1607b7network1.pdf'), WindowsPath('F:/Zeinab 2016- 2019/Amoozesho Parvaresh/faragir-2-esfand-94_www.shenasname.ir_.pdf'), WindowsPath('F:/Zeinab 2016- 2019/Amoozesho Parvaresh/soalat-azmoon1.pdf'), WindowsPath('F:/Zeinab 2016- 2019/Amoozesho Parvaresh/آدرس مناطقه شهر تهران.png'), WindowsPath('F:/Zeinab 2016- 2019/Application/Application.docx'), WindowsPath('F:/Zeinab 2016- 2019/Application/letter.docx'), WindowsPath('F:/Zeinab 2016- 2019/Book/9780735626690.pdf'), WindowsPath('F:/Zeinab 2016- 2019/Book/Android App Development for Dummies (3rd ed.) [Burton 2015-03-09].pdf'), WindowsPath('F:/Zeinab 2016- 2019/Book/C All-in-One Desk Reference For Dummies ( PDFDrive ).pdf'), WindowsPath('F:/Zeinab 2016- 2019/Book/C.pdf'), WindowsPath('F:/Zeinab 2016- 2019/Book/C.Sharp.Learn_p30download.com.pdf'), WindowsPath('F:/Zeinab 2016- 2019/Book/C.Sharp.Learn_p30download.com

## Listing All Hidden Files in Your Home Directory

In Unix, hidden files are those that start with a dot. Usually, those files are not listed when you list files with tools such as ls unless you explicitly ask for them. You will now use the pathlib module to list all hidden files in your home directory. The code snippet indicated here will show exactly how to list these hidden files:

In [45]:
import pathlib
p = pathlib.Path.home()
print(list(p.glob(".*")))

[WindowsPath('C:/Users/YCHZSH/.anaconda'), WindowsPath('C:/Users/YCHZSH/.android'), WindowsPath('C:/Users/YCHZSH/.bash_history'), WindowsPath('C:/Users/YCHZSH/.conda'), WindowsPath('C:/Users/YCHZSH/.condarc'), WindowsPath('C:/Users/YCHZSH/.dotnet'), WindowsPath('C:/Users/YCHZSH/.gitconfig'), WindowsPath('C:/Users/YCHZSH/.ipynb_checkpoints'), WindowsPath('C:/Users/YCHZSH/.ipython'), WindowsPath('C:/Users/YCHZSH/.jupyter'), WindowsPath('C:/Users/YCHZSH/.matplotlib'), WindowsPath('C:/Users/YCHZSH/.PhpStorm2019.3'), WindowsPath('C:/Users/YCHZSH/.php_history'), WindowsPath('C:/Users/YCHZSH/.PyCharm2019.3'), WindowsPath('C:/Users/YCHZSH/.viminfo')]


### Using the subprocess Module

In [46]:
import pathlib
import subprocess
subprocess.run(["ls"], capture_output=True,
    text=True,
    shell = True)

CompletedProcess(args=['ls'], returncode=1, stdout='', stderr="'ls' is not recognized as an internal or external command,\noperable program or batch file.\n")

In [47]:
import pathlib
import subprocess
result = subprocess.run(['ls'], capture_output=True, text=True, shell = True)
print("stdout: ", result.stdout)
print("stderr: ", result.stderr)

stdout:  
stderr:  'ls' is not recognized as an internal or external command,
operable program or batch file.



In [48]:
import subprocess
result = subprocess .run(["ls", "-l"],
    capture_output=True,
    text=True,
    shell = True
)
print("stdout: \n", result.stdout)

stdout: 
 


In [49]:
result = subprocess.run(["ls", "non_existing_file"],
   capture_output=True,
    text=True,
    shell = True
)
print("rc: ", result.returncode)

rc:  1


## Exercise 92: Customizing Child Processes with env vars

1. Import subprocess. Bring the subprocess module into the current namespace:

In [50]:
import subprocess

2. Run env to print the environment variables. You can run the env Unix command, which will list the process environment variables in stdout:

In [51]:
# For Linux / MAC Users
result = subprocess.run(
    ["env"],
    capture_output=True,
    text=True
)
print(result.stdout)

FileNotFoundError: [WinError 2] The system cannot find the file specified

In [32]:
# For windows Users
result = subprocess.run(
    ["set"],
    capture_output=True,
    text=True,
    shell = True
)
print(result.stdout)

ALLUSERSPROFILE=C:\ProgramData
APPDATA=C:\Users\YCHZSH\AppData\Roaming
CLICOLOR=1
COMMONPROGRAMFILES=C:\Program Files (x86)\Common Files
COMMONPROGRAMFILES(X86)=C:\Program Files (x86)\Common Files
COMMONPROGRAMW6432=C:\Program Files\Common Files
COMPUTERNAME=DESKTOP-NIJ54KC
COMSPEC=C:\Windows\system32\cmd.exe
DRIVERDATA=C:\Windows\System32\Drivers\DriverData
FPS_BROWSER_APP_PROFILE_STRING=Internet Explorer
FPS_BROWSER_USER_PROFILE_STRING=Default
GIT_PAGER=cat
HOMEDRIVE=C:
HOMEPATH=\Users\YCHZSH
LOCALAPPDATA=C:\Users\YCHZSH\AppData\Local
LOGONSERVER=\\DESKTOP-NIJ54KC
MPLBACKEND=module://ipykernel.pylab.backend_inline
NUMBER_OF_PROCESSORS=2
ONEDRIVE=C:\Users\YCHZSH\OneDrive
ONEDRIVECONSUMER=C:\Users\YCHZSH\OneDrive
OS=Windows_NT
PAGER=cat
PATH=C:\Users\YCHZSH\Anaconda3;C:\Users\YCHZSH\Anaconda3\Library\mingw-w64\bin;C:\Users\YCHZSH\Anaconda3\Library\usr\bin;C:\Users\YCHZSH\Anaconda3\Library\bin;C:\Users\YCHZSH\Anaconda3\Scripts;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\W

3. Use a different set of environment variables. If you wanted to customize the environment variables that our subprocess has, you could use the env keyword of the subprocess.run method:

In [None]:
# For Linux / Mac Users
result = subprocess.run(
    ["env"],
    capture_output=True,
    text=True,
    env={"SERVER": "OTHER_SERVER"}
)
print(result.stdout)

In [52]:
# For Windows Users
result = subprocess.run(
    ["set"],
    capture_output=True,
    text=True,
    shell = True,
    env={"SERVER": "OTHER_SERVER"}
)
print(result.stdout)

COMSPEC=C:\Windows\system32\cmd.exe
PATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.JS;.WS;.MSC
PROMPT=$P$G
SERVER=OTHER_SERVER



4. Now, modify the default set of variables. Most of the time, you just want to modify or add one variable, not just replace them all. Therefore, what we did in the previous step is too radical, as tools might require environment variables that are always present in the OS. To do so, you will have to take the current process environment and modify it to match the expected result. We can access the current process environment variables via os.environ and copy it via the copy module. Though you can also use the dict expansion syntax with the keys that you want to change to modify it, as shown in the following example:

In [None]:
# For Linux / MAC Users 
import os
result = subprocess.run(
    ["env"],
    capture_output=True,
    text=True,
    env={**os.environ, "SERVER": "OTHER_SERVER"}
)
print(result.stdout)

In [53]:
# For windows users
import os
result = subprocess.run(
    ["set"],
    capture_output=True,
    text=True,
    shell = True,
    env={**os.environ, "SERVER": "OTHER_SERVER"}
)
print(result.stdout)

ALLUSERSPROFILE=C:\ProgramData
APPDATA=C:\Users\YCHZSH\AppData\Roaming
COMMONPROGRAMFILES=C:\Program Files\Common Files
COMMONPROGRAMFILES(X86)=C:\Program Files (x86)\Common Files
COMMONPROGRAMW6432=C:\Program Files\Common Files
COMPUTERNAME=DESKTOP-NIJ54KC
COMSPEC=C:\Windows\system32\cmd.exe
DRIVERDATA=C:\Windows\System32\Drivers\DriverData
FPS_BROWSER_APP_PROFILE_STRING=Internet Explorer
FPS_BROWSER_USER_PROFILE_STRING=Default
HOMEDRIVE=C:
HOMEPATH=\Users\YCHZSH
LOCALAPPDATA=C:\Users\YCHZSH\AppData\Local
LOGONSERVER=\\DESKTOP-NIJ54KC
NUMBER_OF_PROCESSORS=2
ONEDRIVE=C:\Users\YCHZSH\OneDrive
ONEDRIVECONSUMER=C:\Users\YCHZSH\OneDrive
OS=Windows_NT
PATH=C:\Users\YCHZSH\Anaconda3;C:\Users\YCHZSH\Anaconda3\Library\mingw-w64\bin;C:\Users\YCHZSH\Anaconda3\Library\usr\bin;C:\Users\YCHZSH\Anaconda3\Library\bin;C:\Users\YCHZSH\Anaconda3\Scripts;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Windows\System32\OpenSSH\;C:\Python34\Scripts;C:\

## Activity 16: Testing Python Code

The line, compile("1" + "+1" * 10 ** 6, "string", "exec"), will crash the interpreter; we will need to run it with the following code:

1. First, import the sys and subprocess modules as we are going to use them in the following steps:

In [54]:
import sys
import subprocess

2. We save the code that we were given in the code variable:

In [55]:
code = 'compile("1" + "+1" * 10 ** 6, "string", "exec")'

3. Run the code by calling subprocess.run and sys.executable to get the Python interpreter we are using:

In [56]:
result = subprocess.run([
    sys.executable, 
    "-c", code
])

The preceding code takes a code line, which compiles Python code that will crash and runs it in a subprocess by executing the same interpreter (retrieved via sys.executable) with the -c option to run Python code inline.
4. Now, we print the final result using result.resultcode. This will return the value -11, which means the process has crashed:

In [57]:
print(result.returncode)

3221225725


** In this activity, we have executed a small program that can run the requested code line and checked whether it would crash without breaking the current process. It did end up crashing, hence outputting the value -11, which corresponded to an abort in the program. Note that the returncode value other than 0 indicates that it is crashing. In your case, the returncode can be different than what is presented here, however it will not be 0.

## Using Logging

## Exercise 93: Using a logger Object

The goal of this exercise is to create a logger object and use four different methods that allow us to log in the categories mentioned in the Logging section:

1. Import the logging module:

In [58]:
import logging

2. Create a logger.We can now get a logger through the factory method getLogger:

In [59]:
logger = logging.getLogger("logger_name")

This logger object will be the same everywhere, and you call it with the same name.

3.Log with different categories:

In [60]:
logger.debug("Logging at debug")
logger.info("Logging at info")
logger.warning("Logging at warning")
logger.error("Logging at error")
logger.fatal("Logging at fatal")

Logging at error
Logging at fatal


By default, the logging stack will be configured to log warnings and above, which explains why you only see those levels being printed to the console. You will see later how to configure the logging stack to include other levels, such as info. Use files or a different format to include further information.

4. Include information when logging:

In [61]:
system = "moon"
for number in range(3):
    logger.warning("%d errors reported in %s", number, system)

0 errors reported in moon
1 errors reported in moon
2 errors reported in moon


### Logging in warning, error, and fatal Categories

You should be mindful when you log in warning, error, and fatal. If there is something worse than an error, it is two errors. Logging an error is a way of informing the system of a situation that needs to be handled, and if you decide to log an error and raise an exception, you are basically duplicating the information. As a rule of thumb, following these two pieces of advice is key to an application or library that logs errors effectively:

- Never ignore an exception that transmits an error silently. If you handle an exception that notifies you of an error, log that error.
- Never raise and log an error. If you are raising an exception, the caller has the ability to decide whether it is truly an error situation, or whether they were expecting the issue to occur. They can then decide whether to log it following the previous rule, to handle it, or to re-raise it.

A good example of where the user might be tempted to log an error or warning is in the library of a database when a constraint is violated. From the library perspective, this might look like an error situation, but the user might be trying to insert it without checking whether the key was already in the table. The user can therefore just try to insert and ignore the exception, but if the library code logs a warning when such a situation happens, the warning or error will just spew the log files without a valid reason. Usually, a library will rarely log an error unless it has no way of transmitting the error through an exception.

When you are handling exceptions, it is quite common to log them and the information they come with. If you want to include the exception and trace back the full information, you can use the exc_info argument in any of the methods that we saw before:

In [62]:
try:
    int("nope")
except Exception:
    logging.error("Something bad happened", exc_info=True)
    

ERROR:root:Something bad happened
Traceback (most recent call last):
  File "<ipython-input-62-7f9c7873bb5a>", line 2, in <module>
    int("nope")
ValueError: invalid literal for int() with base 10: 'nope'


The error information now includes the message you passed in, but also the exception that was being handled with the traceback. This is so useful and common that there is a shortcut for it. You can call the exception method to achieve the same as using error with exc_info:

In [63]:
try:
    int("nope")
except Exception:
    logging.exception("Something bad happened")

ERROR:root:Something bad happened
Traceback (most recent call last):
  File "<ipython-input-63-39a74a45c693>", line 2, in <module>
    int("nope")
ValueError: invalid literal for int() with base 10: 'nope'


Now, you will review two common bad practices with the logging module.

The first one is a greedy string formatting. You might see some linters complain about formatting a string by the user, rather than relying on the logging module's string interpolation. This means that logging.info("string template %s", variable) is preferred over logging.info("string template {}".format(variable)). This is the case since, if you perform the string interpolation with the format, you will be doing it no matter how we configure the logging stack. If the user who configures the application decides that they don't need to print out the logs in the information level, you will have to perform interpolation, when it wasn't necessary:

#### prefer
logging.info("string template %s", variable)
#### to
logging.info("string template {}".format(variable))

The other, more important, bad practice is capturing and formatting exceptions when it's not really needed. Often, you see developers capturing broad exceptions and formatting them as part of a log message. This is not only boilerplate but also less explicit. Compare the following two approaches:

In [7]:
d = dict()
# Prefer
try:
    d["missing_key"] += 1
except Exception:
    logging.error("Something bad happened", exc_info=True)
# to
try:
    d["missing_key"] += 1
except Exception as e:
    logging.error("Something bad happened: %s", e)

ERROR:root:Something bad happened
Traceback (most recent call last):
  File "<ipython-input-7-b2af7cb7c0c5>", line 4, in <module>
    d["missing_key"] += 1
KeyError: 'missing_key'
ERROR:root:Something bad happened: 'missing_key'


We don't know if it was a key error, nor where the issue appeared. If the exception was raised without a message, we would just get an empty message. Additionally, if logging an error, use an exception, and you won't need to pass exc_info.

### Configuring the Logging Stack

Another part of the logging library is the functions to configure it, but before diving into how to configure the logging stack, you should understand its different parts and the role they play.

You've already seen logger objects, which are used to define the logging messages that need to be generated. There are also the following classes, which take care of the process of processing and emitting a log:

- Log Records: This is the object that is generated by the logger and contains all the information about the log, including the line where it was logged, the level, the template, and arguments, among others.
- Formatters: These take log records and transform them into strings that can be used by handlers that output to streams.
- Handlers: These are the ones that actually emit the records. They frequently use a formatter to transform records into strings. The standard library comes with multiple handlers to emit log records into stdout, stderr, files, sockets, and so on.
- Filters: Tools to fine-tune log record mechanisms. They can be added to both handlers and loggers.

If the functionality that is already provided by the standard library is not enough, you can always create our own kind of classes that customize how the logging process is performed.

Armed with this knowledge, there are multiple ways to configure all of the elements of the logging stack. You can do so by plugging together all the classes manually with code, passing a dict via logging.config.dictConfig, or through an ini file with logging.config.iniConfig.

### Exercise 94: Configuring the Logging Stack

Start with configuring the code. The first way to configure the stack is by manually creating all the objects and plugging them together:

### Configure through code.
Restart the kernel here.

In [2]:
import logging
import sys
root_logger = logging.getLogger()
handler = logging.StreamHandler(sys.stdout)
formatter = logging.Formatter("%(levelname)s: %(message)s")
handler.setFormatter(formatter)
root_logger.addHandler(handler)
root_logger.setLevel("INFO")
logging.info("Hello logging world")

INFO: Hello logging world


In this code, you get a handle of the root logger in the third line by calling getLogger without any arguments. You then create a stream handler, which will output to sys.stdout (the console) and a formatter to configure how we want the logs to look. Finally, you just need to bind them together by setting the formatter in the handler and the handler in the logger. You set the level in the logger, though you could also configure it in the handler.

### Configure with dictConfig.
Restart the kernel here.

In [2]:
import logging
from logging.config import dictConfig

dictConfig({
    "version": 1,
    "formatters": {
        "short":{
            "format": "%(levelname)s: %(message)s",
        }
    },
    "handlers": {
        "console": {
            "class": "logging.StreamHandler",
            "formatter": "short",
            "stream": "ext://sys.stdout",
            "level": "DEBUG",
        }
    },
    "loggers": {
        "": {
            "handlers": ["console"],
            "level": "INFO"
        }   
    }
})
logging.info("Hello logging world")

INFO: Hello logging world


The dictionary configuring the logging stack is identical to the code in Step 1. Many of the configuration parameters that are passed in as strings can also be passed as Python objects. For example, you can use sys.stdout instead of the string passed to the stream option or logging.INFO rather than INFO.

### Configure with basicConfig.
Restart the kernel here.

In [1]:
import sys
import logging
logging.basicConfig(
    level="INFO",
    format="%(levelname)s: %(message)s",
    stream=sys.stdout
)
logging.info("Hello there!")

INFO: Hello there!


### Configure with fileconfig.
Restart the kernel here.

In [1]:
import logging
from logging.config import fileConfig
fileConfig("logging-config.ini")
logging.info("Hello there!")

INFO: Hello there!


### Exercise 95: Counting Words in a Text Document

1. Get the list of words from this link, which is our source data:

In [2]:
import urllib.request
url = 'https://www.w3.org/TR/PNG/iso_8859-1.txt'
response = urllib.request.urlopen(url)
words = response.read().decode().split()
len(words)# 858

858

Here, you are using urllib, another module within the standard library, to get the contents of the URL of this link. You can then read the content and split it based on spaces and break lines. You will be using words to play with the counter.

In [3]:
import collections
word_counter = collections.Counter(words)

This creates a counter with the list of words passed in through the words list. You can now perform the operations you want on the counter.

In [4]:
for word, count in word_counter.most_common(5):
    print(word, "-", count)

LETTER - 114
SMALL - 58
CAPITAL - 56
WITH - 55
SIGN - 21


You can use the most_common method on the counter tmo get a list of tuples with all the words and the number of occurrences. You can also pass a limit as an argument that limits the number of results:


4. Now, explore occurrences of some words, as shown in the following code snippet:

In [5]:
print("QUESTION", "-", word_counter["QUESTION"])
print("CIRCUMFLEX", "-", word_counter["CIRCUMFLEX"])
print("DIGIT", "-", word_counter["DIGIT"])
print("PYTHON", "-", word_counter["PYTHON"])

QUESTION - 2
CIRCUMFLEX - 11
DIGIT - 10
PYTHON - 0


You can use the counter to explore the occurrences of specific words by just checking them with a key. Now check for QUESTION, CIRCUMFLEX, DIGIT, and PYTHON:


Note how you can just query the counter with a key to get the number of occurrences. Something else interesting to note is that when you query for a word that does not exist, you get 0. Some users might have expected a KeyError.

In this exercise, you just learned how to get a text file from the internet and perform some basic processing operations, such as counting the number of words.

### Exercise 96: Refactoring Code with defaultdict

In this exercise, you will learn how to refactor code and simplify it by using defaultdict:

In [8]:
_audit = {}

def add_audit(area, action):
    if area in _audit:
        _audit[area].append(action)
    else:
        _audit[area] = [action]
        
def report_audit():
    for area, actions in _audit.items():
        print(f"{area} audit:")
        for action in actions:
            print(f"- {action}")
        print()

The code template mentioned earlier in this exercise keeps an audit of all the actions that are performed in a company. They are split by area and the dictionary that was used. You can clearly see in the add_audit function the pattern we spoke about before. You will see how you can transform that into simpler code by using defaultdict and how it could be later extended in a simpler way:

1. Run the code that keeps an audit of all the actions, as mentioned previously. First, run the code to see how it behaves. Before doing any refactoring, you should understand what you are trying to change, and ideally, have tests for it

In [10]:
add_audit("HR", "Hired Sam")
add_audit("Finance", "Used 1000£")
add_audit("HR", "Hired Tom")
report_audit()

HR audit:
- Hired Sam
- Hired Tom
- Hired Sam
- Hired Tom

Finance audit:
- Used 1000£
- Used 1000£



You can see that this works as expected, and you can add items to the audit and report them.

2. Introduce a defaultdict. You can change dict for defaultdict and just create a list whenever you try to access a key that does not exist. This will need to be done only in the add_audit function. As report_audit uses the object as a dictionary and defaultdict is a dictionary, you don't need to change anything in that function. You will see how it will look in the following code snippet:

In [12]:
import collections
_audit = collections.defaultdict(list)
def add_audit(area, action):
    _audit[area].append(action)
        
def report_audit():
    for area, actions in _audit.items():
        print(f"{area} audit:")
        for action in actions:
            print(f"- {action}")
        print()

In [16]:
add_audit("HR", "Hired Sam")
add_audit("Finance", "Used 1000£")
add_audit("HR", "Hired Tom")
report_audit()

HR audit:
- Area created
- Hired Sam
- Hired Tom
Finance audit:
- Area created
- Used 1000£


When a key is not found in the _audit object, our defaultdict just calls the list method, which returns an empty list. The code could not be any simpler. What about if you are asked to log the creation of an area in the audit? Basically, whenever a new area is created in our audit object, it should have an element present called "Area Created". The developer that initially wrote the code claims that it was easier to change with the old layout, without using defaultdict.

3. Use the add_audit function to create the first element. The code without defaultdict for add_audit will be as follows:

In [14]:
def add_audit(area, action):
    if area not in _audit:
        _audit[area] = ["Area created"]
    _audit[area].append(action)

The code change performed in add_audit is much more complex than the one you will have to perform in your function with defaultdict. With defaultdict, you just need to change the factory method from being a list to being a list with the initial string:

In [15]:
import collections
_audit = collections.defaultdict(lambda: ["Area created"])
def add_audit(area, action):
    _audit[area].append(action)
        
def report_audit():
    for area, actions in _audit.items():
        print(f"{area} audit:")
        for action in actions:
            print(f"- {action}")

In [17]:
add_audit("HR", "Hired Sam")
add_audit("Finance", "Used 1000£")
add_audit("HR", "Hired Tom")
report_audit()

HR audit:
- Area created
- Hired Sam
- Hired Tom
- Hired Sam
- Hired Tom
Finance audit:
- Area created
- Used 1000£
- Used 1000£


### Exercise 97: Using lru_cache to Speed Up Our Code

In this exercise, you will see how to configure a function to use cache with functools and to reuse the results from previous calls to speed up the overall process.

You use the lru cache function of the functools module to reuse values that a function has already returned without having to execute them again.

We will start with a function that is mentioned in the following code snippet, which simulates taking a long time to compute, and we will see how we can improve this:

In [18]:
import time
def func(x):
    time.sleep(1)
    print(f"Heavy operation for {x}")
    return x * 10

If we call this function twice with the same arguments, we will be executing the code twice to get the same result:

In [19]:
print("Func returned:", func(1))
print("Func returned:", func(1))

Heavy operation for 1
Func returned: 10
Heavy operation for 1
Func returned: 10


We can see this in the output and the print within the function, which happens twice. This is a clear improvement in performance as, once the function is executed, future executions are practically free. Now, we will improve the performance in the steps that follow:


1. Add the lru cache decorator to the func function: The first step is to use the decorator on our function:

In [25]:
import functools
import time
@functools.lru_cache()
def func(x):
    time.sleep(1)
    print(f"Heavy operation for {x}")
    return x * 10

In [27]:
print("Func returned:", func(1))
print("Func returned:", func(1))
print("Func returned:", func(2))

Func returned: 10
Func returned: 10
Heavy operation for 2
Func returned: 20


Note: The Heavy operation only happens once for 1. We are also calling 2 here to show that the value is different based on its input, and, since 2 was not cached before, it has to execute the code for it.

This is extremely useful; with just one line of code, we have at hand a fully working implementation of an LRU cache.

2. Change the cache size using the maxsize argument. The cache comes with a default size of 128 elements, but this can be changed if needed, through the maxsize argument:

In [1]:
import functools
import time
@functools.lru_cache(maxsize=2)
def func(x):
    time.sleep(1)
    print(f"Heavy operation for {x}")
    return x * 10

By setting it to 2, we are sure that only two different inputs will be saved. We can see this by using three different inputs and calling them in reverse order later:

In [2]:
print("Func returned:", func(1))
print("Func returned:", func(2))
print("Func returned:", func(3))
print("Func returned:", func(3))
print("Func returned:", func(2))
print("Func returned:", func(1))

Heavy operation for 1
Func returned: 10
Heavy operation for 2
Func returned: 20
Heavy operation for 3
Func returned: 30
Func returned: 30
Func returned: 20
Heavy operation for 1
Func returned: 10


The cache successfully returned the previous values for the second call of 2 and 3, but the result for 1 was destroyed once 3 arrived, since we limited the size to two elements only.

3. Now, use it in other functions. Sometimes, the functions you want to cache are not in our control to change. If you want to keep both versions, that is, a cached and an uncached one, we can achieve this by using the lru_cache function just as a function and not as a decorator, as decorators are just functions that take another function as an argument:

In [3]:
import functools
import time
def func(x):
    time.sleep(1)
    print(f"Heavy operation for {x}")
    return x * 10
cached_func = functools.lru_cache()(func)

Now, we can use either func or its cached version, cached_func:

In [4]:
print("Cached func returned:", cached_func(1))
print("Cached func returned:", cached_func(1))
print("Func returned:", func(1))
print("Func returned:", func(1))

Heavy operation for 1
Cached func returned: 10
Cached func returned: 10
Heavy operation for 1
Func returned: 10
Heavy operation for 1
Func returned: 10


We can see how the cached version of the function did not execute the code in the second call, but the uncached version did.

You just learned how to use functools to cache the values of a function. This is a really quick way to improve the performance of your application when applicable.

### Partial

Another often used function in functools is partial. partial allows us to adapt existing functions by providing values for some of their arguments. It is like binding arguments in other languages, such as C++ or JavaScript, but this is what you would expect from a Python function. partial can be used to remove the need for specifying positional or keyword arguments, which makes it useful when we need to pass a function that takes arguments as a function that does not take them. Have look at some examples:

In [6]:
def func(x, y, z):
    print("x:", x)
    print("y:", y)
    print("z:", z)
func(1, 2, 3)

x: 1
y: 2
z: 3


In [7]:
import functools
new_func = functools.partial(func, z='Wops')
new_func(1, 2)

x: 1
y: 2
z: Wops


In [8]:
import functools
new_func = functools.partial(func, 'Wops')
new_func(1, 2)

x: Wops
y: 1
z: 2


### Exercise 98: Creating a print Function That Writes to stderr

By using partial, you can also rebind the optional arguments to a different default, allowing us to change the default value that the function has. You will see how you can repurpose the print function to create a print_stderr function that just writes to stderr.

In this exercise, you will create a function that acts like print, but the output is stderr rather than stdout:



1. Explore the print argument. To start, you need to explore the arguments that print takes are. You will call help on print to see what the documentation offers:

In [9]:
help(print)

Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
    
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.



The argument that you are interested in is the file, which allows us to specify the stream you want to write to.

2.Print to stderr. Now, print the default value for the optional argument file, which is sys.stdout, but you can pass sys.stderr to get the behavior you are looking for:

In [10]:
import sys
print("Hello stderr", file=sys.stderr)

Hello stderr


As you are printing to stderr, the output appears in red as expected.

3. Use partial to change the default. You can use partial to specify arguments to be passed and create a new function. You will bind file to stderr and see the output:

In [13]:
import functools
print_stderr = functools.partial(print, file=sys.stderr)
print_stderr("Hello stderr")

Hello stderr


Great – this works as expected; we now have a function that has changed the default value for the optional file argument.

## Activity 17: Using partial on class Methods

In [15]:
import functools
class Hero:
    DEFAULT_NAME = "Superman"
    def __init__(self):
        self.name = Hero.DEFAULT_NAME
   
    def rename(self, new_name):
        self.name = new_name
   
    reset_name = functools.partial(rename, DEFAULT_NAME)
   
    def __repr__(self):
        return f"Hero({self.name!r})"

In [16]:
if __name__ == "__main__":
    hero = Hero()
    assert hero.name == "Superman"
    hero.rename("Batman")
    assert hero.name == "Batman"
    hero.reset_name()
    assert hero.name == "Superman"

TypeError: rename() missing 1 required positional argument: 'new_name'

In [17]:
help (functools)

Help on module functools:

NAME
    functools - functools.py - Tools for working with functions and callable objects

MODULE REFERENCE
    https://docs.python.org/3.8/library/functools
    
    The following documentation is automatically generated from the Python
    source files.  It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
    implementations.  When in doubt, consult the module reference at the
    location listed above.

CLASSES
    builtins.object
        cached_property
        partial
        partialmethod
        singledispatchmethod
    
    class cached_property(builtins.object)
     |  cached_property(func)
     |  
     |  Methods defined here:
     |  
     |  __get__(self, instance, owner=None)
     |  
     |  __init__(self, func)
     |      Initialize self.  See help(type(self)) for accurate signature.
     |  
     |  __set_name__(self, owner, name)
     |  
     |  ---------------------

In [29]:
import functools

class Hero:
    DEFAULT_NAME = "Superman"
    def __init__(self):
        self.name = Hero.DEFAULT_NAME
   
    def rename(self, new_name):
        self.name = new_name
   
    reset_name = functools.partialmethod(rename, DEFAULT_NAME)
   
    def __repr__(self):
        return f"Hero({self.name!r})"

In [30]:
hero = Hero()
assert hero.name == "Superman"
hero.rename("Batman")
assert hero.name == "Batman"
hero.reset_name()
assert hero.name == "Superman"