Object reference naming conventions:
=
Do not reuse names of data types, python keywords or built in attributes
-
Calling dir() with no arguments lists all of Python's built in attributes:

In [3]:
dir()

['In',
 'Out',
 '_',
 '_1',
 '__',
 '___',
 '__builtin__',
 '__builtins__',
 '__doc__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 '_dh',
 '_i',
 '_i1',
 '_i2',
 '_i3',
 '_ih',
 '_ii',
 '_iii',
 '_oh',
 'exit',
 'get_ipython',
 'quit']

The __builtins__ attribute is, in effect, a module that holds all of Python’s
built-in attributes. We can use it as an argument to the dir() function. Those
that begin with a capital letter are the names of Python’s built-in exceptions;
the rest are function and data type names:

In [4]:
dir(__builtins__)

['ArithmeticError',
 'AssertionError',
 'AttributeError',
 'BaseException',
 'BlockingIOError',
 'BrokenPipeError',
 'BufferError',
 'ChildProcessError',
 'ConnectionAbortedError',
 'ConnectionError',
 'ConnectionRefusedError',
 'ConnectionResetError',
 'EOFError',
 'Ellipsis',
 'EnvironmentError',
 'Exception',
 'False',
 'FileExistsError',
 'FileNotFoundError',
 'FloatingPointError',
 'GeneratorExit',
 'IOError',
 'ImportError',
 'IndentationError',
 'IndexError',
 'InterruptedError',
 'IsADirectoryError',
 'KeyError',
 'KeyboardInterrupt',
 'LookupError',
 'MemoryError',
 'ModuleNotFoundError',
 'NameError',
 'None',
 'NotADirectoryError',
 'NotImplemented',
 'NotImplementedError',
 'OSError',
 'OverflowError',
 'PermissionError',
 'ProcessLookupError',
 'RecursionError',
 'ReferenceError',
 'RuntimeError',
 'StopAsyncIteration',
 'StopIteration',
 'SyntaxError',
 'SystemError',
 'SystemExit',
 'TabError',
 'TimeoutError',
 'True',
 'TypeError',
 'UnboundLocalError',
 'UnicodeDecode

Names that begin and end with two underscores (such as __lt__) should not be used. Python defines various special methods and variables that use such names
-

The easiest way to check whether something is a valid identifier is to try to
assign to it in an interactive Python interpreter:

In [11]:
stretch-factor = 1 #This assignment fails because “-” is not a Unicode letter, digit, or underscore.

SyntaxError: cannot assign to operator (<ipython-input-11-76d7f333723d>, line 1)

In [7]:
2miles = 2 #fails because the start character is not a Unicode letter or underscore

SyntaxError: invalid syntax (<ipython-input-7-8b2c5ddd52bf>, line 1)

In [8]:
str = 3 # Legal but BAD

In [9]:
l'impôt31 = 4 #fails because a quote is not a Unicode letter, digit, or underscore

SyntaxError: EOL while scanning string literal (<ipython-input-9-3731f1bc0fe5>, line 1)

In [10]:
l_impôt31 = 5 #this is fine

Integral Types
=
When used in Boolean expressions, 0 and False
are False, and any other integer and True are True. When used in numerical
expressions True evaluates to 1 and False to 0.

In [None]:
Syntax Description
x + y Adds number x and number y

x - y Subtracts y from x

x * y Multiplies x by y

x / y Divides x by y; always produces a float (or a complex if x or y
is complex)

x // y Divides x by y; truncates any fractional part so always produces
an int result; see also the round() function

x % y Produces the modulus (remainder) of dividing x by y

x ** y Raises x to the power of y; see also the pow() functions

-x  changes x’s sign if nonzero, does nothing if zero

+x Does nothing; is sometimes used to clarify code

abs(x) Returns the absolute value of x

divmod(x, y) Returns the quotient and remainder of dividing x by y as a tuple of two ints

pow(x, y) Raises x to the power of y; the same as the ** operator

pow(x, y, z) A faster alternative to (x ** y) % z

round(x, n) Returns x rounded to n integral digits if n is a negative int or returns x rounded to n decimal places if n is a positive int; the returned value has the same type as x; see the text

bin(i) Returns the binary representation of int i as a string, e.g., bin(1980) == '0b11110111100'

hex(i) Returns the hexadecimal representation of i as a string, e.g., hex(1980) == '0x7bc'

int(x) Converts object x to an integer; raises ValueError on failure—or TypeError if x’s data type does not support integer conversion. If x is a floating-point number it is truncated.

int(s, base) Converts str s to an integer; raises ValueError on failure. If the optional base argument is given it should be an integer between 2 and 36 inclusive.

oct(i) Returns the octal representation of i as a string, e.g. oct(1980) == '0o3674'

In [21]:
divmod(16, 5)

(3, 1)

Binary numbers are written with a leading 0b, octal numbers with a leading 0o,★ and hexadecimal numbers with a leading 0x. Uppercase letters can also be used.
-

When a negative rounding value is used on integers a subtle and useful behavior is achieved—for
example, round(13579, -3) produces 14000, and round(34.8, -1) produces 30.0.

Bitwise operators are operators (just like +, *, &&, etc.) that operate on ints and uints at the binary level. This means they look directly at the binary digits or bits of an integer
-

In [None]:
Syntax Description
i & j Bitwise AND of i and j
Normally, ints take up 4 bytes or 32 bits of space. This means each int is stored as 32 binary digits
The & operator compares each binary digit of two integers and returns a new integer, with a 1 wherever both numbers had a 1 and a 0 anywhere else. 



i | j Bitwise OR of int i and int j; negative numbers are assumed to be represented using 2’s complement

i ^ j Bitwise XOR (exclusive or) of i and j

i << j Shifts i left by j bits; like i * (2 ** j) without overflow checking

i >> j Shifts i right by j bits; like i // (2 ** j) without overflow checking

~i Inverts i’s bits

Booleans
=
Python provides three logical operators: and, or, and not.
Both and and or use short-circuit logic and return the operand that determined
the result, whereas not always returns either True or False.
Programmers who have been using older versions of Python sometimes use
1 and 0 instead of True and False; this almost always works fine, but new code
should use the built-in Boolean objects when a Boolean value is required.

Floating-Point Types
=
Python provides three kinds of floating-point values: the built-in float and
complex types,and the decimal.Decimal type from the standard library. Computers natively represent floating-point numbers using base 2—this
means that some decimals can be represented exactly (such as 0.5), but others
only approximately (such as 0.1and 0.2).

If we need really high precision there are two approaches we can take. One
approach is to use ints—for example, working in terms of pennies or tenths of
a penny or similar—and scale the numbers when necessary. This requires us
to be quite careful, especially when dividing or taking percentages. The other
approach is to use Python’s decimal.Decimal numbers from the decimal module.

Mixed mode arithmetic is supported such that using an int and a float produces
a float, and using a float and a complex produces a complex. Because decimal.
Decimals are of fixed precision they can be used only with other decimal.
Decimals and with ints, in the latter case producing a decimal.Decimal result.

It is possible that NaN (“not a number”) or “infinity” may be produced by a calculation involving floats—unfortunately the behavior is not consistent across implementations and may differ depending on the system’s underlying math library.

Here is a simple function for comparing floats for equality to the limit of the machine’s accuracy:
-

In [84]:
import sys

def equal_float(a, b):
    return abs(a - b) <= sys.float_info.epsilon

In [86]:
sys.float_info

sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)

Floating-point numbers can be converted to integers using the int() function which returns the whole part and throws away the fractional part, or using round() which accounts for the fractional part, or using math.floor() or math.ceil() which convert down to or up to the nearest integer.
-

Complex Numbers
=
Literal complex numbers are written with the real and imaginary parts joined by a + or - sign, and with the imaginary part followed by a j. The separate parts of a complex are available as attributes real and imag. For example:

In [87]:
z = -89.5+2.125j
z.real, z.imag

(-89.5, 2.125)

Except for //, %, divmod(), and the three-argument pow(), all the numeric operators and functions can be used with complex numbers, and so can the augmented assignment versions.

Decimal Numbers
=
The decimal module provides
immutable Decimal numbers that are as accurate as we specify. Calculations
involving Decimals are slower than those involving floats, but whether this is
noticeable will depend on the application.


In [88]:
import decimal
a = decimal.Decimal(9876)
b = decimal.Decimal("54321.012345678987654321") #This function can take an integer or a string argument but not a float
a + b

Decimal('64197.012345678987654321')

From Python 3.1 it is possible to convert floats to decimals using the deci- 3.1
mal.Decimal.from_float() function. This function takes a float as argument
and returns the decimal.Decimal that is closest to the number the float approximates.

The math and cmath modules are not suitable for use with decimal.Decimals,
but some of the functions provided by the math module are provided as decimal.
Decimal methods. For example, to calculate ex where x is a float, we write
math.exp(x), but where x is a decimal.Decimal, we write x.exp().

The decimal.Decimal data type also provides ln() which calculates the natural
(base e) logarithm (just like math.log() with one argument), log10(), and sqrt(),
along with many other methods specific to the decimal.Decimal data type.

When we call print() on the result of decimal.Decimal(23) / decimal.
Decimal("1.05") the bare number is printed—this output is in string form.
If we simply enter the expression we get a decimal.Decimal output—this output
is in representational form.

All Python objects have two output forms. String
form is designed to be human-readable. Representational form is designed to
produce output that if fed to a Python interpreter would (when possible) reproduce
the represented object.

Strings
=

String literals are created using quotes, and we
are free to use single or double quotes providing we use the same at both ends.


We can also use a triple quoted string — this is Python-speak for a string
that begins and ends with three quote characters (either three single quotes or
three double quotes). For example:

In [94]:
text = """A triple quoted string like this can include 'quotes' and
"quotes" without formality. We can also escape newlines \
so this particular string is actually only two lines long."""

text

'A triple quoted string like this can include \'quotes\' and\n"quotes" without formality. We can also escape newlines so this particular string is actually only two lines long.'

In [None]:
Escape Meaning
\newline Escape (i.e., ignore) the newline
\\ Backslash (\)
\' Single quote (’)
\" Double quote (")
\a ASCII bell (BEL)
\b ASCII backspace (BS)
\f ASCII formfeed (FF)
\n ASCII linefeed (LF)
\N{name} Unicode character with the given name
\ooo Character with the given octal value
\r ASCII carriage return (CR)
\t ASCII tab (TAB)
\uhhhh Unicode character with the given 16-bit hexadecimal value
\Uhhhhhhhh Unicode character with the given 32-bit hexadecimal value
\v ASCII vertical tab (VT)
\xhh Character with the given 8-bit hexadecimal value

If we want to use quotes inside a normal quoted string we can do so without
formality if they are different from the delimiting quotes; otherwise, we must
escape them:

In [95]:
a = "Single 'quotes' are fine; \"doubles\" must be escaped."
b = 'Single \'quotes\' must be escaped; "doubles" are fine.'

In [2]:
import re
phone1 = re.compile("^((?:[(]\\d+[)])?\\s*\\d+(?:-\\d+)?)$") #\\ escapes required to write backslashes into this string
phone1

re.compile(r'^((?:[(]\d+[)])?\s*\d+(?:-\d+)?)$', re.UNICODE)

In [3]:
phone2 = re.compile(r"^((?:[(]\d+[)])?\s*\d+(?:-\d+)?)$") #by using a raw string r"example", escaping is not neccesary
phone2

re.compile(r'^((?:[(]\d+[)])?\s*\d+(?:-\d+)?)$', re.UNICODE)

In [4]:
t = "This is not the best way to join two long strings " + \
"together since it relies on ugly newline escaping"

s = ("This is the nice way to join two long strings "
"together; it relies on string literal concatenation.")

Slicing and Striding Strings
-
We know from Piece #3 that individual items in a sequence, and therefore individual
characters in a string, can be extracted using the item access operator
([]). In fact, this operator is much more versatile and can be used to extract not
just one item or character, but an entire slice (subsequence) of items or characters,
in which context it is referred to as the slice operator.

Index
positions into a string begin at 0 and go up to the length of the string minus
1. But it is also possible to use negative index positions—these count from the
last character back toward the first.

In [None]:
The slice operator has three syntaxes:
seq[start]
seq[start:end]
seq[start:end:step]

Stepping by -1 will extract every character backwards:
seq[::-1]

String operators and methods
-
The str.join() method is good for joining lots of strings. The method takes a sequence as an argument (e.g., a list or tuple of strings), and joins them together into a single string with the
string the method was called on between each one.

In [6]:
treatises = ["Arithmetica", "Conics", "Elements"]

" ".join(treatises)
'Arithmetica Conics Elements'

"-<>-".join(treatises)
'Arithmetica-<>-Conics-<>-Elements'

"".join(treatises)
'ArithmeticaConicsElements'

'ArithmeticaConicsElements'

The * operator provides string replication:

In [7]:
s = "=" * 5
print(s)

=====


In [8]:
s *= 10
print(s)



When applied to strings, the in membership operator returns True if its lefthand
string argument is a substring of, or equal to, its right-hand string argument.

In cases where we want to find the position of one string inside another, we
have two methods to choose from. One is the str.index() method; this returns
the index position of the substring, or raises a ValueError exception on failure.

The other is the str.find() method; this returns the index position of the substring,
or -1 on failure.

If we are looking for multiple index positions, using the str.index() rather than str.find()
method often produces cleaner code, as the following two equivalent functions
illustrate:

In [9]:
def extract_from_tag(tag, line):
    opener = "<" + tag + ">"
    closer = "</" + tag + ">"
    try:
        i = line.index(opener)
        start = i + len(opener)
        j = line.index(closer, start)
        return line[start:j]
    except ValueError:
        return None

In [10]:
def extract_from_tag(tag, line):
    opener = "<" + tag + ">"
    closer = "</" + tag + ">"
    i = line.find(opener)
    if i != -1:
        start = i + len(opener)
        j = line.find(closer, start)
        if j != -1:
            return line[start:j]
    return None

In [11]:
record = "Leo Tolstoy*1828-8-28*1910-11-20"
fields = record.split("*")
fields

['Leo Tolstoy', '1828-8-28', '1910-11-20']

In [13]:
born = fields[1].split("-")
print(born)

died = fields[2].split("-")
print("lived about", int(died[0]) - int(born[0]), "years")

['1828', '8', '28']
lived about 82 years


The str.maketrans() method is used to
create a translation table which maps characters to characters. It accepts one,
two, or three arguments, but we will show only the simplest (two argument)
call where the first argument is a string containing characters to translate from
and the second argument is a string containing the characters to translate to.

Both arguments must be the same length. The str.translate() method takes
a translation table as an argument and returns a copy of its string with the
characters translated according to the translation table. Here is how we could
translate strings that might contain Bengali digits to English digits:

In [14]:
table = "".maketrans("\N{bengali digit zero}"
    "\N{bengali digit one}\N{bengali digit two}"
    "\N{bengali digit three}\N{bengali digit four}"
    "\N{bengali digit five}\N{bengali digit six}"
    "\N{bengali digit seven}\N{bengali digit eight}"
    "\N{bengali digit nine}", "0123456789")
    
print("20749".translate(table)) # prints: 20749
print("\N{bengali digit two}07\N{bengali digit four}"
    "\N{bengali digit nine}".translate(table)) # prints: 20749

20749
20749


The str.maketrans() and str.translate() methods can also be used
to delete characters by passing a string containing the unwanted characters as
the third argument to str.maketrans().

String Formatting with the str.format() Method
-
The str.format() method provides a very flexible and powerful way of creating
strings. Using str.format() is easy for simple cases, but for complex formatting
we need to learn the formatting syntax the method requires

The str.format() method returns a new string with the replacement fields in
its string replaced with its arguments suitably formatted. For example:

In [17]:
"The novel '{0}' was published in {1}".format("Hard Times", 1854)

"The novel 'Hard Times' was published in 1854"

If we need to include braces inside format strings, we can do so by doubling
them up. Here is an example:


In [18]:
"{{{0}}} {1} ;-}}".format("I'm in braces", "I'm not")


"{I'm in braces} I'm not ;-}"

If we try to concatenate a string and a number, Python will quite rightly raise
a TypeError. But we can easily achieve what we want using str.format():

In [19]:
"{0}{1}".format("The amount due is $", 200)

'The amount due is $200'

One other point to note is that replacement fields can contain replacement
fields. Nested replacement fields cannot have any formatting; their purpose is
to allow for computed formatting specifications. We will see an example of this

Field Names
-
A field name can be either an integer corresponding to one of the str.format()
method’s arguments, or the name of one of the method’s keyword arguments.
We discuss keyword arguments in Chapter 4, but they are not difficult, so we
will provide a couple of examples here for completeness:

In [22]:
"{who} turned {age} this year".format(who="She", age=88)

'She turned 88 this year'

In [25]:
"The {who} was {0} last week".format(12, who="boy")

'The boy was 12 last week'

Notice that in an argument list, keyword
arguments always come after positional arguments;and of course we can make
use of any arguments in any order inside the format string.

Field names may refer to collection data types—for example, lists. In such
cases we can include an index (not a slice!) to identify a particular item:

In [26]:
stock = ["paper", "envelopes", "notepads", "pens", "paper clips"]
"We have {0[1]} and {0[2]} in stock".format(stock)


'We have envelopes and notepads in stock'

Later on we will learn about Python dictionaries. These store key–value items,
and since they can be used with str.format(), we’ll just show a quick example
here. Don’t worry if it doesn’t make sense; it will once you’ve read Chapter 3:

In [28]:
d = dict(animal="elephant", weight=12000)

"The {0[animal]} weighs {0[weight]}kg".format(d)


'The elephant weighs 12000kg'

We can also access named attributes with dot notation. Assuming we have imported the math and
sys modules, we can do this:

In [30]:
import math
"math.pi=={0.pi} sys.maxunicode=={1.maxunicode}".format(math, sys)

'math.pi==3.141592653589793 sys.maxunicode==1114111'

So in summary, the field name syntax allows us to refer to positional and keyword
arguments that are passed to the str.format() method. If the arguments
are collection data types like lists or dictionaries, or have attributes,we can access
the part we want using [] or . notation

From Python 3.1 it is possible to omit field names, in which case Python will in
effect put them in for us, using numbers starting from 0. For example:

In [31]:
"{} {} {}".format("Python", "can", "count")


'Python can count'

The local variables that are currently in scope are available from the built-in
locals() function. This function returns a dictionary whose Mapping
keys are local variable names and whose values are references to the variables’ values. Now
we can use mapping unpacking to feed this dictionary into the str.format()
method. The mapping unpacking operator is ** and it can be applied to a
mapping (such as a dictionary) to produce a key–value list suitable for passing
to a function. For example:

In [34]:
element = "Silver"
number = 47
"Element {number} is {element}".format(**locals())


'Element 47 is Silver'

Unpacking a dictionary into the str.format() method allows us to use the
dictionary’s keys as field names. This makes string formats much easier to
understand, and also easier to maintain, since they are not dependent on the
order of the arguments. Note, however, that if we want to pass more than one
argument to str.format(), only the last one can use mapping unpacking.

Conversions
-
it is possible to override a data type’s normal behavior and force it to provide either its string or its representational form. This is done by adding a conversion specifier to the field. 

Currently there
are three such specifiers: s to force string form, r to force representational form, and a to force representational form but only using ASCII characters. Here is
an example:

In [37]:
import decimal
"{0} {0!s} {0!r} {0!a}".format(decimal.Decimal("93.4"))

"93.4 93.4 Decimal('93.4') Decimal('93.4')"

In [47]:
movie = '\u7ffb\u8a33\u3067\u5931\u308f\u308c\u308b'

In [48]:
"{movie}".format(**locals())

'翻訳で失われる'

In [49]:
"{movie!a}".format(**locals())

"'\\u7ffb\\u8a33\\u3067\\u5931\\u308f\\u308c\\u308b'"

Format Specifications
-
For strings, the things that we can control are the fill character, the alignment
within the field, and the minimum and maximum field widths.

A string format specification is introduced with a colon (:) and this is followed
by an optional pair of characters—a fill character (which may not be }) and an
alignment character (< for left align, ^ for center, > for right align). Then comes
an optional minimum width integer, and if we want to specify a maximum
width, this comes last as a period followed by an integer.

In [51]:
s = "The sword of truth"
"{0}".format(s) # default formatting

'The sword of truth'

In [53]:
"{0:25}".format(s) # minimum width 25

'The sword of truth       '

In [54]:
"{0:>25}".format(s) # right align, minimum width 25

'       The sword of truth'

In [55]:
"{0:^25}".format(s) # center align, minimum width 25

'   The sword of truth    '

In [56]:
"{0:-^25}".format(s) # - fill, center align, minimum width 25

'---The sword of truth----'

In [57]:
"{0:.10}".format(s) # maximum width 10

'The sword '

As we noted earlier, it is possible to have replacement fields inside format specifications.
This makesit possible to have computed formats. Here, for example,
are two ways of setting a string’s maximum width using a maxwidth variable:

In [58]:
maxwidth = 12
"{0}".format(s[:maxwidth])

'The sword of'

In [59]:
"{0:.{1}}".format(s, maxwidth)

'The sword of'

Formatting integers
-
The format specification allows us to control the fill character, the
alignment within the field, the sign, whether to use a nonlocale-aware comma
separator to group digits (from Python 3.1), the minimum field width, and the
number base.

The optional sign character: + forces the output of the sign, - outputs the sign only for negative numbers, and a space outputs a space for positive numbers and a - sign for
negative numbers. 

Then comes an optional minimum width integer—this can
be preceded by a # character to get the base prefix output (for binary, octal, and
hexadecimal numbers), and by a 0 to get 0-padding.

Then, from Python 3.1,
comes an optional comma—if present this will cause the number’s digits to be
grouped into threes with a comma separating each group

If we want the output
in a base other than decimal we must add a type character—b for binary,
o for octal, x for lowercase hexadecimal, and X for uppercase hexadecimal, although
for completeness, d for decimal integer is also allowed.

Example: print_unicode.py
=