# 2. Built-in Data types

- Every object has an identity (ID), a type, and a value.

## (1) Mutability

 i. Example of an immutable object:

In [1]:
age = 42
print(id(age))

age = 43
print(id(age))

4378226536
4378226568


So, in fact, we did not change 42 to 43—we just pointed the name age to a different location, which is the new int object whose value is 43.

ii. Example of a mutable object (Set):

In [3]:
numbers = set()
print(id(numbers))
print(numbers)
numbers.add(3)
numbers.add(7)

print(id(numbers))
print(numbers)


4449336256
set()
4449336256
{3, 7}


## (2) Numbers

Numbers are immutable objects.

 - Integer: Positive, negative or 0.
 - Boolean: True(1) or False(0)

### i. Integers

In [7]:
# Integer object's operand 
a = 14
b = 3
print(a/b)
print(a//b) # Floor value of the division : Integer division in Python is always rounded toward minus infinity. 
print(int(a/b)) # Truncation toward zero

c = -7
d = 4
print(c/d)
print(c//d)
print(int(c/d))

4.666666666666667
4
4
-1.75
-2
-1


In [9]:
print(int('10110', base=2))
print(int('1011000110', base=2))

22
710


- We can use '_' to make the number more readable:

In [14]:
print(pow(37, 2, 43))
(37**(2))%43

36


36

In [15]:
1_024 * 100_000

102400000

### ii. Booleans

In [20]:
print(int(True))
print(int(False))
print(bool(24))
print(bool(1))
print(bool(0))
print(34+True) # Python upcasts these boolean values to integers and performs the addition
print(42-False)
print(54-True)

1
0
True
True
False
35
42
53


**Upcasting** is a type conversion operation that goes from a subclass to its parent. In this example, True and False, which belong to a class derived from the integer class, are converted back to integers when needed. This topic is about **inheritance** and will be explained in detail in Chapter 6, OOP, Decorators, and Iterators.

### iii. Floats

Real numbers, or floating point numbers, are represented in Python according to the **IEEE 754 double-precision binary floating point format**, which stores them in 64 bits of information divided into three sections: **sign, exponent, and mantissa**.

Several programming languages offer two different formats: single and double precision. The former takes up 32 bits of memory, the latter 64. Python supports only the double format. 

In [22]:
pi = 3.1415926536  # how many digits of PI can you remember?
radius = 4.5
area = pi * (radius ** 2)
area

63.617251235400005

In [24]:
import sys
sys.float_info

sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)

In [25]:
0.3 - 0.1 * 3

-5.551115123125783e-17

## (3) Immutable sequences : Strings, Tuples, and Bytes

### i. Strings and Bytes

- **Strings**: Textual data in Python is handled with str objects, more commonly known as strings. They are immutable sequences of Unicode code points. Unicode code points are the numbers assigned to each character in the Unicode standard, which is a universal character encoding scheme used to represent text in computers. 
Unlike other languages, Python does not have a char type, so a single character is represented by a string of length 1.
The result of an encoding produces a bytes object, whose syntax and behavior are similar to that of strings. 


- String and REPR:

In [30]:
print(str("what \nis\nthe difference?"))
print(repr("what \nis\nthe difference?"))

 # the Python repr() function returns a printable representation of the object by converting that object to a string.

what 
is
the difference?
'what \nis\nthe difference?'


- New functions for String: Removeprefix and Removesuffix

In [42]:
s = "Hello There"
print(s.removeprefix("Hell"))
print(s.removesuffix("here"))

o There
Hello T


- Unicode encoding and other encodings, and Bytes objects

In [40]:
s = "This is üŋíc0de string."
print(type(s))
encoded_s = s.encode("utf-8")
print(encoded_s)
print(type(encoded_s))
print(encoded_s.decode("utf-8"))
bytes_obj = b"A bytes object" # We can create a byte object with 'b' in front of a string
print(type(bytes_obj))

<class 'str'>
b'This is \xc3\xbc\xc5\x8b\xc3\xadc0de string.'
<class 'bytes'>
This is üŋíc0de string.
<class 'bytes'>


- Indexing and slicing strings

While indexing comes in one form—zero-based access to any position within the sequence—slicing comes in different forms. 

my_sequence[start:stop:step]. All the arguments are optional; start is inclusive, and stop is exclusive.

In [None]:
s = "The trouble is you think you have time."
print(s[0])
print(s[:4])
print(s[4:])
print(s[2:14])
print(s[2:14:2])
print(s[:])
print(s[::-1]) # Reversed copy of a string using slicing

T
The 
trouble is you think you have time.
e trouble is
etobei
The trouble is you think you have time.
.emit evah uoy kniht uoy si elbuort ehT


- String formatting

One useful feature of strings is that they can be used as templates. This means that they can contain placeholders that can be replaced by arbitrary values using formatting operations. There are several ways of formatting a string. 

In [52]:
greet_old = "Hello %s!"
print("1. "+greet_old % 'Fabrizio')

greet_positional = "Hello {}!"
print("2. "+greet_positional.format("Fabrizio"))

greet_positional = "Hello {} {}!"
print("3. "+greet_positional.format("Fabrizio", "Romano"))

greet_positional_idx = "This is {0}! {1} loves {0}!"
print("4. "+greet_positional_idx.format("Python", "Heinrich"))
print("5. "+greet_positional_idx.format("Coffee", "Fab"))

keyword = "Hello, my name is {name} {last_name}"
print("6. "+keyword.format(name="Fabrizio", last_name="Romano"))

1. Hello Fabrizio!
2. Hello Fabrizio!
3. Hello Fabrizio Romano!
4. This is Python! Heinrich loves Python!
5. This is Coffee! Fab loves Coffee!
6. Hello, my name is Fabrizio Romano


Replacement fields are expressions evaluated at runtime, and then formatted using the format protocol:

In [53]:
name = "Fab"
age = 48
f"Hello! My name is {name} and I'm {age}"

"Hello! My name is Fab and I'm 48"

An interesting addition to f-strings, which was introduced in Python 3.8, is the ability to add an equal sign specifier within the f-string clause; this causes the expression to expand to the text of the expression, an equal sign, then the representation of the evaluated expression. 

In [55]:
user = "heinrich"
password = "super-secret"
print(f"Log in with: {user} and {password}")
print(f"Log in with: {user=} and {password=}")

Log in with: heinrich and super-secret
Log in with: user='heinrich' and password='super-secret'


In [56]:
languages = ["Python", "Javascript"]
print(", ".join(languages))
f"Two very popular languages: {", ".join(languages)}"

Python, Javascript


'Two very popular languages: Python, Javascript'

In [57]:
f"Who knew f-strings could be so powerful? {"\N{shrug}"}"

'Who knew f-strings could be so powerful? 🤷'

### ii. Tuples

The last immutable sequence type we are going to look at here is the tuple. A tuple is a sequence of arbitrary Python objects. In a tuple declaration, items are separated by commas. 

Tuples are used everywhere in Python. 

Sometimes tuples are used without parentheses; for example, to set up multiple variables on one line, or to allow a function to return multiple objects (in several languages, it is common for functions to only be able to return one object), and in the Python console, tuples can be used implicitly to print multiple elements with one single instruction.

In [58]:
t = ()
type(t)

tuple

In [63]:
one_element_tuple = (42,)
a, b, c = 1, 2, 3 # tuple for multiple assignment
print(a)
print(a,b,c)
3 in one_element_tuple

1
1 2 3


False

We use the in operator to check whether a value is a member of a tuple. This membership operator can also be used with lists, strings, and dictionaries, and with collection and sequence objects, in general.

In [64]:
# One-line swap in Pythonic way:
a, b = 1, 5
a, b = b, a
a, b

(5, 1)

Because they are immutable, tuples can be used as keys for dictionaries (we will see this shortly).

To us, tuples are Python’s built-in data that most closely represent a mathematical vector. This does not mean that this was the reason for which they were created, though. 

Tuples usually contain a heterogeneous sequence of elements while, on the other hand, lists are, most of the time, homogeneous. 

Moreover, tuples are normally accessed via unpacking or indexing, while lists are usually iterated over.

## (4) Mutable sequences : Lists and Bytearrays

### i. Lists

In [65]:
list((1, 3, 5, 7, 9))  # list from a tuple

[1, 3, 5, 7, 9]

In [66]:
list("hello") # list from a string

['h', 'e', 'l', 'l', 'o']

- Main functionalities of a list:

In [80]:
a = [1, 2, 1, 3]
a.append(13)
print(a)
print(a.count(1))
a.extend([57, 5])
print(a)
print(a.index(57))  # position of `57` in the list (0-based indexing)
a.insert(3, 17)
print(a)
print(a.pop())
print(a)
print(a.pop(3))
print(a)
a.remove(13)
print(a)
a.reverse()
print(a)
a.sort()
print(a)
a.clear()
print(a)

[1, 2, 1, 3, 13]
2
[1, 2, 1, 3, 13, 57, 5]
5
[1, 2, 1, 17, 3, 13, 57, 5]
5
[1, 2, 1, 17, 3, 13, 57]
17
[1, 2, 1, 3, 13, 57]
[1, 2, 1, 3, 57]
[57, 3, 1, 2, 1]
[1, 1, 2, 3, 57]
[]


In [87]:
a = [1,3,5]
b = [11,22]
print(a+b)
print(a*2)
from math import prod
print(prod(a))
print(sum(a))

[1, 3, 5, 11, 22]
[1, 3, 5, 1, 3, 5]
15
9


In [89]:
from operator import itemgetter
a = [(5, 3), (1, 3), (1, 2), (2, -1), (4, 9)]
print(sorted(a))
print(sorted(a, key=itemgetter(0)))
print(sorted(a, key=itemgetter(0, 1)))
print(sorted(a, key=itemgetter(1)))
print(sorted(a, key=itemgetter(1), reverse=True))

[(1, 2), (1, 3), (2, -1), (4, 9), (5, 3)]
[(1, 3), (1, 2), (2, -1), (4, 9), (5, 3)]
[(1, 2), (1, 3), (2, -1), (4, 9), (5, 3)]
[(2, -1), (1, 2), (5, 3), (1, 3), (4, 9)]
[(4, 9), (5, 3), (1, 3), (1, 2), (2, -1)]


### ii. Bytearrays

Items in a bytearray are integers in the range [0, 256).

In [6]:
name = bytearray(b"Lina")
print(name.replace(b"L", b"l"))
print(name.endswith(b'na'))
print(name.upper())


bytearray(b'lina')
True
bytearray(b'LINA')


## (5) Set types: Set (Mutable), Frozenset (Immutable)

- **Hashability** is a characteristic that allows an object to be used as a set member as well as a key for a dictionary, as we will see very soon. An object is hashable if it has a hash value which never changes during its lifetime, and can be compared to other objects. 

In [20]:
small_primes = set()
small_primes.add(2)
small_primes.add(3)
small_primes.add(5)
small_primes.remove(3)
print(small_primes)
print(7 in small_primes)
bigger_primes = set([5, 7, 11, 13])

# Union operator of Sets:
print(small_primes | bigger_primes)

# intersection operator of Sets
print(small_primes & bigger_primes)

# Difference operator
small_primes - bigger_primes

{2, 5}
False
{2, 5, 7, 11, 13}
{5}


{2}

## (6) Mapping types: Dictionaries

The only standard mapping type, the dictionary is the backbone of every Python object.

There are five ways to create a dictionary.

In [None]:
a = dict(A=1, Z=-1)
b = {"A": 1, "Z": -1}
c = dict(zip(["A", "Z"], [1, -1]))
d = dict([("A", 1), ("Z", -1)])
e = dict({"Z": -1, "A": 1})
a == b == c == d == e  # are they all the same?

True

 **IS operator** checks whether the two objects are the same (that is, **that they have the same ID, not just the same value**), but unless you have a good reason to use it, you should use the double equals instead.

In [6]:
list(zip('hello',range(0,6)))

[('h', 0), ('e', 1), ('l', 2), ('l', 3), ('o', 4)]

In [12]:
d = {}
d["a"] = 1
d["b"] = 2
print(len(d))
print(d["a"])
print(d)
d["c"] =3
print("c" in d)
print(3 in d)
d.clear()
d

2
1
{'a': 1, 'b': 2}
True
False


{}

Membership is checked on the key not on the values. 

In [20]:
d = dict(zip("hello", range(5)))
print(d.keys())
print(d.items())
print(3 in d.values())
print(('o',4) in d.items())

dict_keys(['h', 'e', 'l', 'o'])
dict_items([('h', 0), ('e', 1), ('l', 3), ('o', 4)])
True
True


In [None]:
d.popitem() # Removes last item

('o', 4)

In [22]:
d

{'h': 0, 'e': 1, 'l': 3}

In [None]:
d.pop("l") # Removes the item with the key "l"

3

In [25]:
d

{'h': 0, 'e': 1}

In [27]:
d.pop("not-a-key", "default-value")

'default-value'

In [28]:
d.update({"another": "value"})

In [29]:
d

{'h': 0, 'e': 1, 'another': 'value'}

In [30]:
d.update(a=13)

In [31]:
d

{'h': 0, 'e': 1, 'another': 'value', 'a': 13}

In [32]:
d.get("a")

13

In [33]:
d

{'h': 0, 'e': 1, 'another': 'value', 'a': 13}

In [34]:
d.get("a",117)

13

In [35]:
d.get("b",177)

177

In [36]:
d.get("b")

In [39]:
d = {}
print(d.setdefault("a",1))
print(d)
print(d.setdefault("a",6))
print(d)

1
{'a': 1}
1
{'a': 1}


In [40]:
d = {}
print(d.setdefault("a",{}).setdefault("b",[]).append(1))
print(d)

None
{'a': {'b': [1]}}


In [44]:
d = {"a": "A", "b": "B"}
e = {"b": 8, "c": "C"}
d|e

{'a': 'A', 'b': 8, 'c': 'C'}

In [45]:
e|d

{'b': 'B', 'c': 'C', 'a': 'A'}

In [46]:
{**d, **e}

{'a': 'A', 'b': 8, 'c': 'C'}

In [47]:
{**e, **d}

{'b': 'B', 'c': 'C', 'a': 'A'}

In [49]:
d |=e
d

{'a': 'A', 'b': 8, 'c': 'C'}

## (7) Data types

https://docs.python.org/3/library/datatypes.html

### i. Dates and times

Standard libraries for Dates and Times in Python: *datetime*, *calendar*, *zoneinfo*, *time*

#### (a) Dates 

In [51]:
from datetime import date, datetime, timedelta, timezone, UTC
import time
import calendar as cal
from zoneinfo import ZoneInfo

In [52]:
today = date.today()
today

datetime.date(2025, 3, 27)

In [53]:
today.ctime()

'Thu Mar 27 00:00:00 2025'

In [54]:
today.isoformat()

'2025-03-27'

In [55]:
today.weekday()

3

In [56]:
cal.day_name[today.weekday()]

'Thursday'

In [57]:
today.day, today.month, today.year

(27, 3, 2025)

In [58]:
today.timetuple()

time.struct_time(tm_year=2025, tm_mon=3, tm_mday=27, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=3, tm_yday=86, tm_isdst=-1)

#### (b) Times

In [59]:
time.ctime()

'Thu Mar 27 15:34:00 2025'

In [60]:
time.daylight

0

In [61]:
time.gmtime()

time.struct_time(tm_year=2025, tm_mon=3, tm_mday=27, tm_hour=7, tm_min=34, tm_sec=26, tm_wday=3, tm_yday=86, tm_isdst=0)

In [62]:
time.localtime()

time.struct_time(tm_year=2025, tm_mon=3, tm_mday=27, tm_hour=15, tm_min=34, tm_sec=47, tm_wday=3, tm_yday=86, tm_isdst=0)

In [63]:
time.time()

1743060906.036994

#### (c) Datetime

In [65]:
now = datetime.now()
utcnow = datetime.now(UTC)
now

datetime.datetime(2025, 3, 27, 15, 38, 16, 583649)

In [66]:
utcnow

datetime.datetime(2025, 3, 27, 7, 38, 16, 583675, tzinfo=datetime.timezone.utc)

In [67]:
now.date()

datetime.date(2025, 3, 27)

In [68]:
now.day, now.month, now.year

(27, 3, 2025)

In [69]:
now.date() == date.today()

True

In [70]:
now.time()

datetime.time(15, 38, 16, 583649)

In [71]:
now.hour, now.minute, now.second, now.microsecond

(15, 38, 16, 583649)

In [72]:
now.ctime()

'Thu Mar 27 15:38:16 2025'

In [73]:
now.isoformat()

'2025-03-27T15:38:16.583649'

In [74]:
now.timetuple()

time.struct_time(tm_year=2025, tm_mon=3, tm_mday=27, tm_hour=15, tm_min=38, tm_sec=16, tm_wday=3, tm_yday=86, tm_isdst=-1)

In [77]:
now.tzinfo

In [78]:
utcnow.tzinfo

datetime.timezone.utc

In [79]:
now.weekday()

3

Date and time objects may be categorized as *aware* if they include time zone information, or *naïve* if they don’t.

In [90]:
import zoneinfo
print(type(zoneinfo.available_timezones()))
[x for x in zoneinfo.available_timezones() if "Europe/Paris" in x]

<class 'set'>


['Europe/Paris']

In [95]:
f_bday = datetime(1987, 12, 17, 8, 30, tzinfo = ZoneInfo('Asia/Seoul'))
h_bday = datetime(1985, 8, 30, 7, 30, tzinfo = ZoneInfo('Europe/Paris'))
diff =  f_bday - h_bday

In [96]:
type(diff)

datetime.timedelta

In [99]:
diff.days/365

2.2958904109589042

In [98]:
diff.total_seconds()

72468000.0

In [100]:
today + timedelta(days = 49)

datetime.date(2025, 5, 15)

In [101]:
now + timedelta(weeks = 7)

datetime.datetime(2025, 5, 15, 15, 38, 16, 583649)

Third-party libraries for Date and Time:

- **dateutil**: Powerful extensions to datetime (https://dateutil.readthedocs.io/)
- **Arrow**: Better dates and times for Python (https://arrow.readthedocs.io/)
- **Pendulum**: Python datetimes made easy (https://pendulum.eustace.io/)
- **Maya**: Datetimes for humansTM (https://github.com/kennethreitz/maya)
- **Delorean**: Time Travel Made Easy (https://delorean.readthedocs.io/)
- **pytz**: World time zone definitions for Python (https://pythonhosted.org/pytz/)

In [104]:
import arrow
arrow.utcnow()

<Arrow [2025-03-27T07:54:37.373525+00:00]>

In [105]:
arrow.now()

<Arrow [2025-03-27T15:54:49.374667+08:00]>

In [107]:
local = arrow.now("Asia/Hong Kong")
local

<Arrow [2025-03-27T15:55:19.062709+08:00]>

In [108]:
local.to("utc")

<Arrow [2025-03-27T07:55:19.062709+00:00]>

In [109]:
local.to("Europe/Moscow")

<Arrow [2025-03-27T10:55:19.062709+03:00]>

In [110]:
local.to("Asia/Seoul")

<Arrow [2025-03-27T16:55:19.062709+09:00]>

In [111]:
local.datetime

datetime.datetime(2025, 3, 27, 15, 55, 19, 62709, tzinfo=tzfile('/usr/share/zoneinfo/Asia/Hong_Kong'))

In [112]:
local.isoformat()

'2025-03-27T15:55:19.062709+08:00'

### ii. The collections module

*namedtuple(), deque, ChainMap, Counter, OrderedDict, defaultdict, UserDict, UserList, UserString*

#### (a) **namedtuple**: a subclass of tuple, indexable and iterable, accessible by attribute lookup.

In [113]:
from collections import namedtuple
Vision = namedtuple('Vision', ['left', 'right'])
vision = Vision(9.5, 8.8)

In [115]:
vision[0]

9.5

In [116]:
vision.left

9.5

In [117]:
vision.right

8.8

In [118]:
Vision = namedtuple('Vision', ['left', 'combined', 'right'])
vision = Vision(9.5, 9.2, 8.8)
vision.left

9.5

In [119]:
vision.right

8.8

In [120]:
vision.combined

9.2

#### (b) **defaultdict**

In [123]:
d = {}
d["age"] = d.get("age", 0) + 1
d

{'age': 1}

In [124]:
d = {"age":39}
d["age"] = d.get("age", 0) +1
d

{'age': 40}

In [125]:
from collections import defaultdict
dd = defaultdict(int)
dd["age"] += 1
dd

defaultdict(int, {'age': 1})

#### (c) **ChainMap**

It behaves like a normal dictionary but, according to the Python documentation, **is provided for quickly linking a number of mappings so they can be treated as a single unit.** This is usually much faster than creating one dictionary and running multiple update calls on it.

In [None]:
from collections import ChainMap
default_connection = {'host': 'localhost', 'port':4567}
connection = {'port': 5678}
conn = ChainMap(connection, default_connection)
print(conn['port'])
print(conn['host'])
print(conn.maps)
conn['host'] = 'packtpub.com'
print(conn.maps)
del conn['port']
print(conn.maps)
print(conn['port'])
dict(conn)

5678
localhost
[{'port': 5678}, {'host': 'localhost', 'port': 4567}]
[{'port': 5678, 'host': 'packtpub.com'}, {'host': 'localhost', 'port': 4567}]
[{'host': 'packtpub.com'}, {'host': 'localhost', 'port': 4567}]
4567


{'host': 'packtpub.com', 'port': 4567}

#### (d) Enumerations 

Enumeration: a set of symbolic names (members) bound to unique, constant values. Within an enumeration, the members can be compared by identity, and the enumeration itself can be iterated over.

In [136]:
GREEN = 1
YELLOW = 2
RED = 4
TRAFFIC_LIGHTS = (GREEN, YELLOW, RED)
print(TRAFFIC_LIGHTS)
traffic_lights = {"GREEN": 1, "YELLOW": 2, "RED": 4}

(1, 2, 4)


In [137]:
from enum import Enum
class TrafficLight(Enum):
    GREEN = 1
    YELLOW = 2
    RED = 4

TrafficLight.GREEN

<TrafficLight.GREEN: 1>

In [138]:
TrafficLight.GREEN.name

'GREEN'

In [139]:
TrafficLight.GREEN.value

1

In [140]:
TrafficLight(1)

<TrafficLight.GREEN: 1>

In [141]:
TrafficLight(4)

<TrafficLight.RED: 4>

## (8) Final Consideration 

**Object interning** is a memory optimization technique that is used primarily for immutable data types, such as strings and integers in Python. The idea is to reuse existing objects instead of creating new ones every time an object with the same value is required.

This can lead to significant memory savings and performance improvements because it reduces the load on the garbage collector and speeds up comparisons since they can be done by comparing object identities.

In [142]:
# Object interning:
a = 5
b = 5
id(a) == id(b)

True

What about performance? For example, in a list, operations such as insertion and membership testing can take O(n) time (big O notation), while they are O(1) for a dictionary. It is not always possible to use dictionaries though, if we don’t have the guarantee that we can uniquely identify each item of the collection by means of one of its properties and that the property in question is hashable (so it can be a key in dict).

### (9) About Names

- Names for data should be nouns
- Names for functions should be verbs
- Names should be as expressive as possible.