In statically types languages like Java, the data type is also associated with the variable. Python is dynamically typed. A variable in Python is just a reference, with no type attached. Hence the same variable can be reassigned to another data of different data type.
a = "hello"
print(type(a)) # <class 'str'>
a = 3 + 4j
print(type(a)) # <class 'complex'>
When variable is reassigned, the content is not changed. A new block is is created with the new value, and it reference is stored in the variable. This happens for integer
a = 10
print(hex(id(a))) # 0x10084eb50
a = 15
print(hex(id(a))) # 0x10084eb10 - address changed
a = 10 + 5
print(hex(id(a))) # 0x10084eb10
An object whose internal state can be changed, is called mutable
. While, the object whose internal state cannot be changed,
is called immutable
. For example, I have a class, with two attributes. I change the value of one the attributes. The address
of the object still remains the same. Hence, the object is mutable. In Python,
- Immutable: numbers(int, float, bool, etc), strings, tuples, frozen sets, user-defined classes(explicit mentions)
- Mutable: lists, sets, dictionaries, user-defined classes
There is a catch here. Suppose we have a tuple of integers. Tuples are immutable and the same goes for integers. Now, suppose we have a tuple with lists as element. I can modify the list items, and this will not change the reference of the list, as lists are mutable.
int
is used to store integers. Everything is represented in base 2 in the memory. If the system is n bits, the range of
numbers that can be stored will be [-2^n, 2^n -1]. The int
object uses a variable number of bits. It is limited by the
program's available memory. As the size increases, the standard operations (+,-,*, etc) computation time will also increase.
If the int()
constructor is based, it will truncate/typecast values to fit the concept of integer. The constructor takes a
keyword argument base
, used to define the base of the number. If base argument is used, the value should be in string type.
To convert base10 values to other standard bases, Python provides built-in functions like hex()
, oct()
, bin()
.
Refer number_base.py for conversion of base10 numbers to other bases.
import sys
import time
print(sys.getsizeof(0)) # 24 (bytes)
print(sys.getsizeof(1)) # 28 (bytes)
print(sys.getsizeof(2 ** 1000)) # 160 (bytes)
def calc(a):
for _ in range(10000000):
a * 2
start = time.perf_counter()
calc(10)
end = time.perf_counter()
print(end - start) # 0.556221267
start = time.perf_counter()
calc(12 ** 1000)
end = time.perf_counter()
print(end - start) # 2.373679042
print(int("ff", 16)) # 255
# print(int("B", 11)) # ValueError
print(oct(10)) # 0o12
Resultant of operations on int will be: (+,-,,*,//(floor division), %) -> int, (/) -> float. Floor division is not the same as truncation.
fractions
module can be used to represent fractions. From that module, Fraction
class is used which takes numerator
and denominator. By default, it maintains the sign with the numerator. Numerator and Denominator can be of type int, float,
string, decimal. Operations performed on fractions return fraction as result. If irrational numbers are passed as arguments,
fractions module finds an approximation in form of fraction. limit_denominator
function can be used to limit the denominator
value.
import math
from fractions import Fraction
print(repr(Fraction(1))) # Fraction(1, 1)
print(repr(Fraction(denominator=8, numerator=4))) # Fraction(1, 2)
print(repr(Fraction(denominator=-98, numerator=4))) # Fraction(-2, 49)
print(repr(Fraction('0.125'))) # Fraction(1, 8)
print(repr(Fraction('22/7'))) # Fraction(22, 7)
x = Fraction(2, 3)
y = Fraction(3, 4)
print(x.denominator) # 3
print(y.numerator) # 3
print(repr(x + y)) # Fraction(17, 12)
print(repr(x * y)) # Fraction(1, 2)
x = Fraction(math.pi)
print(repr(x)) # Fraction(884279719003555, 281474976710656)
print(repr(x.limit_denominator(10))) # Fraction(22, 7)
print(repr(x.limit_denominator(100))) # Fraction(311, 99)
print(repr(x.limit_denominator(500))) # Fraction(355, 113)
float
is used for representing real numbers. They have a fixed size i.e. usually 8 bytes. Sign takes 1 bit, exponent
takes 11 bits and the remaining is used by significant digits. (1234.5 -> 5 significant digits, 1234500000-> 5 significant
digits, 0.00012345 -> 5 significant digits). Irrational numbers cannot be represented in their exact form. Numbers after
decimal point are also converted to base2 while storing in memory. Certain decimal values are finite in base10, but an
infinite binary representation. For example, 0.1, 0.3, etc. Hence they cannot be stored precisely in the memory. There will
be an approximate float that is stored. This is a problem with all languages.
from fractions import Fraction
print(float(10)) # 10.0
print(float('12.5')) # 12.5
# print(float('22/7')) # ValueError: could not convert string to float: '22/7'
print(float(Fraction(22, 7))) # 3.142857142857143
print(format(0.1, '.15f')) # 0.100000000000000
print(format(0.1, '.25f')) # 0.1000000000000000055511151
print(format(0.125, '.25f')) # 0.1250000000000000000000000
print(format(0.125, '.25f')) # 0.1250000000000000000000000
Addition of approx float representations with also be an approximate, even if the expected result is not supposed to be
an approximation. For equality check, rounding can be used, say till 5 decimal points. This might not be very effective.
You can use tolerance value to find equality. This can be done using floating absolute in math module i.e. fabs()
. Tolerance
can be absolute or relative. For relative tolerance, tolerance = rel_tolerance(represented as percentage) * max(|x|,|y|)
However, relative tolerance does not work well for numbers very close to 0. Hence, a combination of absolute and relative
tolerance is recommended. tol = max(rel_tol * max(|x|, |y|), abs_tol)
. The details of this is mentioned in PEP 485. This
is taken care by math.isclose
. Use this method for equality checks.
from math import isclose
x, y = 0.125 + 0.125 + 0.125, 0.375
print(x == y) # True
x, y = 0.1 + 0.1 + 0.1, 0.3
print(x == y) # False
print(format(x, '.25f')) # 0.3000000000000000444089210
print(format(y, '.25f')) # 0.2999999999999999888977698
print(round(x, 3) == round(y, 3)) # True
print(isclose(x, y)) # True
a, b = 123456789.01, 123456789.02
x, y = 0.01, 0.02
print(isclose(a, b, rel_tol=0.01)) # True
print(isclose(x, y, rel_tol=0.01)) # False
x, y = 0.0000001, 0.0000002
print(isclose(x, y, rel_tol=0.01)) # False
print(isclose(x, y, rel_tol=0.01, abs_tol=0.01)) # True
print(isclose(a, b, rel_tol=0.01, abs_tol=0.0001)) # True
When float is converted to integer, there is a data loss. However, we can control how this data loss will take place.
Truncation returns only the integer portion. int
uses truncation for floats. Floor is the largest integer less
than or equal to the number. (floor and trunc is same for +ve numbers). Ceiling is the smallest integer greater than
or equal to the number.
from math import trunc, floor, ceil
print(trunc(10.3), trunc(10.5), trunc(10.8), trunc(-10.5)) # 10 10 10 -10
print(floor(10.3), floor(10.5), floor(10.8), floor(-10.5)) # 10 10 10 -11
print(ceil(10.3), ceil(10.5), ceil(10.8), ceil(-10.3)) # 11 11 11 -10
round()
will take two arguments. First argument is the number you want to round, and the second is the point to which it
will be rounded, approximated to the 10^-n place. When there are ties, for example round(1.25, 1)
, the number is rounded
to an even number. Hence the result will be 1.2. This is called banker's rounding. round(15,-1)=20 and round(25,-1)=20
.
This introduces bias to reduce error due to rounding.
print(round(1.9), round(1.9, 0)) # 2 2.0
print(round(1.888, 3), round(1.888, 2), round(1.888, 1)) # 1.888 1.89 1.9
print(round(888.88, 1), round(888.88, 0), round(888.88, -1), round(888.88, -2), round(888.88, -3), round(888.88, -4))
# 888.9 889.0 890.0 900.0 1000.0 0.0
print(round(1.25, 1), round(1.35, 1), round(-1.25, 1), round(-1.35, 1)) # 1.2 1.4 -1.2 -1.4
def _round(x): # rounding away from zero
from math import copysign
return int(x + 0.5 * copysign(1, x))
print(round(1.5), _round(1.5)) # 2 2
print(round(2.5), _round(2.5)) # 2 3
print(round(-1.5), _round(-1.5)) # -2 -2
print(round(-2.5), _round(-2.5)) # -2 -3
Using decimal
module, decimals can be represented.
The boolean values are : True
and False
. They are also integral values.
Lists are similar to arrays. The following operations can be performed on a list.
l = [1,2,3,4,5,6,7,8,9,0]
l.insert(1,3) # (index, value)
l.append(8) #add to the end of list
l.extend([9,10,11]) # add elements from another list
l.remove(6) #removes the first instance of the data in the list
l.pop() # removes last element
l.pop(4) # remove element ar index 4 and return the element
print(l) #prints list
l.sort() # sorts list in ascending order in place
l.reverse() #this does not change the original list. Used to reverse order of elements
del l[5] # remove the sixth element
To access elements in a list, square brackets are used, and the index is laced inside them. E.g. arr[3]
will return the
fourth element of the list. The index values can also be negative. This is used for reverse traversal.
Sorting a nested list based on say nth element of the list elements in list:
l.sort(key = lambda x: x[3])
where n is 3 here. Sorting is done based on 4th element
To print a list, where each element is separated by a common sequence, .join() is used.
l=[1,2,3,4]
print(" ".join(map(str,l))) # 1 2 3 4
A new list can be created from the existing list using slicing. The syntax is as follows:
l = [1,2,3,4,5,6]
print(l[2:5]) #[3,4,5]
The start index is inclusive and end index is exclusive, both separated by a colon(:). If start index section is left empty, slicing starts from the first element. Similarly, in absence of last element, slicing goes till end element.
l = [1,2,3,4,5,6]
print(l[:5:2]) #step size 2. Print [1,3,5]
To create a soft copy of a list: new_list = l
. Any changes made is l
will be reflected in new_list
.
To create a deep copy of the list: new_list = l[:]
. Changes to one list is kept separate from the other.
The map()
function applies a given function to each item of an iterable (list, tuple etc.) and returns a map object of the
results. This result is then typecast to list, set or any data type as required.
def square(n):
return n*n
numbers = [1, 5, 8, 14]
result = map(square, numbers)
print(result) # <map object at 0x7f722da129e8>
numbersSquare = set(result)
print(numbersSquare) # {1, 25, 64, 196}, might be in different order
# A single line substitute for the above code
print(set(map(lambda x: x*x, numbers)))
Passing multiple iterators to lambda function:
num1 = [4, 5, 6]
num2 = [5, 6, 7]
result = list(map(lambda n1, n2: n1+n2, num1, num2))
Strings are immutable. They can be modified only through string function. If character-wise operation is to be performed,
convert the string to list s=list(string)
. After perform operation, convert it back to string "".join(s)
.
.startswith
and endswith
are two methods that can be used with string to check the starting and ending substring
respectively.
s = "Hello There"
print(s.startswith("Hello")) # True
print(s.endswith("era")) # False
a = '''this is multi line
string'''
print(a) # 'this is multi line\nstring'
.split()
function splits the string, returns a list, where elements are decided based on a delimiter.
s="My name is Joker"
print(s.split(' ')) #['My', 'name', 'is','Joker']
print("12-08-2018".split('-')) # ["12","08","2018"]
.strip()
removes trailing whitespaces in the string. It can take one argument to remove characters other tha space, say
removing trailing zeroes will be using s.strip('0
). To strip fro left or right only, lstrip
or rstrip
can be used
respectively.
s = " hello "
print(s.strip()) #hello
To check content and case of string;
string = "Hello World"
len(string) #11
string.isalnum() #True
string.isalpha() #True
string.isdigit() #False
string.isupper() #False
string.islower() #False
string.upper() #HELLO WORLD
string.lower() #hello world
Dictionary implements the concept of hashmap in python. Data is stored in key-value pair. Declared using {}
or dict()
.
This data type is easily convertible to JSON.
d = {"a":1,"b":2,"c":3}
d1 = {}
print(d["a"]) # 1
print(d.get("f", "does not exist")) # does not exist
print(d.keys()) # dict_keys(['a', 'b', 'c'])
print(d.values()) # dict_values([1, 2, 3])
del d["b"]
If a key is being accessed, which does not exist in the dictionary, Python throws KeyError
. To prevent this error,
.get()
function is used. It takes two arguments: the key to accessed and the default value in case the key does not exist.
complex, byte, bytearray, tuples, set, frozenset
NOTE: Searching for membership is faster in sets, as compared to lists and tuples. Hence, membership in dictionary
is also fast.
Everything is an object. Functions are objects of type function. Similarly, class is an instance of type class. Hence, they
will also have memory addresses and you should be able create objects by calling the class All classes will have inbuilt
documentation. It can be viewed using help()
.
a = 10
print(type(a)) # <class 'int'>
b = int(10)
c = int('101', base=2)
print(type(b), type(c)) # <class 'int'> <class 'int'>
def square(a):
return a * a
print(type(square)) # <class 'function'>
help(int)