#### Knowledge Sharing Content
# <center> Basic Data Structures - Python
#### [Bhanu Pratap Singh](https://www.linkedin.com/in/bpst/)

## Numerical Data Types

The two most important numerical data types are the `integer` and `float`. An `integer` is a positive or negative number without a floating point (for example, 5). A `float` is a positive or negative number with floating-point precision (for example, 3.14159265359). Python offers a wide variety of built-in numerical operations, as well as functionality to convert between those numerical data types.

In [1]:
# Arithmetic Operations
x, y = 3, 2
print(x + y) # = 5
print(x - y) # = 1
print(x * y) # = 6
print(x / y) # = 1.5
print(x // y) # = 1
print(x % y) # = 1
print(-x) # = -3
print(abs(-x)) # = 3
print(int(3.9)) # = 3
print(float(x)) # = 3.0
print(x ** y) # = 9

5
1
6
1.5
1
1
-3
3
3
3.0
9


Note that the `//` operator performs integer division. The result is an integer value that is rounded down (for example, 3 // 2 == 1)

## Booleans

A variable of type Boolean can take only two values—either False or True.

In Python, Boolean and integer data types are closely related: the Boolean data type internally uses integer values (by default, the Boolean value False is represented by integer 0, and the Boolean value True is represented by integer 1).

In [3]:
x = 1 > 2
print(x) # False

y = 2 > 1
print(y) # True

False
True


After evaluating the given expressions, variable x refers to the Boolean value False, and variable y refers to the Boolean value True.

#### Keywords: and, or, not

Boolean expressions represent basic logical operators. Using them in combination with only the following three keywords, we can craft a wide variety of potentially complicated expressions:

<b>`and`</b> - the expression `x and y` evaluates to `True` if value `x` is `True` <b><i>and</b></i> value `y` is `True`. If either of those is `False`, the overall expression becomes `False` too.

<b>`or`</b> - the expression `x or y` evaluates to `True` if value `x` is `True` <b><i>or</b></i> value `y` is `True` (or both values are `True`). If even just one of those is `True`, the overall expression becomes `True` too.

<b>`not`</b> - the expression `not x` evaluates to `True` if value `x` is `False`. Otherwise, the expression evaluates to `False`.

In [4]:
x, y = True, False

print((x or y) == True)# True
print((x and y) == False) # True
print((not y) == True) # True

True
True
True


#### Boolean Operator Precedence

The order that Boolean operators are applied is an important aspect of understanding Boolean logic. For example, consider the natural language statement `it rains and it's cold or windy`. We can interpret this in two ways:

* `(it rains and it's cold) or windy` In this case, the statement would be `True` if it is windy — even if it doesn’t rain.

* `it rains and (it's cold or windy)` In this case, however, the statement would be `False` if it doesn’t rain — no matter whether it’s cold or windy.

The order of Boolean operators matters. The correct interpretation of this statement would be the first one because the and operator takes precedence before the or operator. 

In [5]:
# Boolean Operations
x, y = True, False

print(x and not y) # True
print(not x and y or x) # True

True
True


In [6]:
# If condition evaluates to False
if None or 0 or 0.0 or '' or [] or {} or set():
    print("Dead code") # Not reached

This code shows two important points. 
* First, Boolean operators are ordered by priority — the operator not has the highest priority, followed by the operator and, followed by the operator or. 
* Second, the following values are automatically evaluated to False: the keyword None, the integer value 0, the float value 0.0, empty strings, or empty container types.

## Strings

Python strings are sequences of characters. Strings are immutable and so cannot be changed after creation. While other ways to create strings exist, these are the five most commonly used:

**Single quotes** `'Yes'`

**Double quotes** `"Yes"`

**Triple quotes for multiline strings** `'''Yes''' or """Yes"""`

**The string method** `str(5) == '5' is True`

**Concatenation** `'Py' + 'thon' becomes 'Python'`

Often, we explicitly want to use whitespace characters in strings. The most frequently used whitespace characters are the newline character `\n`, the space character `\s`, and the tab character `\t`.

In [7]:
# Most Important String Methods
y = "    This is lazy\t\n   "

print(y.strip())
# Remove Whitespace: 'This is lazy'

print("DrDre".lower())
# Lowercase: 'drdre'

print("attention".upper())
# Uppercase: 'ATTENTION'

print("smartphone".startswith("smart"))
# Matches the string's prefix against the argument: True

print("smartphone".endswith("phone"))
# Matches the string's suffix against the argument: True

print("another".find("other"))
# Match index: 2

print("cheat".replace("ch", "m"))
# Replaces all occurrences of the first by the second argument: meat

print(','.join(["F", "B", "I"]))
# Glues together all elements in the list using the separator string: F,B,I

print(len("Rumpelstiltskin"))
# String length: 15

print("ear" in "earth")
# Contains: True

This is lazy
drdre
ATTENTION
True
True
2
meat
F,B,I
15
True


This non-exclusive list of string methods shows that the string data type is powerful, and we can solve many common string problems with built-in Python functionality.

To know more about built-in string methods [click here](https://docs.python.org/3/library/string.html#module-string)

#### The Keyword None

The keyword `None` is a Python constant and it means `the absence of a value`. Other programming languages such as Java use the value `null` instead. However, the term `null` often confuses beginners, who assume it’s equal to the integer value `0`.

Instead, Python uses the keyword `None`, to indicate that it’s different from any numerical value for zero, an empty list, or an empty string. 

An interesting fact is that the value `None` is the only value in the `NoneType` data type.

In [9]:
def f():
   x = 2

print(f() is None)
# True

print("" == None)
# False

print(0 == None)
# False

True
False
False


This code shows several examples of the `None` data value (and what it is not). If we don’t define a return value for a function, the default return value is `None`.

## Container Data Structures

### Lists

The `list` is a container data type that stores a sequence of elements. Unlike strings, lists are `mutable` — we can modify them at runtime.

In [10]:
l = [1, 2, 2]
print(len(l))
# 3

3


This code snippet shows how to create a list by using square brackets and how to populate it with three integer elements. We can also see that lists can have repeated elements. The `len()` function returns the number of elements in a list.

#### Keyword: is

The keyword `is` simply checks whether both variables refer to the same object in memory. 

Let's checks whether two integers and two lists refer to the same object in memory.

In [11]:
y = x = 3

print(x is y)
# True

print([3] is [3])
# False

True
False


If we create two lists — even if they contain the same elements — they still refer to two different list objects in memory.

Modifying one list object does not affect the other list object. We say that lists are `mutable` because you can modify them after creation. Therefore, if we check whether one list refers to the same object in memory, the result is `False`. 

However, integer values are `immutable`, so there is no risk of one variable changing the object that will then accidentally change all other variables. The reason is that we cannot change the integer object — trying it will only create a new integer object and leave the old one unmodified.

#### Adding Elements

Python provides three common ways to add elements to an existing list: `append`, `insert`, or  `list concatenation`.

In [12]:
# 1. Append
l = [1, 2, 2]
l.append(4)
print(l)
# [1, 2, 2, 4]

[1, 2, 2, 4]


In [13]:
# 2. Insert
l = [1, 2, 4]
l.insert(2, 3)
print(l)
# [1, 2, 3, 4]

[1, 2, 3, 4]


In [14]:
# 3. List Concatenation
print([1, 2, 2] + [4])
# [1, 2, 2, 4]

[1, 2, 2, 4]


All three operations generate the same list [1, 2, 2, 4]. However the `append` operation is the fastest because it neither has to traverse the list to insert an element at the correct position (as with `insert`), nor create a new list out of two sublists (as with `list concatenation`). 

We use the insert operation only if we want to add an element at a specific position in the list that is not the last position. And we use the list concatenation operation to concatenate two lists of arbitrary length. 

Note that a fourth method, `extend()`, allows us to append multiple elements to the given list in an efficient manner.

#### Removing Elements

We can easily remove an element `x` from a `list` by using the list method `remove(x)`.

In [15]:
l = [1, 2, 2, 4]
l.remove(1)
print(l)
# [2, 2, 4]

[2, 2, 4]


The method operates on the list object itself, rather than creating a new list with the changes made. Here we create a list object named `l` and modify this exact object in memory by removing an element. This saves memory overhead by reducing redundant copies of the same list data.

#### Reversing Lists

You can reverse the order of list elements by using the method `list.reverse()`

In [16]:
l = [1, 2, 2, 4]
l.reverse()
print(l)
# [4, 2, 2, 1]

[4, 2, 2, 1]


Reversing the list also modifies the original list object and does not merely create a new list object.

#### Sorting Lists

We can sort `list` elements by using the method `list.sort()`

In [18]:
l = [2, 1, 4, 2]
l.sort()
print(l)
# [1, 2, 2, 4]

[1, 2, 2, 4]


Again, sorting the list modifies the original list object. The resulting list is sorted in an ascending manner. Lists containing string objects would be sorted in an ascending lexicographical manner (from `'a'` to `'z'`). 

In general, the sorting function assumes that two objects can be compared. If we can calculate `a > b` for objects `a` and `b` of any data type, Python can also sort the list `[a, b]`.

#### Indexing List Elements

We can find out the index of a specified list element `x` by using the method `list.index(x)`

In [19]:
print([2, 2, 4].index(2))
# 0

print([2, 2, 4].index(2,1))
# 1

0
1


The method `index(x)` finds the first occurrence of the element `x` in the list and returns its index. Like other major programming languages, Python assigns index 0 to the first sequence and index `i–1` to the `i-th` sequence.

### Dictionaries
The `dictionary` is a useful data structure for storing (key, value) pairs, here is one simple example

In [2]:
# Dictionary
calories = {'apple' : 52, 'banana' : 89, 'choco' : 546}
calories

{'apple': 52, 'banana': 89, 'choco': 546}

we can read and write elements by specifying the key within brackets

In [3]:
# Comparison
print(calories['apple'] < calories['choco'])

True


In [4]:
# Add new key to dictionary
calories['cappu'] = 74
calories

{'apple': 52, 'banana': 89, 'choco': 546, 'cappu': 74}

In [5]:
# Comparison
print(calories['banana'] < calories['cappu'])

False


Use the `keys()` and `values()` functions to access all keys and values of the dictionary

In [6]:
# Using key
print('apple' in calories.keys())

True


In [7]:
# Using value
print(52 in calories.values())

True


Access the `(key, value)` pairs of a dictionary with the `items()` method

In [8]:
for k, v in calories.items():
    print(k) if v > 500 else None

choco


#### Membership
Use the keyword **`in`** to check whether the set, list, or dictionary contains an element 

In [9]:
print(42 in [2, 39, 42])

True


In [10]:
print("21" in {"2", "39", "42"})

False


In [11]:
print("list" in {"list" : [1, 2, 3], "set" : {1,2,3}})

True


We say **x** is a member of **y** if element **x** appears in the collection **y**.

Checking set membership is faster than checking list membership: to check whether element x appears in list y, we need to traverse the whole list until we find x or have checked all elements. However, sets are implemented much like dictionaries: to check whether element x appears in set y, Python internally performs one operation y[hash(x)] and checks whether the return value is not `None`.

### List and Set Comprehension

List comprehension is a popular Python feature that helps you quickly create and modify lists. 

The simple formula is **[ expression + context ]**

**Expression** tells Python what to do with each element in the list.

**Context** tells Python which list elements to select. The context consists of an arbitrary number of for and if statements.

For example, in the list comprehension statement below,

In [12]:
[x for x in range(3)]

[0, 1, 2]

first part `x` is the expression, and the second part for `x in range(3)` is the context. The statement creates the list [0, 1, 2]. The `range()` function returns a range of subsequent integer values 0, 1, and 2 - when used with one argument as in the example above.

Let's see another good code example for list comprehension

In [13]:
# (name, $-income)
customers = [("John", 240000),
             ("Alice", 120000),
             ("Ann", 1100000),
             ("Zach", 44000)]

# Our high-value customers earning >$1M
whales = [x for x, y in customers if y>1000000]
print(whales)

['Ann']


Set comprehension is like list comprehension, but creates a set rather than a list.

## Lambdas

We use the keyword `lambda` to define lambda functions in Python. Lambda functions are anonymous functions that are not defined in the namespace. They are functions without names, intended for single use.

Lambda function syntax 

**`lambda <arguments> : <return expression>`**

A lambda function can have one or multiple arguments, separated by commas. After the colon (:), we define the return expression that may (or may not) use the defined argument. The return expression can be any expression or even another function.

In [14]:
print((lambda x: x + 3)(3))

6


In [16]:
func = lambda x: x + 3
print(func(3))

6


In [17]:
# This will not work
(3)

3

First, we create a lambda function that takes a value x and returns the result of the expression x + 3. The result is a function object that can be called like any other function.

## Stacks

The `stack` data structure works intuitively as a first-in, first-out (FIFO) structure.

Python lists can be used intuitively as stacks with the list operations `append()` to add to the stack and `pop()` to remove the most recently added item.

In [19]:
stack = [3]
stack.append(42) # [3, 42]
stack.pop() # 42 (stack: [3])
stack.pop() # 3 (stack: [])

3

Because of the efficiency of the list implementation, there is usually no need to import external stack libraries.

## Sets

The set data structure is a basic collection data type in Python and many other programming languages. Popular languages for distributed computing (for example, MapReduce or Apache Spark) even focus almost exclusively on set operations as programming primitives. So what is a set exactly? A set is an unordered collection of unique elements. Let’s break this definition into its main pieces.

### Collection

A set is a collection of elements like a list or a tuple. The collection consists of either primitive elements (integers, floats, strings), or complex elements (objects, tuples). 

However, all data types in a set must be `hashable`, meaning that they have an associated hash value. A hash value of an object never changes and is used to compare the object to other objects. 

Let’s look at an example, which creates a set from three strings after checking their hash values.

In [21]:
hero = "Harry"
guide = "Dumbledore"
enemy = "Lord V."
print(hash(hero))
# 6175908009919104006

print(hash(guide))
# -5197671124693729851

## Can we create a set of strings?
characters = {hero, guide, enemy}
print(characters)
# {'Lord V.', 'Dumbledore', 'Harry'}

-9147996449234916283
568780430997353507
{'Dumbledore', 'Harry', 'Lord V.'}


In [22]:
# Can we create a set of lists?
team_1 = [hero, guide]
team_2 = [enemy]
teams = {team_1, team_2}

TypeError: unhashable type: 'list'

We can not create a set of lists because lists are not hashable.

The reason is that the hash value depends on the content of the item, and lists are `mutable`; if we change the list data type, the hash value must change too. Because mutable data types are not hashable, we cannot use them in sets.

### Unordered

Unlike lists, elements in a set have no fixed order. Regardless of the order in which we put stuff into the set, we can never be sure in which order the set stores these elements. Here is an example

In [23]:
characters = {hero, guide, enemy}
print(characters)

{'Dumbledore', 'Harry', 'Lord V.'}


I put in the hero first, but my interpreter prints the guide first.

### Unique

All elements in the set must be unique. Formally, each of two values x, y in the set with x!=y have different hash values hash(x)!=hash(y). Because every two elements x and y in the set are different, we cannot create an army of Harry Potter clones to fight Lord V.

In [25]:
clone_army = {hero, hero, hero, hero, hero, enemy}
print(clone_army)

{'Harry', 'Lord V.'}


No matter how often we put the same value into the same set, the set stores only one instance of this value. The reason is that those heroes have the same hash value, and a set contains at most one element per hash value.