# Python Data Structures

data structures organize and store data.
python has 4  data structures:
lists, tuples, dictionaries, and sets.

## list

list is an ordered collection of objects.
they are separated by commas.
the entire list is enclosed in square brackets.
lists are mutable, you can add, remove, and modify elements.

In [1]:
example_1 = [2, 4, 7]
example_2 = ['Bob', 'John', 'Will']

# list allows different data types to be in one list
example_3 = ['Ford', 'America', 'Europe']

# creating a list
regions = ['Asia', 'America', 'Europe']

list are usually populated by with a loop starting with an empty list

In [2]:
# using common objects methods
my_list = []

# .append() to add items
my_list.append('Pay bills')
my_list.append('Tidy up')
my_list.append('Walk the dog')
my_list.append('Cook dinner')

# output
print(my_list)
print(my_list[0]) # for first element since python uses zero indexing

['Pay bills', 'Tidy up', 'Walk the dog', 'Cook dinner']
Pay bills


In [3]:
# inserting an item in between two items
i = my_list.index('Cook dinner')
my_list.insert(i, 'Go to the pharmacy')
print(my_list)

# to check how many times an item appears
print(my_list.count('Tidy up'))

# using slice notation
print(my_list[0:3])

['Pay bills', 'Tidy up', 'Walk the dog', 'Go to the pharmacy', 'Cook dinner']
1
['Pay bills', 'Tidy up', 'Walk the dog']


In [4]:
# both start and end indices are optional
print(my_list[:3]) # omitting first
print(my_list[3:]) # omitting last
print(my_list[:]) # omitting both

['Pay bills', 'Tidy up', 'Walk the dog']
['Go to the pharmacy', 'Cook dinner']
['Pay bills', 'Tidy up', 'Walk the dog', 'Go to the pharmacy', 'Cook dinner']


In [5]:
# using slice notation for appending and inserting
my_list[len(my_list):] = ['Mow the lawn', 'Water plants']
# len returns the number of items in the list
print(my_list)

['Pay bills', 'Tidy up', 'Walk the dog', 'Go to the pharmacy', 'Cook dinner', 'Mow the lawn', 'Water plants']


a queue is an abstract data type.
one end for inserting items - enqueue, one end for removing items - dequeue
that is, first-in, first-out (FIFO)

In [6]:
# turning a list to a queue using python's deque (double-ended queue) object
# using a to-do list example

from collections import deque
queue = deque(my_list)
queue.append('Wash the car')
print(queue.popleft(), ' - Done!')
my_list_upd = list(queue)

Pay bills  - Done!


### using a list as a stack
a stack is an abstract data structure.
stack implements last-in, first-out (LIFO)

In [7]:
my_list = ['Pay bills', 'Tidy up', 'Walk the dog', 'Go to the pharmacy', 'Cook dinner']
stack = []
for task in my_list:
    stack.append(task)
while stack:
    print(stack.pop(), ' - Done!')

Cook dinner  - Done!
Go to the pharmacy  - Done!
Walk the dog  - Done!
Tidy up  - Done!
Pay bills  - Done!


### using lists and stacks for natural language processing

In [8]:
import spacy
txt = 'List is a ubiquitous data structure in the Python programming language.'

nlp = spacy.load('en_core_web_sm')
doc = nlp(txt)
stk = []
for w in doc:
    if w.pos_ == 'NOUN' or w.pos_ == 'PROPN':
        stk.append(w.text)
    elif (w.head.pos_ == 'NOUN' or w.pos_ == 'PROPN') and (w in w.head.lefts):
        stk.append(w.text)
    elif stk:
        chunk = ''
        while stk:
            chunk = stk.pop() + ' ' + chunk
        print(chunk.strip())

List
a ubiquitous data structure
the Python programming language


### importing with list comprehensions
let's find the head of each word in the sentence

In [9]:
import spacy

txt = 'List is a ubiquitous data structure in the Python programming language.'

nlp = spacy.load('en_core_web_sm')
doc = nlp(txt)

for t in doc:
    print(t.text, t.head.text)

List is
is is
a structure
ubiquitous structure
data structure
structure is
in structure
the language
Python language
programming language
language in
. is


### creating using list comprehension

In [10]:
import spacy

txt = 'List is arguably the most useful type in the Python programming language.'

nlp = spacy.load('en_core_web_sm')
doc = nlp(txt)

head_lefts = [t.text if t in t.head.lefts else 0 for t in doc]
print(head_lefts)

['List', 0, 0, 'the', 'most', 'useful', 0, 0, 'the', 'Python', 'programming', 0, 0]


Moving through a list word by word through the rest of the text

In [12]:
for word in doc:
    head_lefts = [t.text if t in t.head.lefts else 0 for t in doc [w.i:]]
    print(head_lefts)

[0, 0]
[0, 0]
[0, 0]
[0, 0]
[0, 0]
[0, 0]
[0, 0]
[0, 0]
[0, 0]
[0, 0]
[0, 0]
[0, 0]
[0, 0]


Analyzing each fragment, looking for the next zero

In [13]:
for w in doc:
    head_lefts = [t.text if t in t.head.lefts else 0 for t in doc[w.i:]]
    i0 = head_lefts.index(0)
    if i0 > 0:
        noun = [1 if t.pos_== 'NOUN' or t.pos_== 'PROPN' else 0 for t in reversed(doc[w.i:w.i+i0 +1])]
        try:
            i1 = noun.index(1) + 1
        except ValueError:
            pass
        print(head_lefts[:i0 +1])
        print(doc[w.i+i0 +1-i1])

['List', 0]
List
['the', 'most', 'useful', 0]
type
['most', 'useful', 0]
type
['useful', 0]
type
['the', 'Python', 'programming', 0]
language
['Python', 'programming', 0]
language
['programming', 0]
language


PUTTING IT ALL TOGETHER!

In [14]:
import spacy

txt = 'List is arguably the most useful type in the Python programming language.'

nlp = spacy.load('en_core_web_sm')
doc = nlp(txt)
stk = []

for w in doc:
    head_lefts = [t.text if t in t.head.lefts else 0 for t in doc[w.i:]]
    i0 = 0
    try:
        i0 = head_lefts.index(0)
    except ValueError:
        pass
    i1 = 0
    if i0 > 0:
        noun = [1 if t.pos_== 'NOUN' or t.pos_== 'PROPN' else 0 for t in reversed(doc[w.i:w.i+i0 +1])]
        try:
            i1 = noun.index(1) + 1
        except ValueError:
            pass
        if w.pos_ == 'NOUN' or w.pos_ == 'PROPN':
            stk.append(w.text)
        elif (i1 > 0):
            stk.append(w.text)
        elif stk:
            chunk = ''
            while stk:
                chunk = stk.pop() + ' ' + chunk
            print(chunk.strip())

## tuples

In [None]:
"""
a tuple is an ordered collection of objects.
tuples are immutable. once created it can not be changed.
typically used to store collections of heterogeneous data.
especially useful for holding properties of an object.
"""

# example of a simple tuple
('Ford', 'Mustang', 1964)

# example of a list of tuples
# a to-do list with a tuple of time-task pairs
[('8:00', 'Pay bills'), ('8:30', 'Tidy up'), ('9:30', 'Walk the dog'), ('10:00', 'Go to the pharmacy'), ('10:30', 'Cook dinner')]
task_list = ['Pay bills', 'Tidy up', 'Walk the dog', 'Go to the pharmacy', 'Cook dinner']
tm_list = ['8:00', '8:30', '9:30', '10:00', '10:30']
sched_list = [(tm, task) for tm, task in zip(tm_list, task_list)]