## Data Structures
The trick is that it is all about state!

In [2]:
import os
for item in os.listdir('sample_data'):
    if os.path.isdir(item):
        print("This is a directory {0}".format(item))
    else:
        print("This is a file {0}".format(item))

In [3]:
# Looping is easy, but what about state?
# here state is captured in a new variable called "important directories"
important_directories = []
for item in os.listdir('.'):
    if os.path.isdir(item):
        important_directories.append(item)
    print(important_directories)

['.sound']
['.sound']
['.sound']
['.sound']
['.sound']
['.sound']
['.sound', 'sample_data']


In [26]:
list_direc = os.listdir('.')
for item in list_direc:
    print(item)

dataStructure.ipynb
iterating-lists.ipynb
jupyterCourse.ipynb
other-datastructure.ipynb
sample_data


In [4]:
important_directories = []
for item in os.listdir('.'):
    if item.startswith('_'):
        continue # flow control
    if os.path.isdir(item):
        important_directories.append(item)
print(important_directories)

['.sound', 'sample_data']


In [63]:
items = ['first', 'second', 'third', 'foo']
#items[-1]

url = "https://colab.research.com/drive/asdfjhasdf/alfredo/oreilly"
parts = url.split('/')

#print(parts)
#print(parts[3:])
protocol, _, fqdn = parts[:3]
print("protocol is: %s" % protocol)
print(fqdn)
company = parts[-1]
print(company)

print("The first item is: {0}".format(items[0]))
print(items[1])

items.index('foo')


protocol is: https:
colab.research.com
oreilly
The first item is: first
second


3

## Tuples
Shoud be treated as "read only" lists, the difference are subtle!

In [74]:
ro_items = ('first', 'second', 'third')
print("first item in the tuple is: %s" % ro_items.index('first'))
print(ro_items[-1])
print("============")
for item in ro_items:
    print(item)

first item in the tuple is: 0
third
first
second
third


In [79]:
 #Find out what methods are available in a tuple. Tuples are also immutable
for method in dir(tuple()):
    if method.startswith('__'):
        #print(method)
        continue
    print(method)

#ro_items.append('a') will throw an error

count
index


## Sets
Sets are like lists, they look like dictonaries, but they allow us to keep unique items


In [84]:
# creating an empty set
unique = set()
# add items with .add()
unique.add("one")
unique.add("two")
unique.add("one")

print(len(unique))

unique.pop()
print(unique)

2
{'one'}


## List Comprehensions
So easy to abuse! be careful and use it wisely.

In [86]:
items = ['a', '1', '23', 'b', '4', 'c', 'd']
numeric = []
for item in items:
    if item.isnumeric():
        numeric.append(item)
    print(numeric)

[]
['1']
['1', '23']
['1', '23']
['1', '23', '4']
['1', '23', '4']
['1', '23', '4']


In [89]:
# notice the 'if' condition at the end, is this more readable? or less?
inlined_numeric = [item for item in items if item.isnumeric()]
inlined_numeric
#print(len(inlined_numeric))

3


In [93]:
# doubly nested items are ususally targetted for list comprehensions
items = ['a', '1', '23', 'b', '4', 'c', 'd']
nested_items = [items, items]
print(len(nested_items))
nested_items

2


[['a', '1', '23', 'b', '4', 'c', 'd'], ['a', '1', '23', 'b', '4', 'c', 'd']]

In [94]:
numeric = []
for parent in nested_items:
    for item in parent:
        if item.isnumeric():
            numeric.append(item)
numeric

['1', '23', '4', '1', '23', '4']

In [98]:
# and now with list comprehension
numeric = [item for item in parent for parent in nested_items if item.isnumeric()]

numeric

['1', '1', '23', '23', '4', '4']

In [109]:
# this can improve readability
numeric = [
 item for item in parent
        for parent in nested_items
            if item.isnumeric()  
]

numeric

['1', '1', '23', '23', '4', '4']

## The Awesome Dictionary
One of my favorite are mappings, usually referred to as key/value mappings.

In [112]:
# dictionary are mappings, ususally referred to as key/value mappings.
contacts = {
    'alfredo': '405-886-3312',
    'noah': '980-555-5555'
}

contacts


{'alfredo': '405-886-3312', 'noah': '980-555-5555'}

In [113]:
contacts.keys()

dict_keys(['alfredo', 'noah'])

In [114]:
contacts.values()

dict_values(['405-886-3312', '980-555-5555'])

In [120]:
# looping over dictionary defaul to 'keys()' and you can loop over both keys and values
for key in contacts:
    print(key)

for name, phone in contacts.items():
    print("key is {0}, and value is: {1}".format(name, phone))

# treat dictionaries like a small database, with cheap (and fast!) success
    #contacts['john']

alfredo
noah
key is alfredo, and value is: 405-886-3312
key is noah, and value is: 980-555-5555


In [123]:
# Super way to fall back when things do not exist
print(contacts.get('john', "Peter"))

try:
    contacts['john']
except KeyError:
    print("Peter")

#contacts['noah']


Peter
Peter


'980-555-5555'

#### Walking the filesystem, inspecting files
Python has built-in utilities to walk the filesystem. It is a bit clunky, and creating something usful requires stiching things together to produce good output.

In [124]:
import os
# yeilds the 'current'dir, then the directories, and then any files it finds
# for each level it traverse
for path_info in os.walk('.'):
    print(path_info)
    break

('.', ['.sound', 'sample_data'], ['dataStructure.ipynb', 'iterating-lists.ipynb', 'jupyterCourse.ipynb', 'other-datastructure.ipynb'])


In [125]:
import os
from os.path import abspath, join
# producing absolute paths, instead of a tuple of three items
for top_dir, directories, files in os.walk('.'):
    for directory in directories:
        print(abspath(join(top_dir, directory)))
    for _file in files:
        print(abspath(join(top_dir, _file)))
    break

c:\Users\da7ty\Study 2024\Python course\Python_Jupyter\.sound
c:\Users\da7ty\Study 2024\Python course\Python_Jupyter\sample_data
c:\Users\da7ty\Study 2024\Python course\Python_Jupyter\dataStructure.ipynb
c:\Users\da7ty\Study 2024\Python course\Python_Jupyter\iterating-lists.ipynb
c:\Users\da7ty\Study 2024\Python course\Python_Jupyter\jupyterCourse.ipynb
c:\Users\da7ty\Study 2024\Python course\Python_Jupyter\other-datastructure.ipynb


In [128]:
import time
now = time.time()

now
time.time() - now

0.00011491775512695312

##### Working with Assert
Assert are normally used during debugging. They are disabled when running production codes.

In [131]:
##assert 1 == 2
assert 1 == 2, "You didn't think that was true?!"

AssertionError: You didn't think that was true?!