# Fundamentals

In [48]:
import numpy as np
import pandas as pd
import sys

sys.executable

'/opt/homebrew/Caskroom/miniforge/base/envs/main-dev/bin/python'

## `sed` command

`sed` is a text *viewer* and *editor*. It can replace viewing options in the terminal, such as `head`, `tail`, and `cat`.

```bash

sed "/s<text-to-replace>/<text-to-insert>/gip" <filename.text>

## `/s` substitutes some text with another
## `/gip` tells the terminal to:
#### <g> search globally rather than stop at the first instance of the substitution,
#### <i> perform a case-insensitive search, and
#### <p> print the results to terminal
```

## Python

### `set`

`set`s are immutable data types that behave similarly to mathematical sets.
- unordered
- elements are unique; duplicates are disallowed
- may be modified but its elements must be immutable

In [49]:
# example set
x = {'foo', 'bar', 'baz'}

# set size
print(len(x))

# set membership
print('bar' in x)
print('qux' in x)

3
True
False


#### Operators & Methods

Operators and methods differ in that operators can only be used on sets. On the other hand, methods can take an immutable data type as an argument and automatically convert it to a set before applying the method. Using an operator on a non-set will result in a `TypeError`.

In [50]:
# example sets
x1 = {'foo', 'bar', 'baz'}
x2 = {'baz', 'qux', 'quux'}

# operators
print(x1 | x2)
print(x1 & x2)
print(x1 - x2)

# methods
print(x1.union(x2))
print(x1.intersection(x2))
print(x1.difference(x2))

{'baz', 'bar', 'qux', 'foo', 'quux'}
{'baz'}
{'foo', 'bar'}
{'baz', 'bar', 'qux', 'foo', 'quux'}
{'baz'}
{'foo', 'bar'}


### Copy, Shallow Copy, & Deep Copy

**Copy**: the `=` operator creates a reference of the object being copied.

In [51]:
import copy

In [52]:
old_list = [[1, 2, 3], [4, 5, 6], [7, 8, 'a']]
new_list = old_list

new_list[2][2] = 9

print('Old List:', old_list)
print('ID of Old List:', id(old_list))

print('New List:', new_list)
print('ID of New List:', id(new_list))

Old List: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
ID of Old List: 6305819328
New List: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
ID of New List: 6305819328


**Shallow Copy**: the `copy.copy()` method copies the object but recursively references the objects within.

In [53]:
old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
new_list = copy.copy(old_list)

# appending a new nested object
old_list.append([4, 4, 4])

print("Old list:", old_list)
print("New list:", new_list)

# changing a nested object
old_list[1][1] = 'AA'

print("Old list:", old_list)
print("New list:", new_list)

Old list: [[1, 1, 1], [2, 2, 2], [3, 3, 3], [4, 4, 4]]
New list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
Old list: [[1, 1, 1], [2, 'AA', 2], [3, 3, 3], [4, 4, 4]]
New list: [[1, 1, 1], [2, 'AA', 2], [3, 3, 3]]



**Deep Copy**: the `copy.deepcopy()` method copies the object and all nested objects recursively.

In [54]:
old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
new_list = copy.deepcopy(old_list)

# changing a nested object
old_list[1][0] = 'BB'

print("Old list:", old_list)
print("New list:", new_list)

Old list: [[1, 1, 1], ['BB', 2, 2], [3, 3, 3]]
New list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]


### `collections` Module

In [55]:
import collections

#### `namedtuple`

`namedtuple`s are similar to class objects. Once initialized, this `tuple` variant has keys or attributes that values can be assigned to and called, making indexing such values much easier or more intuitive to call.

In [56]:
fruit = collections.namedtuple("fruit", "number variety colour")

In [57]:
guava = fruit(
    number=2,
    variety="Honey Crisp",
    colour="green",
)

apple = fruit(
    number=5,
    variety="Granny Smith",
    colour="red",
)

In [58]:
# `namedtuples` can be used to assign names to values, unlike Python's built-in `tuple` data type
print(guava.colour)
print(apple.variety)

green
Granny Smith


Notice that `namedtuples` are also a memory-efficient option when defining an immutable `class` in Python.

#### `Counter`
`Counter` is a `dict` subclass used to count hashable objects. The elements are stored as dictionary ***keys*** and the object counts are stored as the ***value***. Initializing objects as a `Counter` gives the user access to various counting operations that can be used on that object.

In [59]:
c = collections.Counter('abcabcabcabcd')
print(c)

print(sorted(c.elements()))
print(c.most_common(5))

Counter({'a': 4, 'b': 4, 'c': 4, 'd': 1})
['a', 'a', 'a', 'a', 'b', 'b', 'b', 'b', 'c', 'c', 'c', 'c', 'd']
[('a', 4), ('b', 4), ('c', 4), ('d', 1)]


In [60]:
l = [5,6,7,1,3,9,9,1,2,5,5,7,7]
c = collections.Counter(l)
print(c)

print(sorted(c.elements()))
print(c.most_common(5))

Counter({5: 3, 7: 3, 1: 2, 9: 2, 6: 1, 3: 1, 2: 1})
[1, 1, 2, 3, 5, 5, 5, 6, 7, 7, 7, 9, 9]
[(5, 3), (7, 3), (1, 2), (9, 2), (6, 1)]


In [61]:
s = 'the lazy dog jumped over another lazy dog'
words = s.split()
c = collections.Counter(words)
print(c)

print(sorted(c.elements()))
print(c.most_common(5))

Counter({'lazy': 2, 'dog': 2, 'the': 1, 'jumped': 1, 'over': 1, 'another': 1})
['another', 'dog', 'dog', 'jumped', 'lazy', 'lazy', 'over', 'the']
[('lazy', 2), ('dog', 2), ('the', 1), ('jumped', 1), ('over', 1)]


#### `defaultdict`

`defaultdict`s allow a user to define keys that do not exist by passing in a data type as it is instantiated. 

In [62]:
nums = collections.defaultdict(int)
nums['one'] = 1
nums['two'] = 2
print(nums['three'])

0


In [63]:
count = collections.defaultdict(int)
names = "Mike John Mike Anna Mike John John Mike Mike Britney Smith Anna Smith".split()
for name in names:
    count[name] += 1
print(count)

defaultdict(<class 'int'>, {'Mike': 5, 'John': 3, 'Anna': 2, 'Britney': 1, 'Smith': 2})


#### `OrderedDict`

`OrderedDict` is a dictionary where keys maintain the order in which they are inserted.

In [64]:
od = collections.OrderedDict()
od['a'] = 1
od['b'] = 2
od['c'] = 3
print(od)

for key, value in od.items():
    print(key, value)

OrderedDict([('a', 1), ('b', 2), ('c', 3)])
a 1
b 2
c 3


In [65]:
# `OrderedDict`s can also be initialized with another dictionary
list = ["a","c","c","a","b","a","a","b","c"]
cnt = collections.Counter(list)
od = collections.OrderedDict(cnt.most_common())
for key, value in od.items():
    print(key, value)

a 4
c 3
b 2


#### `deque`

A `deque` is a list optimized for inserting and removing items. Naturally, `deque`s take lists as an argument.

In [66]:
l = ["a", "b", "c"]
deq = collections.deque(l)
print(deq)

deque(['a', 'b', 'c'])


In [67]:
deq.append("d")
deq.appendleft("e")
print(deq)

deque(['e', 'a', 'b', 'c', 'd'])


In [68]:
deq.pop()
deq.popleft()
print(deq)

deque(['a', 'b', 'c'])


In [69]:
deq.clear()
print(deq)

deque([])


In [70]:
l = ["a", "b", "c"]
deq = collections.deque(l)
print(deq.count("b"))

1


#### `ChainMap`

`ChainMap` is used to combine several dictionaries or mappings, and it allows for prioritizing value mappings between duplicate keys. It returns a list of dictionaries.

In [71]:
dict1 = {"a": 1, "b": 2}
dict2 = {"c": 3, "b": 4}
chain_map = collections.ChainMap(dict1, dict2)
print(chain_map.maps)
print(chain_map["a"])

[{'a': 1, 'b': 2}, {'c': 3, 'b': 4}]
1


In [72]:
dict2["c"] = 5
print(chain_map.maps)

[{'a': 1, 'b': 2}, {'c': 5, 'b': 4}]


In [73]:
print(chain_map.keys())
print(chain_map.values())

KeysView(ChainMap({'a': 1, 'b': 2}, {'c': 5, 'b': 4}))
ValuesView(ChainMap({'a': 1, 'b': 2}, {'c': 5, 'b': 4}))


### Time & Date

#### `datetime`

In [90]:
from datetime import datetime
import calendar

In [77]:
# get current date
dt = datetime.now()
print(dt)
print(type(dt))

2024-04-03 14:11:02.905569
<class 'datetime.datetime'>


In [88]:
date_str = "2019-10-31"
parsed_date = datetime.strptime(date_str, "%Y-%m-%d")
print(parsed_date)
print(type(parsed_date))
print("Month: \t\t", parsed_date.month)
print("Year: \t\t", parsed_date.year)
print("Day: \t\t", parsed_date.day)
print("Weekday: \t", parsed_date.weekday())

2019-10-31 00:00:00
<class 'datetime.datetime'>
Month: 		 10
Year: 		 2019
Day: 		 31
Weekday: 	 3


In [91]:
print(calendar.day_name[parsed_date.weekday()])

Thursday
