# Fundamentals

In [19]:
import numpy as np
import pandas as pd
import collections
import copy
import sys

sys.executable

'/opt/homebrew/Caskroom/miniforge/base/envs/main-dev/bin/python'

## `sed` command

`sed` is a text *viewer* and *editor*. It can replace viewing options in the terminal, such as `head`, `tail`, and `cat`.

```bash

sed "/s<text-to-replace>/<text-to-insert>/gip" <filename.text>

## `/s` substitutes some text with another
## `/gip` tells the terminal to:
#### <g> search globally rather than stop at the first instance of the substitution,
#### <i> perform a case-insensitive search, and
#### <p> print the results to terminal
```

## Python

### `set`

`set`s are immutable data types that behave similarly to mathematical sets.
- unordered
- elements are unique; duplicates are disallowed
- may be modified but its elements must be immutable

In [6]:
# example set
x = {'foo', 'bar', 'baz'}

# set size
print(len(x))

# set membership
print('bar' in x)
print('qux' in x)

3
True
False


#### Operators & Methods

Operators and methods differ in that operators can only be used on sets. On the other hand, methods can take an immutable data type as an argument and automatically convert it to a set before applying the method. Using an operator on a non-set will result in a `TypeError`.

In [10]:
# example sets
x1 = {'foo', 'bar', 'baz'}
x2 = {'baz', 'qux', 'quux'}

# operators
print(x1 | x2)
print(x1 & x2)
print(x1 - x2)

# methods
print(x1.union(x2))
print(x1.intersection(x2))
print(x1.difference(x2))

{'baz', 'bar', 'quux', 'qux', 'foo'}
{'baz'}
{'foo', 'bar'}
{'baz', 'bar', 'quux', 'qux', 'foo'}
{'baz'}
{'foo', 'bar'}


### Copy, Shallow Copy, & Deep Copy

**Copy**: the `=` operator creates a reference of the object being copied.

In [11]:
old_list = [[1, 2, 3], [4, 5, 6], [7, 8, 'a']]
new_list = old_list

new_list[2][2] = 9

print('Old List:', old_list)
print('ID of Old List:', id(old_list))

print('New List:', new_list)
print('ID of New List:', id(new_list))

Old List: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
ID of Old List: 4559624192
New List: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
ID of New List: 4559624192


**Shallow Copy**: the `copy.copy()` method copies the object but recursively references the objects within.

In [16]:
old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
new_list = copy.copy(old_list)

# appending a new nested object
old_list.append([4, 4, 4])

print("Old list:", old_list)
print("New list:", new_list)

# changing a nested object
old_list[1][1] = 'AA'

print("Old list:", old_list)
print("New list:", new_list)

Old list: [[1, 1, 1], [2, 2, 2], [3, 3, 3], [4, 4, 4]]
New list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
Old list: [[1, 1, 1], [2, 'AA', 2], [3, 3, 3], [4, 4, 4]]
New list: [[1, 1, 1], [2, 'AA', 2], [3, 3, 3]]



**Deep Copy**: the `copy.deepcopy()` method copies the object and all nested objects recursively.

In [18]:
old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
new_list = copy.deepcopy(old_list)

# changing a nested object
old_list[1][0] = 'BB'

print("Old list:", old_list)
print("New list:", new_list)

Old list: [[1, 1, 1], ['BB', 2, 2], [3, 3, 3]]
New list: [[1, 1, 1], [2, 2, 2], [3, 3, 3]]


### `collections` Module

#### `namedtuple`

In [20]:
fruit = collections.namedtuple("fruit", "number variety colour")

In [21]:
guava = fruit(
    number=2,
    variety="Honey Crisp",
    colour="green",
)

apple = fruit(
    number=5,
    variety="Granny Smith",
    colour="red",
)

In [24]:
# `namedtuples` can be used to assign names to values, unlike Python's built-in `tuple` data type
print(guava.colour)
print(apple.variety)

green
Granny Smith


Notice that `namedtuples` are also a memory-efficient option when defining an immutable `class` in Python.

#### `Counter`

`Counter` is a `dict` subclass used to count hashable objects. The elements are stored as dictionary ***keys*** and the object counts are stored as the ***value***.

In [38]:
c = collections.Counter('abcabcabcabcd')
print(c)

print(sorted(c.elements()))
print(c.most_common(5))

Counter({'a': 4, 'b': 4, 'c': 4, 'd': 1})
['a', 'a', 'a', 'a', 'b', 'b', 'b', 'b', 'c', 'c', 'c', 'c', 'd']
[('a', 4), ('b', 4), ('c', 4), ('d', 1)]


In [39]:
l = [5,6,7,1,3,9,9,1,2,5,5,7,7]
c = collections.Counter(l)
print(c)

print(sorted(c.elements()))
print(c.most_common(5))

Counter({5: 3, 7: 3, 1: 2, 9: 2, 6: 1, 3: 1, 2: 1})
[1, 1, 2, 3, 5, 5, 5, 6, 7, 7, 7, 9, 9]
[(5, 3), (7, 3), (1, 2), (9, 2), (6, 1)]


In [40]:
s = 'the lazy dog jumped over another lazy dog'
words = s.split()
c = collections.Counter(words)
print(c)

print(sorted(c.elements()))
print(c.most_common(5))

Counter({'lazy': 2, 'dog': 2, 'the': 1, 'jumped': 1, 'over': 1, 'another': 1})
['another', 'dog', 'dog', 'jumped', 'lazy', 'lazy', 'over', 'the']
[('lazy', 2), ('dog', 2), ('the', 1), ('jumped', 1), ('over', 1)]
