# Lesson 2: Class Warfare

> Disclaimer: Most of these points should only be applied to Python.

- Overutilising or underutilising classes can lead to ruin
- Classes can be a powerful tool or an endless garden path

## Benefits vs Drawbacks

### Benefits

- Can keep track of state
  - No need to pass parameters back and forth
  - No thread-unsafe global variables
  - Can logically initialise state and then use it
- Can organise a hierarcy of states that belong together
- Provide dot-methods for accessing properties
  - "ask, don't tell"

- We have a collection of files
- Must get some attributes from each, and add those into a shared collection

## Functions vs Classes

### Functions vs Methods

In python

- a **function** takes parameters, returns a value
- a **method** can be called on an object, and can access state in the object

## Baby steps

In [1]:
CONFIG = {
    'thing': 'a',
    'identifiier': 'b',
    'name': 'c'
}

_This isn't very safe if something goes wrong_

In [2]:
CONFIG['identifier']

KeyError: 'identifier'

`namedtuple` == Quick 'n' dirty class!

Used when you just need to
- make sure that the correct keys/values are present
- access something a few times (safely) via a dot method rather than a dict key lookup

In [3]:
from collections import namedtuple

Config = namedtuple('config', ['thing', 'identifier', 'name'])

CONFIG = Config('a', 'b', 'c')

print(CONFIG)
print(CONFIG.thing, CONFIG.identifier)

config(thing='a', identifier='b', name='c')
a b


Now let's try the failing example again

In [4]:
Config(**{
    'thing': 'a',
    'identifiier': 'b',
    'name': 'c'
})

TypeError: <lambda>() got an unexpected keyword argument 'identifiier'

Much better! This is useful when loading a JSON config, and you need to make sure all the key are present

In [5]:
import json
raw = '{"identifier": 123, "name": "me", "thing": 123}'

Config(**json.loads(raw))

config(thing=123, identifier=123, name='me')

In [6]:
raw = '{"identifier": 123, "name": "me", "thing": 123, "extra": 1}'

Config(**json.loads(raw))

TypeError: <lambda>() got an unexpected keyword argument 'extra'

---

## An Example

- You have a collection of items, in this case ids and emails
- Need to iterate through them, collect some values, and pass them on

In [37]:
from faker import Faker
from utils import ppj
from itertools import islice

fake = Faker()

def fake_record(i, spanner=False):
    if spanner and i % 5 == 0:
        return (i, fake.uuid4(), fake.email(), None)
    else:
        return (i, fake.uuid4(), fake.email(), fake.pyint())

def iterate(n=10, spanner=False):
    '''
    This method will yield a tuple of each item, and a boolean indicating 
    if there are more items.
    After all items are consumed, this method will yield None/False
    (This means that we can't just do "for item in iterate(collection)")
    '''
    for i in range(n-1):
        yield fake_record(i, spanner), True
    yield fake_record(i, spanner), False
    while True:
        yield None, False

In [39]:
collection = iterate()
        
sample = list(islice(collection, 2))
ppj(json.dumps(sample, indent=2))

[
  [
    [
      [38;2;102;102;102m0[39m,
      [38;2;186;33;33m"a0af3f82-089f-437d-bf08-2b5994ab0824"[39m,
      [38;2;186;33;33m"smithelijah@gmail.com"[39m,
      [38;2;102;102;102m6957[39m
    ],
    [38;2;0;128;0;01mtrue[39;00m
  ],
  [
    [
      [38;2;102;102;102m1[39m,
      [38;2;186;33;33m"35a3006f-d53d-450d-a5c2-8766c183850b"[39m,
      [38;2;186;33;33m"heatherjones@yahoo.com"[39m,
      [38;2;102;102;102m4612[39m
    ],
    [38;2;0;128;0;01mtrue[39;00m
  ]
]



In [40]:
collection = iterate()
for i, (el, has_next) in enumerate(collection):
    print(i, el, has_next)
    if i >= 12:
        break

0 (0, 'a9dedfac-4fb7-4125-bfc4-a32fffb0da7e', 'jacobgutierrez@spears.com', 2274) True
1 (1, 'fbb9d54a-5e0e-45dd-8cc3-a8cf55460520', 'mgarcia@roach-rocha.com', 395) True
2 (2, '8343bd45-6e43-475d-81e7-3fb6cd1e9040', 'jessica31@gill.com', 6241) True
3 (3, 'a5f76e6d-7acc-49c0-8f4d-45e89e9b2fa8', 'hannah50@hotmail.com', 2526) True
4 (4, 'd8ac05ce-e460-4d4b-b1b7-a267fcf35ff1', 'dustinwilson@gmail.com', 235) True
5 (5, 'bc75006f-31fa-4adb-9820-6ef09f1472b0', 'judithwatts@bennett-shaw.com', 8462) True
6 (6, '5eebd20c-31a2-46b7-818c-a99bf6693b41', 'hodgechristopher@gmail.com', 9908) True
7 (7, '957b23da-60bc-468a-b209-54c324be4eb2', 'jessejarvis@hall.com', 1646) True
8 (8, '49b4f394-3a2d-4071-9926-3642a52d636f', 'reedkevin@adams.net', 3694) True
9 (8, '69d37719-7e1c-4551-92f1-1020c3d084df', 'zward@jones-morris.com', 4370) False
10 None False
11 None False
12 None False


---

_For our exercise, we are only going to collect the ID if the last column has the value "true"_

In [41]:
collection = iterate()

ids = []
for el, has_next in collection:
    print(el, has_next)
    if el[3] % 2:
        ids.append(el[1])
    if not has_next:
        break

(0, '1cb201f8-5def-4f4b-8979-1e3b3ff8a310', 'jessepayne@gmail.com', 7111) True
(1, 'd0d06752-8122-4620-bf45-7493b2495dcb', 'lblevins@gmail.com', 9351) True
(2, '71155947-f977-43e1-9309-8cb1de08a1df', 'omccarthy@robinson.info', 2057) True
(3, 'f2828bb1-f61f-4b57-bd57-7e48f6904af2', 'victor45@gmail.com', 2796) True
(4, 'b1c28e1d-2f56-42cc-b465-32b88eed0513', 'igilmore@hotmail.com', 9946) True
(5, '9712d1ff-36ae-42c8-99be-b9ee4596a072', 'michael26@welch-hoffman.net', 7678) True
(6, '76ed314e-94fe-4002-8bb1-dfb596b95a27', 'rstafford@martinez.com', 4402) True
(7, 'f11b0b86-7c1c-4cab-978b-506c2ffa8ebf', 'andrew51@delgado.com', 3332) True
(8, '7206c1fa-3084-4f69-a5bb-28f01923e094', 'payneluis@gmail.com', 6878) True
(8, '60d20e6f-d6d1-4355-a47c-def6a1b5a255', 'qpena@allen-lawson.biz', 1251) False


### Adding more stuff

Let's add some details around how many items we consumed/are up to

In [10]:
def process(collection):
    ids = []
    for i, (el, has_next) in enumerate(collection):
        print(i, el, has_next)
        if el[3] % 2:
            ids.append(el[1])
        if has_next == False:
            break

## Problems

Let's add a spanner

In [11]:
process(iterate(10, True))

0 (0, '286e921d-6dca-4a10-91a6-7f93509fd7ec', 'christopherrobinson@maxwell-wells.com', None) True


TypeError: unsupported operand type(s) for %: 'NoneType' and 'int'

OK, so let's just add an `isinstance` check

In [12]:
def process(collection):
    ids = []
    for i, (el, has_next) in enumerate(collection):
        print(i, el, has_next)
        if isinstance(el, int) and el[3] % 2:
            ids.append(el[1])
        if len(el[2]) > 15:
            print(f'{"-".join(el)}')
        if has_next == False:
            break

In [13]:
process(iterate())

0 (0, '77c6e156-07f7-47b2-af38-ae691341891e', 'schneiderkathy@gmail.com', 3982) True


TypeError: sequence item 0: expected str instance, int found

Now add something that makes sure we can stringify the item

In [14]:
def process(collection):
    ids = []
    for i, (el, has_next) in enumerate(collection):
        print(i, el, has_next)
        if isinstance(el, int) and el[3] % 2:
            ids.append(el[1])
        if len(el[2]) > 20:
            print(f'!! {"-".join(map(str, el))}')
        if has_next == False:
            break
process(iterate())

0 (0, 'ddc68ae6-42a3-4d6a-97f1-e1faaa59d5c7', 'smitheric@hotmail.com', 4502) True
!! 0-ddc68ae6-42a3-4d6a-97f1-e1faaa59d5c7-smitheric@hotmail.com-4502
1 (1, '15964d0c-776e-44bb-843a-fd8a86598c4a', 'potterbradley@gmail.com', 8620) True
!! 1-15964d0c-776e-44bb-843a-fd8a86598c4a-potterbradley@gmail.com-8620
2 (2, 'c87a00fe-f844-4561-9e19-62a3db63f4f0', 'erik57@campbell.net', 2882) True
3 (3, '902fcda8-164e-460f-8c0f-0cad7f778adf', 'boonemary@faulkner.com', 2293) True
!! 3-902fcda8-164e-460f-8c0f-0cad7f778adf-boonemary@faulkner.com-2293
4 (4, '529e5153-a3d1-414f-bac3-6bc9e5e09cba', 'sclarke@gmail.com', 4266) True
5 (5, 'ae953247-e933-4ff4-9563-2b33439fc5ae', 'choinicholas@mendoza.biz', 2616) True
!! 5-ae953247-e933-4ff4-9563-2b33439fc5ae-choinicholas@mendoza.biz-2616
6 (6, 'dfb70b1a-2ec7-4295-b6bb-c7c798f55043', 'jstewart@yahoo.com', 7837) True
7 (7, '894dbaa4-b6be-4ee3-9ff7-c6e82316da8f', 'lindarodriguez@gmail.com', 8875) True
!! 7-894dbaa4-b6be-4ee3-9ff7-c6e82316da8f-lindarodriguez@gmail

It looks confusing now so let's add some comments

In [15]:
def process(collection):
    ids = []
    for i, (el, has_next) in enumerate(collection): # <--- ⚠
        print(i, el, has_next)
        # collect item if it's even
        if isinstance(el, int) and el[3] % 2: # < ----------------⚠
            ids.append(el[1])
        # Warn about large items
        if len(el[2]) > 20: # <-----------------------------------⚠
            print(f'!! {"-".join(map(str, el))}') # <-------------⚠
        if has_next == False:
            break

process(iterate())

0 (0, 'd0638cdc-5451-4818-b0f4-6c7567c4254f', 'njohnson@sanchez.net', 6290) True
1 (1, 'b1a1164c-a4eb-48c4-9556-060dbc32cd27', 'rjohnson@gmail.com', 5172) True
2 (2, '3335de9c-4f44-4fa3-84dd-3f589fc1c944', 'ogalvan@kelly.biz', 7305) True
3 (3, '6edfdbad-b4d8-42b7-a3a6-e7a77dfdcdb5', 'zrussell@williams.net', 4783) True
!! 3-6edfdbad-b4d8-42b7-a3a6-e7a77dfdcdb5-zrussell@williams.net-4783
4 (4, 'cf4c7925-db32-481b-9bf0-c8d5cde52ea7', 'rroberts@carter.com', 9756) True
5 (5, '7e9d0305-56d6-42d6-ad5d-906a4ab1d182', 'tiffany98@carter.info', 8213) True
!! 5-7e9d0305-56d6-42d6-ad5d-906a4ab1d182-tiffany98@carter.info-8213
6 (6, 'f61bcd11-d45c-4693-8beb-5d2d4f519803', 'myerseric@hotmail.com', 7829) True
!! 6-f61bcd11-d45c-4693-8beb-5d2d4f519803-myerseric@hotmail.com-7829
7 (7, '66775230-1377-4a93-a5ab-2ef561a857dd', 'rmorrison@martin-smith.com', 3510) True
!! 7-66775230-1377-4a93-a5ab-2ef561a857dd-rmorrison@martin-smith.com-3510
8 (8, 'b2a6a7b2-4f83-4378-9006-d0aca09809da', 'alexanderraymond@gmai

---

### Let's take a step back

We use values from the raw item without knowing that they're usable

Instead of holding all the logic in this method, what if we could _ask_ each element if it was even?

In [16]:
from dataclasses import asdict, dataclass

@dataclass
class Element:
    numeric_id: int
    uuid: str
    email: str
    score: int
        
def process(collection):
    ids = []
    for i, (el, has_next) in enumerate(collection):
        el = Element(*el)
        print(i, el, has_next)
        # collect item if it's even
        if isinstance(el.score, int) and el.score % 2:
            ids.append(el.uuid)
        # Warn about large items
        if len(el.email) > 20:
            print(f'!! {"-".join(map(str, asdict(el).values()))}')
        if has_next == False:
            break
process(iterate())

0 Element(numeric_id=0, uuid='5031da3d-40f5-464a-9255-3af3d2098521', email='cochoa@hotmail.com', score=175) True
1 Element(numeric_id=1, uuid='4b001e3b-ce75-48c3-9d0d-04c9ba210229', email='anna51@gmail.com', score=5019) True
2 Element(numeric_id=2, uuid='3197f3c8-3a3c-4948-881a-6b597f14ac5e', email='annabarron@gmail.com', score=4862) True
3 Element(numeric_id=3, uuid='625cd61c-315e-48e3-a994-28303fc11cf0', email='franklinjackson@yahoo.com', score=4481) True
!! 3-625cd61c-315e-48e3-a994-28303fc11cf0-franklinjackson@yahoo.com-4481
4 Element(numeric_id=4, uuid='74a61f75-e2f8-4406-a83c-704ca2cb5b5a', email='dmelton@yahoo.com', score=1617) True
5 Element(numeric_id=5, uuid='af825dbd-a521-4ec4-b3f8-594410881a8f', email='nunezkenneth@church-davis.biz', score=9027) True
!! 5-af825dbd-a521-4ec4-b3f8-594410881a8f-nunezkenneth@church-davis.biz-9027
6 Element(numeric_id=6, uuid='0e7c058b-0c6e-47d8-b0ea-58b610ce27fd', email='silvasarah@henderson.com', score=1334) True
!! 6-0e7c058b-0c6e-47d8-b0ea-5

In [17]:
from dataclasses import dataclass

@dataclass
class Element:
    numeric_id: int
    uuid: str
    email: str
    score: int

    def is_even(self) -> bool:
        try:
            return self.score % 2
        except TypeError:
            return False

    def email_len(self, limit=20) -> bool:
        return len(self.email) > limit
    
    def as_row(self, delim='-'):
        return delim.join(map(str, [
            self.numeric_id, self.uuid, self.email, self.score,
        ]))
        
        
def process(collection):
    ids = []
    for i, (el, has_next) in enumerate(collection):
        el = Element(*el)
        print(i, el, has_next)

        if el.is_even:
            ids.append(el.uuid)
        if el.email_len() > 20:
            print(f'!! {el.as_row()}')

        if has_next == False:
            break
process(iterate())

0 Element(numeric_id=0, uuid='b966cc04-9d02-4187-8b9a-7abc57950645', email='clloyd@gmail.com', score=1629) True
1 Element(numeric_id=1, uuid='7427c9fd-fe4f-457d-8c94-fc5e78279d89', email='johnanderson@freeman.com', score=3124) True
2 Element(numeric_id=2, uuid='96edcbb4-0ac5-442b-ab84-c20d8491c71c', email='craig53@hotmail.com', score=9259) True
3 Element(numeric_id=3, uuid='ad90883b-2a34-4e35-8851-b92a360ed73a', email='rhonda63@yahoo.com', score=9320) True
4 Element(numeric_id=4, uuid='d85d4d58-ed83-4141-8871-17f4b07c0f2b', email='jamesandrea@reese.org', score=1586) True
5 Element(numeric_id=5, uuid='d2e11475-3d93-4827-b1c6-54c77a775a1f', email='lowedaniel@gray.biz', score=2359) True
6 Element(numeric_id=6, uuid='97fb1dc9-72c2-4e11-97a2-1b160491dbf7', email='frodriguez@thomas.com', score=7277) True
7 Element(numeric_id=7, uuid='49e47e1f-1fa5-479d-8efe-8c19e6184140', email='denise95@welch-smith.com', score=8398) True
8 Element(numeric_id=8, uuid='cc15a59f-cf3a-4c40-b0b4-85948e859c25', e

## Now the collection itself

The collection iterator needs some work.
We need something that we can use like this:

```python
# Loop exits when no more items
for el in X:
    Element.from_api(el)
```

In [18]:
def paginate(collection):
    for i, (el, has_next) in enumerate(collection):
        yield i, el
        if not has_next:
            return

print('---- old')
for i, el in enumerate(iterate(5)):
    if i > 8:
        break
    print(el)
    
print('---- new')
for el in paginate(iterate(5)):
    print(el)

---- old
((0, '37773e43-8da4-4637-bcb7-4d5af7d6d8cb', 'anna41@hotmail.com', 9707), True)
((1, '76c59f55-f876-4b90-bec9-6df9cdda5491', 'robert83@gmail.com', 2483), True)
((2, 'cbd3cad0-be49-4a8c-a6dd-f493015bb9f2', 'roberta05@gmail.com', 2814), True)
((3, 'cafa9d35-191a-4dca-b52e-fe507d4a3306', 'john73@hotmail.com', 8512), True)
((3, 'f592e946-9af8-4ab9-beb2-0c252527fafe', 'brendan61@young.info', 9866), False)
(None, False)
(None, False)
(None, False)
(None, False)
---- new
(0, (0, 'd372c4e5-169d-4e3c-b549-967f32801ca5', 'derrickortiz@yahoo.com', 1257))
(1, (1, '8fb8f6b4-0c62-4032-a898-be3320d6183a', 'angelahamilton@torres.com', 4038))
(2, (2, '4eb11c0c-04db-442a-824c-facbcfef8b93', 'ibrooks@simmons-reynolds.com', 7695))
(3, (3, '2297adcc-de3b-4156-b10c-88926aa19640', 'veronica92@morgan-fletcher.com', 4328))
(4, (3, '5eca273f-669c-43e7-9ccd-9d2f6c510e3e', 'russellgregory@gallagher.net', 7332))


In [19]:
def process(collection):
    ids = []
    for i, el in paginate(collection): # <-------- ✓
        el = Element(*el)              # <-------- ⚠
        print(i, el)

        if el.is_even:
            ids.append(el.uuid)
        if el.email_len() > 20:
            print(f'!! {el.as_row()}')

process(iterate(5))

0 Element(numeric_id=0, uuid='f116aebf-93e1-4f84-b655-1b1b99933302', email='alicia39@yahoo.com', score=7240)
1 Element(numeric_id=1, uuid='29555322-f3f1-4837-a979-e8f3d7b185c3', email='ashley18@hotmail.com', score=5827)
2 Element(numeric_id=2, uuid='8aa41651-0152-4c43-9aa5-cd7e3fba0f73', email='llewis@livingston.com', score=4157)
3 Element(numeric_id=3, uuid='841e397f-c496-4467-af45-a9f1ef6df03e', email='austinmcneil@simpson.com', score=6539)
4 Element(numeric_id=3, uuid='b2ea3310-67bb-49ed-96a8-b50ebaa0450d', email='erinjimenez@hotmail.com', score=2369)


In [20]:
@dataclass
class Element:
    numeric_id: int
    uuid: str
    email: str
    score: int

    def is_even(self) -> bool:
        try:
            return self.score % 2
        except TypeError:
            return False

    def email_len(self) -> bool:
        return len(self.email)
    
    def as_row(self, delim='-'):
        return delim.join(map(str, [
            self.numeric_id, self.uuid, self.email, self.score,
        ]))
    
    @staticmethod
    def from_api(raw):
        return Element(*raw)

def paginate(collection):
    for i, (el, has_next) in enumerate(collection):
        yield i, el
        if not has_next:
            return
    
def process(collection):
    ids = []
    for i, el in paginate(collection):
        el = Element.from_api(el)
        print(i, el)

        if el.is_even:
            ids.append(el.uuid)
        if el.email_len() > 20:
            print(f'!! {el.as_row()}')

process(iterate(5))

0 Element(numeric_id=0, uuid='69053f4e-0784-4a32-b2a3-674d5aae95ed', email='gibsonmeagan@parrish.com', score=4710)
!! 0-69053f4e-0784-4a32-b2a3-674d5aae95ed-gibsonmeagan@parrish.com-4710
1 Element(numeric_id=1, uuid='f1762314-551f-4947-9ca3-78316de85773', email='jason41@gmail.com', score=9549)
2 Element(numeric_id=2, uuid='7e5d5cde-b743-40b5-9923-85af032ceb7d', email='wgarrett@tanner.org', score=9584)
3 Element(numeric_id=3, uuid='f6a83638-1155-4942-bd73-97d1f36465b4', email='paul20@hotmail.com', score=1930)
4 Element(numeric_id=3, uuid='539b7b00-8e79-40fe-9df8-bfcc0a4af24a', email='hsimon@hotmail.com', score=4449)


What if we want to send all the even and odd records to different places?
Or, collect all the emails from both categories?

In [21]:
def process(collection, debug=False):
    even_ids = []
    odd_ids = []
    for i, el in paginate(collection):
        el = Element.from_api(el)
        print(i, el)

        if el.is_even():
            even_ids.append(el.uuid)
        else:
            odd_ids.append(el.uuid)
        if el.email_len() > 20 and debug:
            print(f'!! {el.as_row()}')
    return even_ids, odd_ids

even, odd = process(iterate(8))
print('\n', 'even:', len(even), 'odd:', len(odd))
print(even)

0 Element(numeric_id=0, uuid='41fd6194-0ba9-43ad-986d-b0413dbb0096', email='ffrancis@buckley-alvarez.net', score=6558)
1 Element(numeric_id=1, uuid='67385abf-e1ea-4033-b007-5fbac39e3882', email='michael02@yahoo.com', score=9177)
2 Element(numeric_id=2, uuid='8d8b126b-1858-48bb-ac23-8e175954ab9d', email='gary58@rice.com', score=6762)
3 Element(numeric_id=3, uuid='84f81dad-b4ab-44bb-9d65-e288fcf9f335', email='gamblejason@mcpherson.info', score=9362)
4 Element(numeric_id=4, uuid='5edcf716-d01e-46f9-a461-4fd2c0b8af51', email='christopherrose@hall-montgomery.com', score=3072)
5 Element(numeric_id=5, uuid='f20e0fc7-c763-414e-bcd3-e0216b709375', email='choimelissa@hotmail.com', score=2945)
6 Element(numeric_id=6, uuid='0cc45990-9b7d-40ae-a448-5717397587dc', email='webercarol@yahoo.com', score=7343)
7 Element(numeric_id=6, uuid='60e8b6e2-9c97-448d-b7c2-20468fde1a06', email='jordanjeffrey@hotmail.com', score=9857)

 even: 4 odd: 4
['67385abf-e1ea-4033-b007-5fbac39e3882', 'f20e0fc7-c763-414e-bcd

What if we want to collect the emails of odd/even people instead? or something else in the future?

Step 1: just return the entire objects, don't grab values from them

In [22]:
def process(collection, debug=False):
    even = []
    odd = []
    for i, el in paginate(collection):
        el = Element.from_api(el)
        print(i, el)

        if el.is_even():
            even.append(el)
        else:
            odd.append(el)
        if el.email_len() > 20 and debug:
            print(f'!! {el.as_row()}')
    return even, odd

even, odd = process(iterate(8))
print('\n', 'even:', len(even), 'odd:', len(odd))
print([el.email for el in even])

0 Element(numeric_id=0, uuid='cafcd7d2-367f-493c-a0ce-0875abdd9641', email='psmith@palmer.com', score=3898)
1 Element(numeric_id=1, uuid='7221d8df-844e-431d-a880-5009cb413f03', email='hudsonjay@hotmail.com', score=1353)
2 Element(numeric_id=2, uuid='5e219c0b-6d3d-443a-800f-ba6c74a6169b', email='hernandezdevin@gmail.com', score=1531)
3 Element(numeric_id=3, uuid='f3c51d7f-ca88-49fa-b863-2927f1045133', email='turnerthomas@miller.com', score=9213)
4 Element(numeric_id=4, uuid='5570a2bd-1c01-4d8b-9ba7-534f8928fdab', email='amanda90@newton.com', score=3296)
5 Element(numeric_id=5, uuid='b895166e-61f2-4d70-b8ad-0d6345fe015f', email='ksimpson@webster.com', score=9639)
6 Element(numeric_id=6, uuid='39385694-f51e-4503-8bfa-4fd682bed96e', email='darren16@mitchell.com', score=3675)
7 Element(numeric_id=6, uuid='3ca687e1-3e08-4c25-8a2c-50e8c2e0358b', email='kathleentownsend@gmail.com', score=9431)

 even: 6 odd: 2
['hudsonjay@hotmail.com', 'hernandezdevin@gmail.com', 'turnerthomas@miller.com', 'ks

In [23]:
from collections import Counter
from typing import List
from itertools import filterfalse

def paginate(collection):
    for el, has_next in collection:
        yield el
        if not has_next:
            return

@dataclass
class Collection:
    items: List[Element]

    def from_raw(items):
        return Collection(list(map(Element.from_api, items)))
        
    def emails(self):
        return [el.email for el in self.items]

    def __iter__(self):
        yield from self.items
        
    def odd_records(self):
        return Collection(list(filter(lambda x: x.score % 2, self.items)))
    
    def even_records(self):
        return Collection(list(filterfalse(lambda x: x.score % 2, self.items)))


c = Collection.from_raw(paginate(iterate(8)))
print('all emails\n', c.emails())

print('\nodd records\n', c.odd_records())
print('\neven records\n', c.even_records())

print('\neven emails!\n', c.even_records().emails())

all emails
 ['bvargas@morris-mueller.org', 'derekpope@horn-gomez.com', 'rgonzales@yahoo.com', 'phernandez@cooper.biz', 'loganleblanc@yahoo.com', 'natalie14@gmail.com', 'christie85@mahoney.com', 'michael64@gmail.com']

odd records
 Collection(items=[Element(numeric_id=2, uuid='abe8e4d0-2a69-4b84-a61e-19dae340b084', email='rgonzales@yahoo.com', score=1009), Element(numeric_id=3, uuid='cabd1900-35d4-4eb6-8368-68a9e0d82686', email='phernandez@cooper.biz', score=4993), Element(numeric_id=4, uuid='e854a982-4593-458f-91b9-ea3db3169880', email='loganleblanc@yahoo.com', score=5665), Element(numeric_id=5, uuid='87569e30-93e6-44a0-bdd0-bc00fac8e3dc', email='natalie14@gmail.com', score=5843), Element(numeric_id=6, uuid='51092b04-a3ab-4d3b-9c00-7c17f2427d72', email='michael64@gmail.com', score=2259)])

even records
 Collection(items=[Element(numeric_id=0, uuid='8d27f75a-1812-4ba9-af4f-ac873581a919', email='bvargas@morris-mueller.org', score=1876), Element(numeric_id=1, uuid='58f9a8b3-84f6-4450-a8c9

---

### Filtering and sorting

If you have a static method (not an instance method), you can filter with that instead of having to use a lambda

In [24]:
@dataclass
class Element:
    numeric_id: int
    uuid: str
    email: str
    score: int

    @staticmethod
    def from_api(raw):
        return Element(*raw)
        
    def is_even(self) -> bool:
        try:
            return self.score % 2
        except TypeError:
            return False
    
    @staticmethod
    def _is_even(element):
        return element._is_even()
    
    @staticmethod
    def _is_false(element):
        return not element._is_even()

    def email_len(self) -> bool:
        return len(self.email)
    
    def as_row(self, delim='-'):
        return delim.join(map(str, [
            self.numeric_id, self.uuid, self.email, self.score,
        ]))

@dataclass
class Collection:
    items: List[Element]

    def from_raw(items):
        return Collection(list(map(Element.from_api, items)))
        
    @property
    def emails(self):
        return [el.email for el in self.items]

    def __iter__(self):
        yield from self.items
        
    def filter_records(self, pred):
        return Collection(list(filter(pred, self.items)))


c = Collection.from_raw(paginate(iterate()))
print('total items', len(c.items))

even = c.filter_records(Element.is_even)
print('\neven items\n', len(even.emails), even.emails)

odd = c.filter_records(Element.is_even)
print('\nodd items\n', len(even.emails), even.emails)

total items 10

even items
 4 ['yfitzpatrick@gmail.com', 'donaldsonjillian@davila.com', 'jonescatherine@davis.net', 'knoxthomas@gmail.com']

odd items
 4 ['yfitzpatrick@gmail.com', 'donaldsonjillian@davila.com', 'jonescatherine@davis.net', 'knoxthomas@gmail.com']


In [25]:
c.filter_records(lambda x: x.email.startswith('a'))

Collection(items=[])

In [26]:
c.filter_records(lambda x: '@' in x.email)

Collection(items=[Element(numeric_id=0, uuid='2d613f5b-43ab-4ecf-b9ee-51dadd3166b5', email='irivera@yahoo.com', score=5408), Element(numeric_id=1, uuid='8599f467-480d-4dd3-a257-96a76a10c5ee', email='yfitzpatrick@gmail.com', score=5779), Element(numeric_id=2, uuid='3aedc8b3-e4ca-4d92-9649-4cc464ab05ff', email='fosterphilip@gmail.com', score=5320), Element(numeric_id=3, uuid='fcad899c-26b1-4c13-a0aa-05923b66b393', email='donaldsonjillian@davila.com', score=7273), Element(numeric_id=4, uuid='633c519e-3b35-4077-b2cf-3ac74abb963f', email='robert02@hotmail.com', score=4468), Element(numeric_id=5, uuid='3d1aec3d-7159-4ee0-a15c-37f68aefaffb', email='jonescatherine@davis.net', score=5269), Element(numeric_id=6, uuid='89c3dcf9-6f49-427a-b68a-8d9be96d3692', email='xortega@singleton.com', score=6576), Element(numeric_id=7, uuid='b2d4f7e7-868a-450a-9af6-344410c99cc1', email='bobbylewis@hotmail.com', score=3588), Element(numeric_id=8, uuid='e1b88691-9e66-4baa-93e8-469ce592028e', email='knoxthomas@gm

In [27]:
@dataclass
class Element:
    numeric_id: int
    uuid: str
    email: str
    score: int

    @staticmethod
    def from_api(raw):
        return Element(*raw)
        
    def is_even(self) -> bool:
        try:
            return self.score % 2
        except TypeError:
            return False
    
    @staticmethod
    def _is_even(element):
        return element._is_even()
    
    @staticmethod
    def _is_false(element):
        return not element._is_even()

    def email_len(self) -> bool:
        return len(self.email)
    
    def as_row(self, delim='-'):
        return delim.join(map(str, [
            self.numeric_id, self.uuid, self.email, self.score,
        ]))

@dataclass
class Collection:
    items: List[Element]

    def from_raw(items):
        return Collection(list(map(Element.from_api, items)))

    @property
    def emails(self):
        return [el.email for el in self.items]

    def __iter__(self):
        yield from self.items

    def filter_records(self, pred):
        return Collection(list(filter(pred, self.items)))

    
c = Collection.from_raw(paginate(iterate()))
print('total items', len(c.items))

even = c.filter_records(Element.is_even)
print('\neven items\n', len(even.emails), even.emails)

odd = c.filter_records(Element.is_even)
print('\nodd items\n', len(even.emails), even.emails)

total items 10

even items
 3 ['john91@hotmail.com', 'uarellano@jordan.com', 'anthony71@smith.com']

odd items
 3 ['john91@hotmail.com', 'uarellano@jordan.com', 'anthony71@smith.com']


## Representation

Dunder methods!

Let's make a few options:

- All objects as JSON
- All objects as rows/lists

In [28]:
@dataclass
class Element:
    numeric_id: int
    uuid: str
    email: str
    score: int

    @staticmethod
    def from_api(raw):
        return Element(*raw)
        
    def is_even(self) -> bool:
        try:
            return self.score % 2
        except TypeError:
            return False
    
    @staticmethod
    def _is_even(element):
        return element._is_even()
    
    @staticmethod
    def _is_false(element):
        return not element._is_even()

    def email_len(self) -> bool:
        return len(self.email)
    
    def as_row(self, delim='-'):
        return delim.join(map(str, [
            self.numeric_id, self.uuid, self.email, self.score,
        ]))
    
    def as_json(self):
        return json.dumps(self.__dict__)

@dataclass
class Collection:
    items: List[Element]

    def from_raw(items):
        return Collection(list(map(Element.from_api, items)))
        
    @property
    def emails(self):
        return [el.email for el in self.items]

    def __iter__(self):
        yield from self.items
        
    def filter_records(self, pred):
        return Collection(list(filter(pred, self.items)))
    
    def as_json(self, **kwargs):
        return json.dumps(
            [asdict(el) for el in self.items],
            **kwargs
        )

    
c = Collection.from_raw(paginate(iterate()))
print('total items', len(c.items))

even = c.filter_records(Element.is_even)
print('\neven items\n', len(even.emails), even.emails)

odd = c.filter_records(Element.is_even)
print('\nodd items\n', len(even.emails), even.emails)

total items 10

even items
 7 ['jacob47@gmail.com', 'stoneemily@hotmail.com', 'harrisbrianna@yahoo.com', 'whitedaniel@yahoo.com', 'kelly76@spence-fletcher.com', 'rlandry@hotmail.com', 'andrea28@hotmail.com']

odd items
 7 ['jacob47@gmail.com', 'stoneemily@hotmail.com', 'harrisbrianna@yahoo.com', 'whitedaniel@yahoo.com', 'kelly76@spence-fletcher.com', 'rlandry@hotmail.com', 'andrea28@hotmail.com']


In [29]:
c = Collection.from_raw(paginate(iterate()))

ppj(c.items[0].as_json())
ppj(c.as_json())

{[38;2;0;128;0;01m"numeric_id"[39;00m: [38;2;102;102;102m0[39m, [38;2;0;128;0;01m"uuid"[39;00m: [38;2;186;33;33m"2d23a66c-0837-49c2-81a8-cf64236f3f47"[39m, [38;2;0;128;0;01m"email"[39;00m: [38;2;186;33;33m"ksanchez@hotmail.com"[39m, [38;2;0;128;0;01m"score"[39;00m: [38;2;102;102;102m2082[39m}

[{[38;2;0;128;0;01m"numeric_id"[39;00m: [38;2;102;102;102m0[39m, [38;2;0;128;0;01m"uuid"[39;00m: [38;2;186;33;33m"2d23a66c-0837-49c2-81a8-cf64236f3f47"[39m, [38;2;0;128;0;01m"email"[39;00m: [38;2;186;33;33m"ksanchez@hotmail.com"[39m, [38;2;0;128;0;01m"score"[39;00m: [38;2;102;102;102m2082[39m}, {[38;2;0;128;0;01m"numeric_id"[39;00m: [38;2;102;102;102m1[39m, [38;2;0;128;0;01m"uuid"[39;00m: [38;2;186;33;33m"492620c2-455e-4ae5-b874-2a65ce3976ad"[39m, [38;2;0;128;0;01m"email"[39;00m: [38;2;186;33;33m"qbenson@gmail.com"[39m, [38;2;0;128;0;01m"score"[39;00m: [38;2;102;102;102m8032[39m}, {[38;2;0;128;0;01m"numeric_id"[39;00m: [38;2;102;102;102m2[39m, 

In [30]:
ppj(c[:2].as_json(indent=2))

TypeError: 'Collection' object is not subscriptable

In [31]:
What if we just want to print the first few items?

Object `items` not found.


In [None]:
What if we just want to print the first few items

In [32]:
ppj(c.items[:2].as_json(indent=2))

AttributeError: 'list' object has no attribute 'as_json'

In [33]:
ppj(Collection(c.items[:2]).as_json(indent=2))

[
  {
    [38;2;0;128;0;01m"numeric_id"[39;00m: [38;2;102;102;102m0[39m,
    [38;2;0;128;0;01m"uuid"[39;00m: [38;2;186;33;33m"2d23a66c-0837-49c2-81a8-cf64236f3f47"[39m,
    [38;2;0;128;0;01m"email"[39;00m: [38;2;186;33;33m"ksanchez@hotmail.com"[39m,
    [38;2;0;128;0;01m"score"[39;00m: [38;2;102;102;102m2082[39m
  },
  {
    [38;2;0;128;0;01m"numeric_id"[39;00m: [38;2;102;102;102m1[39m,
    [38;2;0;128;0;01m"uuid"[39;00m: [38;2;186;33;33m"492620c2-455e-4ae5-b874-2a65ce3976ad"[39m,
    [38;2;0;128;0;01m"email"[39;00m: [38;2;186;33;33m"qbenson@gmail.com"[39m,
    [38;2;0;128;0;01m"score"[39;00m: [38;2;102;102;102m8032[39m
  }
]



In [34]:
@dataclass
class Collection:
    items: List[Element]

    def from_raw(items):
        return Collection(list(map(Element.from_api, items)))

    def __getitem__(self, i):
        return Collection(self.items[i])
    
    @property
    def emails(self):
        return [el.email for el in self.items]

    def __iter__(self):
        yield from self.items
        
    def filter_records(self, pred):
        return Collection(list(filter(pred, self.items)))
    
    def as_json(self, **kwargs):
        return json.dumps(
            [asdict(el) for el in self.items],
            **kwargs
        )

In [35]:
c = Collection.from_raw(paginate(iterate()))

ppj(c[1:2].as_json(indent=2))
ppj(c[1:3].as_json(indent=2))

[
  {
    [38;2;0;128;0;01m"numeric_id"[39;00m: [38;2;102;102;102m1[39m,
    [38;2;0;128;0;01m"uuid"[39;00m: [38;2;186;33;33m"dfe7fdbd-9f3d-444d-a856-7552fab7acdf"[39m,
    [38;2;0;128;0;01m"email"[39;00m: [38;2;186;33;33m"moorejoshua@martin.com"[39m,
    [38;2;0;128;0;01m"score"[39;00m: [38;2;102;102;102m8345[39m
  }
]

[
  {
    [38;2;0;128;0;01m"numeric_id"[39;00m: [38;2;102;102;102m1[39m,
    [38;2;0;128;0;01m"uuid"[39;00m: [38;2;186;33;33m"dfe7fdbd-9f3d-444d-a856-7552fab7acdf"[39m,
    [38;2;0;128;0;01m"email"[39;00m: [38;2;186;33;33m"moorejoshua@martin.com"[39m,
    [38;2;0;128;0;01m"score"[39;00m: [38;2;102;102;102m8345[39m
  },
  {
    [38;2;0;128;0;01m"numeric_id"[39;00m: [38;2;102;102;102m2[39m,
    [38;2;0;128;0;01m"uuid"[39;00m: [38;2;186;33;33m"64a35f6e-1561-430a-9c5a-fe89739e06d6"[39m,
    [38;2;0;128;0;01m"email"[39;00m: [38;2;186;33;33m"george24@hotmail.com"[39m,
    [38;2;0;128;0;01m"score"[39;00m: [38;2;102;102;102m1768[

### Sorting!

You can sort easily if you already have handy methods available for getting the values to sort by

In [36]:
list(sorted(c, key=lambda x: x.score))

[Element(numeric_id=3, uuid='23b2b93e-24a5-49db-8c7f-76db20aedc55', email='floresbrandon@hotmail.com', score=1396),
 Element(numeric_id=2, uuid='64a35f6e-1561-430a-9c5a-fe89739e06d6', email='george24@hotmail.com', score=1768),
 Element(numeric_id=5, uuid='5431ad4c-6feb-45a9-ba71-4b5e42ecec8c', email='travis87@blake-brown.biz', score=2949),
 Element(numeric_id=8, uuid='c0edd942-2078-48ee-8c6a-93246a0e340e', email='carolburns@hotmail.com', score=3815),
 Element(numeric_id=0, uuid='89d6a291-e484-4149-838b-ebbcb7a87f5f', email='cherylellis@giles.com', score=5285),
 Element(numeric_id=6, uuid='79378ef4-49ea-4601-b6b2-59b062a81ecf', email='rbrown@hotmail.com', score=6605),
 Element(numeric_id=8, uuid='e17478a5-c64e-4cc0-a77d-3199d1f5579b', email='brenda03@casey-sharp.com', score=7416),
 Element(numeric_id=1, uuid='dfe7fdbd-9f3d-444d-a856-7552fab7acdf', email='moorejoshua@martin.com', score=8345),
 Element(numeric_id=7, uuid='7a2217a7-5864-4307-9f5a-9720d1a5a0b2', email='wrightjennifer@young.