# Use more iterators and fewer lists for loops

Regardless of whether you use a list or a tuple, you need to store a reference to every value in memory. Both lists and tuples take up more memory the bigger they get.

In [13]:
from sys import getsizeof

little_tuple = tuple(range(10))
getsizeof(little_tuple)

120

In [14]:
big_tuple = tuple(range(10000))
getsizeof(big_tuple)

80040

Lists and tuples are iterables because you can iterate over their values. Iterators are a special type of iterable that evaluate their values lazily, only when the values are needed. They don't hold references to all the values in memory. That means iterators have constant size in memory. For example, `enumerate` produces an iterator. ArcPy cursors also produce iterators


In [15]:
little_tuple_iterator = enumerate(little_tuple)
getsizeof(little_tuple_iterator)

72

In [16]:
big_tuple_iterator = enumerate(big_tuple)
getsizeof(big_tuple_iterator)

72

An iterator is any object that implements both the `__iter__` and `__next__` methods. You can make your own iterators by creating a custom class that implements those methods (or a class that inherits from `collections.abc.Iterator`). An easier way to create your own iterators is to create a special type of iterator called a generator. Generators are functions that use a yield statement to produce values lazily, one at a time (instead of all at once like a list or tuple). For example, you could use a generator to produce infinite sequential ObjectID numbers.

In [18]:
def make_oids(start):
    while True:
        yield start
        start += 1

oid = make_oids(start=1000)

# Every time you want a new oid, just call next(oid). The size of oid never changes
print(f"At first the size of oid is {getsizeof(oid)} bytes")
print(f"The first value produced by oid is {next(oid)}")

for i in range(10000):
    new_oid = next(oid)

print(f"After 10,000 iterations, the next value produced by oid is {next(oid)}")
print(f"After 10,000 iterations, the size of oid is {getsizeof(oid)} bytes")

At first the size of oid is 184 bytes
The first value produced by oid is 1000
After 10,000 iterations, the next value produced by oid is 11001
After 10,000 iterations, the size of oid is 184 bytes


Iterators (inlcuding generators) are useful for enabling better separation of concerns. Maybe you need to create many records and assign each of them an ObjectID. You could do that in a loop.

In [10]:
records = []
oid = 1000
for value in range(10):
    record = {
        'OID':  oid,
        'value': value,
    }
    oid += 1
    records.append(record)
records

[{'OID': 1000, 'value': 0},
 {'OID': 1001, 'value': 1},
 {'OID': 1002, 'value': 2},
 {'OID': 1003, 'value': 3},
 {'OID': 1004, 'value': 4},
 {'OID': 1005, 'value': 5},
 {'OID': 1006, 'value': 6},
 {'OID': 1007, 'value': 7},
 {'OID': 1008, 'value': 8},
 {'OID': 1009, 'value': 9}]

That works, but the loop is doing two things: making a record and making an OID value. It would be better if those things were separated. That way if we later need to change the way OID values are generated, we don't have to change the code in the loop.

In [11]:
records = []
oid = make_oids(1000)
for value in range(10):
    record = {
        'OID':  next(oid),
        'value': value,
    }
    records.append(record)
records

[{'OID': 1000, 'value': 0},
 {'OID': 1001, 'value': 1},
 {'OID': 1002, 'value': 2},
 {'OID': 1003, 'value': 3},
 {'OID': 1004, 'value': 4},
 {'OID': 1005, 'value': 5},
 {'OID': 1006, 'value': 6},
 {'OID': 1007, 'value': 7},
 {'OID': 1008, 'value': 8},
 {'OID': 1009, 'value': 9}]