**Summary**
1. Iterable: lists, strings, dictionaries, file connections.
2. Iterator: produces next value with `next()`
3. `enumerate()`: takes any iterable as an argument and returns an enumerate object with pairs of each element and its index.
4. `zip()`: accepts an arbitrary number of iterables and returns an iterator of tuples.
5. `pd.read_csv(path, chunksize=1000)`
6. List comprehensions: basic syntax is **[output expression for Iterator variable in Iterable]**.
7. Generators - a special type of iterator that iterate over a sequence of values without creating the entire sequence in memory at once - `yield` statement.
8. List comprehensions vs. generators: (1)List comprehension - returns a list; Generators - returns a generator object. (2)Both can be iterated over.

In [1]:
import pandas as pd

**1. Introduction to iterators**

In [1]:
#Iterable:
#Examples: lists, strings, dictionaries, file connections
#An object with an associated iter() method
#Applying iter() to an iterable creates an iterator
#Iterator:
#Produces next value with next()

In [4]:
#iterating over iterables: next()
word = 'Terry'  #define a iterable
it = iter(word)
next(it)

'T'

In [5]:
next(it)

'e'

In [8]:
#iterating at once with *
word = 'Terry'  
it = iter(word)
print(*it)

T e r r y


In [10]:
#iterating over dictionaries
pythonistas = {'hugo': 'bowne-anderson', 'francis': 'castro'}
for key, value in pythonistas.items():
    print(key, value)

hugo bowne-anderson
francis castro


In [15]:
#iterating over file connections
file = open('C:/Users/89751/OneDrive/Desktop/test.csv')

it = iter(file)
print(next(it))

System boundary,Functional unit,Impact assessment method,Impact category,PDF Path



In [16]:
print(next(it))

Cradle to grave,79 t of methanol,CML2001,"['Global Warming Potential (GWP)', 'Acidification Potential (AP)', 'Human Toxicity Potential (HTP)']",C:/Users/89751/OneDrive/Desktop/LCA ontology/Ragas_evaluation/Paper/1.pdf



**2. Playing with iterators**

In [5]:
#Enumerate(): takes any iterable as argument, and returns a enumerate object
#which consists of pairs containing the elements of the original iterable, along with their index within the iterable.
avengers = ["hawkeye", "iron man", "thor", "quicksilver"]
e = enumerate(avengers)
e_list = list(e)
print(e_list)

[(0, 'hawkeye'), (1, 'iron man'), (2, 'thor'), (3, 'quicksilver')]


In [6]:
#enumerate() and unpack
avengers = ["hawkeye", "iron man", "thor", "quicksilver"]
for index, value in enumerate(average):
    print(index, value)

0 hawkeye
1 iron man
2 thor
3 quicksilver


In [7]:
#Zip(): accepts an arbitrary number of iterables and returns an iterator of tuples.
avengers = ["hawkeye", "iron man", "thor", "quicksilver"]
names = ["barton", "stark", "odinson", "maximoff"]
z = zip(avengers, names)
z_list = list(z)
print(z_list)

[('hawkeye', 'barton'), ('iron man', 'stark'), ('thor', 'odinson'), ('quicksilver', 'maximoff')]


In [8]:
#Zip() and unpack
average = ["hawkeye", "iron man", "thor", "quicksilver"]
names = ["barton", "stark", "odinson", "maximoff"]
for z1, z2 in zip(avengers, names):
    print(z1, z2)

hawkeye barton
iron man stark
thor odinson
quicksilver maximoff


In [9]:
#print zip with *
avengers = ["hawkeye", "iron man", "thor", "quicksilver"]
names = ["barton", "stark", "odinson", "maximoff"]
z = zip(avengers, names)
print(*z)

('hawkeye', 'barton') ('iron man', 'stark') ('thor', 'odinson') ('quicksilver', 'maximoff')


In [10]:
# Assuming mutants and powers are defined
mutants = ['Wolverine', 'Cyclops', 'Storm']
powers = ['Regeneration', 'Optic Blast', 'Weather Control']

# Create a zip object from mutants and powers
z1 = zip(mutants, powers)

# 'Unzip' the tuples in z1 by unpacking them into positional arguments using the * operator in a zip() call
result1, result2 = zip(*z1)

# Convert the result tuples to lists for better readability (optional)
result1 = list(result1)
result2 = list(result2)

# Print the results
print(result1)  # Output: ['Wolverine', 'Cyclops', 'Storm']
print(result2)  # Output: ['Regeneration', 'Optic Blast', 'Weather Control']

['Wolverine', 'Cyclops', 'Storm']
['Regeneration', 'Optic Blast', 'Weather Control']


**3. Using iterators to load large files into memory**

In [15]:
#iterating over data
result = []
path = "C:/Users/89751/OneDrive/Desktop/embeddings_train.csv"
for chunk in pd.read_csv(path, chunksize=1000):
    result.append(sum(chunk['embedding_0']))
total = sum(result)
total

-2.38281186831227

In [16]:
#iterating over data
total = 0
for chunk in pd.read_csv(path, chunksize=1000):
    total += sum(chunk['embedding_0'])
total

-2.38281186831227

**4. List comprehensions**

**List comprehensions**
1. Collapse for loops for building lists into a single line
2. Components
   * Iterable
   * Iterator variable (represent members of iterable)
   * Output expression
3. The basic syntax for a list comprehension is **[output expression for Iterator variable in Iterable]**
4. Advanced **[output expression + conditional on output for iterator variable in iterable + conditional on iterable]**
5. *Note*: when you use list comprehensives, please pay attention to readibility issues.

In [1]:
#populate a list (common way)
nums = [12, 8, 21, 3, 16]
new_nums = []
for num in nums:
    new_nums.append(num + 1)
print(new_nums)

[13, 9, 22, 4, 17]


In [3]:
#a list comprehension
nums = [12, 8, 21, 3, 16]
new_nums = [num + 1 for num in nums]
new_nums

[13, 9, 22, 4, 17]

In [4]:
result = [num for num in range(11)]
result

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

In [6]:
#nested loops (1)
pairs_1 = []
for num1 in range(0, 2):
    for num2 in range(6, 8):
        pairs_1.append((num1, num2))
pairs_1

[(0, 6), (0, 7), (1, 6), (1, 7)]

In [8]:
#nested loops (2)
pairs_2 = [(num1, num2) for num1 in range(0, 2) for num2 in range(6, 8)]
print(pairs_2)

[(0, 6), (0, 7), (1, 6), (1, 7)]


In [9]:
# Create a 5 x 5 matrix using a list of lists: matrix
matrix = [[col for col in range(5)] for row in range(5)]

# Print the matrix
for row in matrix:
    print(row)

[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]


**5. Advanced comprehensions**

In [10]:
#conditionals in comprehensions
[num ** 2 for num in range(10) if num % 2 == 0]

[0, 4, 16, 36, 64]

In [11]:
[num **2 if num % 2 == 0 else 0 for num in range(10)]

[0, 0, 4, 0, 16, 0, 36, 0, 64, 0]

In [14]:
#Dict comprehensions
#create dictionaries; Use curly braces {} instead of brackets []
pos_neg = {num: -num for num in range(9)}
pos_neg

{0: 0, 1: -1, 2: -2, 3: -3, 4: -4, 5: -5, 6: -6, 7: -7, 8: -8}

**6. Introduction to generator expressions**

In [16]:
#generator expressions
[2 * num for num in range(10)]

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

In [17]:
(2 * num for num in range(10))

<generator object <genexpr> at 0x000001DE0EA153C0>

In [24]:
#List comprehensions vs. generators
#List comprehension - returns a list
#Generators - returns a generator object (no memory useage)
#Both can be iterated over
results = (2 * num for num in range(6))
for num in results:
    print(num)

0
2
4
6
8
10


In [25]:
results = (2 * num for num in range(6))
print(list(results))

[0, 2, 4, 6, 8, 10]


In [26]:
#conditionals in generator expressions
even_nums = (num for num in range(10) if num%2 == 0)
print(list(even_nums))

[0, 2, 4, 6, 8]


In [27]:
#build a generator function (using yield)
def num_sequence(n):
    """Generate values from 0 to n"""
    i = 0
    while i < n:
        yield i
        i += 1

In [30]:
# Create a list of strings
lannister = ['cersei', 'jaime', 'tywin', 'tyrion', 'joffrey']

# Define generator function get_lengths
def get_lengths(input_list):
    """Generator function that yields the
    length of the strings in input_list."""

    # Yield the length of a string
    for person in input_list:
        yield len(person)

# Print the values generated by get_lengths()
for value in get_lengths(lannister):
    print(value)

6
5
5
6
7


In [28]:
result = num_sequence(5)
type(result)

generator

In [29]:
for item in result:
    print(item)

0
1
2
3
4


**7. Using Python generators for streaming data**

In [4]:
list = [(91401583.0, 44.5079211390026), (92237118.0, 45.206665319194), (93014890.0, 45.866564696018), (93845749.0, 46.5340927663649), (94722599.0, 47.2087429803526)]

In [14]:
list_new =  [int(tup[0] * tup[1] * 0.01) for tup in list]
list_new 

[40680944, 41697325, 42662734, 43670267, 44717348]