## Learn-by-doing python stuffs:
- In this workbook I. shall try to explain with simple examples some of the intermediate python features.
- As usual, there will be less theory and more code.

### enumerate()
- `Want a free counter for your iterator? Use enumerate()`
- A lot of times while looping over an iterator we realize the need of a counter.
- In such situations, a common workaround which one does is using a range function. This semingly appears to be an overcomplication of the given situation.
- Python's built-in **enumerate()** methods seems to be the perfect fit for such cases. 

_Let's see an example_


In [1]:
"""
* Say we have a situation where we have generated the list of students sorted by the marks obtained 
in descending order.
* We want to display the names of the students along with their rank 
(as students as arranged from highest->lowest marks, it is implied that the student at index 0 came 1st,
student at index 1 came 2nd and so on )

"""
lst = ["StudA","StudB","StudC","StudD","StudE",]

In [3]:
# Approach 1: via range()
for stud in range(len(lst)):
    print(f"{lst[stud]} has secured position {stud+1} in the class")

StudA has secured position 1 in the class
StudB has secured position 2 in the class
StudC has secured position 3 in the class
StudD has secured position 4 in the class
StudE has secured position 5 in the class


The above code is clumsy for the following reasons:
- I have to get the length of the list
- I have to run range() over the length of the list to get the index
- I have to increment the range val by 1 and grab list[index] at every iteration

**enumerate() addresses this situation in a pythonic way**.

In [4]:
for idx, stud in enumerate(lst,start=1):
    print(f"{stud} has secured position {idx} in the class")
    

StudA has secured position 1 in the class
StudB has secured position 2 in the class
StudC has secured position 3 in the class
StudD has secured position 4 in the class
StudE has secured position 5 in the class


**What's happening here?**
- enumerate() wraps any iterator with a generator
- enumerate() returns an instance of an enumerate object which is an iterator
- It then yields a tuple containing pair of successive loop indexes, and the `next()` value from the sequence which has been passed to enumerate() function
- enumerate() starts from 0 (default), but can be take start=any-integer-value as the beginning of the index

In [6]:
# enumerate returns an enumerate object
enumerate(["a","b","c","d"])

<enumerate at 0x1092dc0a0>

In [7]:
# each element is a tuple that with the index and the original item value
list(enumerate(["a","b","c","d"]))

[(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd')]

In [8]:
for i,j in list(enumerate(["a","b","c","d"])):
    print(i,j)

0 a
1 b
2 c
3 d


In [5]:
help(enumerate)

Help on class enumerate in module builtins:

class enumerate(object)
 |  enumerate(iterable, start=0)
 |  
 |  Return an enumerate object.
 |  
 |    iterable
 |      an object supporting iteration
 |  
 |  The enumerate object yields pairs containing a count (from start, which
 |  defaults to zero) and a value yielded by the iterable argument.
 |  
 |  enumerate is useful for obtaining an indexed list:
 |      (0, seq[0]), (1, seq[1]), (2, seq[2]), ...
 |  
 |  Methods defined here:
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __next__(self, /)
 |      Implement next(self).
 |  
 |  __reduce__(...)
 |      Return state information for pickling.
 |  
 |  ----------------------------------------------------------------------
 |  Static methods defined here:
 |  
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.



### list comprehension

Remember the following:
- List comprehension generates a list. 
- Syntax: 2 or 3 parts:
    - expression
    - source of data (iterable)
    - condition (optional)
- Usage: Generates a new list by applying set of operations on each element of given list 
- Note: Any list comprehension can be broken down into `for loops`, but usually not the vice versa
- <u>Beginners trying to understand the syntax of list comprehensions can follow the below approach</u>:
    - Create the new list using **for** loop and **append**
    - Convert your code back to list comprehension(all remains but the `append`)

_Let's jump into examples_

### Typical syntax for creating a new list  from an old list :

### Tackling the above via list comprehension

### What about nested loops
- Nothing's special
- **Only remember that the sequence of `for` loops remains same**

### Examples

In [1]:
# create a list of even numbers between 0 and 9:
newlst = [num for num in range(10) if num % 2 ==0]
print(newlst)

[0, 2, 4, 6, 8]


In [13]:
# For each number in list below (s_list), get the number and its position in mylist as a list of tuples.
mylist = [9, 3, 6, 1, 5, 0, 8, 2, 4, 7]
s_list = [6, 4, 6, 1, 2, 2]
[(num,mylist.index(num)) for num in s_list ]

[(6, 2), (4, 8), (6, 2), (1, 3), (2, 7), (2, 7)]

In [16]:
# For each number in list below (s_list), get the number and its position in mylist as a dict.

{num:mylist.index(num) for num in s_list }

{6: 2, 4: 8, 1: 3, 2: 7}

In [6]:
mylist = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[-val if 3<=val<=8 else val for val in mylist  ]

[1, 2, -3, -4, -5, -6, -7, -8, 9, 10]

In [7]:
# In mylist, square the number if its even, else, cube it.
[ val**2 if val%2==0 else val**3 for val in mylist]


[1, 4, 27, 16, 125, 36, 343, 64, 729, 100]

In [9]:
# Flatten the matrix(a list of lists) keeping only the even numbers.
sample = [[11,12,13,14], [15,16,17,18], [19,120,21,22], [23,24,25,26]]
[element for lst in sample for element in lst]

[11, 12, 13, 14, 15, 16, 17, 18, 19, 120, 21, 22, 23, 24, 25, 26]

In [10]:
# Keep only odd nos from the above
[element for lst in sample for element in lst if element%2 !=0]

[11, 13, 15, 17, 19, 21, 23, 25]

In [54]:
# The goal is to tokenize the following 5 sentences into words, excluding the stop words.
sentences = ["a new world record was set", 
             "in the holy city of ayodhya", 
             "on the eve of diwali on tuesday", 
             "with over three lakh diya or earthen lamps", 
             "lit up simultaneously on the banks of the sarayu river"]

stopwords = ['for', 'a', 'of', 'the', 'and', 'to', 'in', 'on', 'with']

# for loop approach

newlst=[]
for line in sentences:
    tlst=[]
    for word in (line.split()):
        if word not in stopwords:
            tlst.append(word)
    newlst.append(tlst)    

print("Via for loop",newlst)  

# list comprehension approach
print("\nVia List comprehension:",[[word for word in (line.split()) if word not in stopwords] for line in sentences])


Via for loop [['new', 'world', 'record', 'was', 'set'], ['holy', 'city', 'ayodhya'], ['eve', 'diwali', 'tuesday'], ['over', 'three', 'lakh', 'diya', 'or', 'earthen', 'lamps'], ['lit', 'up', 'simultaneously', 'banks', 'sarayu', 'river']]

Via List comprehension: [['new', 'world', 'record', 'was', 'set'], ['holy', 'city', 'ayodhya'], ['eve', 'diwali', 'tuesday'], ['over', 'three', 'lakh', 'diya', 'or', 'earthen', 'lamps'], ['lit', 'up', 'simultaneously', 'banks', 'sarayu', 'river']]


In [None]:
# Make a dictionary of the 26 english alphabets mapping each with the corresponding integer.

# Desired output
{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5, 'f': 6,
 'g': 7, 'h': 8, 'i': 9, 'j': 10, 'k': 11, 'l': 12,
 'm': 13, 'n': 14, 'o': 15, 'p': 16, 'q': 17, 'r': 18,
 's': 19, 't': 20, 'u': 21, 'v': 22, 'w': 23, 'x': 24,
 'y': 25, 'z': 26}

In [3]:
import string
{ltr:idx for idx, ltr in enumerate(string.ascii_lowercase,start=1)}

{'a': 1,
 'b': 2,
 'c': 3,
 'd': 4,
 'e': 5,
 'f': 6,
 'g': 7,
 'h': 8,
 'i': 9,
 'j': 10,
 'k': 11,
 'l': 12,
 'm': 13,
 'n': 14,
 'o': 15,
 'p': 16,
 'q': 17,
 'r': 18,
 's': 19,
 't': 20,
 'u': 21,
 'v': 22,
 'w': 23,
 'x': 24,
 'y': 25,
 'z': 26}

In [None]:
# Replace all alphabets in the string ‘Lee Quan Yew’, by substituting the alphabet with the corresponding numbers, like 1 for ‘a’, 2 for ‘b’ and so on.
#Desired Output:

[12, 5, 5, ' ', 17, 21, 1, 14, ' ', 25, 5, 23]

In [12]:
import string
dct = {ltr:idx for idx, ltr in enumerate(string.ascii_lowercase,start=1)}

print("Option 1:",[dct[element] if element in dct else element for element in "Lee Quan Yew".lower() ])
print("Option 2:",[dct.get(element,' ') for element in  "Lee Quan Yew".lower()])


Option 1: [12, 5, 5, ' ', 17, 21, 1, 14, ' ', 25, 5, 23]
Option 2: [12, 5, 5, ' ', 17, 21, 1, 14, ' ', 25, 5, 23]


In [25]:
# Get the unique list of words from the following sentences, excluding any stopwords
sentences = ["The Hubble Space Telescope has spotted", 
             "a formation of galaxies that resembles", 
             "a smiling face in the sky"]

stopwords = ['for', 'a', 'of', 'the', 'and', 'to', 'in', 'on', 'with']

# Desired output:
{'face', 'formation', 'galaxies', 'has', 'hubble', 'resembles',
 'sky', 'smiling', 'space', 'spotted', 'telescope', 'that', 'the'}

{'face',
 'formation',
 'galaxies',
 'has',
 'hubble',
 'resembles',
 'sky',
 'smiling',
 'space',
 'spotted',
 'telescope',
 'that',
 'the'}

In [27]:
set(sorted([word.lower() for line in sentences for word in line.split() if word.lower() not in stopwords]))

{'face',
 'formation',
 'galaxies',
 'has',
 'hubble',
 'resembles',
 'sky',
 'smiling',
 'space',
 'spotted',
 'telescope',
 'that'}

In [35]:
# Tokenize the following sentences excluding all stopwords and punctuations.
sentences = ["The Hubble Space telescope has spotted", 
             "a formation of galaxies that resembles", 
             "a smiling face in the sky", 
             "The image taken with the Wide Field Camera", 
             "shows a patch of space filled with galaxies", 
             "of all shapes, colours and sizes"]

stopwords = ['for', 'a', 'of', 'the', 'and', 'to', 'in', 'on', 'with']

# Option 1: Via regular loop
newlst=[]
for line in sentences:
    tlst=[]
    for word in line.split():
        if word.lower() not in stopwords:
            tlst.append(word.lower())
    newlst.append(tlst)
    
    
print("Option 1:",newlst)    

print("\nOption 2:",[[word.lower() for word in line.split() if word.lower() not in stopwords] for line in sentences])

Option 1: [['hubble', 'space', 'telescope', 'has', 'spotted'], ['formation', 'galaxies', 'that', 'resembles'], ['smiling', 'face', 'sky'], ['image', 'taken', 'wide', 'field', 'camera'], ['shows', 'patch', 'space', 'filled', 'galaxies'], ['all', 'shapes,', 'colours', 'sizes']]

Option 2: [['hubble', 'space', 'telescope', 'has', 'spotted'], ['formation', 'galaxies', 'that', 'resembles'], ['smiling', 'face', 'sky'], ['image', 'taken', 'wide', 'field', 'camera'], ['shows', 'patch', 'space', 'filled', 'galaxies'], ['all', 'shapes,', 'colours', 'sizes']]


In [36]:
# Create a list of (word:id) pairs for all words in the following sentences, where id is the sentence index.

# Input
sentences = ["The Hubble Space telescope has spotted", 
             "a formation of galaxies that resembles", 
             "a smiling face in the sky"]

# Desired output:
[('the', 0), ('hubble', 0), ('space', 0), ('telescope', 0), ('has', 0), ('spotted', 0),
 ('a', 1), ('formation', 1), ('of', 1), ('galaxies', 1), ('that', 1), ('resembles', 1),
 ('a', 2), ('smiling', 2), ('face', 2), ('in', 2), ('the', 2), ('sky', 2)]


In [37]:
[(word.lower(),idx) for idx,line in enumerate(sentences) for word in line.split()]

[('the', 0),
 ('hubble', 0),
 ('space', 0),
 ('telescope', 0),
 ('has', 0),
 ('spotted', 0),
 ('a', 1),
 ('formation', 1),
 ('of', 1),
 ('galaxies', 1),
 ('that', 1),
 ('resembles', 1),
 ('a', 2),
 ('smiling', 2),
 ('face', 2),
 ('in', 2),
 ('the', 2),
 ('sky', 2)]