<a href="https://colab.research.google.com/github/ShaunakSen/problem-solving-with-code/blob/master/DSA_in_Python_Fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Data Structures and Algorithms - Improving concepts

> Notes, codes, solutions from multiple resources to improve fundamentals on DSA

- Data Structures and Algorithms by Michael T Goodrich: https://www.amazon.in/Structures-Algorithms-Python-Michael-Goodrich/dp/1118290275

- https://realpython.com/introduction-to-python-generators/

---

## Some advanced python concepts

### Iterators and Generators

an instance of a list is an iterable, but not itself an iterator.
With data = [1, 2, 4, 8], it is not legal to call next(data). However, an iterator object can be produced with syntax, i = iter(data), and then each subsequent call to next(i) will return an element of that list. The for-loop syntax in Python simply automates this process, creating an iterator for the give iterable, and then repeatedly calling for the next element until catching the StopIteration exception

More generally, it is possible to create multiple iterators based upon the same
iterable object, with each iterator maintaining its own state of progress. However,
iterators typically maintain their state with indirect reference back to the original
collection of elements. For example, calling iter(data) on a list instance produces
an instance of the list iterator class. That iterator does not store its own copy of the
list of elements. Instead, it maintains a current index into the original list, representing the next element to be reported. Therefore, if the contents of the original list
are modified after the iterator is constructed, but before the iteration is complete,
the iterator will be reporting the updated contents of the list.
Python also supports functions and classes that produce an implicit iterable series of values, that is, without constructing a data structure to store all of its values
at once. For example, the call range(1000000) does not return a list of numbers; it
returns a range object that is iterable. This object generates the million values one
at a time, and only as needed. Such a lazy evaluation technique has great advantage. In the case of range, it allows a loop of the form, for j in range(1000000):,
to execute without setting aside memory for storing one million values. Also, if
such a loop were to be interrupted in some fashion, no time will have been spent
computing unused values of the range


A generator is implemented with a syntax that
is very similar to a function, but instead of returning values, a yield statement is
executed to indicate each element of the series. As an example, consider the goal
of determining all factors of a positive integer. For example, the number 100 has
factors 1, 2, 4, 5, 10, 20, 25, 50, 100. A traditional function might produce and
return a list containing all factors, implemented as:

In [None]:
def factors(n):
    results = []
    for k in range(1, n+1):
        if n%k == 0:
            results.append(k)

    return results

In [None]:
def factors(n):
    results = []
    for k in range(1, n+1):
        if n%k == 0:
            yield k

In [None]:
next(factors(200))

1

Notice use of the keyword yield rather than return to indicate a result. This indicates to Python that we are defining a generator, rather than a traditional function

If a programmer writes a loop such as for factor in factors(100):, an instance of our generator is created. For each iteration of the loop, Python executes our procedure  If a programmer writes a loop such as for factor in factors(100):, an instance of our generator is created. For each iteration of the loop, Python executes our procedure

In [None]:
def factors(n):
    k=1
    while k*k < n: ## while k < sqrt(n)
        if n%k == 0:
            yield k ## k is a factor
            yield n//k ## so is n/k
        k+=1
    if k*k == n: ##  special case if n is perfect square
        yield k

We should note that this generator differs from our first version in that the factors are not generated in strictly increasing order. For example, factors(100) generates the series 1,100,2,50,4,25,5,20,10

### How to Use Generators and yield in Python

> By Kyle Stratis: https://realpython.com/introduction-to-python-generators/


Generator functions are a special kind of function that return a lazy iterator. These are objects that you can loop over like a list. However, unlike lists, lazy iterators do not store their contents in memory.


#### Example 1: Reading Large Files


what if you want to count the number of rows in a CSV file? The code block below shows one way of counting those rows:

```python

csv_gen = csv_reader("some_csv.txt")
row_count = 0

for row in csv_gen:
    row_count += 1

print(f"Row count is {row_count}")
```

Looking at this example, you might expect csv_gen to be a list. To populate this list, csv_reader() opens a file and loads its contents into csv_gen. Then, the program iterates over the list and increments row_count for each row.

This is a reasonable explanation, but would this design still work if the file is very large? What if the file is larger than the memory you have available? To answer this question, let’s assume that csv_reader() just opens the file and reads it into an array:

```python
def csv_reader(file_name):
    file = open(file_name)
    result = file.read().split("\n")
    return result
```

This function opens a given file and uses file.read() along with .split() to add each line as a separate element to a list. If you were to use this version of csv_reader() in the row counting code block you saw further up, then you’d get the following output:

```
Traceback (most recent call last):
  File "ex1_naive.py", line 22, in <module>
    main()
  File "ex1_naive.py", line 13, in main
    csv_gen = csv_reader("file.txt")
  File "ex1_naive.py", line 6, in csv_reader
    result = file.read().split("\n")
MemoryError
```

In this case, open() returns a generator object that you can lazily iterate through line by line. However, file.read().split() loads everything into memory at once, causing the MemoryError.

Before that happens, you’ll probably notice your computer slow to a crawl. You might even need to kill the program with a KeyboardInterrupt. So, how can you handle these huge data files? Take a look at a new definition of csv_reader():

```python
def csv_reader(file_name):
    for row in open(file_name, "r"):
        yield row
```
In this version, you open the file, iterate through it, and yield a row. This code should produce the following output, with no memory errors:

```
Row count is 64186394
```

What’s happening here? Well, you’ve essentially turned csv_reader() into a generator function. This version opens a file, loops through each line, and yields each row, instead of returning it.


#### Example 2: Generating an Infinite Sequence

Let’s switch gears and look at infinite sequence generation. In Python, to get a finite sequence, you call range() and evaluate it in a list context:

```python
>>> a = range(5)
>>> list(a)
[0, 1, 2, 3, 4]
```

Generating an infinite sequence, however, will require the use of a generator, since your computer memory is finite:



In [1]:
def infinite_sequence():
    num = 0
    while True:
        yield num
        num += 1  ### this statement is executed after yield, unlike return

This code block is short and sweet. First, you initialize the variable num and start an infinite loop. Then, you immediately yield num so that you can capture the initial state. This mimics the action of range().

After yield, you increment num by 1. If you try this with a for loop, then you’ll see that it really does seem infinite:

In [None]:
for i in infinite_sequence():
    print (i, end=' ')

The program will continue to execute until you stop it manually.

Instead of using a for loop, you can also call next() on the generator object directly. This is especially useful for testing a generator in the console:

In [3]:
gen = infinite_sequence()

next(gen)

0

In [5]:
next(gen)

2

Here, you have a generator called gen, which you manually iterate over by repeatedly calling next(). This works as a great sanity check to make sure your generators are producing the output you expect.

#### Example 3: Detecting Palindromes

You can use infinite sequences in many ways, but one practical use for them is in building palindrome detectors. A palindrome detector will locate all sequences of letters or numbers that are palindromes. These are words or numbers that are read the same forward and backward, like 121. First, define your numeric palindrome detector:



num = 121
temp = 121

rev_num = 12
temp = 1

In [6]:
def is_palindrome(num):
    # Skip single-digit inputs
    if num//10 == 0:
        return False

    temp = num
    reversed_num = 0

    ### calculate the reverse of num
    while temp!=0:
        reversed_num = (reversed_num * 10) + (temp % 10)
        temp = temp // 10

    if num == reversed_num:
        return num
    else:
        return False

In [None]:
for i in infinite_sequence():
    pal = is_palindrome(i)
    if pal:
        print (pal)

In this case, the only numbers that are printed to the console are those that are the same forward or backward.

Now that you’ve seen a simple use case for an infinite sequence generator, let’s dive deeper into how generators work.

### Understanding Generators


Generator functions look and act just like regular functions, but with one defining characteristic. Generator functions use the Python yield keyword instead of return. Recall the generator function you wrote earlier:

```python
def infinite_sequence():
    num = 0
    while True:
        yield num
        num += 1
```

