# <center>Generators</center>
## <center>Practical cases</center>

### Table of contents
---
- **Cases.**
- **Pipeline (Unix).**
- **Building a real-world _Shell_ command into script.**
- **Benchmarking.**
---

### Cases
---

Generators, can be particularly effective and powerful when applied to certain kinds of programming problems in systems. 

- **Files**
    - Searching
        - Keywords in text.
        - Word frequency.
        - Syntax errors.
        - ...
    - Filtering
        - Based on specific column in .csv file.
        - Based on combinations.
        - ...
    - Ordering
        - Commands for execution from a configuration file.
        - By unique columns.
        - ...
    - Sorting
        - Memory log files by Memory Consumption status.
        - ...
    - Moving
        - Reorganizing files in a directory or multiple directories.
        - ...
    - Parsing
        - JSON
        - HTML
        - XML
        - ...
    - Templating engines
    - ...
- **Threads**
    - Microthreads.
    - ...
- **Networking**
    - Building an interactive two-player network game - 'Rock paper scissors'.
    - Simultaneously manage client connections.
    - Sending objects through sockets.
    - ...
    
---
**NOTE**

> The cases that you can apply Generators are not limited to the list above. The aim is to give you more meaningful cases to get an idea and unlock your creativity or get recognized in your problem.
---

### Pipeline (Unix)
---

A pipeline is a series of processes where each process consumes the output of the prior process and produces output for the next. Similar to using a pipe in the UNIX shell. You must have used pipes in the shell before.

Linux:

```bash
ps aux | grep python
```

Windows equivalent:

```powershell
tasklist | find "python"
```

![Pipeline diagram of UNIX ps and aux command](diagrams/pipeline_diagram_unix_ps_aux_command.png)

- The first process is _ps aux_.
- The second process is _grep python_ , it consumes the output of the prior process.
- The result.

### Building a real-world Shell command into script.
---

One real-world example of an application that we might take is Linux and Unix-like operating systems shell command _wc_. The _wc_ command allows us to count the number of lines, words, characters of each given file.

In [None]:
# Let's see the command in action.
!wc data/molecules/*.pdb

---
Let's create a Python script that does the same thing by using Generators. 

Tasks:
- Split the logic into a small processes to form a pipeline.
- Implement

![Pipeline diagram of UNIX ps and aux command](diagrams/pipeline_diagram_unix_wc_command.png)

In [None]:
# Implementation
from pathlib import Path


def open_files(files_paths):
    """Opens the files for reading by lazy evaluating them.

    :yields: class:`io.TextIOWrapper`
    """
    for file_path in files_paths:
        with open(file_path, 'r') as file:
            yield file


def counter_generator(texts):
    """Counts the number of lines, words, characters
    for each given text.

    :yields: `tuple`
    """
    for text in texts:
        lines_count = len(text.splitlines())
        words_count = len(text.split())
        characters_count = len(text)

        yield (
            lines_count,
            words_count,
            characters_count
        )


# Yielding all matching files.
files_paths = Path('data/molecules').rglob('*.pdb')
# Open the files.
files = open_files(files_paths)
# Read the files.
files_texts = (file.read() for file in files)

for tl, tw, tc in counter_generator(files_texts):
    # Prints for each text:
    # 1. Number of lines.
    # 2. Number of words.
    # 3. Number of characters.
    print(tl, tw, tc)


---
Summarizing, we have a directory called _molecules/_ that contains six files describing some simple organic molecules. The files are with _.pdb_ extension indicates that these files are in Protein Data Format, a simple text format that specifies the type and position of each atom in the molecule. Our program does counting the number of lines, words, and characters for each of the files.