## For efficiency's sake: structured programs, packages, regular expressions 

Efficiency is key. Therefore, in this session, you will learn more about advanced aspects of efficient programming. We have already discussed how to improve code efficiency and maintainability in small programs. Sooner or later, however, additional questions may emerge: How should you **distribute code over a main program and importable modules**? Where can time and effort be saved by relying already available **modules** and **packages**? Can **repeating patterns** in the task at hand be detected and exploited for further automization? 

### Structured programs
Python code (in .py files) can be used in two ways:
- as a standalone program (e.g., the code file that you run from within Spyder)
- as a module that is imported and used from inside another program

**Recap**: Any .py file can be loaded as a module. As such, modules allow us (a) to distribute code across multiple files (which is convenient for bigger coding projects). Modules are loaded with an *import* statement. 

**Larger projects should make use of the distinction between standalone programs and modules.**
- package the main functionality in different Python files
- import these files as modules from a main file where the execution code resides
- this main code should only be executed when the code is run as a standalone program

#### The *main* block

Upon importing a module, **all statements in the imported module are executed**. This may be fine if the module is explicitly design with a singular intended function as module imported to other programs. But **sometimes you may want to be able to use a script as both a module and an executable standalone program in and of itself**. This is where the *main* block comes in handy.

**Why would you want this functionality?**
- Your code's **main function may be for use as module**, but you still want to include a script mode where it runs some **unit tests or a demo**.
- Your code's **main function may be for use as standalone program**, but you still want to **make some of its functionalities available to other programs**, without needing to create a new .py file.
- Finally, your code may be for **use as standalone program (by you)**, but may have some unit tests and the **testing framework (by your teachers)** works by importing .py files like your script and running special test functions. This is the case for your homework exercises.

In [1]:
#Example from homework exercise 01
def pig_latin_name(name):
    """
    Transform a name to the Pig Latin variant (paying attention to capital letters).
    """
    pass


# paste your test code here (will not be tested)
if __name__ == '__main__':
    pass

In your homework exercises, **you can test your own code in the main block of your script**. This part of the code is **only exectued when the file runs as standalone program**. If it is instead imported as module (e.g., to our testing scripts), execution of the module's main block is skipped. 

**But how does this work, exactly?**

Upon initializing a script, the Python interpreter will...
1. read the source files and **assign certain special variable names**
2. execute all code

Special variables are preceded and followed by \_\_ . Here, we care in particular about the **\_\_name\__** variable. 

If you are **running a source file as the main program**, the interpreter will assign the hard-coded string "\_\_main\_\_" to the \_\_name\_\_ variable. It's as if the interpreter inserts this at the top of your code (you don't actually have to add this to your code):

    __name__ = "__main__" 
    
If you want to convince yourself of this, try and print out the variable's value on your next homework exercise:

In [4]:
print (__name__)

__main__


If you are **importing a source file as module to another program**, the interpreter will search for the respective .py file and will assign the name from the import statement as value of its \_\_name\_\_ variable.

For instance, if we are importing the module *math*, it's as if the interpreter inserts this at the top of the *math.py* file:

    __name__ = "math"
    
Note that the name of your main program remains \_\_main\_\_:

In [9]:
import math
print("the live script's name remains: ",__name__)
print("the imported module's name is:",math.__name__)

the live script's name remains:  __main__
the imported module's name is: math


Coming back to the question as to **how this is useful**, note that we can **conditionalize the execution of certain statements on whether the file is executed as main file or as module**. Code that is embedded under the *if* statement will only be executed if the \_\_name\_\_ variable is assigned the "\_\_main\_\_" value at runtime:

In [11]:
if __name__ == '__main__':
    print("This is now running as standalone program!")

This is now running as standalone program!


#### Using the main block to separate functionality from function

It is efficient, and good practice, in programming to **separate code functionality (the function definitions) from code function (the executable program code)**. You are already using this pattern in your homeworks!

The following program provides an **example of well-packaged code** that could readily be imported as module to other programs, which would provide them with the specified functions' functionalities:

In [17]:
def sum_lists(list1 , list2):
    return sum(list1) + sum(list2)

def read_floats_from_file(filename):
    with open(filename , "r") as in_file:
         lines = in_file.readlines ()
    float_list = []
    for line in lines:
        float_list.append(float(line.strip ()))
    return float_list

if __name__ == "__main__":
    l1 = read_floats_from_file("test.txt")
    l2 = read_floats_from_file("test2.txt")
    print("Sum of both lists: " + str(sum_lists(l1 ,l2)))

Sum of both lists: 103301.5


### Packages

### Pattern detection via regular expressions

#### Regular languages

#### Finite state automata