# Section E - Writing Scripts / Programs

Feedback: https://forms.gle/Le3RAsMEcYqEyswEA

**Topics** - Python Editors, Writing and runnig scripts, argparse library

So far, we've been working in python notebooks - these are great for workflows or data centered presentations where you want to mix text, code, and graphs/plots and a human will be interacting with it.  

But some times we want to make a tool that we can run, maybe part of an automated process, and don't want to see the code to do it.  We can write a script - a text file that the python interpreter runs for us.  

**Script** vs **Program** - call it what you want.  Calling something a program implies that it's compiled, or at least that its more complicated than a script.  Python works just as well for short utility scripts as for gigantic programs, but it's not really
\* compiled. In computationaly intensive programs, you might write moudles in C or or another compiled language to handle the cpu-heavy tasks, and call those modules from python, keeping all of the complex program logic in python so that it's more human friendly to work on. 

\* *When you run a python .py script, it is converted to a .pyc bytecode file which is executed by the interpreted.  The bytecode is obfuscated, but is trivial to convert back to python code, unlike a truely compiled language.  sometimes you'll see people/companies distribute .pyc files to obfuscate their tools somewhat.*

## Installing Python
You need to have a python interpreter installed on your computer in order to run a python script. You should **select the option during install to add the interpreter to the system PATH** so that when you open up a cmd or powershell window, you can directly run a .py file with "python3" in the shebang line and it just runs, or you can say "pyhon my_script.py" and python is in your path so it just works. 

https://www.python.org/downloads/

## Python Editors (IDEs)
It's important to use a good editor for writing scripts.  Some features of a good editor include:
* Syntax highlighting
* Linting - A Linter is a tool that looks at your code for issues like missing variables, misspelled stuff, and any time you diverge from standards and conventions that the rest of the world thinks are a good idea. 
* You make like for your ide to be able to run your code from the editor and give you the output.  Or you may prefer to run it from a terminal window separately. 
* Debugging - If you run your code from the editor, you should be able to set breakpoints to pause your script and see what variables are set, etc. 

A few good ones are:
* VS Code
* IntelliJ or PyCharm
* Atom
* Sublime
* VIM, NeoVIM, or Emacs (there are more of a Linux thing)

#### *Exercise*
Open a cmd (windows) or xterm (linux) an try running `python --version` or `python3 --version`.  if you don't have python installed, you'll get an error.  Windows might pop up a windows store page to instal python.  That'll probably work. Install python if you need to and get this working. 

Install an IDE - I recommend VS Code, but use what you want. The instructions here will all be for VS Code. once installed, go into the extensions menu and search for "python microsoft".  Install the "python", "Pylance", "Pylint", "autopep8" extensions specifically from microsoft.  You may also like the "Jupyter" extension for notebooks. "Copilot" is great too, but requires signing up for it at github.com. 

Make a folder in your home directory for python scripts and open that folder with Code. Use the exploror winodw pane on the left to create a new file caled hello.py.  Add the following to the file and save it (ctrl+s is a shortcut to save):

    #! pyhon3

    print("Hello World!")

Pretty Simple!  Now cd to that directory in your command window and run the script by typing it's name and hitting enter!  You should see the hello world in the cmd window. 

We can run it from the IDE too.  Use the little triangle in the top right corner of the window to run it and you should see a console pop up on the bottom showing the hello world message. 

Finally, if you have any hilighted lines in the script indicating issues, try to resolve them.  For example, I see, "Missing module docstring" and "Final newline missing".  A module docstring is a tripple quoted string at the top of the file, just below the shebang line, that says what the file/module does.  For example:

    #! python3
    '''Simple hello world test script'''




## Template for a script

In [None]:
#!/usr/bin/env python
'''Short note about the script/module'''

import stuff

GLOBAL_VAR = 'foo'

def funcDefinition(some, args):
    '''foo'''
    return 'bar'

def main(some, args):
    print('Hello, world!')

if __name__ == '__main__':
    main()

**The #! shebang line**

Shebang is short for hash bang.  This line says which interpreter to use to run the script. A couple of common entries are:
* In linux for python3:
  * `#!/usr/bin/env python3`
* In windows, this would be common:
  * `#! python3`

In both cases the system PATH variable will be checked to find the given interpreter. 

You can also specify the complete path to the interpreter you want to use.  I might do this to make sure it uses a virtual env that I've configured:
* `#!/home/my_user/venv/bin/python3`

**module description**

You can add a short documentation abou the purpose of the script/module below the shebang line. 

**import statements**

They go at the top.

**global variables**

Things like paths to tools that are called by the script, directories.  Global variables should be in ALL_CAPS with underscores between the words if they are multi-word.  It's common to have a VERBOSE or DEBUG boolean global that's referenced elsewhere do decide whether or not to print debug messages for troubleshooting issues. 

**function definitions**

This is the main body of the script.  It's not uncommon to have a "main" function that is the first thing called when the script starts.  You don't have to define any functions if you don't want to. 

**the if __name__ ... condition**

This is something that is important if your script might be used as a module that could be imported by another script or module in order to access your scripts functions and global variables.  If your script is imported, then it's __name__ will not be "__main__", but if it is called as a script, it's name will be "__main__", so the code below here gets executed. 

You can also skip this section and just start writing code that will run.  

## Helpful Libraries for Scripts

### sys
The sys module provides access to some variables and functions that interact with the Python interpreter.
* **sys.argv** - A list of command-line arguments passed to the script. sys.argv[0] is the script name, and if len(sys.argv) > 1, then it was passed some arguments when it was run. 
* **sys.exit()** - Exits the program with an optional exit code.  Exit code zero says that everything worked as expected, and non-zero (positive) says there was an error.  You might return different numbers for different errors so if another tool calls your script, it can do something different depending on the exit codes. 
* sys.stdin, sys.stdout, sys.stderr - File objects corresponding to the interpreter’s standard input, output, and error streams.

### os
The os module provides a way of interacting with the operating system. These are a few essential functions to view current working dir and change directories; list or remove files and directories; create directories.  

* os.getenv() - Retrieves the value of an environment variable.
* os.environ - A dictionary representing the environment variables.
* os.chdir() - Changes the current working directory.
* os.getcwd() - Returns the current working directory.
* os.listdir() - Lists the contents of a directory.
* os.mkdir() and os.makedirs() - Create directories.
* os.remove() and os.rmdir() - Remove files and directories.
* os.path - A submodule for working with file and directory paths, providing functions like:
  * os.path.join()
  * os.path.exists()
  * os.path.isfile()
  * os.path.isdir().

These modules are essential for performing system-level tasks and interacting with the environment in which your Python code is running.

#### *Exercise*
Create a whats_here.py script that does the following:
* Prints the current working directory.  This is the directory that the process that called your script has as its CWD. 
* Gets the current user and saves it to a variable
* Make a directory called "foo" if it does not exist. 
* Checks if a file called f"foo/{current_user}_was_here.txt" exists and creates it if not.  You should see this in the file explorer and in your command window (run dir foo or ls foo for windows/linux) after it's created. 
* Change directory to f"C:/Users/{current_user}/Desktop" and list out the files here. 

## Script Arguments and sys.argv
Some times we need scripts to take some parameters to change their behavior.  Any arguments you pass to your script when running it get stored in sys.argv so you can check them from the script. This is just like passing arguments to a function. 

#### *Exercise*
Put the following code into a script called "show_args.py" and run it with different combinations of arguments passed to it:

    #! python3
    '''A simple tool to see what arguments are set when running.'''
    import sys

    VERBOSE = False

    if '-v' in sys.argv or '--verbose' in sys.argv:
        print('Verbose is set, so I will give detailed messages about what is happening.)
        VERBOSE = True

    for n, arg in enumerate(sys.argv):
        if n == 0:
            print(f"The name of the script is: {arg}")
        else:
            print(f'arg {n} is: {arg})
        if VERBOSE:
            print(f"This arg was {len(arg)} characters long!")
    
    if VERBOSE:
        print("Number of args received:", len(sys.argv) - 1)

For example, you could run:  `show_args.py foo bar omg-wow`

For 10 points extra credit, you can make the following modification to the args script: Check if each arg has an '=' character in it.  If an = is present, split the arg on the = to keg a key=value pair and assign it to an args_kv dictionary.  Then print all of the key value pair given in addition to the other non-kv arguments. 

And for an additional 5 points extra credit, handle all of the argument checkin in a function called checkArguments() which returns a tuple with args and args_kv.  The function should not print anything.  The script should call the function, save the return values, and then print a summary. 

## Argparse
The previous exercise hopefully proved that we can simply pass data into a script and check for it, but a really high quality script is able to also provide documentaion about what it does, what arguments are allowed, verify that correct arguments are given and that the values set with them are valid, and return them all in a simple data structure that can be used by the script.  It would require a lot of code to do this ourselves, so thankfully we have the "argparse" library.  

Argparse lets you define what arguments the script accepts, say which of them are required, if any, set default values, set required data types, etc etc.  You can find documentation for it here: https://docs.python.org/3/library/argparse.html

Here's an example script utilizing argparse:

## Logging