## Building Python Tools
This notebook is based on a [linkedin learning course](https://www.linkedin.com/learning/building-tools-with-python/)

* Tools are a small piece of software that accomplishes a specific goal or task
  + command-line tools
    - bash scripts, python script, and compiled software
    - ofter designed to consolidate sets of tasks (automation to replace manual work
  + graphical tools
    - wrapper for CLI tool, or purpose-built GUI
    - includes broswer add-ons
  + Hosted tools on servers
    - can be designed for end users
    - can be part of a larger workflow for automated process
* choose a type based on the task
  + some types are more appropriate for different kinds of work  

### Develop Python tools
* python:
  + a cross-platform tool development language
  + include a lot of useful modules, and Tkinter GUI framework built in
* planning a tool:
  + A tool idea comes from a specific need
  + keep the scope focused
  + write down inputs, actions, and outputs
* plan scope
  + aviod multifunction tools
    - cause confusion in the users if they don't need all the features
  + one task, one tool]
  + focus adis reliaility
  + signle-purpose tools are easier to maintain
* involve users one you have the plan
  + check that your plan addresses user needs
  + "why" is as important as "how"
  + get feedback before and during the building process
* strategy
  + what options are there?
  + automation, API, script?
    - too many variables in a process to make a automation tool?
    - sequence of events too fratile or brittle to rely on a tool for
  + is this reasonable to build?
  + choose what's right for the situation
* in order to build a tool that works on many platforms, use interpreted or scripting languages like Python or Ruby 

#### Important: when runing the following shell command, decks/* were automatically replace by the files in decks by shell!

In [11]:
# Important: when runing the following shell command, decks/* were automatically 
# replace by the files in decks by shell!
!python3 slidecount-args.py -s decks/*

Item decks/not_a_deck.pdf is not a .pptx file or recognized option and will be ignored.
Error reading bad_deck.pptx (File is not a zip file). Count will be 0.
Slides	Deck
0	bad_deck.pptx
13	deck1.pptx
14	deck2.pptx
16	deck3.pptx
11	deck4.pptx
- - - - -
54 total slides in 5 decks.


### Using shell commands
* interact with a shell command from your script
* use python to manage interction with CLI tools
* supplement python's capacities as needed
  + using subprocess to call out to shell commands is helpful
    - allows us to interact with existing programs instead of creating certain functionalities ourselves
    - writing python script is much easier for no-sysadmins than writing in Bash
    - can manage the input and output that allow users to use programs they otherwise couldn't or wouldn't
* os.subprocess module lets us start child process
* construct the command as a list (`["ls", "-a", "/var"]`) and each element is a command, option or argument
* subprocess.Popen starts the program in a new process, the first argument is the argument to run
* text/universal_newlines = True means
  + we want to communicate with program using text, as we would type at command line
  + if we set this as False, or don't provide it, the module will assume we want to work in bytes
* stdout=subprocess.PIPE, stderr=subprocess.PIPE
  + connect stdout and stderr from the process back to stdout and stderr as the output of the function
  + so we can access the stdout and stderr from the script
* communicate() tells the module to interact with the process, not just open it and run it
  + it sends the command and wait for the output
  + otherwise, we wouldn't get a result back from the command

In [8]:
import subprocess

# Represent the command ls -a /var as a dictionary to use in subprocess.Popen
the_command = ["ls", "-a", "/var"]

# Send the stdout and stderr of the process to variables we can use in the script.
# Popen starts the program in a new process, the first argument is the argument to run

stdout, stderr = subprocess.Popen(the_command, text=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate()

# View the output of each variable.
print("stdout: %s" % stdout)
print("stderr: %s" % stderr)

stdout: .
..
backups
cache
crash
lib
local
lock
log
mail
metrics
opt
run
snap
spool
tmp

stderr: 


### Using arguments as input for a script
* ./myscript.py datafile.txt stats.txt
* sys.argv
  + can be used to access the arguments
  + simple to implement, but returns one long lis of argument
  + the 1st element of sys.argv is the name of the script, and then other arguments in their input order

In [10]:
!python3 arguments.py apple banana cherry

Items in the list 'arguments'
-----------------------------
arguments: 			['arguments.py', 'apple', 'banana', 'cherry']
arguments[0] script name:	arguments.py
arguments[1] first arg:		apple
arguments[2] second arg:	banana
arguments[1:] all but [0]:	['apple', 'banana', 'cherry']


### Using getopt to parse optional args and other args
* short optional args are defined as string with : to separate more than one short options 
* long optionas are contained in a list of strings
* getopt accepts `sys.argv[1:]` as the 1st argument to parse, short and long options as 2nd and 3rd arguments

In [12]:
!python3 slidecount-getopt.py -s decks/deck1.pptx decks/deck2.pptx

[('-s', '')]
['decks/deck1.pptx', 'decks/deck2.pptx']
Slides	Deck
13	deck1.pptx
14	deck2.pptx
- - - - -
27 total slides in 2 decks.


### Using argparse to organize optional arguments
* allows you to add short and long optional arguments
  + parser = argparse.ArgumentParser(description=".....")
  + parser.add_argument("-s", "--summary", help="....")
  + args = parser.parse_args()
* retrieve the optional args by args.logn_option_name, e.g. args.summary

### error handling
* explain errors in human language
* prevent technical jargon from reaching users
* make error messages actionable
* There are two ways to check for problems:
  + test for known conditions by if statement
    - if len(tasks) == 0
  + catch exceptions
    - try / except
* if necessary, create application's own log file in a place it has write permission with troubleshooting
#### use case: 
* if a tool requires a specific module
  + put the relevant import statement inside a try statement
  + you can then provide the user a message if the module is not installed    

### Running the tool
* make script executable: chmod +x script.py
  + requires a shebang line like #!/usr/bin/env python3
  + run with ./script/py
  + script can be double-clicked in a GUI

### Building a script as single application using pyinstaller
* PyInstaller creates all-in-one apps
  + available through pip (pip install pyinstaller)
  + bundles python and all required modlues and files into a single app
  + app can run your tool, even if the target system doesn't have python installed
  
### The following is the code of slideout.py that is converted to an .exe file by pyinstaller  
* it imports several python packages
* using command line `!pyinstaller -w -F slidecount.py`
* a executable file is generated in the dist/folder
* you need to make sure all the required packages are installed and available in you virtural environment for pyintaller

In [19]:
!pyinstaller -w -F slidecount.py

46 INFO: PyInstaller: 4.7
46 INFO: Python: 3.8.10
64 INFO: Platform: Linux-5.11.0-40-generic-x86_64-with-glibc2.29
64 INFO: wrote /home/yuan/git_repos/advanced_python/notebooks/python_tools/slidecount.spec
67 INFO: UPX is not available.
69 INFO: Extending PYTHONPATH with paths
['/home/yuan/git_repos/advanced_python/notebooks/python_tools']
257 INFO: checking Analysis
257 INFO: Building Analysis because Analysis-00.toc is non existent
257 INFO: Initializing module dependency graph...
259 INFO: Caching module graph hooks...
271 INFO: Analyzing base_library.zip ...
2799 INFO: Processing pre-find module path hook distutils from '/home/yuan/Documents/py3_env/lib/python3.8/site-packages/PyInstaller/hooks/pre_find_module_path/hook-distutils.py'.
2800 INFO: distutils: retargeting to non-venv dir '/usr/lib/python3.8'
5271 INFO: Caching module dependency graph...
5444 INFO: running Analysis Analysis-00.toc
5485 INFO: Analyzing /home/yuan/git_repos/advanced_python/notebooks/python_tools/slidecoun