# Python From the Terminal Prompt
## Why? I LOVE JupyterHub!
JupyterHub is great for developing code and running short scripts, but it's not the best option for developing re-useable code or processing large datasets. A few reasons why:
- Jupyter does take up a bit of overhead when running code and adds a layer of complexity to the Python kernals it runs.
- Because of this overhead and added complexity, it's not the best option for processing large datasets.
- Jupyter doesn't enforce modular coding techniques.  In fact, I'd argue that it encourages long, unstructured scripts.
- If you want to shift to a different geographic region or different time window you have to actually edit the code, making your process prone to errors.
- Jupyter doesn't exist everywhere... and requires additional setup then just Python.

## Putting Our Workflow in a .py File
Now that we have our workflow running relatively well in Jupyter, it's time to structure it in a .py file so we can reuse it in other notebooks and other projects.  It's really as easy as cut and paste.

[Python File](python/uber-workflow.py)


## Running Python from the Command Line
Now that we have our .py file, we can open up a terminal in JupyterHub and type ```python uber-workflow.py``` from the python directory.

## Organizing our Workflow Code
As your code base gets bigger, you'll find that having all of your code in a single file makes functions hard to find. It also makes working on a coding project as a team more difficult as everyone will be editing the same file. (See the Git Workshop later this week!). So, what often is done is that code is divided up into files based on it's function and then the core workflow code also gets its own file. You might hear other programming tutorials refer to this core workflow file as the "driver module".  Below is a link to the files split up into workflow.py and modules.py.  Notice how the workflow.py file imports the modules file to find those.  As your code grows, you'll probably want to continue to break the code up into managable-sized files. One option here might be to put the acquisition modules in one file, the manipulation modules in another and the visualization modules in a third.

[workflow.py](workflow.py)  
[modules.py](modules.py)

You run this from the command line just like before... with ```python workflow.py```.


## Supercharging the Terminal Prompt: Command Line Arguments
So, now we can run our code from the command line, but we still have to modify our code every time we want to change the data we're loading, so how is this better? I thought you said this is one reason why JupyterHub isn't so great.

Introducing... command line arguments. When you type ```python <filename>``` at the terminal prompt, you can pass additional arguments to the file to tell it what to do. Let's, for a quick example, pass our start and end date to the file. We'll call this new command line argument supporting version workflowCLI.py for Command Line Interface. You'll see this as a common way to refer to the use of command line arguments to control how a program runs.

[workflowCLI.py](workflowCLI.py)

You'll see we used the sys.argv[] built-in Python list to retrieve our command line arguments in the code. Note that the first element in the list, \[0\], is the filename (here, workflowCLI.py), so normally you'll want to start with the second element in the list, \[1\]. We'll assume the first argument is the start date, and the second the end date, but if we had more time, we might add some code to print out what we're expecting for command line arguments.

Now, we can call our code and specify the start and end dates without ever touching the actual code with something like ```python workflowCLI.py 2016-06-01 2016-11-01```. Note that by default, a space will seperate the command line arguments so if you need a space in an argument, you can put the arguments in quotes.

## Python argparse Module
For a more sophisticated CLI, you can use the Python built-in argparse module. That allows you to do command line arguments in key-value pairs like ```python workflowCLIv2.py startDate=2016-06-01 endDate=2016-11-01```.  More info at the links below.  
[Python Docs](https://docs.python.org/3/library/argparse.html)  
[Python Tutorial](https://docs.python.org/3/howto/argparse.html)