If you are running on Quest, run the following commands to load the required packages and files:
<br>`module load python/anaconda3.6`
<br>`module load parallel`
<br>`wget https://raw.githubusercontent.com/aGitHasNoName/PythonForAutomation/main/questFiles5.txt`
<br>`parallel wget {} :::: questFiles5.txt`

# Pipe the output of one script to the input of another
<br>There are two things we need to do to make this happen. 
1. We need to set the second Python script up to take in piped input.
2. We need to use the pipe character `|` on the command line

The second part is easy, but the first part is a little tricky.

### `sys.stdin`
This Python command will pull an input file from your pipe into your Python script.

**`sys.stdin` always pulls data in as a file object, which by default is a list of lines. You can change it to a string (which is easy to convert to a number, if that's what you need) with `sys.stdin.read()` - just like `f.read()`.**

You can also take in a file from the command line using `sys.stdin`, without needing to receive that file object from another script, so that is how we will practice first. Let's use the script `PIPEargv2.py` to see how I've set it up.
<br><br>After we look at the script, we can run it on the command line by directing a file in to the script. We'll use the file `number.txt` which only contains the number 3.
<br><br>When we use sys.stdin, we don't need to represent that data on the command line with {} or another placeholder. Instead, we will **direct** the file in with the **<** sign.
<br><br>`python PIPEargv2.py kiwi < number.txt`

<br><br>Now that the script is ready to take input in through sys.stdin, we can pipe the output from another script into it.
<br><br>Remember `add100.py` that adds 100 to a number? We can call that script with the argument 8, and then pipe the output (108) into the `PIPEargv2.py` script. This means that instead of printing the number to the screen, bash is taking whatever is printed, saving it as a file, and then giving it the next script as `sys.stdin`.
<br><br>`python add100.py 8 | python PIPEargv2.py kiwi`

### <br><br>`sys.stdout`
Let's think about how the output of the first script is getting to the second script.

### <br>Redirect output from `print()` with > on the command line
This will take anything that gets printed to the screen and **instead** print it to a file.

`python add100.py 3 > number.txt`

### Print to the screen and save to a file by piping and using tee
If we want the output to appear on the screen and also get saved in a file, we can also pipe the output of a script to the tee command.
<br><br>`python add100.py 3 | tee numberT.txt`

<br>The PowerShell equivalent is Tee-Object.
<br>`python add100.py 3 | Tee-Object numberT.txt`

<br>The print function is, by default, always going to print to what is called standard out. We can also use `sys.stdout.write()` to write to standard out, but anything that gets printed will also go there. There are ways to explicitly print to another file, but we're not going to cover that today. Let's look at a reworked version of the add100 script, called `PIPEadd100.py`.

<br>And we can run it the exact same way on the command line to see that it works the same as the version with `print()`:
<br><br>`python PIPEadd100.py 3 | tee numberT.txt`

### <br><br>Pipe with or without saving the intermediate outputs.
Let's revisit our full pipeline, but use the number 60 as the input for the first script.
<br><br>`python add100.py 60 | python PIPEargv2.py kiwi`
<br><br>This will not save the number 160 to any file - it only gets passed to the next piece of the pipeline. If we want to save it, we have to tee it up in the middle:
<br><br>`python add100.py 60 | tee number.txt | python PIPEargv2.py kiwi`

### <br><br>Exercise
1. Copy and paste the last pipeline we wrote and have the final output print to a file called kiwis.txt instead of printing to the screen.
2. Copy and paste the last pipeline we wrote and have the final output print to a file called kiwis.txt AND print to the screen.

ANSWERS:
<br><br>`python add100.py 60 | tee number.txt | python PIPEargv2.py kiwi > kiwi.txt`
<br><br>`python add100.py 60 | tee number.txt | python PIPEargv2.py kiwi | tee kiwi.txt`

### <br><br>A more complicated example
We can also include command line software programs in our pipeline. Each program may have a different way that you have to tell it to use standard in or standard out, or that may be the default. 
<br><br>I'm going to walk through a pipeline that starts with a pdf, uses a command line tool to convert it to text, feeds that into a Python script that reformats the text into a csv file, then passes the csv to a version of our sortEmails.py script. The pipeline looks like this:
<br><br>`pdftotext email_addresses4.pdf - | python PIPEemailsTxtToCsv.py | python PIPEsortEmails.py > final.csv`

And in parallel!
<br><br>`parallel "pdftotext {.}.pdf - | python PIPEemailsTxtToCsv.py | python PIPEsortEmails.py > {.}.csv" ::: \*.pdf`
<br><br>Or append both outputs to one file:
<br><br>`parallel "pdftotext {.}.pdf - | python PIPEemailsTxtToCsv.py | python PIPEsortEmails.py >> final.csv" ::: \*.pdf`