## Creating Scripts

(NOTE, since our notebook containers get rebuilt every time we stop and start our server, we need 
to reinstall any software or system changes we make

## THIS NOTEBOOK ONLY WORKS on a machine that the user can modify!
Now, let's try to create our first "script", which we can execute from the shell directly. Scripts are basically a sequence of shell commands that can be invoked as a single command. They also allow if-then-else, looping, etc.
But for now, we will stay simple.

* Go to the the shell (Click on the Jupyter icon, and then NEW, terminal)
* Type `sudo apt-get -y install nano`   (This will install a simple text editor)
* Type `cd`
* Type `nano getTemp`: This will use the text editor `nano` and create a file called `getTemp`
* Type the command from the earlier module (but without the ! in front of curl... (why?)




```
#!/bin/sh
curl -s "http://api.ipstack.com/128.122.85.5?access_key=c2192e9aa79a13153a328f383b810862" | \
jq '"http://api.openweathermap.org/data/2.5/weather?q=" + .city + "," + .region_name + "&mode=json&units=imperial&appid=ffb7b9808e07c9135bdcc7d1e867253d"' | \
xargs curl -s | jq '.main.temp'


```
* Notice the addition of the first line `#!/bin/sh`. We will get back to that later.
* Type Ctrl+X to save the file and exit (type Y to confirm)
* In the shell, type `chmod +rx getTemp`. This makes our file _executable_.
* Finally type `./getTemp` and see what happens.

#### Exercise

* Modify the script, to add an extra command that also prints the date
* Modify the script, so that the output goes into the file ~/NYC-Temperature.txt


## Running Jobs in the Background

Sometimes, we would like to start a task, and let it run in the background. To do so, we simply add the character `&` at the end of the command. For example, if we want to run a long running task using grep, and store the results in a file, we can (again, you need to be at the command prompt in linux)  type:

`cd `

`cd data`

`grep -R 'MORIMOTO' . > morimoto.txt &`

#### Standard output, Standard Error

When we run tasks in the background, it is often useful to separate the storing of the program output from the program errors. This is done by using the `2>` redirect operator, which redirects the error messages to the file of our choice.

`grep -R 'MORIMOTO' . > morimoto.txt 2> morimoto-errors.txt &`

If we prefer to store both standard output and standard error in the same file, we use the `2>&1` command:

`grep -R 'MORIMOTO' . > morimoto.txt 2>&1 &`

#### Nohup

When we use the `&` operator, the task runs in the background, but stops running the moment we logout from our ssh session. To allow the task to continue running, even after we log out, we can use the `nohup` command, as follows:

`nohup grep -R 'MORIMOTO' . > morimoto.txt 2> morimoto-errors.txt &`

### Exercise

Start downloading a big data set (e.g., the restaurant data set) using CURL. Use the -s option to put it in silent mode, and use the nohup command and the & operator to let the process run in the background.

## Cron: Scheduling Tasks

      
Cron is used to execute desired tasks (in the background) at designated times. 

A crontab is a simple text file with a list of commands meant to be run at specified times and these jobs will run regardless of whether the user is actually logged into the system. 

To use cron for tasks meant to run only for your user profile, add entries to your own user's crontab file. Start the crontab editor from a terminal window:

#### `export EDITOR=nano`

#### `sudo crontab -e`


We will see below how to generate a crontab entry from a notebook cell, without using nano.

### The structure of the crontab file

This is how a cron job is laid out:

minute (0-59), hour (0-23, 0 = midnight), day (1-31), month (1-12), weekday (0-6, 0 = Sunday), command

and each line of the crontab file has the following format:

`minute hour day_of_month month day_of_week   command`

Each of the parts is separated by a space, with the final part (the command) having one or more spaces in it. 
For example, you can run a backup of all your user accounts at 5 a.m every week with:

`0 5 * * 1 tar -zcf /var/backups/home.tgz /home/`

#### More examples

`01 04 1 1 1 /usr/bin/somedirectory/somecommand`

The above example will run /usr/bin/somedirectory/somecommand at 4:01am on January 1st plus every Monday in January. An asterisk (\*) can be used so that every instance (every hour, every weekday, every month, etc.) of a time period is used. Code:


`01 04 * * * /usr/bin/somedirectory/somecommand`

The above example will run /usr/bin/somedirectory/somecommand at 4:01am on every day of every month.

Comma-separated values can be used to run more than one instance of a particular command within a time period. Dash-separated values can be used to run a command continuously. For example:

`01,31 04,05 1-15 1,6 * /usr/bin/somedirectory/somecommand`

The above example will run /usr/bin/somedirectory/somecommand at 01 and 31 past the hours of 4:00am and 5:00am on the 1st through the 15th of every January and June.

The `/usr/bin/somedirectory/somecommand` text in the above examples indicates the task which will be run at the specified times. It is recommended that you use the full path to the desired commands as shown in the above examples. Enter which somecommand in the terminal to find the full path to somecommand. The crontab will begin running as soon as it is properly edited and saved.

You may want to run a script some number of times per time unit. For example if you want to run it every 10 minutes use the following crontab entry (runs on minutes divisible by 10: 0, 10, 20, 30, etc.)

`*/10 * * * * /usr/bin/somedirectory/somecommand`

which is also equivalent to the more cumbersome

`0,10,20,30,40,50 * * * * /usr/bin/somedirectory/somecommand`


(See https://help.ubuntu.com/community/CronHowto for more details)


### You need to start the system "cron" service that checks crontab entries
The following command will do that...

In [None]:
# Start cron
!sudo service cron start


#### You will also need to reinstall nano every time your machine is restarted, and recreate your crontab entry...
(Actually see below, there is an easier way)


In [None]:
!sudo apt-get -y install nano


##### Exercise  (We will work on  this in class)

* Create  a cron job to keep track of the temperature in New York (at an NYU server location), running every 5 minutes. Use the OpenWeatherMap service and jq to get the temperature (.main.temp). Use the redirect operator to store the temperature in a text file called ~/NYC-Temperatures.txt, appending a new line for every measurement. Once you have that working, add the current time (.dt) to the jq filter.  Let it run for a while, and then create a homework notebook that has the crontab entry (!crontab  -l), 
and a listing of your getTemp script (!cat ~/getTemp ) as well as the times and temperatures. 
(!cat ~/NYC-Temperatures.txt)

In [31]:
# The following command creates a crontab entry for me
!echo "*/10 * * * * /home/nwhite/getTemp"|crontab
!echo crontab -l

crontab -l


In [33]:
#
#Here is my shell script
!cat /home/nwhite/getTemp

#!/bin/sh
curl -s "http://api.ipstack.com/128.122.85.5?access_key=c2192e9aa79a13153a328f383b810862" | \
jq '"http://api.openweathermap.org/data/2.5/weather?q=" + .city + "," + .region_name + "&mode=json&units=imperial&appid=ffb7b9808e07c9135bdcc7d1e867253d"' | 
xargs curl -s | jq '.dt, .main.temp' >>~/NYC-Temperatures.txt



In [34]:
!cat /home/nwhite/NYC-Temperatures.txt


1536864960
76.57
1536864960
76.57
1536864960
76.57
1537201620
77.04
1538406900
71.13
1538409360
72.99
1538409360
72.99
1538409360
72.99
1538410620
73.33
1538410620
73.33


In [35]:
!ls -alt /home/nwhite


total 100
-rw-rw-r--  1 nwhite ubuntu   170 Oct  1 16:50 NYC-Temperatures.txt
-rw-r--r--  1 nwhite ubuntu    34 Oct  1 16:42 mycron
drwxrwsr-x 15 root   ubuntu  4096 Oct  1 16:34 .
drwxrwsr-x 16 nwhite ubuntu  4096 Oct  1 14:23 notebooks
drwxr-xr-x  1 nwhite ubuntu     0 Oct  1 14:23 data
drwxrwxrwx  1 root   root    4096 Oct  1 14:22 ..
-rw-rw----  1 nwhite ubuntu  7200 Sep 26 21:18 .bash_history
drwxrwsr-x  2 nwhite ubuntu  4096 Sep 26 21:18 mystuff
-rw-rw-r--  1 nwhite ubuntu  1166 Sep 15 19:54 .nbgrader.log
drwxrwsr-x  3 nwhite ubuntu  4096 Sep 15 19:48 .jupyter
-rwxrwxr-x  1 nwhite ubuntu   325 Sep 13 19:13 getTemp
-rwxrwxr-x  1 nwhite ubuntu   324 Sep 13 19:03 getTemp~
-rw-rw-r--  1 nwhite ubuntu    66 Sep 12 22:23 .selected_editor
drwxrwxrwx  8 root   root    4096 Sep 12 16:15 assignments
drwxrwS---  2 nwhite ubuntu  4096 Sep 11 19:50 .ssh
drwxrwsr-x  2 nwhite ubuntu  4096 Sep  2 21:58 .keras
drwxrwS---  5 nwhite ubuntu  4096 Aug 31 15:47 .cache
drwxrwsr-x  3 n