## Scripting Version Control and Deployment

## Magic Numbers

Magic numbers provide a way of identifying a file's type (Eg. gif, jpg, executable, etc). Alternative ways of determinining this include extension conventions (.gif, .jpg, .exe, .txt, etc)

A magic number is a number embedded at or near the beginning of a file that indicates its file format (i.e., the type of file it is). It is also sometimes referred to as a file signature.

use the linux "file" command to interrogate the magic number and reveal the file type:

### The Shebang

Executable script files in Linux (and Unix in general), ie. files that are have the executable mode bit set, begin with the #! character at the beginning of the file. These two characters are referred to as the "shebang" at the very beginning of the file. The next few bytes of the file contain the path of the interpreter executable that is responsible for parsing and executing the script, followed by a new-line character. Eg:

The `#!` is a simply a human readable magic number (the magic byte string being 0x23 0x21) that is identified by the Linux Loader (a program which loads the code and data of the executable object file into memory and then runs the program by jumping to the first instruction). It in turn loads the specified interpreter at byte 3 and passes the script as an argument to it. It is always possible to do this manually yourself by launching the interpreter with the script as an argument, eg

## Version Control with Git

Git is an open source distributed version control system. It has a tiny footprint with lightning fast performance.

Git uses files for storage. A commit is a file with the commit message, associated data (name, email, date/time, previous commit, etc) and with a link to a tree file. The tree file contains a list of objects or other trees. The object or blob is the actual content associated with the commit (the filename isn’t stored in the object, but in the tree). Files are stored with a filename of a SHA-1 hash of the object.

#### Create a new repository:

#### Checkout a repository:

#### Add and commit

You can propose changes (add it to the Index) using

This is the first step in the basic git workflow. To actually commit these changes use

Now the file is committed to the HEAD, but not in your remote repository yet.

#### Pushing Changes to the Remote Repository (eg. to Github)

Your changes are now in the HEAD of your local working copy. To send those changes to your remote repository, execute 

Change master to whatever branch you want to push your changes to (master is default).

If you have not cloned an existing repository and want to connect your repository to a remote server, you need to add it with

Now you are able to push your changes to the selected remote server


#### Reviewing the Commit Log

Now you are able to push your changes to the selected remote server


You can add a lot of parameters to make the log look like what you want.

## Coding style and structure, PEP-8, text editors and IDE's

PEP-8 gives coding conventions for the Python code comprising the standard library in the main Python distribution. See [https://www.python.org/dev/peps/pep-0008/]

### Integrated Development Environments and Editors

- Pycharm: https://www.jetbrains.com/pycharm/
- ViM
- Notepad++
- Whatever you're comfortable with

## Python Pip

The standard packaging tools are all designed to be used from the command line.

The following command will install the latest version of a module and its dependencies from the Python Packaging Index:

## Writing Command Line Tools

### Argparse Module

The argparse module makes it easy to write user-friendly command-line interfaces. The program defines what arguments it requires, and argparse will figure out how to parse those out of sys.argv. The argparse module also automatically generates help and usage messages and issues errors when users give the program invalid arguments.

For full documentation, visit [https://docs.python.org/3/library/argparse.html]

## Deployment

### VirtualEnv

virtualenv is a tool to create isolated Python environments.

The basic problem being addressed is one of dependencies and versions, and indirectly permissions. Imagine you have an application that needs version 1 of LibFoo, but another application requires version 2. How can you use both these applications? If you install everything into /usr/lib/python2.7/site-packages (or whatever your platform’s standard location is), it’s easy to end up in a situation where you unintentionally upgrade an application that shouldn’t be upgraded.

Or more generally, what if you want to install an application and leave it be? If an application works, any change in its libraries or the versions of those libraries can break the application.

Also, what if you can’t install packages into the global site-packages directory? For instance, on a shared host.

In all these cases, virtualenv can help you. It creates an environment that has its own installation directories, that doesn’t share libraries with other virtualenv environments (and optionally doesn’t access the globally installed libraries either).

Virtualenv has one basic command:

Where ENV is a directory to place the new virtual environment. It has a number of usual effects (modifiable by many Options):

- ENV/lib/ and ENV/include/ are created, containing supporting library files for a new virtualenv python. Packages installed in this environment will live under ENV/lib/pythonX.X/site-packages/.
- ENV/bin is created, where executables live - noticeably a new python. Thus running a script with #! /path/to/ENV/bin/python would run that script under this virtualenv’s python.
- The crucial packages pip and setuptools are installed, which allow other packages to be easily installed to the environment. This associated pip can be run from ENV/bin/pip.

The python in your new virtualenv is effectively isolated from the python that was used to create it.

In a newly created virtualenv there will also be a activate shell script. For Windows systems, activation scripts are provided for the Command Prompt and Powershell.

On Posix systems, this resides in /ENV/bin/, so you can run:

For some shells (e.g. the original Bourne Shell) you may need to use the . command, when source does not exist. There are also separate activate files for some other shells, like csh and fish. bin/activate should work for bash/zsh/dash.

This will change your $PATH so its first entry is the virtualenv’s bin/ directory. (You have to use source because it changes your shell environment in-place.) This is all it does; it’s purely a convenience. If you directly run a script or the python interpreter from the virtualenv’s bin/ directory (e.g. path/to/ENV/bin/pip or /path/to/ENV/bin/python-script.py) there’s no need for activation.

The activate script will also modify your shell prompt to indicate which environment is currently active. To disable this behaviour, see VIRTUAL_ENV_DISABLE_PROMPT.

When activated, all pip commands will only affect the Python installation within the VirtualEnv. the `pip freeze` command will list all packages installed within that virtual env. It is a good convention to store the list in a text file in your /ENV directory as follows:

To later install these requirements on a fresh VirtualEnv, just copy your code to it, activate it and run:

To undo these changes to your path (and prompt), just run:

You can compress and copy the entire /ENV directory and copy it to new systems to deploy your code (eg. under /opt/).

For more detail, see https://virtualenv.pypa.io/en/stable/

### Conda

Conda is a more modern alternative to VirtualEnv with similar functionality - see https://conda.io/docs/user-guide/overview.html

## Scheduling your Python Scripts

If you design your script to be short-lived and automatically restarted (eg for batch runs, periodic reporting, etc), you can have the Cron service in Linux invoke it periodically.

The best documentation source for crontabs can be found by typing "man 5 crontab" on a Linux host.

The main configuration file for cron, /etc/crontab, contains the following lines:

he first four lines are variables used to configure the environment in which the cron tasks are run. The SHELL variable tells the system which shell environment to use (in this example the bash shell), while the PATH variable defines the path used to execute commands. The output of the cron tasks are emailed to the username defined with the MAILTO variable. If the MAILTO variable is defined as an empty string (MAILTO=""), email is not sent. The HOME variable can be used to set the home directory to use when executing commands or scripts.

Each line in the /etc/crontab file represents a task and has the following format:
- minute — any integer from 0 to 59
- hour — any integer from 0 to 23
- day — any integer from 1 to 31 (must be a valid day if a month is specified)
- month — any integer from 1 to 12 (or the short name of the month such as jan or feb)
- dayofweek — any integer from 0 to 7, where 0 or 7 represents Sunday (or the short name of the week such as sun or mon)
- command — the command to execute (the command can either be a command such as ls /proc >> /tmp/proc or the command to execute a custom script)

For any of the above values, an asterisk (*) can be used to specify all valid values. For example, an asterisk for the month value means execute the command every month within the constraints of the other values.

A hyphen (-) between integers specifies a range of integers. For example, 1-4 means the integers 1, 2, 3, and 4.

A list of values separated by commas (,) specifies a list. For example, 3, 4, 6, 8 indicates those four specific integers.

The forward slash (/) can be used to specify step values. The value of an integer can be skipped within a range by following the range with /<integer>. For example, 0-59/2 can be used to define every other minute in the minute field. Step values can also be used with an asterisk. For instance, the value */3 can be used in the month field to run the task every third month.

Any lines that begin with a hash mark (#) are comments and are not processed.

As shown in the /etc/crontab file, the run-parts script executes the scripts in the /etc/cron.hourly/, /etc/cron.daily/, /etc/cron.weekly/, and /etc/cron.monthly/ directories on an hourly, daily, weekly, or monthly basis respectively. The files in these directories should be shell scripts.

If a cron task is required to be executed on a schedule other than hourly, daily, weekly, or monthly, it can be added to the /etc/cron.d/ directory. All files in this directory use the same syntax as /etc/crontab.

Users other than root can configure cron tasks by using the crontab utility. All user-defined crontabs are stored in the /var/spool/cron/ directory and are executed using the usernames of the users that created them. To create a crontab as a user, login as that user and type the command crontab -e to edit the user's crontab using the editor specified by the VISUAL or EDITOR environment variable. The file uses the same format as /etc/crontab. When the changes to the crontab are saved, the crontab is stored according to username and written to the file /var/spool/cron/username.

The cron daemon checks the /etc/crontab file, the /etc/cron.d/ directory, and the /var/spool/cron/ directory every minute for any changes. If any changes are found, they are loaded into memory. Thus, the daemon does not need to be restarted if a crontab file is changed.

## Daemonised (Long Running) Python Scripts

### Systemd

systemd is an init system used in Linux distributions to bootstrap the user space and manage all processes subsequently, instead of the UNIX System V or Berkeley Software Distribution (BSD) init systems.

Like the init daemon, systemd is a daemon that manages other daemons, which, including systemd itself, are background processes. systemd is the first daemon to start during booting and the last daemon to terminate during shutdown. The systemd daemon serves as the root of the user space's process tree; the first process (pid 1) has a special role on Unix systems, as it receives a SIGCHLD signal when a daemon process (which has detached from its parent) terminates. Therefore, the first process is particularly well suited for the purpose of monitoring daemons; systemd attempts to improve in that particular area over the traditional approach, which would usually not restart daemons automatically but only launch them once without further monitoring.

systemd executes elements of its startup sequence in parallel, which is faster than the traditional startup sequence's sequential approach. For inter-process communication (IPC), systemd makes Unix domain sockets and D-Bus available to the running daemons. The state of systemd itself can also be preserved in a snapshot for future recall.

#### Managing processes

The basic object that systemd manages and acts upon is a "unit". Units can be of many types, but the most common type is a "service" (indicated by a unit file ending in .service). To manage services on a systemd enabled server, our main tool is the systemctl command.

All of the normal init system commands have equivalent actions with the systemctl command. We will use the nginx.service unit to demonstrate (you'll have to install Nginx with your package manager to get this service file).

#### Enabling or Disabling Units

By default, most systemd unit files are not started automatically at boot. To configure this functionality, you need to "enable" to unit. This hooks it up to a certain boot "target", causing it to be triggered when that target is started.

#### Getting an Overview of the System State

There is a great deal of information that we can pull from a systemd server to get an overview of the system state.

#### Querying Unit States and Logs

#### Stopping or Rebooting the Server

#### Creating unit files and building new systemd services

Create a .service file in the systemd folder. For example /etc/systemd/system/my_daemon.service. Here is an example .service file.

Reload systemd to read your new unit file

Start your service and make it persist across reboots

Check your service is running by visiting http://localhost:8000

There are more options you can specify in unit files. For example, in addition to ExecStart you can specify ExecStop and ExecReload to control what happens when stopping and restarting. Those are not required though. If you omit the ExecStop option, it is smart enough to know it should kill the process. If you need a more graceful shutdown though, specify that with ExecStop. To see some more options, look at man systemd.service in your distribution. [Freedesktop.org's man systemd.service](https://www.freedesktop.org/software/systemd/man/systemd.service.html).