## Summary of INF3331 and INF4331 course

## Version control system: Git
### Why
* Collaboration
* Storing versions
* Restoring versions
* Backup
* Use cases: Latex documents (Master/PhD thesis, paper), software projects (open-source and proprietary) 

### Creating a repository

We *initialise* an empty git repository with

In [1]:
rm -Rf project; mkdir project
cd project
git init .

Initialised empty Git repository in /home/funsi/Documents/resources-15/doc/slides/src/2015/15_summary/project/.git/


If we want to work with an existing repository we can *clone a remote*, for example from the web:

```bash
git clone https://github.com/UiO-INF3331/code_snippets.git
```
or from another local git repository:
```bash
git clone ~/src/my_project
```

Every git repository has a `.git` directory (the dot makes it a hidden directory) that contains all necessary repository files.

### Adding and committing files

git seperates between three *areas* when adding changes to a repository:

<img src="images/git_areas.png" style="max-width:100%; width: 50%">

Making changes to a repository consists of *three steps*:
1. Modify/Add files in the working directory
2. Add files to the staging area
3. Commit the files

Use
```bash
git status
```
to list files that are modified in the working directory and on staging area.

You can view the a *line-by-line difference between the working directory and staging area* with:

```bash
git diff
```

You can *add changes from the working directory to the staging area* with:

```bash
git add file.txt
git add *py
```
This works both on *modified* and *new* files.  

The files in the staging area can be *committed to the repository* with:

```bash
git commit -m "Add file.text and the Python scripts"
```


### Pull and push changes

Download, and apply changes from the remote repository 
```bash
git pull origin master
```


Upload all new commits to the remote repository
```bash
git push origin master
```

### Example

In [2]:
echo "print 'Hallo world'" > helloworld.py
echo "Hello world for Python" > Readme.md
ls -R1 ../project

../project:
helloworld.py
Readme.md


In [3]:
git status

On branch master

Initial commit

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	[36mReadme.md[m
	[36mhelloworld.py[m

nothing added to commit but untracked files present (use "git add" to track)


In [4]:
git add Readme.md
git add *py



In [5]:
git status

On branch master

Initial commit

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

	[33mnew file:   Readme.md[m
	[33mnew file:   helloworld.py[m



In [6]:
git commit -m "Add a hello world program and a Readme file"

[master (root-commit) 4d212d6] Add a hello world program and a Readme file
 2 files changed, 2 insertions(+)
 create mode 100644 Readme.md
 create mode 100644 helloworld.py


Let us modify the files again:

In [7]:
echo "Run with python helloworld.py" >> Readme.md
echo "# A sample comment" >> helloworld.py



In [8]:
git status

On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	[32mmodified:   Readme.md[m
	[32mmodified:   helloworld.py[m

no changes added to commit (use "git add" and/or "git commit -a")


In [9]:
git diff

[?1h=[1;33mdiff --git a/Readme.md b/Readme.md[m[m
[1;33mindex 1553fac..b617deb 100644[m[m
[1;33m--- a/Readme.md[m[m
[1;33m+++ b/Readme.md[m[m
[1;35m@@ -1 +1,2 @@[m[m
 Hello world for Python[m[m
[1;32m+[m[1;32mRun with python helloworld.py[m[m
[1;33mdiff --git a/helloworld.py b/helloworld.py[m[m
[1;33mindex 32066b0..a8f5316 100644[m[m
[1;33m--- a/helloworld.py[m[m
[1;33m+++ b/helloworld.py[m[m
[1;35m@@ -1 +1,2 @@[m[m
 print 'Hallo world'[m[m
[1;32m+[m[1;32m# A sample comment[m[m
[K[?1l>

In [10]:
git add Readme.md



In [11]:
git status

On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	[33mmodified:   Readme.md[m

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	[32mmodified:   helloworld.py[m



In [12]:
git commit -m "Add an installation instruction"

[master f71a9c5] Add an installation instruction
 1 file changed, 1 insertion(+)


In [13]:
git status

On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	[32mmodified:   helloworld.py[m

no changes added to commit (use "git add" and/or "git commit -a")


### Useful to know


* Typical workflow:
  1. Make changes to a file
  2. `git add file.txt`
  3. `git commit`
  4. `git pull`
  5. `git push`

* You can stage only a subset of the modifications to a file with:
```bash
git add --patch file.txt
```

* You can *unstage* changes with:
```bash
git reset HEAD file.txt
```

* You can stage all modified files in the working directory *and* commit them with
```bash
git commit -a -m "Some commit message"
```

* If you are unsure, create a copy of the entire directory (including the .git directory) - if something goes wrong you can always go back to that backup.

* Never use `git reset --hard` and `git push --force` as you can loose data and create inconsistent repositories!

* Get a pretty commit history of your repository with:
   ```bash
   git log --graph --oneline --decorate
   ```

### What to do/learn next?

* Set up a graphical difftool, for example `meld`:
  <img src="images/meld.png" style="max-width:100%; width: 50%">
  Setup instructions:
  ```bash
  git config --global diff.tool meld
  git difftool
  ```
* Add a `.gitignore` file let git automatically ignore certain files (e.g. pyc files).
* Learn how to use `git branches`
* Learn `git stash`, `git revert`, `git reset`, ...

## Scripting: Bash

### Why
* Available on most UNIX systems
* You are already using it
* Simple and fast way to get things done
* Good for file management, communicating and piping between programs, rapid prototyping, simple output
* Use-cases: Automation of simple tasks, system administration, web application deployment, data crunching, auomated backups

### Examples

Bash variables

In [1]:
s=42
echo "The answer is $s"

The answer is 42


In [51]:
time=`date`   # or time=$(date)
echo "The time is $time"

The time is Tue Dec  1 07:31:33 CET 2015


Command line arguments:

In [19]:
echo "Command: $0
First command line argument $1
Second command line argument $2
All command line arguments $@
Exit of last executed command $?"

Command: /bin/bash
First command line argument 
Second command line argument 
All command line arguments 
Exit of last executed command 0


Conditionals:

In [22]:
if [ "$?" == "0" ]; then
  echo "Last command was successful"
fi

Last command was successful


Loops:

In [24]:
files=`ls images/*`
for f in $files; do echo "$f"; done

images/git_areas.png
images/meld.png


Combining Bash commands: Piping

In [28]:
du -a .. | sort -rn | head -n 10

17856	..
8420	../13
8364	../13/Rhinoceros.png
5944	../num_itg.eps
2116	../14_introduction_to_scikit_learn
1652	../14_introduction_to_scikit_learn/02.2-Basic-Principles.ipynb
456	../15_summary
380	../snippets
268	../14_introduction_to_scikit_learn/02.1-Machine-Learning-Intro.ipynb
200	../15_summary/project


Bash redirects

In [30]:
echo "hei verden" > hei.txt   # Save output to file



In [32]:
wc -w < hei.txt    # Use file as input

2


Stdout and stderr

In [46]:
echo "Hallo"   
ls -y

Hallo
ls: invalid option -- 'y'
Try 'ls --help' for more information.


In [55]:
echo "Hallo" 1> stdout    # redirect stdout to file
ls -y 2> stderr           # rediret stderr to file
echo "stdout: $(cat stdout)"
echo "stderr: $(cat stderr | head -n 1)"

stdout: Hallo
stderr: ls: invalid option -- 'y'


### An example bash script: Send email when disk space is nearly full

In [9]:
MAX=90                 # warning threshold for disk space in percentage
EMAIL=USER@domain.com  # email adress for warning recipient
DISK=sda1              # disk identifier 
USAGE=$(df -h |grep $DISK | awk '{ print $5 }' | cut -d'%' -f1)

if [ $USAGE -gt $MAX ]; then
  echo "Percent used: $USE" | mail -s "Running out of disk space" $EMAIL
else
  echo "You've got enough space"
fi

You've got enough space




### What next
More material to read:
* Get familiar with more UNIX commands (e.g. sed, find, ...)
* Learn how to use Makefiles
* Learning by doing