[Kudos to **Aron Ahmadia** (US Army ERDC) and **David Ketcheson** (KAUST) from whom I copied shamelessly]

# Course philosophy

What is data-science?
* hopefully involves data (lots of it)
* and a scientific approach that is
  - reproducible
  - transparent
  - _open_
  - (later) scaleable

When working with lots of data you will 
* spend 90% of your time finding and cleaning your data and figuring out which tools you should use to do so
* spend 9% of your time complaining about how long it takes you to mess with your data
* spend 1% if your time doint the actual analysis

Therefore the course will
* commit lots of time to teaching you _tools_ to do your science
* involve lots of hands on exercises (about 2/3 of the time!)
* follow a completely open and transparent approach

# Outline day 1 morning _"the tools we use"_

1. Version control with git
  * Why version control?
  * Basic workflows
2. Git exercises
3. Jupyter notebooks
  * Intro
  * What is a notebook?
  * Navigation & keyboard shortcuts
  * Markdown 
4. Jupyter exercises

By the end or this lecture you will
* have a working local version control for this course and
* your _own_ remote repository with all the resources/exercises in this course,
* know how to write and execute code,
* create and display graphics and
* structure and describe your work in text.



# Git
### What is Git?
* Git is a software for distributed version control of files.
* Git is open source and free
* Git is the most widespread tool to work with code collaboratively
* It is a command line based tool (alltough there are GUIs for it)

### What can you do with Git?
* local version control
* remote backup of your work
* collaborative work online via GitHub and GitLab
* accessibility for others
* publishing of code  

Git is way to large and powerful to even begin to teach it comprehensively here. We will introduce you to a few basics and then add new functionality as we go along.

##### The local repository (init, status, clone)

* create a new directory for this course and switch to it
* to create a new and empty git repository, type  
``` 
git init 
```  
* check the status of the repository using  
``` 
git status 
```
* create a working copy of a local repository  
``` 
git clone /path/to/repository 
```   


### The git playground

Your local repository consists of three "trees" maintained by git:
* your **working directory** (holds the actual files)
* the **index** (acts as a staging area for changes you make)
* the **HEAD** (points to the last change you have commited)
![trees](http://rogerdudler.github.io/git-guide/img/trees.png)

### The remote repository (clone, pull, remote add) 

* when using a remote server, you need to specify the server to clone the repository  
```
git clone <server>
``` 

* in the case of this course, we created a repository at the server
```
git@gitlab.gwdg.de:pycnic/datascience-course-ggnb
```
* if you haven't cloned an existing remote repository, you can add your local one to the remote server via
```
git remote add origin <server>
```
* if you do that, don't forget to get your local version up to date with the remote repository
```
git pull origin master
```

HINT: use ```git remote``` to show the name of your remote repository

### The workflow (add, commit, push)

1. propose changes for a single fill or all new files by adding them to the **Index**
```
git add <filename>  
git add *
```
2. commit _all_ changes to the **HEAD**, make sure to add a meaningful comment to _every_ commit
```
git commit -m "Commit message"
```
4. changes are now in the **HEAD** of your local working copy. To send them to the remote repository
```
git push origin master
```

HINT: use ```git status``` to see which files are already in the **Index**

### The branches (checkout, branch) 

* Branches are used to work on new projects isolated from each other. 
* The master branch is the "default" branch when you create a repository. 
* Use other branches for development/experimentation and merge them back to the master branch upon completion.  

![branching](http://rogerdudler.github.io/git-guide/img/branches.png)

1. create a new branch named reature_x and switch to it
```
git checkout -b feature_x
```
2. switch bach to the master branch
```
git checkout master
```
3. delete the branch again
```
git branch -d feature_x
```
4. or push the branch to the remote repository so it is available to others
```
git push origin <branch>
```
5. or merge the branch to your active branch (e.g. master)
```
git merge <branch>
```

HINT: ```git status``` also shows you, which branch is currently your _active_ branch

# Git exercises

A word on exercises: Our work as data-scientists and programmers involves a lot of searching for existing solutions online. Not knowing the command/keyboard shortcut/function by heart is not a shame at all - just look it up, it will be there and we all do it all the time! Therefore, if you are stuck
1. google your problem, if you can't find an answer within 2-3 min proceed to
2. ask your neighbor, if she already is annoyed with you proceed to
3. ask the teachers

1. **Your own repository**
  1. clone the remote repository we use for the course to create a _local_ repository
  2. create your own _remote_ repository using your GWDG user credentials
  3. add the remote adress of your own _remote_ repository to your _local_ repository  
  HINT: ```git remote add <name> <server>```   
2. **A branch a day...**
  1. create a new branch with the name of this exercise
  2. make sure the new branch is your active branch
3. **Basic workflow**
  1. create a new file in the new branch
  2. add the new file to the index
  3. commit the new file with a meaningful commit message
  4. merge the branch back to ```master```

# Jupyter

### What is jupyter?

We will use jupyter for the rest of the course to
1. Teach you new concepts by presenting a jupyter-notebook file with code snippets in a short talk.
2. Give you exercises for every new concept to play around with (coding is learning by doing first and foremost!).
3. Give you access to the teaching-notebook so you can extend existing code snippets directly.

Jupyter is

* an editor where we can write and structure text to describe what we do
* an interpreter where we can write code and execute it  

Different functionalities require different _cells_:  
* markdown cells for text
* code cells for code 

*Markdown* is a simple way to structure text (make it look nicer) using a couple of symbols

**this** is a _markdown_ cell

In [2]:
#this is a code cell
3 * 10

30

### The notebook itself

* Notebooks are HTML code
* files are named .ipynb and reside in the _dashboard_ (where you can also create new files)
* notebooks are shown/run in a browser like firefox, chromium or edge
* the code is executed by a kernel (in our case python, but can be other languages too)

### Modal editor

Jupyter notebook has a modal user interface. This means that the keyboard does different things depending on which mode the Notebook is in. There are two modes: edit mode and command mode.

**Edit mode** is indicated by a green and **Command mode** by a grey cell border:

When a cell is in edit mode, you can type into the cell, like a normal text editor.  

When you are in command mode, you are able to edit the notebook as a whole, but not type into individual cells. Most importantly, in command mode, the keyboard is mapped to a set of shortcuts that let you perform notebook and cell actions efficiently. For example, if you are in command mode and you press `c`, you will copy the current cell and paste it by pressing `v` - no modifier is needed.

<div class="alert alert-success" style="margin: 10px">
Enter edit mode by pressing `enter` or using the mouse to click *inside* a cell's editor area.
</div>

<div class="alert alert-success" style="margin: 10px">
Enter command mode by pressing `esc` or using the mouse to click *outside* a cell's editor area.
</div>

### Navigation

**Mouse navigation:** All navigation and actions in the Notebook are available using the mouse through the menubar and toolbar, which are both above the main Notebook area.  

**Keyboard navigation:** In edit mode, most of the keyboard is dedicated to typing into the cell's editor. In command mode, the entire keyboard is available for shortcuts.

### Keyboard shortcuts (edit mode):

In edit mode, keyboard shortcuts are similar to text editors like word, gedit etc. Examples:
* `ctrl-c` to copy
* `ctrl-v` to paste
* `tab` for text-completion.


### Keyboard shortcuts (command mode):

Most important:
* `enter` enters edit mode
* `esc` enters command mode
* `h` calls the help menu
---
Basic navigation:
* `up/down` select cell above/below
* `shift-enter` execute cell and select below
* `alt-enter` execute cell and insert new cell below
---
Cell types:
* `y` to code
* `m` to markdown
---
Cell editing:
* `d-d` delete cell
* `c` copy cell
* `x` cut cell
* `v` paste cell
* `shift-up/down` to select multiple cells

## Markdown 

##### Text formatting

You can make text _italic_ or **bold** or `monospace`

---
# You
## can
### make
#### headings

---
Courtesy of MathJax, you can beautifully render mathematical expressions, both inline: 
$e^{i\pi} + 1 = 0$, and displayed:

$$e^x=\sum_{i=0}^\infty \frac{1}{i!}x^i$$

##### Lists
Itemized list
* One
  - sublist
    - subsublist
* Two
  - sublist
* Three
  - sublist
    - subsublist
      - subsubsublist

---
Enumerated list
1. First
  1. sublist
  2. sublist
2. Second

##### Code
This is a code snippet:    
    
```Python
def f(x):
    """a docstring"""
    return x**2
```
  

##### Tables
Time (s) | Audience Interest
---------|------------------
 0       | High
 1       | Medium
 3       | Food

##### Command line

In [5]:
!ls

01-jupyter-python-intro.ipynb	    03-control-structures.ipynb
02-data-types-and-containers.ipynb  04-input-output-and-libraries.ipynb


# Exercises

0. **Git**
  1. Create a new branch for the exercise and switch to it
1. **The Dashboard**
  1. Create a new notebook
  2. give the notebook a meaningful name
  3. put the notebook on your working branch's index
  4. give the notebook a structure for the exercises using new cells and headings  
2. **Keyboard shortcuts** 
  1. learn 3-4 useful keyboard shortcuts and use them wherever possible
  2. make a table with your favourite keyboard shortcuts
  3. find three more useful shortcuts for jupyter notebook  
3. **Markdown & Code**
  1. practice text formatting
  2. lists
  3. LaTeX  
  4. make a code cell, perform some basic calculations
4. **(Optional) Command line**
  1. experiment with command line commands
  2. find out how to display an image in jupyter notebook using the command line
5. **Git**
  1. commit the notebook to the working branch (commit message!)
  2. (optional) make a change in the notebook (for example 'accidentally' deleting it) and undo the change using git
  3. merge your working branch back to ```master```