# Git: Getting started

   - Management of changes to programs, documents,  and other computer files
   - Backup, restore, synchronize, track changes, track ownership, branch, and merge
   - Test changes to code without losing the original
   - Allow multiple developers to collaborate on a single codebase
   - Revert back to old versions of code
   - Maintain sanity
   
# Github

A site which purpose is to store git repositories on the internet.

Some resources:

   - Git: Obtain the primary software (mac, linux, windows): http://git-scm.com/
   - Git Magic: a great online book introducing the software http://www-cs-students.stanford.edu/~blynn/gitmagic/
   - A list of graphical front-end clients for Git: https://git.wiki.kernel.org/index.php/Interfaces,_frontends,_and_tools#Graphical_Interfaces
   - Github: Online hosting service for Git repositories. https://github.com/

Very easy to start version control of your files:

 * Initialize repository
 * Add files
 * Commit changes to repository
 
Follow along:

```
sudoh> mkdir LearnPythonClass ; cd LearnPythonClass ; ls -la
total 0
drwxr-xr-x   2 sudoh  wheel    68 Jan 07 22:16 .
drwxrwxrwt  41 root    wheel  1394 Jan 07 22:16 ..
```

initialize this as a directory that git will follow

```
sudoh> git init
Initialized empty Git repository in /private/tmp/LearnPythonClass/.git/
sudoh> ls -la
total 0
drwxr-xr-x   3 sudoh  wheel   102 Jan 07 22:16 .
drwxrwxrwt  41 root    wheel  1394 Jan 07 22:16 ..
drwxr-xr-x  10 sudoh  wheel   340 Jan 07 22:16 .git
```

Make a file, add it so git knows about it, the commit it.

```bash
sudoh> echo "this is my first file in this repo" > README
sudoh> ls
README
sudoh> git add README
sudoh> git commit -m "This is my initial commit"
[master (root-commit) ebc83b9] This is my initial commit
 1 file changed, 1 insertion(+)
 create mode 100644 README
```

# Git: Basic Workflow

* modify code/document

* commit changes

* repeat

# Clone

```bash
# Download a remote version of a repository into a local machine
$ git clone https://github.com/udohsolomon/LearnPython_Class.git
Cloning into 'LearnPython_Class'...
remote: Enumerating objects: 85, done.
remote: Total 85 (delta 0), reused 0 (delta 0), pack-reused 85
Unpacking objects: 100% (85/85), done.
Checking connectivity... done.

```

# Add

```bash
# make a new python file blah.py
$ git add blah.py 
$ git commit -m "created blah to show off that I know git"
[master (root-commit) 2c7aded] Added a readme file as a test
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 README

# {-- add this directory, its contents, and every subdirectory --}
$ git add .
```

# Delete

```bash
# {-- to just stop tracking changes --}
$ git rm --cached uselessfile.txt
$ git rm --cached -r abunchof/oldfiles/

# {-- to delete the file altogether, same as normal os-level remove --}
$ git rm uselessfile.txt
```
see http://stackoverflow.com/questions/4124792/remove-an-existing-file-from-a-git-repo

# Rename (same as delete + add)

```bash
git mv oldname.txt newname.txt
```

## Git Status and Diff

```bash
$ git status
# On branch master
# Changed but not updated: 
#  (use "git add <file>..." to update what will be committed) 
#  (use "git checkout ­­ <file>..." to discard changes in working directory)
#
#     modified: mystuff.py
#
no changes added to commit (use "git add" and/or "git commit ­a") 
```

** Try this: **
 
 - make a new file
 - edit it
 - add it to git and then commit
 - edit the file
 - type `git diff`
 - commit your changes
 - `git log`

# Commit, Push and Pull

## Git GUI Tools ##

Unless you really like working from the command line, tools like `gitk` (a history visualizer) and `git gui` (a GUI interface to git’s functionality) are incredibly helpful. (Automatically included on OSX, may need to find ‘git-gui’ package on Linux)
```bash
# {-- from inside of working directory --}
$ gitk 
# {-- to view all branches --}
$ gitk --all

# {-- and --}
$ git gui &
```
Those two are pretty standard, but there are many other options as well:

https://git.wiki.kernel.org/index.php/Interfaces,_frontends,_and_tools#Graphical_Interfaces

# Undoing changes

First few characters of hash enough to specify the commit

* `git reset --hard`: (e.g. `git reset --hard 766f`) load an old commit and delete all commits newer than the one just loaded.  This ‘changes history.’

* `git checkout`: (e.g. `git checkout 82f5`) load an old commit, but new edits will be applied to this new branch and your other edits will still be accessible.  This moves around through history.

* `git revert`:  Add a commit that undoes earlier commits, should be run from a clean working directory.  Follow this with a “git checkout .” (including the dot) to make your working directory match the git version.  This does not change history, it instead adds an undo.

* Can refer to commits other ways; e.g. by start of commit message (git checkout :/”Replace printf()”) or by, e.g., the 3rd last saved state (git checkout master~3)


## Branching & Merging ##

`branch`: faster and more space efficient than cloning for situations when we need to switch gears and modify multiple versions of the code simultaneously 
```bash
 $ git status 
# On branch master 
[...]

$ git branch work-newalgo
$ git checkout work-newalgo 
Switched to branch 'work-newalgo'

$ git checkout -b work-betterdocs 
Switched to a new branch 'work-betterdocs' 
$ git branch 
  master
* work-betterdocs
  work-newalgo
```

# Merging

`merge`: to combine branches back together (see `git mergetool`)

[ make a bunch of commits on work branch ]

```bash
$ git checkout master 
Switch to branch 'master' 
$ git merge work-betterdocs 
Updating 213a816..a9fae1e 
Fast-forward
 README	|	100 +++++++++ 
 LICENSE |	5+ 
 INSTALL | 120/30 ++++++++++++­­­­ 
 3 files changed, 225 insertions(+), 30 deletions(­)
 create mode 100644 LICENSE 
 create mode 100644 README
$ git branch -d work-betterdocs 
Deleted branch work-betterdocs (was a9fae1e).
```

# Hosting Projects on Github ##

<center>
<img src="http://octodex.github.com/images/baracktocat.jpg" width=30%></img>
</center>

From now on, for this class, we’ll use GitHub as a standard, so let’s go through a complete example

- log into GitHub, create a new repository (w/o a readme yet)

- go to https://github.com/edu for a free student account

- on the command line, create some files, and push to this new repository:

```bash
nano README
git init
git add README
git commit -m "first commit"
git remote add originhttps://github.com/[user_name]/[repo_name].git
git push origin master
```

# Updating code 

* After the bare repository is hosted on another server, clone a copy to your local disk
* Pull any changes from other users
* Push any new changes you make to the new server

```bash
git clone https://github.com/sudoh/projects/LearnPythonClass/test_this.git 
Cloning into test_this...
# Make some changes....
git commit -m "Long and descriptive commit message"
# Pull to update to latest version of the code
git pull origin master
# Resolve any merge conflicts and then commit them, if necessary
# Push to check-in local changes to the hosted repository:
git push origin master
```

# Git collaboration workflow

* Pull latest version from central repo
* Modify code/document
* Commit changes (early and often!)
* Pull (again, and resolve any merge conflicts)
* Push your changes to the central repo
* Repeat

 Generally, git will happily deal with multiple workers on the same file and merge the changes automatically when you pull from the central repository. But when the same line is edited by multiple people, human intervention is generally needed to resolve
 
 try **`git mergetool`** if things go awry


Git can keep track of changes made to code, synchronize code between different people, test changes to code without losing the original, and revert back to old versions of code.
GitHub is a website that stores Git repositories on the internet to facilitate the collaboration that Git allows for. A repository is simply a place to keep track of code and all the changes to code.

Git commands:

```
  git clone <url> : take a repository stored on a server (like GitHub) and downloads it
  git add <filename(s)> : add files to the staging area to be included in the next commit
  git commit -m "message" : take a snapshot of the repository and save it with a message about the changes
  git commit -am <filename(s)> "message" : add files and commit changes all in one
  git status : print what is currently going on with the repository
  git push : push any local changes (commits) to a remote server
  git pull : pull any remote changes from a remote server to a local computer
  git log : print a history of all the commits that have been made
  git reflog : print a list of all the different references to commits
  git reset --hard <commit> : reset the repository to a given commit
  git reset --hard origin/master : reset the repository to its original state (e.g. the version cloned from GitHub)
```
    
When combining different versions of code, e.g. using git pull, a merge conflict can occur if the different versions have different data in the same location. Git will try to take care of merging automatically, but if two users edit, for example, the same line, a merge conflict will have to be manually resolved.
To resolve a merge conflict, simply locally remove all lines and code that are not wanted and push the results.

# Debugging Code #

 * Standard technique of inserting “print” statements may be inconvenient if code takes a long time to run
 * The module `pdb` is an interactive source  debugger
 * Variables are preserved at breakpoint, and can interactively step through lines of code
 
 http://docs.python.org/library/pdb.html
 
Use `pdb.set_trace()` to explore and interact with problematic code:
```python
#[ in some problematic python file ]
import pdb
#[ code code code ]
pdb.set_trace()   # hard-code a breakpoint at a given point in a program
#[ problematic code ]
```
using from the command line:
```bash
python -m pdb myscript.py
```
pdb will automatically enter post-mortem debugging if the program being debugged exits abnormally. 

 Can debug within iPython by typing debug after exception is raised
‘help’ shows the commands available, for both pdb and iPython’s ipdb

# Good coding style

 - There is an official Python style guide, called PEP8:
    http://www.python.org/dev/peps/pep-0008/
 
 - Guido sez: ‘code is read much more often than it is written’
     - especially true for scientific code
     
 - Python enables you to write readable code, so do it!  The world will seem a happier place.
 
 - `pylint`, `pep8`, and similar tools tell you whether code is self-consistent, if sections are duplicated, and where/how you break the style rules
 
 - get them with pip, run them like: `pep8 my_code.py`
 

# Testing - nosetest ##

* nosetest is a program that runs any pre-defined tests you have written, and reports back the results
* within your code, add test functions that act as self-checks
* naming convention: test*
* lay out everything you want your code to do beforehand, as a self check
    * test-driven development

write test functions that use the `assert` command - it raises an AssertionError if it does not evaluate to true

```python
#[ within my code ]
def square_me(x):
    return x**2

def test_1():
    # test the simple square_me function
    assert square_me(2.) == 4.

def test_2():
    # testing without an absolute equality
    assert abs(square_me(2.34) - 5.4756) < .001
```

# Distributions - distutils2 #

* distutils2: the standard way to take your directories of code and bundle them up for easy installation and use by others
* You create a setup.py file which allows others to install your code in the standard fashion.
* There are myriad options for the metadata you can define, an incredibly simple example is below:
```python
from distutils2.core import setup
setup(name='My_Package',
      version='0.01',
       license='License_to_not_kill',
      py_modules=['my_package_name'],
      )
```

http://pythonhosted.org/Distutils2/distutils/introduction.html

http://pypi.python.org/pypi/Distutils2

Standard hierarchy includes several ancillary files, as well as the package and modules.
```bash
My_Package_folder/
  README.txt
  LICENSE.txt
  setup.py  
  my_package_name/
    __init__.py
    my_module.py
    my_other_module.py
```
Running the setup.py will install your package (and modules) into the Python path

If you’ve written your own package, use distutils2 to create a standard, share-able zipped file
```bash
cd My_Package_folder
python setup.py sdist

[... created my_package_name-0.1.tar.gz ...]
```
http://guide.python-distribute.org/

<center>In the end, all of these things are designed to be HELPFUL, not an additional burden, so learn about them and choose what works for you (and your coworkers/collaborators) and what does not.</center>

<p>
<center>For a great and much deeper guide on how to be an effective code-builder, see: http://software-carpentry.org/</center>