# Tracking Files with `git add`

## Git-Add Basics
`git add` tells Git to track the contents of a file.

## Learning Objectives
By the end of this lesson you will be able to:
* Work with temporary HOG-Sessions
* Use `git add` to track an untracked file
* Discuss what part of the `.git/` directory is changed by `git add`

## The Basics of Git Add

### Tracked and Untracked files
Recall from the [Basics of Git](git-basics.ipynb) lesson that there are on a high level two
types of files in the Git universe:

1. Tracked files

2. Untracked files

Tracked files are files that Git is storing some data about. As we will see, Git stores this data
in a few different places; of note, information is stored in the _Git database_ (aka `.git/objects`)
and in the _staging area_ (aka `.git/index`).

An untracked file is one that Git knows nothing about; it doesn't store any data about it in the index and
it doesn't have any information about its contents in stored in the database. This can often be useful. Here are
some things that we usually don't want Git to track:

1. Anything with **password** - this is critical infromation that should not be sent along
2. Binary files, such as compiled code (Java class files, compiled C/C++ files, etc) - these can get large
   and Git is not good at compressing these since it is built to work on human-readable text (i.e., code).
   Also, since compiled code comes from the source code we can always recompile. This makes our life easier:
   we only care about the source code!
   
   Additionally, image files, music files, video files, etc are often best kept elsewhere and downloaded
   separately. However it will not be too noticable if a few images slip into the repo here and there
3. Temporary files and one-off scripts
4. Backup files generated by text editors

Now, let's say we have a file and we've decided that we want to track it: that's where the `git add` command comes into play. But first, let's get ourselves a temporary Git repository set up!

## Setup

To get set up for the exercise, go ahead and run the following two cells. This just gets us
set up and ready to explore some cool Git stuff. Additionally, open up a new terminal.

In [1]:
# First a dumb hacky workaround...
import sys
import os
from os import path as osp
cwd = osp.dirname(os.getcwd())
print('cwd = ' + cwd)
sys.path.append(cwd)  # Set CWD to the root of our project

cwd = /home/ben/Projects/hog/src


In [2]:
# Begin normal session
from gitutil.session import GitSession
from snapshots import GitDirLog
import code

# Directory with the lesson we'll be using
lesson = '../lessons/basics/add1/'

# Some scripts to help us set up our repository
scripts =   { 'add':    lesson + 'scripts/add.gcs'
            , 'append': lesson + 'scripts/append.gcs'
            , 'touch':  lesson + 'scripts/touch.gcs'
            }

# The session gives us a sandbox git session where we can mess around
# without fear of messing anything up. We use scripts to set up basic
# repositories.
session = GitSession()

def run_script(s):
    script = scripts[s]
    session.run_script(script)

# log will hold the history of the .git directory. For more information,
# check out the git-basics notebook.
log = GitDirLog(session.dir() + "/.git")

# snap() lets us take snapshots so we can review our history
def snap(m='', verbose=True):
    log.take_snapshot(m, verbose)

# print the difference objects
def print_diffs(start=0, end=-1):
    log.print_diffs(start,end)

**Run the following cell and in your terminal, change directory to the temporary directory that it outputs**
This is the temporary directory where we generated a Git repository

In [3]:
print("Temporary session in {}".format(session.dir()))

Temporary session in /tmp/tmp60h98doi


*  Once there, look around. There should be a `.git` directory and not much else.

*  Run the `snap()` command in the following cell to take a snapshot of the git directory

In [4]:
snap(m="New repository")        # This creates a snapshot of the .git directory

+------------------------------------------------------------------------------+
|               Difference Object - New Snapshot: New repository               |
+------------------------------------------------------------------------------+
| created [22]:
|     [f] .git/HEAD
|     [d] .git/branches
|     [f] .git/config
|     [f] .git/description
|     [d] .git/hooks
|     [f] .git/hooks/applypatch-msg.sample
|     [f] .git/hooks/commit-msg.sample
|     [f] .git/hooks/post-update.sample
|     [f] .git/hooks/pre-applypatch.sample
|     [f] .git/hooks/pre-commit.sample
|     [f] .git/hooks/pre-push.sample
|     [f] .git/hooks/pre-rebase.sample
|     [f] .git/hooks/prepare-commit-msg.sample
|     [f] .git/hooks/update.sample
|     [d] .git/info
|     [f] .git/info/exclude
|     [d] .git/objects
|     [d] .git/objects/info
|     [d] .git/objects/pack
|     [d] .git/refs
|     [d] .git/refs/heads
|     [d] .git/refs/tags
| modified [0]:
| removed [0]:
+-----------------------------------

*  You'll see a pretty long output that looks like nonsense; this is the contents of your `.git/` directory and this is what HOG tracks. We will be seeing how this changes as we make updates to the Git repo.

*  Now we are going to make our first update to the Git repository with `run_script()`. Note that this could be done by hand with `touch f1`

In [5]:
run_script('touch')            # Let's add a file to our repository

*  Look around in the Git session - what difference do you see?

*  In your open terminal, run `git status()`. 

  - **Question:** What does it say? Is Git tracking the new file?

*  Run another `snap()` in the cell below:

In [6]:
snap(m="Created file f1")

+------------------------------------------------------------------------------+
|             Difference Object: New repository -> Created file f1             |
+------------------------------------------------------------------------------+
| created [0]:
| modified [0]:
| removed [0]:
+------------------------------------------------------------------------------+


*  **Question:** were there any changes after this last snapshot was taken? Why do you think this is?

*  In your terminal, run `git add f1` followed by `git status()`. 

  - **Question:** What does the output say? What has changed since you last ran `git status`?

In [7]:
snap(m="Added a file")

+------------------------------------------------------------------------------+
|                                 Index Entry                                  |
+------------------------------------------------------------------------------+
| name       :b'f1'
| ctime-sec  :0x5b3a2098
| ctime-nano :0x1b7b494a
| mtime-sec  :0x5b3a2098
| mtime-nano :0x1b7b494a
| dev        :0x80d
| ino        :0xcb9bb6
| mode       :0x8225
| uid        :0x3eb
| gid        :0x3eb
| file size  :0x0
| sha1       :0xf858724c887985b7b3edb3241e2e52f54303f408
| flags      :0x2
+------------------------------------------------------------------------------+

+------------------------------------------------------------------------------+
|              Difference Object: Created file f1 -> Added a file              |
+------------------------------------------------------------------------------+
| created [3]:
|     [f] .git/index
|     [d] .git/objects/e6
|     [f] .git/objects/e6/9de29bb2d1d6434b8b29ae775ad

*  Running the above cell which takes a snapshot of the .git directory and prints the difference. Additionally, you
   will see an Index listing. We will talk about this in the next lesson. 

  - **Question:** What files in the .git directory have changed?

*  Finally, run the `session.cleanup()` command above to get rid of the temporary directory

In [None]:
session.cleanup()

## Questions
1. In your terminal, run `man git-add` (if you've never heard of man pages, check them out! They can be a bit terse but they are a life saver). Read the first paragraph in the `DESCRIPTION` section and make a list of terms that are we either covered in this lesson or that you don't know about yet. For example, do you know what a _working tree_ is?

2. Look at the `OPTIONS` section. Is there an option that will allow you to do a test run for `git add`; that is, is there some option that you can append to `git add` that will not actually update the `index`?

3. Look at the `OPTIONS` section. Is there an option that will allow us to add the file `--dumb-file-name.txt` without adding any other files and without the name being mistaken for an option?
