## Tracking changes with Git

#### Some simple commands to know with Git

#### Tags:
    Data: labeled data
    Technologies: n/a
    Techniques: tracking changes to code
    
#### Resources:

[Git Documentation](https://git-scm.com/doc)

[How to write commit messages](https://chris.beams.io/posts/git-commit/)

[Git Branching Workflows](https://git-scm.com/book/en/v2/Git-Branching-Branching-Workflows)

[TortoiseGit](https://tortoisegit.org/)
    

As data scientists we have to develop a coded solution that we want to share with others and also maintain once in the production environment. It is an essential practice to track the changes in the code so we can revert back to a previous solution or get our code checked by colleagues before going to production environent.

Git is one of the most used version tracking systems. It is meant to be fast, adoptable and secure. Git workflow is not simple, but usually there is no need to know all the intricacies of a Git implementation. There are some commands that are absolutely needed if you want to work on your own or collaborate with others using Git. 

There are several Git Visual interfaces that can help with the process, but command line is simple to understand and use so i would recommend learning Git Bash or use it in other command line interfaces. 

### Cloning the Git repo

Usually a git repo already exists or has been created on the server (e.g. gitHub), so what we need is to clone the directory from the server in a local environment. 

```
git clone https://github.com/scenthr/project-portfolio
```

### Insecting the status of a Git folder

You can inspect the current status of a Git folder by going to the git folder and using command status

```
git status
```

### Adding a file into a Git repo

Once the whole repo directory structure is copied locally, we can add additional files by using the add command

```
git add new_file.txt
```

Or just add all new files in the directory by

```
git add .
```

### Commiting changes to a file

Once you have created some chages to a file or a set of files you can commit changes locally by using the commit command. With this a new entry will be added to the tracking system holding the infromation about the changes you noted here. Later we can revert back to one such entry point if needed.

```
git commit -m "add project predict future sales"
```

#### Convestions about the comments during commit

Taking care that understandable comments are added is important as it speeds up understanding of sequence of changes that were made on a file. This is especially important if there are more people working on the same file as makes the collaboration easier. My suggestion is to write a simple headline in imperative mode and add the body only if it is really needed (think really complex implementations that require steps in implementation to be explained). Don't describe how but what and why. Here is a resource that discusses how to write commits with purpose.

[Git commits](https://chris.beams.io/posts/git-commit/)


### Pushing changes to the server

Pushing changes to the server may be a simple as calling the command or a bit more complex if using a branch. If we did not work in a branch then we can just push the changes from the local folder to the server by

```
git push origin master
```

If you worked inside a branch, then you need to merge the branch to the master by moving to the master branch
```
git checkout master
```

and once there, merge the branch into master

```
git merge branch-name
```

and delete the branch afterwards

```
git branch -d branch-name
```

### Looking at the logs

We ca inspect the change log so to understand the sequence of changes to a file by inspecting the header of a commited change

```
git log --oneline
```

## A lot more

A lot more can be done with Git and its server implementation like Git Hub, Git Lab, Bitbucket and etc. There are some resources at the begining of the article.