# Version Control

Version control is a way of monitoring and logging changes to your files.
Properly used it allows you to maintain not only a full history of your work but even multiple versions.
It can also be used to create a remote backup of your work and changes to it.
Advanced use will allow collaboration with colleagues and even complete strangers.

We will use the most common version control system Git in tandem with the online repository GitHub.

**git is not GitHub**


This tool is like bash extremely powerful, what we will learn here will not be comprehensive but will give you the main workflows that are used.

`git` is on most Linux systems if not install it.

## Level 1: git (no GitHub)

Our goal here will to have one folder with code in it which maintains the history of changes made to it.
In the event of doing something that breaks the code or changes the behavior we could then 'roll back' to a previous version to understand what is going on.

So a directory where we will keep our project.
Note: one project, one folder, one git repository.


```bash
mkdir my_project
cd my_project
git init
ls -a
```

```bash
./    ../   .git/
```

Perfect that `.git/` folder that is where all your projects history will be stored.
If, like me, you do something horrible to this repository then you can do `rm -rf .git` and it will all go away and a new `git init` will make a new repository with no history and no memory of whatever it is that you did so wrong.

Now we need to add a file to track.

```bash
touch file_a.txt
git status
```


```bash
~/my_project (main)$ git status                                                                            
On branch main

No commits yet

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	file_a.txt

nothing added to commit but untracked files present (use "git add" to track)
```

See that `(main)` after `~/my project` in the prompt some terminals will use that to show you your branch more on this later.

`git status` gives us the current status of the git repo (short for repository).
Using it we see that git has shown you you have untracked files.
We can use `git add` to add them to the files git will track for us.

```bash
git add file_a.txt
git status
```

```bash
On branch main

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)
	new file:   file_a.txt
```

A change has been staged here the change is the blank file being added.
A stage is not saved yet you can stage multiple changes. 

```bash
touch file_b.txt
git add file_b.txt
git status
```

```bash
On branch main

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)
	new file:   file_a.txt
	new file:   file_b.txt
```

Two changes, both new files.
To add this to the log that makes the history of this repo.

```bash
git commit -m "adding two new files"
git status
```

```bash
On branch main
nothing to commit, working tree clean
```

And now the repo is up to date. 
Note: You must provide a message to the commit using `-m` then some text in quotes, if you do not it will make you add some by opening a text editor.

Lets make changes.

```bash
echo "some text in file a" >> file_a.txt
echo "some text in file b" >> file_b.txt
git status
```

```bash
On branch main
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   file_a.txt
	modified:   file_b.txt

no changes added to commit (use "git add" and/or "git commit -a")
```

Now the status message is that the files have been modified. 
Lets stage them then commit them.

```bash
git add file_*.txt 
git status
```

```bash
On branch main
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
	modified:   file_a.txt
	modified:   file_b.txt
```

Now we have staged them we can commit them.

```bash
git commit -m "adding text to files"
git log
```

```bash
commit a1dc7fc041705c5fd61308eaff1418cc86b68ea4 (HEAD -> main)
Author: 
Date:   Fri Jul 26 12:08:00 2024 +0100

    adding text to files

commit 6ff23042a0ea5fca337fc8ad7277f18650a7d77f
Author: 
Date:   Fri Jul 26 12:00:45 2024 +0100

    adding two new files

```

Using log you can view the history of your files to see what you did.

The process of going back to to an old commit is less clear. 

Say you want to go look at the old code but keep the new code:

```bash
git checkout 6ff2304

cat file_a.txt
``` 

then go back to your latest

```bash
git checkout main

cat file_a.txt
```

More on checkout later.

If you do this at the end of every day you will have a daily log of your changes.
Even better if you do this after you do-a-thing, you have a play by play of all the changes you made and with good commit messages, why you made them!


## Level 2: Github

One stated benefit of version control is having a remote backup of your code.
For this we will use [Github](https://github.com/).
You will need to create an account, a personal account is what most people use however Warwick has an enterprise subscription which is already attached to your university id [Warwick Github](https://github.warwick.ac.uk/).

Either will work, a personal account has the benefit of portability the warwick has the benefit of being suitable for sensitive code.

### Set up your github credentials

#### SSH

Follow [this guide](https://docs.github.com/en/authentication/connecting-to-github-with-ssh/adding-a-new-ssh-key-to-your-github-account)

#### Local Identity

We will use the `git config` tool to set some global variables.

Use the same as you used on github.

Username: 

`git config --global user.name "John Doe"`

Email:

`git config --global user.email johndoe@example.com`

You can see all your config variables:

`git config --list`

### Creating a remote

You need to create a new repository for the remote, we will link your local repo and this remote repo later.

You can either click the `+` icon on the nav bar in the top right.

![Nav bar, personal](./images/NavbarPer.png)
![Nav bar, enterprise](./images/NavbarEnt.png)

Or the green new button in the sidebar.

![Sidebar](./images/Sidebar.png)

On clicking one of these buttons you will be presented with a form to fill in, name the repo `my_project` add a description noting this is for learning git.
Leave it public, and don't add any of the initializing items.

> #### Note:
>
> In the future for new projects it is recommended to start with the remote and add a `README`, `.gitignore`, and a `license`.
> Then you can connect it to a local, for this tutorial we show the local existing first as for projects where there is already code it is easier to create the local then push to an empty remote.
>

Once you have done this you will be presented with an empty repo and Github will helpfully give you the commands to link a existing or local repository from the command line.

```bash

git remote add origin https://github.com/PipGrylls/my_project.git
git branch -M main
git push -u origin main

```

Copy and paste into the command line that has your local repo and you will get the following.

```bash
Enumerating objects: 7, done.
Counting objects: 100% (7/7), done.
Delta compression using up to 8 threads
Compressing objects: 100% (4/4), done.
Writing objects: 100% (7/7), 532 bytes | 532.00 KiB/s, done.
Total 7 (delta 0), reused 0 (delta 0), pack-reused 0
To https://github.com/<username>/my_project.git
 * [new branch]      main -> main
branch 'main' set up to track 'origin/main'.
```

Refresh the github page and you will see `file_a.txt` and `file_b.txt`. 
From here you can click on the files to view them, you can click on where it says `2 Commits` to view the history. Or many other things.

We will use the remote to create a README. Click the big green button that says 'Add a README'.
Use the editor to add some descriptive text about the purpose of this repo I use `This repo is an example for training.`.
Then click the green `Commit changes...` button, you can edit the messages if you wish, finally click `Commit changes` again.
Github automatically displays the README.md on the homepage of your repo!

### Pull

Now we have an issue, our remote has a thing our local does not have.
Run `git status` and you will see that the local has no idea this has happened.
We will use `git fetch` to collect information about the remote. 
Now `git status` will tell us that there is a difference and tell us our branch is behind.
However we can simply use `git pull`, run that now.

```bash
Updating a1dc7fc..2832703
Fast-forward
 README.md | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 README.md
```

This note lets us know which file was changed how many lines were changed and the method by which the change was made.

### Push

We saw push earlier, but lets look at it again in isolation.
First we make, add, and commit some local changes.

```
echo "some more text" >> file_a.txt
touch file_c.txt
echo "this is file c" > file_c.txt
git add *.txt
git commit -m "some local changes"
git status
```

```bash
[main 375d9da] some local changes
 2 files changed, 2 insertions(+)
 create mode 100644 file_c.txt
On branch main
Your branch is ahead of 'origin/main' by 1 commit.
  (use "git push" to publish your local commits)

nothing to commit, working tree clean
```

Similar message however our local knows the last state of the remote so knows that we are ahead, `git push` as standard will push to the remote. 
However, as best practice we will use `git push origin main` which is a way of saying do a `git push` use the remote we call `origin` and push to the branch there called `mainbash[main 375d9da] some local changes
 2 files changed, 2 insertions(+)
 create mode 100644 file_c.txt
On branch main
Your branch is ahead of 'origin/main' by 1 commit.
  (use "git push" to publish your local commits)

nothing to commit, working tree clean

Similar message however our local knows the last state of the remote so knows that we are ahead, `git push` as standard will push to the remote. 
However, as best practice we will use `git push origin main` which is a way of saying do a `git push` use the remote we call `origin` and push to the branch there called `main`.

Run `git push origin main` and you will see:

```bash
Enumerating objects: 6, done.
Counting objects: 100% (6/6), done.
Delta compression using up to 8 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (4/4), 389 bytes | 389.00 KiB/s, done.
Total 4 (delta 0), reused 0 (delta 0), pack-reused 0
To https://github.com/PipGrylls/my_project.git
   2832703..375d9da  main -> main
```

Then you can look on Github and see your changes.

> #### Note 
>
> Looking at the commands we used earlier you can see that when we did `git remote add` we specified origin as the name then passed the url for this.
> The branch was already called main.
> The `git branch -M main` tells git that this is to be considered the 'primary' branch.
> > Note in a note:
> >
> > In older repositories you may still see `master` as the `primary` or main branch.
> > This was changed to `main` due to the connotations of the descriptor `master`.
>
> ```bash 
> git remote add origin https://github.com/PipGrylls/my_project.git
> git branch -M main
> ```
>

### Cloning some other code.

It's worth noting that you will likely work with someone else on their code.

To do this you will need to clone code, use [my test project](https://github.com/PipGrylls/my_project/tree/main) as an example.

Click the green button `<> Code` select the ssh and copy the repo ssh location.
Create a folder test then navigate inside (this is necessary as git is going to create a folder called my_project and you already have this).
Open a command line and type `git clone` then paste the ssh info you just copied.
Run this command, use `ls`, see the my_project folder, navigate inside use git status. Or otherwise look around it contains some things from the next section. 

You can do this for any public codes but you may now (likely will not) have the ability to push to the repo unless you are added as a collaborator unless you use a branch and a pull request which we will go over later.

If you want to make a copy of someones code that is no longer directly linked to it then you can use the fork button (to the right and above the code button).
This creates a copy of their code in your profile then you can follow the above instructions to clone it locally but it will be linked to your version and you will have push permissions.

> #### Note
>
> Before copying someone elses code check the license and/or ask for permission. 
> You should ensure you understand the terms of use and don't violate them.

## Level 3: Conflicts and Merging

If you follow the above and never edit your local without first synchronizing it with the origin, and vice versa, then you will never run into an issue.
This is however unlikely.

### Conflicts

Lets make a conflict.

In the remote click on file_a.txt, use the editor (pencil icon in top right) to add a line with the text `A change made in the remote`. Commit this change.

Then in the terminal run `echo "A change made locally" >> file_a.txt`. Add and commit this change.

Then run `git fetch`, then run git status.

```bash
On branch main
Your branch and 'origin/main' have diverged,
and have 1 and 1 different commits each, respectively.
  (use "git pull" to merge the remote branch into yours)

nothing to commit, working tree clean
```

Now we see we have multiple commits that are in different places, use `git pull origin main` to bring them all to the local. 

```bash
From https://github.com/<username>/my_project
 * branch            main       -> FETCH_HEAD
hint: You have divergent branches and need to specify how to reconcile them.
hint: You can do so by running one of the following commands sometime before
hint: your next pull:
hint: 
hint:   git config pull.rebase false  # merge
hint:   git config pull.rebase true   # rebase
hint:   git config pull.ff only       # fast-forward only
hint: 
hint: You can replace "git config" with "git config --global" to set a default
hint: preference for all repositories. You can also pass --rebase, --no-rebase,
hint: or --ff-only on the command line to override the configured default per
hint: invocation.
fatal: Need to specify how to reconcile divergent branches.
```

This message is a conflict, both commits edit the same line in a way that `git` cannot automatically reconcile.

### Merging

We need to tell git how we want to do this the way we will use is a merge.
Copy that line `git config pull.rebase false` and paste it to the terminal.
Then rerun the pull, `git pull origin main`.

```bash
From https://github.com/PipGrylls/my_project
 * branch            main       -> FETCH_HEAD
Auto-merging file_a.txt
CONFLICT (content): Merge conflict in file_a.txt
Automatic merge failed; fix conflicts and then commit the result.
```

> Note: if you always want to merge then you can use `git config --global` to make this the default for all repositories always

Now we have a proper conflict so we need to open the file with the conflict and manually resolve it.

```bash
vim file_a.txt
```

```text
<<<<<<< HEAD
A change made locally
=======
A change made in the remote
>>>>>>> 668d9526d42a530bf60d1317919f06987be31ff6
```

This is the way git lets us know what the conflict is. You can choose to resolve this however you want.

You can pick the remote change only, delete all lines (between the <<<<<<< >>>>>>>) except that line that says `A change made in the remote`.
Or the local line only, delete all lines except the line that says `A change made locally.`.
Or we can delete the whole lot and add the line, `A resolved conflict`.
Also we could keep both in whatever order we like by just deleting the lines with the `<` `>` and `=` also removing the HEAD and commit id.
Finally we could keep both as above then add another line saying conflict fixed.

Whatever you plan to do, do it, save the file, `add`, and `commit`. 
Then `git status` gives.

```bash
On branch main
Your branch is ahead of 'origin/main' by 2 commits.
  (use "git push" to publish your local commits)

nothing to commit, working tree clean
```

This resolves the conflict but we should then push to get everything synced up, `git push origin main`.


## Level 4: Feature Branch Workflow

> Note: 
> 
> If you use git like this you are likely one of the most proficient git users.
> This section is designed to lay out the processes of how to do this a full tutorial with code is out of scope.

The previous example is a poor way to handle git, it creates messy merges that can take a great deal of time to unpick. 

The 'Feature Branch Workflow' is designed to take a project and make it so that changes are isolated and the core of the code is always in a workable state.

> Branches:
> 
> A branch is a copy of the code that diverges from the `main`.
> They are used to develop changes that may impact the usability of the code without rendering the whole repository obsolete while the change is made.
> We will also see that they can be used to compartmentalize different changes so blocks of work are kept separate and manageable.
> In some cases, branches are used to separate versions of code that have different defaults such as a code which has been written with a GPU backend and a CPU backend but the frontend api is consistent.
>

### `main` and `dev`

To start this process we should ensure we have two standard branches, `main` and `dev`. 
`main` as we have seen is the primary or default branch it should be kept in a working state.
Thus to make any change we will require another branch to test any additions, we will call this `dev`.
To create `dev`:

```bash
git branch dev
git checkout dev
```

```bash
Switched to branch 'dev'
```

Now we have dev which is an identical copy of main and we should push it to origin so it is tracked remotely:

```bash
git push origin dev
```

```bash
Total 0 (delta 0), reused 0 (delta 0), pack-reused 0
remote: 
remote: Create a pull request for 'dev' on GitHub by visiting:
remote:      https://github.com/PipGrylls/my_project/pull/new/dev
remote: 
To https://github.com/PipGrylls/my_project.git
 * [new branch]      dev -> dev
```

Differences between dev and main are stored in `.git/` you can switch between any branch using `git checkout <branch name>`.

You can now develop on `dev` and then when you are ready you can do one of two things to bring your changes to `main`.

#### Merge locally

Doing things locally is okay 

```bash
echo "a change made on dev" >> file_c.txt
git add *.txt
git commit -m "changing things on dev"
git push origin dev
git checkout main
```

But now we have done this we want to bring the changes back to main.

Note: we are on the main branch.

```bash
git merge dev
```

Now inspect file_c.txt and you will see the dev change.

```bash
git push origin main
```

Push to the remote and then you will see the remote is also updated.


#### Launch a pull request to merge on github (preferred for collaborative projects)

Start my making a change to a file on `dev`.

```bash
git checkout dev
echo "a change made on dev to be pushed to the remote" >> file_b.txt
git add *.txt
git commit -m "a change to go to the dev remote"
git push origin dev
```

Go to the github page:

You will see this

![note of a push](./images/NoteOfPush.png)

You can click that to get to the pull request page directly, however, taking the 'long route' if that note isn't there go to the `Pull requests` in the navbar.

![repo navbar](./images/RepoNavbar.png)

Then there is 'New pull request' green button.

![pull request](./images/PullRequest.png)

Modify the dropdowns to compare dev to main as in the image above.
An indication the the ability to merge is created.
We can then click create pull request, on the next page more details can be added about the PR and why it's being made.
You can also add Reviewers Assignees Labels and more that are all useful when you work collaboratively.
You can leave it blank for now and click create pull request again.

Finally you get to the pull request information page.
This is where in a collaborative project your request will be reviewed and merged by a maintainer (depending on the professionalism of the project).
Here just press merge pull request, then confirm merge.
Then go back to `<> Code`.

Locally we can checkout main and then pull to get that merge back locally.

```bash
git checkout main
git pull origin main
```

And thats the merge done, this seems much more complicated but for most large projects this is the only way to contribute to main.

### Feature branches


> Overview:
> 
> The idea of this style of development is to keep main as the 'release' version.
> It will always work and is what is documented for people to use and you can point to it (and specific versions of it via commits or releases) in papers.
>
> Dev is also protected from direct `push`es and only updated via pull requests.
> To make changes we use a specific feature branch, I like to use `GitHub issues` to create branches although the nomenclature here starts to fail us.
> You can also just make branches using `git branch <branch-name>`, then follow as above but making changes on your new branch and doing a pull request to dev.
>
> Once you have committed a given number of features to `dev` so it has a new ability or function, and most importantly has been tested to stable, you can merge that to `main`.
> For more complicated projects you can expect to find more complicated structures, or unfortunately no structure which then can be very hard work.
>

Example of using GH to do this branching for the above workflow.

Go to issues in the navbar

![navbar](./images/RepoNavbar.png)

Click new issue, use the form to create an issue and description.

![issue page](./images/Issue.png)

Then click 'Submit new issue', then you will be on the issue page where you can have a discussion about the issue and many other things.
We want to create a branch to track this issue, look for the link in the sidebar under development.

![Issue sidebar](./images/IssueSidebar.png)

A pop up box gives you a generated branch name, best leave it as is, and a bit of other information. The one thing we do want to change is to swap the branch source from `main` to `dev`.

First click on Change Branch source.

![Issue Popup](./images/IssuePopup2.png)

then in the dropdown select dev (or the branch you want to branch off)

![Issue Popup](./images/IssuePopup.png)

Click the green 'Create branch' button, and the checkout locally commands are given.

```bash
git fetch origin
git checkout 2-content-for-file_b
```

`git fetch origin` gets all branches from origin and then the checkout puts us on that branch. 
Insert some content for file_b

```bash
echo "content for the issue branch" >> file_b.txt
git add file_b.txt
git commit -m "adding some content to file b #2"
git push origin 2-content-for-file_b
```

If you go back and look at the issue page (refresh it), you will see that by adding #2 which is the issue number (check what yours is) that it gets logged in the issue comments.

Now follow closely but not exactly the steps above to create an issue to track the creation of new file called `file_d.txt`.

Checkout, create the file, and push back.

#### Merging it all back together

Go to Pull Requests and then we can open new pull requests, each pull request should take one of our 'issue branches' and merge it into dev.

![pr for an issue](./images/IssueBranchPR.png)

Now we have 2 pull requests and can merge them to dev one by one, note normally you would have a reviewer audit and merge your pull request to avoid mistakes but here we will do them.

![image of multiple waiting PRs](./images/MultiplePRs.png)

After finishing a pull request you will see a popup asking if you want to delete the branch. 
For a feature branch usually we are now done with that development and are ready to delete it so press that button.

![Prompt to delete branch](./images/DeleteBranch.png)

We can also go back to the issues and click close issue on each of them to show that we have stopped tracking them.
It's good practice to close with a comment to note the pull request that solved this issue.

Go back to your local and do a `git fetch origin` and then `checkout dev`. 
Use the pull request method to pull `dev` into `main` noting that it contains content for b and file_d.

We see from this process that `main` remained more stable. 
In practicality dev could have as many pull requests as needed and main would only have one that would have been run though testing to make sure it met the repositories standards.

That's it use git like that and you'll thank yourself later.


### Last note

Sometimes merging won't be automatic and there will be conflicts. 

If `dev` has been modified and you cannot automatically do a pull request from a feature branch then we need to do the following.

1. fetch all branches locally
2. checkout dev
3. git pull origin dev
4. checkout feature branch
5. git merge dev
6. resolve conflicts
7. push feature branch

The pull request will automatically update to not your conflict resolution and should now show that merging can be automatic.