# Introduction to Git 


In [2]:
from IPython.display import Image
from IPython.core.display import HTML 

<b>Git</b> is a version control system that allows you to track changes in your code over time.
* it helps us keep track code changes
* the authors of the changes
* it also helps us collaborate on code

Git is currently by far the most popular software repository. It is primarily used for
software development, but you can version all kinds of data with it.

## What does Git do?

* it allows us to manage projects with <b> Repositories </b>
* it allows us to <b> clone </b> a project to work on a local copy
* it allows us to track changes through <b> staging </b> and <b> commiting </b>
* <b> branch </b> and <b> merge </b> to allow for wotk on different parts and versions of the project
* <b> pull </b> the latest version of the project to a local (on your pc) copy
* <b> push </b> local updates to the main project

In [3]:
Image(url= "img/git1.png", width=400, height=1200)

## Version control

When more than one person (or even just a single person) works on some program
program (or data in the broadest sense), there is a danger of quickly creating a chaos of versions.

Version management programs (code repositories), such as Git, attempt to
solve this problem
- by storing the individual versions, documenting them and making them retrievable
- try to make changes visible and automatically merge them if necessary

In [4]:
Image(url= "img/git2.png", width=400, height=1200)

## Basic Git commands

| Command   | Purpose    |
|-------------:|:-----------|
| ``` git init ``` | initializes a new Git repository in the current directory|
| ``` git add <file> ```  | adds a file to the staging area | 
| ``` git commit -m "<message>": ``` | commits the changes in the staging area with a message describing the changes | 
| ``` git status ``` | shows the current status of the repository | 
| ``` git log ``` | shows the commit history of the repository | 
| ``` git <command> help ``` | gives help on specific options for commands | 


### Let's get started

To start using Git, we are first going to open up our Command shell.

For Windows, you can use Git bash, which comes included in Git for Windows. For Mac and Linux you can use the built-in terminal.

The first thing we need to do, is to check if Git is properly installed:



In [29]:
Image(url= "img/git3.png")

If Git is installed, it should show something like git version X.Y



Now let Git know who you are. This is important for version control systems, as each Git commit uses this information:



```
$ git config --global user.name "lucijakrusic"

$ git config --global user.email "lucija.krusic1@gmail.com"

```

We can also use:
    
```
$ git config --list
```
    
to list out all the configuration settings available. Exit by pressing ```q```.

We can e.g. set up a text editor to configure git. To do that use:
 ``` 
 $ git config --global core.editor (editor_path)
 ```
 
 such as
 
 ```
 $ git config --global core.editor C:\WINDOWS\system32\notepad.exe
 ```

## Git status


We can test the status of the repo by using:

```
$ git status

```

This gives us information on the git repository, such as branch, and things we didn't "save" yet:


In [30]:
Image(url= "img/git4.png")

## Git help

If we need help remembering options for a specific command, we can always use ``` git <command> -help```, e.g.

```
 git commit -help

```

And to list all possible commands we use

```

git help -all

```

## Branching and merging

In Git a branch is a new or a separate version of the main repository. A branch branches off at a certain point from the main branch abd from there, the development is <i> basically </i> independent. The branches share history though and are aware of it - which enables subsequent merging.

In [13]:
Image(url= "img/git10.png")

Git uses this feature excessively. This is facilitated by the fact that the repository is repository is available locally on the computer, and switching back and forth between branches takes only a fraction of a second. 

Switching from one branch to another is therefore extremely fast, even for large projects. This is why branches are omnipresent in Git. Over time, various models have emerged for the effective
use of branches have emerged, such as the following one:


This model provides for two basic branches:
- ```master``` - the traditional name for the first branch created by Git itself. The master branch represents the official version of the project. Recently, the more more neutral ```main``` is preferred.
- ```develop``` - this is the development branch of the project which, when a milestone is reached - is merged back into the master branch.

In addition there are so called ```feature branches```, in which a new feature is created and if necessary further branches (release branches, hotfix branches, etc.).




In [14]:
Image(url= "img/git11.png")


A repository with several branches could look like this:
- Parallel to the master branch, there is the develop branch.
- Branching off from the develop branch there are feature branches, here e.g. input_valitationand core_algorithm. Such branches are only created to develop a new feature.
- When the feature has been transferred back to the develop branch,
feature branches are usually deleted.
- alternative_algorithm is an example of an ad-hoc branch, where you quickly try something out. If the idea works, you move the code to the branch from which it was forked. If it doesn't work, the branch is simply deleted.
- The release branch contains only released versions.
- A hotfix branch becomes necessary when a dangerous bug is found in a released version. The fix, i.e. the code that corrects the error, must be returned to the master as soon as possible and leads to another release. Often it has to be incorporated into the develop branch.


### Commands dealing with branches

| Command   | Purpose    |
|-------------:|:-----------|
| ``` git branch ``` | lists all branches in the repository|
| ``` git branch <branch-name>``` | creates a branch|
| ``` git checkout <branch>```  | switches to the specified branch| 
| ``` git merge <branch> ``` | merges the specified branch into the current branch | 
| ``` git branch -m old-name new-name ``` | renaming a branch | 
| ``` git branch -d <branch>``` | deleting a branch | 


In our repo, we are currently in the master branch. We know that also because it says so:

## Creating a "notes branch" for our class repository

### 1. Create a personal branch:

``` git checkout -b my_notes ```


In [5]:
Image(url= "img/git-branch.png")

This creates a new branch called my_notes, where you can freely make changes without affecting the main branch.



### Step 3: Working in the Personal Branch


During the session (or after) you can work and take your notes in the branch you've created (my_notes). The notebooks will be saved here and youi don't need to worry about overwriting my changes.

### Step 4: Pulling Updates from the class repo (Before Each Class)

Before every class, you should switch back to the main branch to pull the latest updates from the course repository:
```
git checkout main
git pull origin main
```

This will bring in all the new content or updates from the instructor.



### Step 5: Merging the updates into the personal branch
After pulling the latest updates, you can merge those updates into their personal branch without causing conflicts in their own notes:

```
git checkout my_notes
git merge main

```


Since you haven’t made any changes to the course notebooks in the main branch, this merge will go smoothly with no conflicts.



### Step 6: Keeping personal notes safe

Now, all the latest course content is available in your my_notes branch along with their personal notes. You can continue to work in this branch without ever touching the main branch, where the instructor’s updates live.

If you ever want to back up their personal branch to a remote repository (optional):

```
git push origin my_notes

```

## Git staging environment


One of the core functions of Git is the concepts of the <b>Staging Environment</b>, and the <b>Commit</b>.

As you are working, you may be adding, editing and removing files. But whenever you hit a milestone or finish a part of the work, you should add the files to a Staging Environment.




```
$git add <filename>

```

or


```
$git add --all
```

### Git commit

Since we have finished our work, let's commit it.

Adding commits keep track of our progress and changes as we work. Git considers each commit change point or "save point". It is a point in the project you can go back to if you find a bug, or want to make a change.

When we commit, we should always include a message explaining what the newly commited change was.

By adding clear messages to each commit, it is easy for yourself (and others) to see what has changed and when.



```
$git commit -m "First release of my notes"

```

where ```commit``` performs the commit and ```-m``` stands for "message". This should give us a result looking like:

In [36]:
Image(url= "img/git-commit.png")

## Remote repositories

To be able to use a foreign branch, we need a so-called remote, that allows interaction with the foreign repository.
If we have cloned a repository with ```git clone```, git has automatically created a remote for us, which is called origin by default. Clone a repository (either from another directory or from a server. Go to the
directory where your clone exists and enter this command:

```
$ git remote -v

```

And what we'll see is the remote origin that we cloned from.

But, let's go back to the most important commands that deal with GitHub:
- ```push``` - after commiting our changes in the repo locally we can push them to the remote repository
- ```pull``` - basically getting a local copy of the latest version of the repo (we already know how to use this one)

### The Pull Request Workflow
The pull request workflow is often preferred because it allows better control over 
which changes flow back into a repository.

Example with GitHub:
1. user A creates a repository on GitHub.
2. user B forks this repository to GitHub. A fork is a kind of clone (read:Copy) that semantically corresponds to a project fork. Thus user B becomes owner of the forked project in their space on GitHub.
3. User B now clones their fork from GitHub onto her local computer and programs e.g. a bugfix or an extension.
4. When she is done, she pushes the change back to her fork on the GitHub server.
5. after that, they can make a pull request in the GitHub web interface. User A is notified of this and can review proposed changes and either  add them to his repository or reject them. In both cases User B will be notified, and if they rejects the changes, they can fix them and submit another pull request.


## Github

In practice collaboration often requires a central repository on an always available server, which is accessible from everywhere, and where one can e.g. can also set up user management and automatic backups.

Since individuals usually neither have their own server, nor do they have to
with the administration of such a server, there are providers that offer such
services (and additional features like bugtracker).

In a basic versionthe repositories are free of charge, but there are also paid versions. The best known
Git services are:
- GitHub (https://github.com/)
- Bitbucket (https://bitbucket.org/)
- GitLab (https://gitlab.com/)

* <b>GitHub</b> is a web-based hosting service for Git repositories. It offers collaboration and management features for projects hosted on its platform.
* Together, Git and GitHub are powerful tools for managing code and collaborating with others.
* GitHub and Git are not the same - GitHub basically makes tools that use Git
* it is a host of source code (actually the largest in the world) and is owned by Microsoft

You're already familiar with GitHub since all of our materials are hosted there. Considering that you all also have a GitHub account (or should have it, like we spoke about last time), you'll now be making your own new repository.

In [22]:
Image(url= "img/git21.png")

In [23]:
Image(url= "img/git22.png")

In [24]:
Image(url= "img/git23.png")

Here you can name you repo, set it to public so that people can see it or private. You can add a readme file with a short description of the repository and a .gitignore file. This file is basically a textual file in which you can add  parts of your project that ypou don't want to share with others (e.g. log files, personal files, hidden files, etc.). Editing and commiting is also possible in GitHub:

In [25]:
Image(url= "img/git24.png")

Since we have already set up a toy local Git repo, let's push that to GitHub. 

In [26]:
Image(url= "img/git25.png")

Copy the URL, or click the clipboard marked in the image above. 

Now paste it the following command:
    
```
 $ git remote add origin https://github.com/<username>/test.git
 ```
 
 Now we are going to push our master branch to the origin url, and set it as the default remote branch:
 
 ```
 $git push --set-upstream origin main

 
 ```
 
 Now, go back into GitHub and see that the repository has been updated:



In [27]:
Image(url= "img/git26.png")

## Collaborating with Github

* Forking: Creating a copy of a repository on your GitHub account.
* Cloning: Downloading a copy of a repository to your local machine.
* Pull Requests: Proposing changes to a repository by submitting a pull request on GitHub.
* Issues: Creating and tracking issues or bugs in a repository on GitHub.


## Best Practices

* Commit regularly and write clear commit messages.
* Use branches to work on new features or bug fixes.
* Pull changes from the remote repository frequently to avoid conflicts.
* Keep your repository organized and maintain a clear directory structure.
* Use pull requests and code reviews to ensure high-quality code.

# A more comprehensive guide

If you have any questions in the future about Git and Github or run into issues, please first consult these more extensive guides:
* https://github.com/lucijakrusic/programming2SS23/blob/main/intro_selected_topics/Git%20%26%20GitHub.ipynb
* https://www.w3schools.com/git/git_intro.asp?remote=github