# Overview

This section descirbes some aspects of working in Git.

Git is a distributed version control system, which means it stores all information in multiple places. There are typically two types of hosts where a Git repository can be stored:

- A git hosting some service or server that stores project.
- A developer's computer. The entire porject history is stored on each developer's computer. If needed, they can upload local changes to the hosting server or download changes contributed by other developers.

This is an entry page containing basic information you need to know to start working with `git`. Some sections refer to specific pages that provide more detailed descriptions of various `git` concepts with more advanced examples.

## Repository 

This section discusses what a `git` repository is and what a typical folder needs to have to be considered a `git` repository.

You should use the `git init` command to create a new repository. This command adds a `.git` folder to the directory being initialized as a repository - this folder serves as a marker indicating that the directory is a `git` repository.

---

The following cell creates a folder and initializes a repository in it. This allows you to see the typical messages provided by `git` during these operations.

In [None]:
mkdir /tmp/git_init
cd /tmp/git_init

git init

[33mhint: Using 'master' as the name for the initial branch. This default branch name[m
[33mhint: is subject to change. To configure the initial branch name to use in all[m
[33mhint: [m
[33mhint: 	git config --global init.defaultBranch <name>[m
[33mhint: [m
[33mhint: Names commonly chosen instead of 'master' are 'main', 'trunk' and[m
[33mhint: 'development'. The just-created branch can be renamed via this command:[m
[33mhint: [m
[33mhint: 	git branch -m <name>[m
Initialized empty Git repository in /tmp/git_init/.git/


Check what the empty folder looks like after running `git init`.

In [None]:
ls -la

total 20
drwxrwxr-x  3 fedor fedor  4096 Jan  8 18:47 [0m[01;34m.[0m
drwxrwxrwt 43 root  root  12288 Jan  8 18:50 [30;42m..[0m
drwxrwxr-x  7 fedor fedor  4096 Jan  8 18:47 [01;34m.git[0m


There is a `.git` folder which contains all the files that `git` uses to keep track of changes.

### Git kernel

This section typically uses a specially created jupyter git kernel. This automatically creates a git repo each time you run a cell with `%init` magic in the first line. Therefore, you can just start witht the git commands avoiding, the boilerplate code to manage the repo.

## Stages

Files in git can take three stages:

- `Modified`: means that the file in the working directory has been modified, but these changes aren't tracked by Git.
- `Staged`: means that you marked the current version of the file to be commited.
- `Commited`: means that this version of the file is stored in the local database in the `.git` folder.

The following picture is an adoptation of popular approach to visualise the stages of the stages in git:

![](overview_files/git_stages.svg)

Check more in [Stages](stages.ipynb) page.

## Configuration

Use the `git config` command to work with configurations from the command line.

There are three levels of configuration for the `git` repository. Each level keeps its configuration in the corresponding file and has a corresponding flag for the `git config` command. The following cell maps each configuration level in `git`: 

| Level                               | Configuration file | `git config` flag |
|-------------------------------------|--------------------|-------------------|
| System: for all users of the system | `/etc/gitconfig`   | `--system`        |
| Global: for the user                | `~/.gitconfig`     | `--global`        |
| Local: for the repository           | `./.git/config`    | `--local`         |

For more check the [Configuration](configuration.ipynb) page.

## Status

The `git status` command prints the status of the repository.

To get more compact output use `-s` (`--short`) option, so the result will be just files with corresponding markers:

- The **first symbol** in the marker indicates how a file got into staged area.
- The **second symbol** in the marker indicates if a file modified in unstaged area.
- The `Untracked file` fill be denoted with `??` marker.

---

The following cell creates the, files that have different stages.

In [42]:
%init
echo "initial" > modified
echo "initial" > modified_staged
echo "initial" > modified_staged_modified
git add modified modified_staged modified_staged_modified
git commit -m "add modified" &> /dev/null

echo "initial" > staged_file
git add staged_file &> /dev/null

echo "initial" > new_file
echo "new" >> modified

echo "new" >> modified_staged
git add modified_staged

echo "initial" > new_staged_modified
git add new_staged_modified
echo "new" >> new_staged_modified

echo "new" >> modified_staged_modified
git add modified_staged_modified
echo "new" >> modified_staged_modified

The git status output for such repo is represented in the following cell:

In [43]:
git status

On branch master
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
	[32mmodified:   modified_staged[m
	[32mmodified:   modified_staged_modified[m
	[32mnew file:   new_staged_modified[m
	[32mnew file:   staged_file[m

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	[31mmodified:   modified[m
	[31mmodified:   modified_staged_modified[m
	[31mmodified:   new_staged_modified[m

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	[31mnew_file[m



While most output is easy to understand, but consider some tricky cases:

- `modified_staged_modified`: the case when a file has both staged and unstaged changes, so it appars in both "Changes to be commited" and "Changes not staged for commit".
- `new_staged_modified`: the close to previous case but file was staged as new file and modified, therefore in *"Changes to be commited"* it appears as a *"new file"*.

The corresponding short output is in the following cell:

In [44]:
git status -s

 [31mM[m modified
[32mM[m  modified_staged
[32mM[m[31mM[m modified_staged_modified
[32mA[m[31mM[m new_staged_modified
[32mA[m  staged_file
[31m??[m new_file


## Ignore

Specify the files you want `git` to ignore in the `.gitignore` file. There are several features important to know about git ignore:

- You can specify the patterns in regex-like style.
- You can specify the `.gitignore` file not only in the root directory of the project. If place the file in the nested folder the rules would be applied just to this folder.
- The lines tha begins with `#` symbol would be underestood as comment.

Check details in the [Ignore](ignore.ipynb) page.

---

The following cell adds `file` to the `.gitignore`.

In [46]:
%init

echo "content" > file
echo "content2" > file2

cat << EOF > .gitignore
file
EOF

git status

On branch master

No commits yet

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	[31m.gitignore[m
	[31mfile2[m

nothing added to commit but untracked files present (use "git add" to track)


As a result, there is no `file` in the outputs of the `git status` because it is ignored `git`.