## BASH and ZSH CLI Cheat Sheet

**Vertical bar**: The vertical bar | represents a pipe in Linux/UNIX systems. It is a form of redirection that sends the output of one program to the input of another program. So for example, `cat file1 | wc` means that the command `cat` will read the contents of `file1` and the send the output as input to the `wc` command. So anytime there is a vertical bar means that there is are multiple commands forming a chain, almost like currying in programming.

**Single dot**: A single dot, aka a period, in a path refers to the current directory. Double dots refer to the parent directory.

**cat**: The `cat` command refers to concatenation and is used for viewing text files in the terminal itself. Example usages below:

 `cat file1 file2`: Shows contents of file1 and file2 in order. Multiple files can viewed this way.
 `cat -n file1`: Shows contents of file1 but also labels each line number so you know which line number a certain piece of text appeared.
 `cat > newfile`: Creates a new file named 'newfile'. When doing this, the shell will expect you to enter some initial text to fill the file. You can simply exit early to not enter anything.
 `cat file1 > file2`: Copies file1 to file2. file2 is overwritten if it exists, or created if it doesn't.
 `cat file1 file2 file3 > file4`: Copies file1, file2, file3 to file4 in order or creates a new file called file4 with the copied contents.  
 `cat file1 >> file2`: Appends file1 to file2.
 `cat >> file1`: Will append anything typed in the terminal to file1.
 `cat < file2`: Provides file2 as an input to the `cat` command which will just read out file2.
 `cat file1 | more`: Shows as much content from file1 in the terminal as possible and if it doesn't fit, will ask user to if they want to see more.

**source**: The `source` command in BASH means to run a file in the current SHELL instance instead of starting a new shell. So `source filename` will run the file in the current shell. A short hand for this command is `. filename`. Notice the space between the period and the filename.

**touch**: The `touch` command is used to create files and change the timestamps/access times of files. So `touch filename1 filename2` creates 2 **empty** files. `touch -c filename` is used to check if a file is created; if it is not, don't create it, just check. 

**./**: This command is similar to the `source` command but will run the file in a new shell instance. So `./filename` runs the file in a new shell.

**> or <**: This isn't so much a command but a redirection operator. It controls the input and output of commands. So if I have a command called `foo` then `foo < filename` will provide the `filename` as the input to the `foo` command. Similarly, `foo > filename.txt` will redirect the output to be put into a text file called `filename.txt`. Using `foo >> filename.txt` will append the output of `foo` to `filename.txt`.

**>>**: As stated above, this redirects the output of a command to a file and appends the output to the file. 

**which**: Used to identify the location of an a given executable that is executed when you type the name of the executable into the shell. So `which python` shows the location of the executable used to run the Python interpreter. So any scripts, programs, executables that are put into the `PATH` variable can have their location found by using `which`. You can list the location of more than one executable at a time: `which executable1 executable2`. 

**echo**: Used to display a string or whatever output of another command to the terminal. So `echo "Hello"` will literally display `Hello` in the terminal. You can use this in conjunction with other commands that return a string output. `echo` is mainly used in bash scripts to display info to the terminal. You can also list out specific files such as `echo *.txt`, which will display all .txt files. You can also display the PATH variable using `echo $PATH`. 

**printenv**: Used to display the current environment and shell variables used by the current shell. Whenever a shell program starts up it gathers a bunch of information that the shell program (as well its child processes) needs in order to run. The info is stored in a data structure called the environment. Environment variables are the info stored within this data structure that are defined for the current shell and inherited by any child shells or processes. Shell variables are data that is stored exclusively within the shell that created them. 

**pwd**: Command that prints the full path of the current working directory to the stdout of the shell. Appending `-P` as an option to the `pwd` command prints the full canonical path of the working directory. If the `-L` option is used then it prints the 'logical' path which is the path that includes any soft/symlinks. This means that the `-L` option will only display up to the first symlink?

**cp**: Copy command. Command structure is `cp [option] source destination`. So `cp remote.sh /local/bin` will copy the file `remote.sh` in the current directory to the `/local/bin` directory. `destination` can also be another file so then `cp` will overwrite the `destination` file with the `source` file. If the `destination` file doesn't exist, then a new file will be created with that filename.  

**~**: Known as a tilde expansion. This has several uses depending on the context and it is dependent on which shell you are running. All characters following the tilde up to the first unquoted slash are considered a 'tilde-prefix'. This 'tilde-prefix' will have different interpretations. The below apply to the bash shell.

- When used by itself in a path such as `~/` then this refers to the `$HOME` environment variable of the current user running the shell. So `~/foo` will be the path to the `foo` directory located in the current user's home directory.
- When followed by unquoted characters, then the shell interprets this as a login or username. So `~foo/` will be the path to the home directory of user `foo`. 
- `~+/` refers to the current working directory. It is equivalent to the output of `$PWD`. 
- `~-/` refers to the old working directory. It is equivalent to the output of `$OLDPWD`. 

**$HOME**: Current user's home directory. On Windows it defaults to `C:\\Users\\currentusername`. 

**$PATH**: Current path variable for running executables. For the cmder terminal, you can put any executables into the `cmder/bin` directory to have it load into the initial `$PATH` variable that gets created when the shell starts up. 

**$PWD**: Stands for "print working directory". This shell environment variable contains the path of the current working directory.

**$OLDPWD**: Stands for "Old print working directory". This shell environment variable contains the last directory before the last the `cd` command was executed. So if I am in directory `~/foo` and I `cd` to `~/foo/bar` then `$PWD` will contain `~/foo` as it was the directory prior to the last `cd` command. 

**hash**: This command allows viewing, modifying, and resetting the hash table. The command `hash -r` will empty and remove all entries in the hash table. 

**rm**: The obvious remove/delete command for files and directories.

# Other Linux Stuff

## SCP
Secure Copy Protocol that is used to transfer files between machines. It runs on port 22. 

The general command structure for SCP is:

`
scp [option] [user_name@source_host:path/to/source/file] [user_name@target_host:target/path] 
`

Here `user_name@source_host:path/to/source/file` is the source machine path. `user_name` is the log-in user name, `source_host` is the machine name, then a colon is used to indicate the path to use for that machine, then the actual `path/to/source/file`. The same goes for the target machine.

If using this command in Powershell, you may have issue indicating the path using `~` to indicate the `$HOME` directory. 

## Command/PATH Hash Table

The command or path hash table is a hash table that contains the directory/disk locations of where to look when a command is run. Basically, it is the table that the `$PATH` variable refers to when looking up commands. This table is managed by the shell program. 


# FILENAME/DIRECTORY STRUCTURE INFORMATION

**~**: Represents the home directory in bash. For Windows systems, this usually defaults to `c:/users/username`.

**period in front of directory**: Indicates the directory is "hidden" from normal view. Also known as dot files or dot directories, this convention comes from Unix and Linux systems as a a way to hide configuration files from the user.

**symbolic link/symlink/soft link**: A file that contains a reference to another file or directory in the form of an absolute or relative path. 

**canonical path**: This is the ''true'' path of the file or directory starting from the root directory. There is only one canonical path for every file or directory.

**absolute path**: This is the path starting from the root directory but it allows for path manipulations like up-level references such as `c:/foo/bar/../example.txt`. Here `example.txt` is located in the `foo` directory since the `..` sends it back to the parent directory. So an absolute path is like the canonical path except it allows path manipulations; consequently, the canonical path is just the shortest absolute path. So for the above example, it would read `c:/foo/example.txt`, in which it removes the additional path manipulations. So there can be multiple absolute paths but only one canonical path. 

**ntpath**: For windows systems, the path components are separated by a **double** backslash

**posix path**: Unix systems have a single forward slash

# GIT AND GITHUB NOTES

## GIT COMMAND SHEET

`git remote -v`: View the remote origin associated with the current repository.

`git branch`: View all branches on the local repository. `-a` to view all branches including remote tracking branches. `-r` to view remote only. `-v` to view the branches + the SHA-1 hash + commit message. `-vv` to view the branches + SHA-1 + commit message + the remote tracking branch that the local branch is linked to. 

`git show HEAD`: Shows the commit that the HEAD pointer is currently pointing to.

`git show-ref`: Show the refs available on the local repository along with the commit IDs. By default it shows the tags, heads, and remote refs.

`git add *.ext`: This will add all files with the appropriate extension to the staging area. So `git add *.txt` will add all `.txt` files to the staging area. 

`git log --all --decorate --oneline --graph`: View the git commit history as a graph with the locations of the local and remote branch refs. 

`git ls-tree <tree-ish>`: View information about the files being tracked in the tree-ish rev. So `git ls-tree main` will view the files located at the root level of the commit that is being pointed to by main branch ref. If you want to view all files, then you call use `git ls-tree -r main` to recursively look into the subtrees to get all the files. 

`git reflog`: A log of all the times a ref in the git commit history is updated. So it shows all the activities of the git refs and the commits they were associated with. So making a commit will update the ref of the current branch which will then be recorded in the reflog. 

## GIT TERMINOLOGY AND COMMANDS

**working tree / working directory**: This is used to refer to the entire project directory of the repo. So it is the basically the directory of your project and all the directories and files inside since a project is just a giant tree structure with different files and directories branching off the root directory.

**git snapshot**: git stores information as a 'snapshot'. So for a repo, the snapshot consists of all the files in the repo along with the current state of those files. So when git loads a snapshot, it will load all the files listed in the snapshot (also removing any files not present in the snapshot) into the repo. 

**git repo**: A directory that has a .git folder inside of it so that git knows how to manage the files. 

**git command structure**: git commands follow the general structure of `git command --options path`. Command refers to the name of the command and options are the optional actions a command can perform. Options always come after the command and have either `-` or `--` before the option name; you can have multiple options for a single command. Path is either the path to a directory or a file that the command should act on. Git documentation usually specifies option in brackets `[]` and directories in angled brackets `<>`. So a command such as `git rm --cached myfile.txt` will be presented in documentation as `git rm [--cached] <directory>`. Some options require a directory as an argument and the documentation will list it as `[--option=<directory>]`.

**Repository**: The basic container for the project, usually in the form of a directory. Git will track any changes to the directory. It will not track changes made to the parent directory. When creating a repository it is automatically set as the master branch. To set a current directory as a repository (aka create a repository), use `git init`.

**Origin**: A commonly accepted term used to indicate the "original" repository, aka the very first repository. You can refer to a local origin or a remote origin. Local origin is the original repository on the local machine. Remote origin is the repository that you cloned from.

**Commits**: Basically a save-point of the files inside a repository. When making a commit, you save the current status of all the files. Making multiple commits builds up a commit history like a sequence of events. There are three stages to making a commit:

**`<rev>`**: A `<rev>` in the git documentation refers to the name of a commit object or possibly a tree object. This could either be the alias of a git object like `main` or the full SHA-1 hash. You can add additional characters to the `<rev>` value such as ^ or ~ to indicate other revs relative to the specified rev. For example `<rev>~2` will indicate a rev that is the 2nd generation ancestor or grand-parent of the specified `<rev>`. So if we have a branch called `b1`, then `b1~2` indicates the grand-parent commit of the latest commit on the branch. [More information found online](https://git-scm.com/docs/gitrevisions)

**`<refname>`**: The alias for a ref, so something like a branch name like `main` or `branch1`. 

**`<tree-ish>`**: This refers to a rev such that if you follow the chain of pointers starting from rev, it will eventually wind up pointing to a tree object. So a `tree-ish` rev is any object, commit or tree, that ultimately down the line ends up pointing to a tree.

**`<commit-ish>`**: Like the tree-ish example, this is something that ultimately ends up point to a commit object.

**Colon or :** = This is typically used to separate a `<rev>` from the path that is used to navigate relative to that rev or within that rev. So `<tree-ish>:path` in the documentation means that we specify a tree-ish rev, which will end up pointing a tree object, and the `path` part refers to the path used to navigate through the tree contained within the tree object. 

**`<refspec>`**: A refspec is a mapping from a source to a target (destination). It has syntax of `<src>:<target>`. So if one wanted to `git push` the local master branch to the remote master branch, we could use: `git push origin master:master`.

The syntax of a refspec is as follows: `[optional +]<src>:<dst>`

`<src>` = a matching pattern for finding refs. For example, `refs/heads/*` will grab all refs in the directory `refs/heads` from the source, which can either be the remote server or another directory on your local machine. Here we are using the wildcard * for globbing all refs. 

`<dst>` = a matching pattern for the target/destination. So `refs/remotes/origin/*` will update all refs in the target/destionation directory. 

`[optional +]` = An optional `+` sign in front of `<src>:<dst>`. This tells git to update the ref even if it isn't a fast-forward. 

Therefore, something like `git push origin b1:b5` will find the ref file for the branch named `b1` and push it to update the `b5` ref file, or `b5` branch, on the remote server. 

**`<pathspec>`**: Similar to the refspec, this is a matching pattern used to find certain files or configure the search so that it starts from a certain directory or place. So `docs/*.jpg` for example, is a pathspec that will grab all jpg images from the docs directory. Meanwhile, `:(top)*.js` tells git to start matching for javascript files from the root of the repository rather than the current working directory. So this will find **all** javascript files in your repository.

## BASIC STEPS FOR ADDING AND COMMITTING FILES

1. Save the file normally without involving git.
2. Staging: When a file is going to be committed, it needs to be added to a staging area. This is like a buffer area before we actually make the commit and let's us look over things before we commit. Here are some commands regarding the staging area:
  * `git status` = get info about what files are currently staged
  * `git add [filename]` = add the file to the staging area
  * `git add .` = add all files in the repository that have been modified to the staging area
  * `git rm --cached [filename]` = remove the file from the staging area
3. Commit: Make a commit and save the current status of the files. Here are some commands for making a commit:
  * `git commit -m "[message]"` = commit the files in the staging area. the option `-m` means to add a message to describe the commit. It is standard to always add a message. Be specific in the message about the commit. When this command is executed, it will print to console the number of lines deleted. This doesn't mean any code was actually deleted. It just detects changes as though you deleted the code and then added it back.
  * `git log` = display the full commit history of the repository.


## GIT OBJECTS

Git is just a giant key-value data store modeled as a linked-list. When something is inserted into a git repo, git returns a unique key that is used to retrieve the stored item. These keys are also used as the pointers within git's linked list structure. Git stores these items as "objects". There are different types of objects:

  * commit = store information about the commit
  * tree = store information about a directory
  * blob = file content stored as raw binary data; blob is short for 'binary large object'
  * annotated tag = store tag information

Each object has a unique hash to identify it, made using SHA-1 (secure hash algorithm). The hash is generated using the contents of the object. So for a source code file, it literally uses the source code itself to generate a hash. This pretty much guarantees every object has a unique hash. For tree objects, which refer to directories, it uses the structure of the directory to calculate the hash.

The hashes are used as the "keys" to an "object" in the linked-list structure. They are also used as pointers to point to other objects. So the hash of a commit is used as a pointer to point to that commit. 

Now git organizes the object types in the following way:

```
commit
  |-- tree
        |-- blob
        |-- blob
        |-- tree
              |-- blob

```

The commit object has a hash that points to a tree object. That tree object has 3 hashes: 2 for blobs, 1 for another tree. That second tree object contains a hash for another blob object. Therefore, git simply stores everything in a giant graph structure. You can view the objects it currently stores in the .git/objects directory. The objects folder will list a bunch of folders, with each folder name only being 2 characters long. Specifically, when git generates a hash for an object, the hash will be 40 characters long. It will take the first 2 characters and use that as the name of the folder and store the remaining 38 characters as a file in that folder.

Let's talk more about how git generates hashes for objects. For blob objects, it prefixes the object content with the word "blob" followed by the object content's length in regular numerical numbers followed by a null character.

Tree hashes are different. Trees objects contain one line per file or subdirectory/subtree with each line having the following format:
`file-permission object-type object-hash file-name`
Example is: `100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 index.txt`

File permission here is 644 (ignoring the 100), and permissions of 644 mean that owner has read/write access of the file while others can only read. Permissions of 600 mean owner can read/write but other people can't do anything. To generate the tree hash, it uses all this information to feed into the SHA-1 algorithm.

As for commits, their hashes are generated using the information contained in them:
  * parent commit hash, so each new commit points to the previous one. May or may not be present depending on commit history.
  * starting tree hash
  * author
  * committer
  * date (including time)
  * commit message

## GIT BRANCHES AND REMOTE BRANCHES

### MASTER BRANCH IS NOW CALLED MAIN BRANCH ON GITHUB

A branch is simply a movable pointer to a commit within a repo; basically the hash of the commit is used as the pointer. Each time a Branch info is stored in .git/refs/heads directory. There are also files located within /.git called "HEAD", and "ORIG_HEAD", and "FETCH_HEAD", and possibly other "HEAD" files.

  * "HEAD" is a file that a pointer to the current branch you are on. Doesn't actually use a hash as a pointer but rather references the file located in .git/refs/heads that contains the pointer. So "HEAD" is basically a pointer to a pointer; it points to the current branch which is also a pointer.
  * "ORIG_HEAD" is the previous commit that "HEAD" was pointing to.
  * "FETCH_HEAD" is the pointer to the tip of the remote branch. So it locally tracks where the remote branch has advanced.

To create a new branch, use the command `git branch <branch-name>`. This creates a new file in the .git/refs/heads directory whose filename is the name of the branch and the contents of the file is the hash of the commit that the previous branch was pointing to when the new branch was created. However this command won't move you to the new branch after the branch is created. To move onto the new branch use: `git checkout <branch-name>`. Now the "HEAD" file will point to the new branch we created.

Now what happens if we make a commit after we made a new branch and switched to it? Well, the new branch will move its pointer to point to the new commit but the previous branch will not move to point to the new commit. Diagrammatically it can look like:

```
//before new commit
                      master
                        |
commit1 <- commit2 <- commit3
                        |
                      new branch
                          |
                        HEAD

//after new commit
                      master
                        |
commit1 <- commit2 <- commit3 <- commit4
                                    |
                                  new branch
                                      |
                                    HEAD

```

So the previous branch, master in this case, will remain stationary but the new branch will move. If we move back the master branch, then the "HEAD" file will point to the master branch which in turn points back to a previous commit.

We can also make a new commit while working back in the old branch. Any new commits made there will have the old branch move to the new commit as expected. So in the above example, we can make a new commit in the master branch. This will add a new commit that will point to commit3 in addition to having commit4 point to it. This means the commit history will look like:

```
//after new commit on master branch

                            HEAD
                              |
                            master
                              |
                          new commit5
                          /
commit1 <- commit2 <- commit3 <- commit4
                                    |
                                  new branch

```

Now what happens if we want to reset back to an old commit? We simply move the branch pointer back to the previous commit. We can do this with the command `git reset <commit HASH>`. Instead of using the hash to explicitly specify the commit we move to, we can do a relative move: `git reset current~<num>`. So `git reset current~2` means to move back 2 commits relative to the current commit we are currently on. So in the above example, if we were on commit4 in the new branch, then `git reset current~2` moves us back to commit2. Just remember that `~` means to move back and `^` means to move forward. So `git reset current^2` moves forward 2 commits.

What if we just want to view a previous commit, but not reset the code back? We use `git checkout <branch>` or `git checkout <commit>`. This simply moves the "HEAD" pointer around to the specified branch or commit. This is known as a detached HEAD state. Normally, the HEAD pointer points to the head of a branch (which in turns points to the last commit in the branch) but by using `git checkout <commit>`, you can force it to point to a specific commit. This means it no longer neccessarily points to the latest commit in a branch. To get back to a non-detached HEAD state, simply checkout a branch. So `git checkout <branch>` will simply move the HEAD pointer back to the ref (pointer) that points to the last commit in the branch you checkout. 

In the detached HEAD state, you are allowed to make changes and make new commits. These new commits will be based off of the commit you originally checked out with the HEAD in the detached state. These commits are not immediately added to your git history and tree. That means these commits are in a sort of limbo state in which they can disappear when no ref points to them. If the commits you make in the detached HEAD state are not needed anymore, simply checkout another branch to move the HEAD pointer somewhere else. Now this pseudo-branch will eventually be garbage collected since no ref or pointer points to the last commit of the branch. If you want to keep the changes you made in this experimental branch, then create a new branch with `git branch <branch-name>`. This will create a new branch head pointer that points to the last commit in the experimental branch and move the HEAD pointer to point at the new branch head pointer. 

The difference between `git checkout` and `git reset` is that `git checkout` moves the HEAD pointer around while `git reset` moves the branch head pointer around in addition to the HEAD pointer. So `git reset` on a branch called b1 will move the b1 pointer somewhere else. 


## GIT REMOTE TRACKING BRANCHES

As for remote branches, the branches in the remote repo, git needs to do a bit of coordination between those branches and the current branches on your local computer. Basically, when cloning a repository from a remote server, your local repository will be setup such that it has two `main` pointers. One is the `main` pointer for your local repository, while the other is `origin/main` which is used to point to the commit that the `main` pointer on the remote server is pointing to. This `origin/main` pointer is a remote tracking branch as it tracks and follows the movement of the `main` pointer on the remote repo. You cannot manually edit or move the position of the `origin/main` pointer. 

So if you clone a git repo, your local repository will look something like:

```
//cloned a remote repo, this is your local repo
                      main
                        |
commit1 <- commit2 <- commit3
                        |
                     origin/main

```

So there are 'two' `main` pointers. Now when working on your local repository, the remote repo may receive updates different from your repo. So the remote repo and your local repo may look like:

```
//the structure of the remote repo

                                                      main
                                                       |
commit1 <- commit2 <- commit3 <- remotecommit4 <- remotecommit5



//your local repo
                                                  main
                                                    |
commit1 <- commit2 <- commit3 <- localcommit4 <-localcommit5
                        |
                     origin/main


```

So now you can see that your local `origin/main` pointer no longer matches with the `main` pointer on the remote repo. To make it match up, you can call `git fetch` which will adjust your local repo so that it looks like the following:

```
//the structure of your local repo after calling git fetch

                                                   origin/main
                                                       |
commit1 <- commit2 <- commit3 <- remotecommit4 <- remotecommit5
                        \
                        localcommit4 <- localcommit5
                                             |
                                            main

```

Note that after calling `git fetch` and having this new branch and split in commit history, you actually won't be able to modify or checkout this new branch. Technically, this new branch doesn't 'exist' yet in your local branch because you can't manually modify the `origin/main` pointer. In order to incorporate these changes, you can use `git merge` to actually merge the branches. 

Or, you can create a local branch that is based off of the remote tracking branch pointer. Let's say that there was an additional branch created on the remote repo:


```
//the structure of your local repo after calling git fetch with additional branch on the remote repo


                               origin/remotebranch1
                                        |
                                  remotecommit6
                                       /            origin/main
                                      /                 |
commit1 <- commit2 <- commit3 <- remotecommit4 <- remotecommit5
                        \
                        localcommit4 <- localcommit5
                                             |
                                            main

```

So now we have an additional remote tracking branch called `origin/remotebranch1`. We can now call `git checkout -b localbranch1 origin/remotebranch1` which will create a new local branch called `localbranch1` that points to the commit that `origin/remotebranch1` points to. This new local branch is called a **tracking branch**. Local tracking branches have a direct relationship with the remote branches they are tracking. So calling `git pull` on one of these branches means that git knows exactly which branch to pull from on the remote repo. You can change which remote branch a local branch is tracking by first checking out the local branch then calling `git branch -u origin/<branch-name>`. You can list the tracking branches with `git branch -vv`. 

Deleting a remote branch can be done by pushing onto the specific branch with `git push origin --delete <branch-name>`. All this does is remove the branch head pointer so that the chain of commits the pointer was associated with eventually gets garbage collected. 

## NAVIGATING COMMITS WITH GIT CHECKOUT AND RESET

**working directory here does not mean the current directory, it refers to the entire project directory/repo**

The git index is the staging area of git. It serves as the space between the actual files on your working directory (the project folder/repository you actually see on your computer) and what is listed in your commit history. So running `git add` simply adds all the files listed in the git index to the commit history. When you run `git checkout`, then git goes to the commit history, locates the files associated with the commit, and then populates your working directory and the index with the files from the commit. So it looks like the index has two sections to it: one for the files to be committed, and one section for the files in the current working directory. 

So when we run a `git add, git commit` command, what happens is the following:

1. A file is edited in the current working directory. 
2. Git detects that there are changes in the current working directory that do not show up in the git index. 
3. A file is added to the staging area. 
4. The git index and current working directory now agree on the current status of the files. 
5. The git index and the HEAD pointer do not agree however. This is because the HEAD is pointing to the last commit made on the branch but the git index has a new potential commit (whose hash will be different) so the index and the HEAD differ.
6. The staged changes are committed and added to the commit history. The HEAD pointer now points to this new commit. Then the HEAD pointer and the git index now agree since the hash of the latest commit (that HEAD points to) agrees with the hash of the changes listed in the index. 

Remember, git identifies things by their SHA-1 hash. So if they have the same hash, which pretty much requires the items to be identical, then everything is the same. 

When you run `git checkout`, then what happens is that the HEAD pointer now points to a new branch ref, then uses the commit that the branch ref points to in order to populate the git index with the files listed in the commit and then populates the current working directory (your repo) with the files listed in the git index. 

Now when running `git reset`, several things will happen:

1. git will move the branch head to point at the commit specified. This also means that the HEAD pointer will now point to a different commit since it points to the branch head which in turn points to a new commit. If you run `git reset --soft <commit>`, then this is all the reset operation will do. 

This also means that now the HEAD pointer now points to a different commit then what the git index has. The git index will be populated with the file information from a commit that is different from the one that the HEAD points to. The git index however will still match up with the current working directory. So only the HEAD pointer has changed but the files listed in the git index and the working directory are still the same and match up. 

Note that 

2. The second step is to update the git index with the files listed in the commit that the HEAD is pointing to now. This is the default behavior of `git reset`. Calling `git reset --mixed` with the `--mixed` option is the same. This means that the git index has been reset to the new commit. So you have effectively 'unstaged' all your changes so this is like undoing `git add` and undoing `git commit`. 

Note that no actual files in your working directory have been changed yet. So nothing has been deleted or added yet. 

3. The third step is optional and only runs if you call `git reset --hard`. Then git will use the files located in the git index to update the working directory. This involves actually deleting/adding files to your working directory. Note that the deletion is only permanent if the changes to the file were not committed. So if you committed the changes, then you can undo the delete by moving back to the original commit to restore the files as the state of the files are contained in that commit. But if you did not, then you would lose the uncommitted changes as there would be no previous commit to recover the information. 

You can modify the git index and working directory independently of the HEAD pointer. Since the HEAD pointer has to point to a full commit, it can't point to individual files within the commit. You can do this by supplying a path to the file relative from the root of your working directory. So `git reset <commit-hash> <path-to-file>` will update the git index so that a specific file in the index matches the state of the file listed in the commit specified by `<commit-hash>`. The working directory and HEAD will not be touched by this command unless you also include the `--hard` option. 

This adjustment of the HEAD pointer and the git index allows you to skip over commits. 

```
//current state of the repo

                    main,HEAD
                        |
commit1 <- commit2 <- commit3


main,HEAD -> commit3
git index -> commit3
working directory -> commit3

```

```
//after calling: git reset --soft HEAD~2

  main,HEAD
   |                     
commit1 <- commit2 <- commit3

main -> commit1
git index -> commit3
working directory -> commit3

```

Next run `git commit`. This will take the information listed in the index, create a new commit, have the new commit point to the commit that the branch head currently points to, then move the branch head and HEAD to point to the new commit

```
//after calling: git commit

  main,HEAD
     |
  commit4
   /                     
commit1 <- commit2 <- commit3

main -> commit4
git index -> commit4 
working directory -> commit4

```
In this case, commit4 is just a copy of commit3. But now commit3 and commit2 are no longer part of the commit history technically since there is no commit or header pointing to them. 

Running `git checkout <branch>` is similar to `git reset --hard` in that it updates the HEAD, git index, and working directory, but there are important differences. First, it does not delete files that have uncommitted changes. `git reset --hard` simply removes everything to match the git index. `git checkout` will keep the files with uncommitted changes around so that when you come back, they will still be there. In addition, `git checkout` only moves the HEAD pointer and not the branch head that the HEAD pointer points to. This is how you get the detached HEAD state.  

`git checkout <branch> <path-to-file>` will do something different. Here the HEAD pointer will not move but the git index will be updated so that the specific file in the index will match the file listed in the commit that HEAD currently points to. The working directory will also be updated to reflect the changes to the file so this will delete/add things to the file that cannot be undone if a commit has not been made. 

## GIT REFS

A ref is just a file containing a commit hash. So it is a shorthanded way of referring to a specific commit; basically it is a pointer to a commit. This is used so that each branch pointer knows which commit to point to. Located in .git/refs, we can see various ref files. As discussed in previous section, the refs/heads directory contains a bunch of files detailing the different branches in the repo; the filename is the name of the branch. But those aren't the only refs. There are different kinds of refs:

  * packed refs: a compression of a bunch of refs into a single file. It is basically a line by line file with each line containing the directory location of the ref and the hash the ref contained.
  * special refs: like the "HEAD", and "FETCH_HEAD" files. The "HEAD" file points to the ref file that contains the commit hash. So it points to a pointer. 

When specifying a ref you can either use the short name, like the name of the branch, or the full name which is basically the full directory path such as refs/heads/master.

Remote refs are local files that contain commit hashes for the commits on the remote repository server. So when you do something like `git push`, you will be updating the remote ref on the server with a new commit hash as well pushing the new tree and blob objects onto the remote server. In addition, when you do something like `git fetch`, you are updating your local repository's remote refs with the information contained on the server. 


## CONNECTING TO GITHUB: PUSH/PULL/FETCH/CLONE/FORK AND OTHER THINGS ABOUT WORKING WITH THE REMOTE REPOSITORY

Github hosts a copy of the repository on the cloud, referred to as the remote repo. To get files and put files into the remote repository, there are several commands that will be useful:

1. `git clone <git-url> <dir>`: This command allows you to copy everything from the remote repository to your computer. This will create a new folder in `<dir>` whose name will be the same as the name of the repository you are copying. You cannot clone into an existing directory if it is not empty. Here are some useful options with this command:
  *

2. `git push <repo> [<refspec>]`: Command to update the remote refs using local refs and sending over any necessary objects to complete the change.

3. `git fetch [<refspec>]`: Update the local repo with the refs from the remote repo along with any objects needed to complete the ref/commit histories. So this command will download files and objects and blobs to update your local repo with the information contained in the remote repo. Remote tracking branches are also updated. Look online for specific options to use with this command. This command will only update the refs and objects needed but will not merge the commits with the commits in the local repo. 

Note that you cannot update the current branch ref that the HEAD pointer is pointing to with `git fetch`. So if you are on the main branch, and you run `git fetch origin main:main`, that is, use the remote main ref to update the local main ref, it is not allowed since HEAD is currently pointing at the local main ref. You can however update other refs that the HEAD pointer is not pointing at. 

4. `git merge`: Join two branches. This typically involves creating a new commit that points back to the two branches that were merged. So if you are on one branch `other` and want to merge into the `main` branch, then you first checkout the `main` branch and then execute `git merge other`. This will create a new commit that then points back to the previous two branches and their commits. 

```
//current branch tree
                      master
                        |
commit1 <- commit2 <- commit3 
              \
               commit4 <- commit5
                              |
                            other

//after merging 'other' with 'master' by calling: git merge other

                                      master
                                        |
commit1 <- commit2 <- commit3 <----- commit6
              \                   /        
               commit4 <- commit5
                              |
                            other


```

So in the above example, a new commit6 is made that then points to commit3 and commit5 and master now points to commit6. 

Not all merges will go smoothly. Specifically, if the two branches to be merged contain different things in the same place in a file, then git will not know which version to pick after the merge. So if line 34 reads "thing" in file1 in the master branch but the same line 34 in the other branch reads "thinggs", then git will have trouble deciding which to pick. Conflicts need to be resolve manually. You can view the status of the merge conflict with `git status` which will show that the merge process is paused. 

Git will automatically add conflict markers into the files that have conflicts. Specifically, it will be added to the version of the files in the current branch you are on. The conflict marker will look like for example:

```
<<<<<<< HEAD:index.html
<div id="footer">contact : email.support@github.com</div>
=======
<div id="footer">
 please contact us at support@github.com
</div>
>>>>>>> iss53:index.html
```

Everything above the `======` will be the stuff that is present in the current branch and everything below is the stuff that is in the branch that wants to merge into the current branch. To resolve the conflict, simply replace the ENTIRE block, <<<<<<, ======, >>>>>> included, and then save the file and add the file to the staging area. For example, a resolution might be to replace the entire block with:

```
<div id="footer">
please contact us at email.support@github.com
</div>
```

After replacement, save the file and add it to the staging area to let git know that the conflict is resolved. Once all file conflicts have been resolved and added to the staging area, you can execute `git commit` to commit the files and finish the merge. 

**git fast fowarding**: Whenever merging two branches such that the source branch can reach the target branch by simplying following the source branch's commit history, then git will simply update the target branch to match the source branch. 

```
//current branch tree
                      master
                        |
commit1 <- commit2 <- commit3
                        |
                      other


//after a few commits
                      master
                        |
commit1 <- commit2 <- commit3 <- commit4 <- commit 5
                                               |
                                              other



//after doing a fast forward merge of the 'other' and 'master' branches
                                             master
                                               |
commit1 <- commit2 <- commit3 <- commit4 <- commit 5
                                               |
                                              other

```

You can undo a `git merge` in several ways. One way is via a `git reset --hard` call. Say you had branch b1 merge into branch b2. Now to undo the merge, which resulted in a merge commit that points to the previous two branches you can reset the branch head of branch b2. So first check out branch b2, then call `git reset --hard HEAD~`. This will move the b2 branch head back one commit, along with the HEAD pointer, update the git index and update the working tree. Then the merge commit will be left over as an unused commit that will be eventually garbage collected. 

Another is to use `git revert`. This command creates a new commit whose contents matches a previous commit. The merge commit will still be in the commit history but will effectively be 'overwritten' by this new commit. When a merge commit happens, the commit will remember the parents in the merge and order them. The 'mainline' parent has parent-number = 1, and the second parent = 2. So merging a branch called b1 into main will have: main = 1, b1 = 2. 

So the command `git revert -m 1 HEAD` will do the following:

1. Look at the commit pointed to by HEAD. In this case the commit should be a merge commit otherwise the `-m 1` option doesn't make sense as there may only be one parent. 
2. Then look at the parent with parent-number = 1, then grab the previous commit from that parent. 
3. Copy that commit into a new commit
4. Attach the commit as part of the commit history, and then update the git index, working tree, HEAD to this new commit.

So graphically, the command looks like:

```
// before calling: git revert. We merge the 'other' branch into the 'main' branch.

                                      master,HEAD
                                        |
commit1 <- commit2 <- commit3 <----- commit6
              \                   /        
               commit4 <- commit5
                              |
                            other


//after calling: git revert -m 1 HEAD

                                                master,HEAD
                                                  |
commit1 <- commit2 <- commit3 <----- commit6 <- commit7(contents match commit3)
              \                   /        
               commit4 <- commit5
                              |
                            other
```

So now a new commit7 is made, which points to commit6 (the merge commit), but the contents of commit7 match commit3 in the main branch before the merge. 

5. `git pull`: This is actually a combination of two commands: `git fetch` and then `git merge`. As such, this command WILL change things in your local repository while `git fetch` will only update the refs and any objects needed but will not change your local repo. 

6. `git rebase`: This changes the parent commit of the current commit you are viewing with HEAD. So if you create a new branch, and then want the commit on this branch to actually be part of the main branch, then you can use `git rebase main` to change the parent commit to be the last commit on the main branch. This will not update the main branch head.

The above explanation is a bit simplified but a more complete picture of what happens is the following:

1. Checkout the branch you want to rebase. Let's say the branch is b1 and you want to rebase it onto the main branch. 
2. Run `git rebase main`
3. git will now go through and find commit that is the common ancestor to both branches. This can graphically look like:

```
                        master
                          |
commit1 <- commit2 <- commit3
              \                          
               commit4 <- commit5
                              |
                            b1,HEAD
```

So in this case, the common ancestor is commit2. 

4. Move the b1 branch history such that the child commit of commit2 that lives on the b1 branch is rebased onto the commit where the main branch ref is pointing to. Since the main branch ref points to commit3, it will be rebased to commit3. Now it will look like:

```
                        master               b1,HEAD
                          |                    |
commit1 <- commit2 <- commit3 <- commit4 <- commit5

```

You can now merge the main branch into the b1 branch with `git merge main`. 

## GITIGNORE

There are some files/directories in your project that you may not want git to commit. This includes compiled code, dependency caches, files generated at runtime, etc. To stop git from working with these files/directories, you make a .gitignore file. The syntax of .gitignore files follows that of Linux globbing patterns. Use `touch .gitignore` to make the file as Windows sometimes has issues with files that only have an extension in their name. 

Example ignores:

`bar/`: Ignores the `bar` directory relative to the .gitignore file. So if the .gitignore file is in the `foo` directory so `foo/.gitignore` and so is `bar`, so `foo/bar`. Then `bar/` will ignore the `bar` directory that is located in the same directory as .gitignore. 

`/bar/goo`: Ignore the `goo` directory that is within the `bar` directory with the `bar` directory being at the same level as the .gitignore file. 

## SETTING UP SSH CONNECTION TO A REMOTE REPOSITORY

There are several ways to add a remote repository. The simplest is to clone the repository. This will implicitly set up `origin` to be the remote repository you cloned from.

You can setup the remote repository to communicate using either HTTPS or SSH. If using HTTPS, you will be prompted to sign in and provide credentials everytime you want to make a push or pull or interacting with the remote. With SSH, this will not be necessary. To set up SSH, follow the general steps:

0. Check for existing SSH keys in use using `ls -al ~/.ssh`. Github can allow the use of one SSH key for multiple repositories but only as an account key which associates the key to a specific user and account. There are also deploy keys which are "accountless" and are not associated with a specific account and just associated with the repository it is attached to. You can't use an SSH key as an account key and a deploy key. You also can't use the SSH key that belongs to a different account or user. 
1. Generate a new SSH private/public key pair using `ssh-keygen -t ed25519 -C github_account_email_address`. This will ask for a place to save the key. You can use the `.ssh` folder found in the home directory which seems to be the default location (not sure about this)
2. Add the key to your Github account. Do this by going to account settings, then access section, then add new SSH key. Copy and paste the public key into the provided field. 
3. Set the url of the remote repository to use the SSH url. You can get the URL by visiting the remote repository page on Github and clicking the "Clone" or "Code" button. Then select the SSH option which will show the URL for the SSH connection to the repository. Use that to set the url of the repository with the command `git remote set-url remote_shorthand_name git@github.com:USERNAME/REPOSITORY.git`. Here `remote_shorthand_name` is the shorthanded name for the repository; most times it is just 'origin'. 
4. Test the SSH connection with `ssh -T git@github.com`. You may or may not get some message that asks if you want to connect. The information displayed is Github's public RSA key fingerprint. Make sure that it matches Github's correct key fingerprint which can be found on the github website. Then enter `yes` and it should show that the connection works. Now you should be able to push/pull from the remote repo on Github.

## DIFFERENCE BETWEEN SHELL, CONSOLE, AND TERMINAL

The **terminal** is simply an interface that accepts inputs and passes them along somewhere else and displays any outputs it receives. In the olden days, this might have been a teletypewriter (TTY) which was literally just a typewriter that was also a computer. Nowadays, we use software versions of the traditional terminals. Once again, these software terminals do the exact same thing. They take your input and **pass it on**; they don't do anything else with the inputs.

The **shell** is the program that a terminal sends inputs to. The shell generates outputs and passes it to the terminal which then displays it. The shell is also the program that is used to start other programs. In UNIX, shell refers to a command-line shell which is basically a command-line + shell combined. This means users can submit commands and start other programs using the shell. The shell is the first thing that a user sees when logging in.

The **console** refers to the physical machine that hosts the terminal. But for all practical purposes nowadays, it is basically a terminal since we don't have physical machines that are dedicated to just hosting a terminal. Console emulators basically simulate a terminal and you can change what shell program you wish the terminal to talk to.

Therefore in order to interact with a shell, you need a terminal to send inputs to it. Many shells also double as command line interfaces so you can also execute computer commands using the shell.