## BASH CLI Cheat Sheet

**Vertical bar**: The vertical bar | represents a pipe in Linux/UNIX systems. It is a form of redirection that sends the output of one program to the input of another program. So for example, `cat file1 | wc` means that the command `cat` will read the contents of `file1` and the send the output as input to the `wc` command. So anytime there is a vertical bar means that there is are multiple commands forming a chain, almost like currying in programming.

**Single dot**: A single dot, aka a period, in a path refers to the current directory. Double dots refer to the parent directory.

**cat**: The `cat` command refers to concatenation and is used for viewing text files in the terminal itself. Example usages below:

 `cat file1 file2`: Shows contents of file1 and file2 in order. Multiple files can viewed this way.
 `cat -n file1`: Shows contents of file1 but also labels each line number so you know which line number a certain piece of text appeared.
 `cat > newfile`: Creates a new file named 'newfile'. When doing this, the shell will expect you to enter some initial text to fill the file. You can simply exit early to not enter anything.
 `cat file1 > file2`: Copies file1 to file2. file2 is overwritten if it exists, or created if it doesn't.
 `cat file1 file2 file3 > file4`: Copies file1, file2, file3 to file4 in order or creates a new file called file4 with the copied contents.  
 `cat file1 >> file2`: Appends file1 to file2.
 `cat >> file1`: Will append anything typed in the terminal to file1.
 `cat < file2`: Provides file2 as an input to the `cat` command which will just read out file2.
 `cat file1 | more`: Shows as much content from file1 in the terminal as possible and if it doesn't fit, will ask user to if they want to see more.

**source**: The `source` command in BASH means to run a file in the current SHELL instance instead of starting a new shell. So `source filename` will run the file in the current shell. A short hand for this command is `. filename`. Notice the space between the period and the filename.

**touch**: The `touch` command is used to create files and change the timestamps/access times of files. So `touch filename1 filename2` creates 2 **empty** files. `touch -c filname` is used to check if a file is created; if it is not, don't create it, just check. 

**./**: This command is similar to the `source` command but will run the file in a new shell instance. So `./filename` runs the file in a new shell.

**> or <**: This isn't so much a command but a redirection operator. It controls the input and output of commands. So if I have a command called `foo` then `foo < filename` will provide the `filename` as the input to the `foo` command. Similarly, `foo > filename.txt` will redirect the output to be put into a text file called `filename.txt`. Using `foo >> filename.txt` will append the output of `foo` to `filename.txt`.

**>>**: As stated above, this redirects the output of a command to a file and appends the output to the file. 

**which**: Used to identify the location of an a given executable that is executed when you type the name of the executable into the shell. So `which python` shows the location of the executable used to run the Python interpreter. So any scripts, programs, executables that are put into the `PATH` variable can have their location found by using `which`. You can list the location of more than one executable at a time: `which executable1 executable2`. 

**echo**: Used to display a string or whatever output of another command to the terminal. So `echo "Hello"` will literally display `Hello` in the terminal. You can use this in conjunction with other commands that return a string output. `echo` is mainly used in bash scripts to display info to the terminal. You can also list out specific files such as `echo *.txt`, which will display all .txt files. You can also display the PATH variable using `echo $PATH`. 

**printenv**: Used to display the current environment and shell variables used by the current shell. Whenever a shell program starts up it gathers a bunch of information that the shell program (as well its child processes) needs in order to run. The info is stored in a data structure called the environment. Environment variables are the info stored within this data structure that are defined for the current shell and inherited by any child shells or processes. Shell variables are data that is stored exclusively within the shell that created them. 

**pwd**: Command that prints the full path of the current working directory to the stdout of the shell. Appending `-P` as an option to the `pwd` command prints the full canonical path of the working directory. If the `-L` option is used then it prints the 'logical' path which is the path that includes any soft/symlinks. This means that the `-L` option will only display up to the first symlink?

**~**: Known as a tilde expansion. This has several uses depending on the context and it is dependent on which shell you are running. All characters following the tilde up to the first unquoted slash are considered a 'tilde-prefix'. This 'tilde-prefix' will have different interpretations. The below apply to the bash shell.

- When used by itself in a path such as `~/` then this refers to the `$HOME` environment variable of the current user running the shell. So `~/foo` will be the path to the `foo` directory located in the current user's home directory.
- When followed by unquoted characters, then the shell interprets this as a login or username. So `~foo/` will be the path to the home directory of user `foo`. 
- `~+/` refers to the current working directory. It is equivalent to the output of `$PWD`. 
- `~-/` refers to the old working directory. It is equivalent to the output of `$OLDPWD`. 

**$HOME**: Current user's home directory. On Windows it defaults to `C:\\Users\\currentusername`. 

**$PATH**: Current path variable for running executables. For the cmder terminal, you can put any executables into the `cmder/bin` directory to have it load into the initial `$PATH` variable that gets created when the shell starts up. 

**$PWD**: Stands for "print working directory". This shell environment variable contains the path of the current working directory.

**$OLDPWD**: Stands for "Old print working directory". This shell environment variable contains the last directory before the last the `cd` command was executed. So if I am in directory `~/foo` and I `cd` to `~/foo/bar` then `$PWD` will contain `~/foo` as it was the directory prior to the last `cd` command. 



# FILENAME/DIRECTORY STRUCTURE INFORMATION

**~**: Represents the home directory in bash. For Windows systems, this usually defaults to `c:/users/username`.

**period in front of directory**: Indicates the directory is "hidden" from normal view. Also known as dot files or dot directories, this convention comes from Unix and Linux systems as a a way to hide configuration files from the user.

**symbolic link/symlink/soft link**: A file that contains a reference to another file or directory in the form of an absolute or relative path. 

**canonical path**: This is the ''true'' path of the file or directory starting from the root directory. There is only one canonical path for every file or directory.

**absolute path**: This is the path starting from the root directory but it allows for path manipulations like up-level references such as `c:/foo/bar/../example.txt`. Here `example.txt` is located in the `foo` directory since the `..` sends it back to the parent directory. So an absolute path is like the canonical path except it allows path manipulations; consequently, the canonical path is just the shortest absolute path. So for the above example, it would read `c:/foo/example.txt`, in which it removes the additional path manipulations. So there can be multiple absolute paths but only one canonical path. 

**ntpath**: For windows systems, the path components are separated by a **double** backslash

**posix path**: Unix systems have 

## PYTHON OS.PATH MODULE

The path structure is dependent on the operating system that Python is running on. So Windows machines will naturallly use `ntpath` and Unix uses `posixpath`. If you want to use one specific convention over the other, you can instead import `posixpath` or `ntpath` which will provide the same interface as the `os.path` moodule. So you can use `import posixpath` to use the posix style path convention on a Windows machine if you want. 

`os.path.realpath`: Derefernces symbolic links to get the canonical path

`os.path.split`: Splits path into tuple of (head, tail). The tail part will never contain a slash so if the path ends with a slash, then the tail will be empty. 

`os.path.normpath`: Takes a pathname and collapses redundant separators and up-level references. So on a posix system, `a//b` and `a/b/` and `a/./b` and `a/foo/../b` become `a/b` since the extra slashes or up-level references are redundant. On Windows, it can be used to convert forward slashes in path names to backward slashes. 

# GIT AND GITHUB NOTES

## BASIC GIT TERMINOLOGY AND COMMANDS

**git command structure**: git commands follow the general structure of `git command --options path`. Command refers to the name of the command and options are the optional actions a command can perform. Options always come after the command and have either `-` or `--` before the option name; you can have multiple options for a single command. Path is either the path to a directory or a file that the command should act on. Git documentation usually specifies option in brackets `[]` and directories in angled brackets `<>`. So a command such as `git rm --cached myfile.txt` will be presented in documentation as `git rm [--cached] <directory>`. Some options require a directory as an argument and the documentation will list it as `[--option=<directory>]`.

**Repository**: The basic container for the project, usually in the form of a directory. Git will track any changes to the directory. It will not track changes made to the parent directory. When creating a repository it is automatically set as the master branch. To set a current directory as a repository (aka create a repository), use `git init`.

**Origin**: A commonly accepted term used to indicate the "original" repository, aka the very first repository. You can refer to a local origin or a remote origin. Local origin is the original repository on the local machine. Remote origin is the repository that you cloned from.

**Commits**: Basically a save-point of the files inside a repository. When making a commit, you save the current status of all the files. Making multiple commits builds up a commit history like a sequence of events. There are three stages to making a commit:

**Basic Steps for adding and committing files**

1. Save the file normally without involving git.
2. Staging: When a file is going to be committed, it needs to be added to a staging area. This is like a buffer area before we actually make the commit and let's us look over things before we commit. Here are some commands regarding the staging area:
  * `git status` = get info about what files are currently staged
  * `git add [filename]` = add the file to the staging area
  * `git add .` = add all files in the repository that have been modified to the staging area
  * `git rm --cached [filename]` = remove the file from the staging area
3. Commit: Make a commit and save the current status of the files. Here are some commands for making a commit:
  * `git commit -m "[message]"` = commit the files in the staging area. the option `-m` means to add a message to describe the commit. It is standard to always add a message. Be specific in the message about the commit. When this command is executed, it will print to console the number of lines deleted. This doesn't mean any code was actually deleted. It just detects changes as though you deleted the code and then added it back.
  * `git log` = display the full commit history of the repository.


## GIT OBJECTS

Git is just a giant key-value data store modeled as a linked-list. When something is inserted into a git repo, git returns a unique key that is used to retrieve the stored item. These keys are also used as the pointers within git's linked list structure. Git stores these items as "objects". There are different types of objects:

  * commit = store information about the commit
  * tree = store information about a directory
  * blob = file content stored as raw binary data; blob is short for 'binary large object'
  * annotated tag = store tag information

Each object has a unique hash to identify it, made using SHA-1 (secure hash algorithm). The hash is generated using the contents of the object. So for a source code file, it literally uses the source code itself to generate a hash. This pretty much guarantees every object has a unique hash. For tree objects, which refer to directories, it uses the structure of the directory to calculate the hash.

Now git organizes the object types in the following way:

```
commit
  |-- tree
        |-- blob
        |-- blob
        |-- tree
              |-- blob

```

The commit object has a hash that identifies a tree object. That tree object has 3 hashes: 2 for blobs, 1 for another tree. That second tree object contains a hash for another blob object. Therefore, git simply stores everything in a giant graph structure. You can view the objects it currently stores in the .git/objects directory. The objects folder will list a bunch of folders, with each folder name only being 2 characters long. Specifically, when git generates a hash for an object, the hash will be 40 characters long. It will take the first 2 characters and use that as the name of the folder and store the remaining 38 characters as a file in that folder.

Let's talk more about how git generates hashes for objects. For blob objects, it prefixes the object content with the word "blob" followed by the object content's length in regular numerical numbers followed by a null character.

Tree hashes are different. Trees objects contain one line per file or subdirectory/subtree with each line having the following format:
`file-permission object-type object-hash file-name`
Example is: `100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 index.txt`

File permission here is 644 (ignoring the 100), and permissions of 644 mean that owner has read/write access of the file while others can only read. Permissions of 600 mean owner can read/write but other people can't do anything. To generate the tree hash, it uses all this information to feed into the SHA-1 algorithm.

As for commits, their hashes are generated using the information contained in them:
  * parent commit hash, so each new commit points to the previous one. May or may not be present depending on commit history.
  * starting tree hash
  * author
  * committer
  * date (including time)
  * commit message

## GIT BRANCHES AND REMOTE BRANCHES

### MASTER BRANCH IS NOW CALLED MAIN BRANCH ON GITHUB

A branch is simply a movable pointer to a commit within a repo; basically the hash of the commit is used as the pointer. Each time a Branch info is stored in .git/refs/heads directory. There are also files located within /.git called "HEAD", and "ORIG_HEAD", and "FETCH_HEAD", and possibly other "HEAD" files.

  * "HEAD" is a file that a pointer to the current branch you are on. Doesn't actually use a hash as a pointer but rather references the file located in .git/refs/heads that contains the pointer. So "HEAD" is basically a pointer to a pointer; it points to the current branch which is also a pointer.
  * "ORIG_HEAD" is the previous commit that "HEAD" was pointing to.
  * "FETCH_HEAD" is the pointer to the tip of the remote branch. So it locally tracks where the remote branch has advanced.

To create a new branch, use the command `git branch <branch-name>`. This creates a new file in the .git/refs/heads directory whose filename is the name of the branch and the contents of the file is the hash of the commit that the previous branch was pointing to when the new branch was created. However this command won't move you to the new branch after the branch is created. To move onto the new branch use: `git checkout <branch-name>`. Now the "HEAD" file will point to the new branch we created.

Now what happens if we make a commit after we made a new branch and switched to it? Well, the new branch will move its pointer to point to the new commit but the previous branch will not move to point to the new commit. Diagrammatically it can look like:

```
//before new commit
                      master
                        |
commit1 <- commit2 <- commit3
                        |
                      new branch
                          |
                        HEAD

//after new commit
                      master
                        |
commit1 <- commit2 <- commit3 <- commit4
                                    |
                                  new branch
                                      |
                                    HEAD

```

So the previous branch, master in this case, will remain stationary but the new branch will move. If we move back the master branch, then the "HEAD" file will point to the master branch which in turn points back to a previous commit.

We can also make a new commit while working back in the old branch. Any new commits made there will have the old branch move to the new commit as expected. So in the above example, we can make a new commit in the master branch. This will add a new commit that will point to commit3 in addition to having commit4 point to it. This means the commit history will look like:

```
//after new commit on master branch

                            HEAD
                              |
                            master
                              |
                          new commit5
                          /
commit1 <- commit2 <- commit3 <- commit4
                                    |
                                  new branch

```

Now what happens if we want to reset back to an old commit? We simply move the branch pointer back to the previous commit. We can do this with the command `git reset <commit HASH>`. Instead of using the hash to explicitly specify the commit we move to, we can do a relative move: `git reset current~<num>`. So `git reset current~2` means to move back 2 commits relative to the current commit we are currently on. So in the above example, if we were on commit4 in the new branch, then `git reset current~2` moves us back to commit2. Just remember that `~` means to move back and `^` means to move forward. So `git reset current^2` moves forward 2 commits.

What if we just want to view a previous commit, but not reset the code back? We use `git checkout <branch>` or `git checkout <commit>`. This simply moves the "HEAD" pointer around to the specified branch or commit.

As for remote branches, the branches in the remote repo, git needs to do a bit of coordination between those branches and the current branches on your local computer.

## GIT REFS

A ref is just a file containing a commit hash. So it is a shorthanded way of referring to a specific commit. This is used so that each branch pointer knows which commit to point to. Located in .git/refs, we can see various ref files. As discussed in previous section, the refs/heads directory contains a bunch of files detailing the different branches in the repo; the filename is the name of the branch. But those aren't the only refs. There are different kinds of refs:

  * packed refs: a compression of a bunch of refs into a single file. It is basically a line by line file with each line containing the directory location of the ref and the hash the ref contained.
  * special refs: like the "HEAD", and "FETCH_HEAD" files. The "HEAD" file points to the ref file that contains the commit hash. So it points to a pointer. 

When specifying a ref you can either use the short name, like the name of the branch, or the full name which is basically the full directory path such as refs/heads/master.

A refspec is a mapping of a source branch to a target branch. It has syntax of `<src>:<target>`. So if one wanted to `git push` the local master branch to the remote master branch, we could use: `git push origin master:master`.

## CONNECTING TO GITHUB: PUSH/PULL/FETCH/CLONE/FORK AND OTHER THINGS ABOUT WORKING WITH THE REMOTE REPOSITORY

Github hosts a copy of the repository on the cloud, referred to as the remote repo. To get files and put files into the remote repository, there are several commands that will be useful:

1. `git clone <git-url> <dir>`: This command allows you to copy everything from the remote repository to your computer. This will create a new folder in `<dir>` whose name will be the same as the name of the repository you are copying. You cannot clone into an existing directory if it is not empty. Here are some useful options with this command:
  *

2. `git push <repo> [<refspec>]`: Command to update the remote refs using local refs and sending over any necessary objects to complete the change.


## GITIGNORE

There are some files/directories in your project that you may not want git to commit. This includes compiled code, dependency caches, files generated at runtime, etc. To stop git from working with these files/directories, you make a .gitignore file. The syntax of .gitignore files follows that of Linux globbing patterns. Use `touch .gitignore` to make the file as Windows sometimes has issues with files that only have an extension in their name. 

Example ignores:

`bar/`: Ignores the `bar` directory relative to the .gitignore file. So if the .gitignore file is in the `foo` directory so `foo/.gitignore` and so is `bar`, so `foo/bar`. Then `bar/` will ignore the `bar` directory that is located in the same directory as .gitignore. 

`/bar/goo`: Ignore the `goo` directory that is within the `bar` directory with the `bar` directory being at the same level as the .gitignore file. 


## GIT COMMAND SHEET

**`git remote -v`**: View the remote origin associated with the current repository.

**`git branch`**: View all branches on the local repository. 

**`git show HEAD`**: Shows the commit that the HEAD pointer is currently pointing to.

**`git show-ref`**: Show the refs available on the local repository along with the commit IDs. By default it shows the tags, heads, and remote refs.

**`git add *.ext`**: This will add all files with the appropriate extension to the staging area. So `git add *.txt` will add all `.txt` files to the staging area. 

## DIFFERENCE BETWEEN SHELL, CONSOLE, AND TERMINAL

The **terminal** is simply an interface that accepts inputs and passes them along somewhere else and displays any outputs it receives. In the olden days, this might have been a teletypewriter (TTY) which was literally just a typewriter that was also a computer. Nowadays, we use software versions of the traditional terminals. Once again, these software terminals do the exact same thing. They take your input and **pass it on**; they don't do anything else with the inputs.

The **shell** is the program that a terminal sends inputs to. The shell generates outputs and passes it to the terminal which then displays it. The shell is also the program that is used to start other programs. In UNIX, shell refers to a command-line shell which is basically a command-line + shell combined. This means users can submit commands and start other programs using the shell. The shell is the first thing that a user sees when logging in.

The **console** refers to the physical machine that hosts the terminal. But for all practical purposes nowadays, it is basically a terminal since we don't have physical machines that are dedicated to just hosting a terminal. Console emulators basically simulate a terminal and you can change what shell program you wish the terminal to talk to.

Therefore in order to interact with a shell, you need a terminal to send inputs to it. Many shells also double as command line interfaces so you can also execute computer commands using the shell.