# Lesson 2: Basic command line skills, Git, and Github

## Getting started with the command line

Now, as you did in lesson 1, open up your terminal application

### pwd and ls

Let's start  by using the `pwd` __(print working directory)__ command to figure out what directory we're in. Type the following into terminal:

In [None]:
pwd

`pwd` tells you the **path** of your current directory. The path shows where a file or folder is stored on your computer. The path lists all of the parent directories, separated by slashes (`/`), all the way up to the root directory, which is signified by the initial `/`. You are probably in your home directory.

To list all files and folders in the current directory, we employ the `ls` __(list)__ command:

In [None]:
ls

### cd, change directory:
Let's make sure we are in the home directory:

In [None]:
cd

Invoking the `cd` __(change directory)__ command without specifying a target directory defaults to the home directory. Another way to specify your home directory is by its shortcut, `~/`. In general, the tilde-slash means "home directory." 

Use `pwd` to check where you are now. 

### mkdir and rmdir

To make a directory, the command is `mkdir`, hence the name ***m***a***k***e ***dir***ectory. followed by the name of the directory you want to create. For example, to make a directory called `test`:

In [None]:
mkdir test

You now have an empty directory called test. You can see it if you list the contents of the directory.

In [None]:
ls

You can move into this directory with the `cd` command.

In [None]:
cd test

Verify that you are in the directory by checking your path:

In [None]:
pwd

If you want to move back into your home directioy, you could either type `cd` + [path to home directory], or use the following shorcut to go up one folder in the path:

In [None]:
cd ../

We do not need (nor want) this directory, so let's delete it. To delete an *empty* directory, the command is `rmdir`.

In [None]:
rmdir test

### nano: creating text files

For the next demonstration, I want to start by making a folder and adding a text file to it:

Make the "Test" directory with `mkdir`:

In [None]:
mkdir Test

Move into the "Test" directory with `cd`:

In [None]:
cd Test

Now we're going to use a new command called `nano`. Nano is a simple text-editor that you can acess from Terminal. You can acess the text-editor by typing `nano` into terminal, or you can provide a filename for the text document you want to create:

In [None]:
nano Textfile.txt

This will open the Nano text-editor in Terminal. You can add any text you want, here's an example:
    
    Hello world
    
Once you've written some text, use the `command` + `x` keys to exit Nano. Nano will ask the following:

`Save modified buffer (ANSWERING "No" WILL DESTROY CHANGES) ?`

Press the `y` key to save your file. Nano will then double-check what you want to name the file:

`File Name to Write: Textfile.txt`

If you never provided a file name, you will have the opporutnity to do so now. If you are happy with the file name, hit the `enter / return` key. This will take you back to the normal Terminal window.

Verfiy that you now have a text file called "Textfile.txt" in your Test folder:

In [None]:
ls

You should get "Textfile.txt" in response.

## Setting up Git

The Gold lab uses Git for **version control** and for sharing files, data, and code.

Put briefly, `git` is a **version control system** that allows multiple programmers to work together on the same program while keeping track of the changes each person makes. Git allows locally stored __repositories__ of code to be synchronized against a remotely stored master copy.

Install the latest version of Git with Homebrew:

In [None]:
brew install git

## Set up a GitHub account

GitHub is a company (owned by Microsoft) that provides hosting for software development. Github connects with Git so that local computers can easily interact with software stored on the website.

Go to [http://github.com/](http://github.com/) to get an account. You should register with your academic email address (you can get free repositories as an academic). __Be thoughtful about the username you choose. This website is part of your professional profile--for some careers your GitHub is more important than your resume.__

## Setting your username in Git

You can change the name that is associated with your Git commits using the git config command. The new name you set will be visible in any future commits you push to GitHub from the command line. If you'd like to keep your real name private, you can use any text as your Git username.

Configure your Github username (__Don't forget to replace "Barbara McClintock" with your prefered name!__):

In [None]:
git config --global user.name "Barbara McClintock"

Now configure your Github email (__Don't forget to replace "your_email@example.com" with your actual email address!__)

In [None]:
git config --global user.email "your_email@example.com"

## Forking a repository

Let's say you want to do some work on a project with code stored in a GitHub **repository**, but you are not an active collaborator. For example, there could be a useful package a lab at another university put on GitHub that is useful for your research. You want to do something *almost* exactly like the package does, but need to make some small modifications yourself. You want to clone the repository and add a couple functions and maybe modify one or two they already have, leaving much of the rest of the repository untouched. Of course, you also want to update your local copy of all that untouched (but still used) code when the maintainers update it.

There is a nice way to do this called **forking**. To fork a repository on GitHub, simply navigate got the website of the repository and click the Fork button. Be sure you are logged in as yourself when you do this. 

Click [this link](https://github.com/DavidGoldLab/Gold_Lab_Training) to go to the GitHub page for our lab.

The fork button is in the upper right. Just click the button, and you now have a **fork** of the bootcamp repository on your GitHub account.

### Cloning your fork to your local machine

Now you can clone your fork of the repository to your local machine. We will keep all of your material under version control in a directory called `git` in your home directory. Do the following on the command line.

    mkdir -p ~/git
    
Now you can clone *your forked copy of the repository* (not the original `bootcamp` repository). To do this, navigate your browser to the forked copy of the `Gold_Lab_Training` repository on your account (this is where clicking the "Fork" button took you in your browser). The browser URL will be: `https://github.com/yourusername/Gold_Lab_Training`, and the top left of the website will say something like "`yourusername/Gold_Lab_Training` forked from `DavidGoldLab/Gold_Lab_Training`".


Now, you can clone it by doing the following on the command line, making the obvious substitution for `the_url_you_just_copied`.
    
    cd ~/git
    git clone the_url_you_just_copied
    
You now have a local copy of your own fork of the bootcamp repository. You can add files and edit it. When you commit and push, it will all be on your account, and the master repository will not see the changes.

### Syncing your forked repository to the upstream repository

As I mentioned before, you want to be able to sync your repository with the original `Gold_Lab_Training` repository so you can retrieve any updates in it. The original repository is typically called the **upstream repository**, since presumably you are changing it, so you are downstream. You want the upstream repository to be a **remote** repository, which is just what we call a repository we track and fetch and merge from. To see which repositories are remote, do the following on the command line.

    cd Gold_Lab_Training
    git remote -v

The `-v` just means "verbose," so it will also tell you the URLs. Entering that now will show a single repository, `origin`, which you can fetch from and push to. In your case, `origin` your fork of the bootcamp repository.

We now want to add the upstream repository. To do this, add the **original** repository as the upstream repository.

    git remote add upstream https://github.com/DavidGoldLab/Gold_Lab_Training

Now try doing `git remote -v`, and you will see that you are now also tracking the upstream repository.

If you ever want to delete a repository, go to the git directory and remove the repository using the force (`-f`), and recursive (`-r`) flags:

    cd ~\git\
    rm -rf Gold_Lab_Training

## Committing, Pushing and Pulling

Now that you've got a local copy and a copy on your GitHub account, there are four things that you'll need to know how to do in order to interact with your forked repository on Github:

- __Commit__ - This command records changes you have made to the repository. Think of it as a snapshot of the current status of the project. Commits are done locally (in other words, on your personal computer).
- __Push__ - This command "pushes" the recent commit from your local repository up to GitHub. If you're the only one working on a repository, pushing is fairly simple. If there are others accessing the repository, you may need to pull before you can push.
- __Pull__ - This command "pulls" any changes from the GitHub repository and merges them into your local repository.
- __Sync__ - syncing is like pulling, but instead of connecting to your GitHub copy of the forked repo, it goes back to the original repository and brings in any changes. Once you've synced your repository, you need to push those changes back to your GitHub account.


### mv: moving and renaming files

Uh-oh! That gave us some strange output, talking about the usage of `mv`. This is because the space in the file `some sequence.fasta` was interpreted as a gap between arguments of the `mv` command. To specify that the space is part of the file name, we need to use an **escape character**. The escape character for macOS or Linux is `\`. This works (but don't do it just yet):

    `mv some\ sequence.fasta some_sequence.fasta   # Don't do this`

Because these files are under version control, you should precede the `mv` command with `git`. That way, Git will keep track of the naming changes you made. So, do this:

    `git mv some\ sequence.fasta some_sequence.fasta`
    
Now, we probably want this file in the `sequences` directory. We can also move files into directories (without changing their file names) using the `mv` command.

    git mv some_sequence.fasta sequences/

The trailing slash is not necessary, but I always include it out of habit to remind myself that I am moving a file to a directory.

Now let's go into the `sequences` directory and see what we have.

    cd sequences
    ls

We see that `some_sequence.fasta` is there, along with other FASTA files.

### Exploring file content

We would like to see what is in the sequence files. Bash offers various ways to display the content of files. We'll look at the genome of the dengue virus in the file `dengue.fasta`. There are lots of ways to do it. We'll start with `less`. It got its name because it is more feature-rich than `more`, which was used to look at files before `less` came to be. ("`less` is `more`," get it?) It allows using the arrow up and arrow down keys traverse up or down by line. It also allows scrolling by touchpad or mouse. Since it doesn't require the whole file to be read before displaying the top content, it's ideal for larger files. It also supports searching initiated by "/" followed by the query; `shift+g` will go to the end of the file; `gg` to the beginning; and you can specify a line number by "`:`" followed by the line number.

    `less dengue.fasta`
    
To exit `less` or `more`, hit `Q`.

We'll now look at several other ways to look at files.  Just substitute them for `less` in the above command.

#### cat
`cat` prints the entire file to the standard output (terminal). This is especially useful if the files are very small.

#### head
`head` just prints the top lines of the file to the standard output. The default can be changed:

    head -5 

This will print the first 5 lines to the standard output.

#### tail
Like `head`, but for the last lines of the file.

### Copying files and directories: cp

If you want to retain a copy of the folder/file in the original folder you can use the copy command `cp`. It works straightforwardly with files. Applied to directories it requires a **flag**: `cp -r`, meaning "recursive." A **flag** typically begins with a hyphen (`-`) and gives the command some extra directions on how you want to do things. In this case, we are telling `cp` to work recursively.

Let's have a look at the `cp` command in action.

    cp dengue.fasta copy_of_dengue.fasta

Maybe we want a copy of the entire `sequences` directory. To do that, we will `cd` one directory up to the `command_line_tutorial` directory.

    cd ../

We went up one directory using `../`.  This is an example of a **relative path**.  The current directory is "`./`", "`../../`" is two directories up, "`../../../`" is three directories up, and so on. This is very very useful when navigating directory structures. Now let's try copying an entire directory with the `-r` flag.

    cp -r sequences copy_of_sequences

We can also rename directories with the `mv` command. Let's rename `copy_of_sequences` to `sequences_copy`. This is silly, but illustrates how things work.

    mv copy_of_sequences sequences_copy

### Removing files and directories with rm

Yes, some of the things we just did are silly. We have no need for having a copy of a given sequence or a copy of the whole sequences directory. We can clean things up by deleting them. First, let's get rid of our copy of the dengue sequence. Let's `cd` into the sequences directory and make sure it's there.

    cd sequences
    ls

Now let's remove the file and verify it is gone.

    rm copy_of_dengue.fasta
    ls

And poof!, its gone! And I mean gone. It is pretty much irrecoverable. **Warning**: `rm` is a wrecking ball.  It will destroy any files you have that do not have restrictive permissions.  This is so important, I will say it again.

<center><font color="red"><b>rm is unforgiving</b></font></center><br />

Therefore, I always like to use the `-i` flag, which means that `rm` will ask me if I'm sure before deletion.

    rm -i some_sequence.fasta

You will get a prompt. Answer "`n`" if you do not want to delete it.

Now, let's use `rm` to remove an entire directory. To do this, we need to use the `-r` flag.

    rm -r sequences_copy

### Aliases (PowerShell users, skip this section)

Yes, `rm` is a wrecking ball, but we can temper it using the `-i` flag. For safety, we would like `rm` to always ask us about deletion. We can instruct `bash` to do this for us by creating an **alias**.

    alias rm="rm -i"

After executing this, any time we use `rm`, `bash` will instead execute `rm -i`, thereby keeping us out of trouble.

One of my favorite aliases is to make `ls` list things more prettily.

    alias ls="ls -FG"

The `-F` flag makes `ls` put a slash at the end of directories. This helps us tell the difference between files and directories. The `-G` flag enables coloring of the output, also useful for differentiating file types.

### Updating your bootcamp directory

You have now updated the name of the file `some_sequence.fasta`. Git kept track of that, so you should **commit** and push your change. We will talk more about Git later in the bootcamp. For now, do the following commands to commit your change and then **push** the change to your **master branch**, which is your fork.

    git commit -m "Changed file name of some_sequence.fasta."
    git push origin master