One of the most useful ways to use Git is in conjunction with [GitHub](https://github.com/), a website built on Git, but with a familiar GUI interface. Using Git with GitHub allows us to push our code to remote repositories. This enables us to:

* Share our code with others and build a portfolio.
* Collaborate with others on a project and build code together.
* Download and use code others have created.

Remote repositories aren't just useful for building a portfolio. Pushing to GitHub also allows us to collaborate with others on code. For example, thousands of different [contributors](https://github.com/torvalds/linux/graphs/contributors) are developing [Linux](https://github.com/torvalds/linux) on GitHub. Many companies, including [Google](https://github.com/google) and [Facebook](https://github.com/facebook), also use GitHub to work on code projects across teams.

Remote repositories also enable us to access and use code we didn't write. For instance, [this repo](https://github.com/amzn/amazon-dsstne) will let us download Amazon's Deep Learning tools and start training models. Because the reposistory is public, anyone can download and use it. Repositories on GitHub can also be private, in which case they're hidden, and not accessible to others.

To download a remote repository to our own computer, we'll need to clone it. cloning copies a repository from one location (in this case, a remote one) to a folder on our computer. The repository retains all of its Git history, and we can work with it just like we would with a Git repository we created ourself

We use [the git](https://git-scm.com/docs/git-clone) clone command to clone a remote repository. If we were cloning a repository we found on GitHub, we'd specify the GitHub URL for that repository.

Here's how we'd typically clone the [Amazon Deep Learning repo](https://github.com/amzn/amazon-dsstne) from GitHub:

git clone https://github.com/amznlabs/amazon-dsstne.git

https://github.com/amznlabs/amazon-dsstne.git is the URL for the Git repository we're cloning. Our clone command will automatically create a folder named amazon-dsstne in our current folder, and place the repository there.

However, because we're working with a simplified remote repository for the purposes of this file, we'll clone it a bit differently:

git clone /waqas/user/git/chatbot

This will clone the repository from /waqas/user/git/chatbot, a path on our local computer, to our current folder, and place it in a subfolder named chatbot.

If we specify a second argument to git clone, we can change the folder the repository saves to:

git clone /waqas/user/git/chatbot silentbot

This command will place the chatbot repository in a folder called silentbot.

Now that we've cloned a repository, we can makes changes to it, just like we did in the last file. We'll be able to edit files, add them to the staging area, and then commit the changes. The local version of the repo will then reflect the changes, but the remote version won't.

Review the following diagram carefully. It illustrates the relationship between the local repo and the remote repo, and how they're separate:

![image.png](attachment:image.png)

After making the commit in the diagram, the local repo will have one more commit than the remote repo, and the file README.md will be different.

Most GitHub projects include a README.md file, which helps people understand what the project is about and how to install it. It's common to write the README file in [Markdown](https://daringfireball.net/projects/markdown/syntax) format, which allows us to create lists and other complex but useful structures in plain text. The Markdown format has an .md file extension.

Similar to the diagram above, we'll edit the README.md file to add a line, then commit it to the repository. It's important to add informative messages when comitting to shared repositories, so that other people can figure out what each commit is doing without having to read through the code. This is very important when debugging code that multiple people are working on.

**Task:**

* cd into the chatbot folder to navigate to the chatbot repo.
* Add the line **This project needs no installation** to the bottom of README.md.
* Add our changes to the staging area using git add.
* Commit our changes using git commit, with the commit message Updated README.md.
* Run git status to see the status of the repo.

**Answer**

* cd chatbot
* printf "This project needs no installation" >> README.md .  we can also use echo -e instead of printf
* git add README.md
* git commit -m "Updated README.md"
* git status

When we ran git status above, our output looked something like this:

![image.png](attachment:image.png)

The first two lines mention the terms branch, master, and origin. Every Git repository consists of one or more branches. Each branch contains a slightly different version of the code. The important fact to know is that the main branch of a Git repo is typically called master. Developers create separate branches when they want to work on new features for a project, then add the commits in those branches back into master when the features are ready.

All of the changes we've made so far have been on the master branch of the chatbot repo. The master branch is usually the most up-to-date shared version of any code project.

We can check which branch we're on with the [git branch](https://git-scm.com/docs/git-branch) command. This command will list all of the branches in the repo. It will also highlight the currently active branch, and add an asterisk next to its name.

Once we've made changes to the local version of a repo, we can push those changes to the remote repo so that everyone can see them. Edits we make locally are only reflected in our local repo. Unless we push them to the remote, the remote repo doesn't change.

To do this, we'll need to use the [git push](https://git-scm.com/docs/git-push) command, which pushes commits from our local repo to the remote repo. Here's a diagram showing what happens when we run git push:

![image.png](attachment:image.png)

As the diagram shows, until we push the branch to the remote repo, the changes are only in our local repo. Pushing to the remote will update the remote with our latest changes. Anyone else who pulls from the remote repo will then have access to the same two commits that we have in our local repo.

When we run git push, we need to specify both the name of the remote repo to push to, and the name of the branch we're pushing. When we clone a repo, Git automatically names the remote repo origin. This means that the following command will push the master branch to the remote repo:

**git push origin master**

It's possible, but rare, that a remote will have a name other than origin. If we're unsure, we can list the remote(s) associated with our local repo using [git remote](https://git-scm.com/docs/git-remote).

The git remote command will list all of the repo's remotes. If we specify the -v option (git remote -v), we'll get additional information about where the remote repos are located.

As we know, git stores a repo's history as a series of commits. Each commit contains anything that changed since the previous commit. This allows Git to store history very efficiently, and replay that history to reconstruct the working directory -- the folder on our computer where we edit files, add the changes, and then make commits. Commits are separate from the working directory. They're essentially snapshots of all of the files in the working directory at specific points in time.

We can see the full commit history of the master branch of the local chatbot repo with git log. Here's the output 

![image.png](attachment:image.png)

This history shows two commits -- the first one with the message **Add the initial version of README.md**, and the second with the message **Updated README.md**. The great thing about Git is that it stores both commits, so we can quickly revert back to a previous commit if we want to.

To do this, we'd need to use the commit's hash, or unique identifier. Hashes allow us to perform operations like revert to a specific commit. We can find the hash for a commit in the output from git log. In the output we generated above, the first commit has the ID **8a1ca35dd5c5de8f93aa6cbbd153caa40233386c**, and the second commit has the ID **6a95e94ea10caa28013b767510d4bc59369d83fa**.

We can use the [git show](https://git-scm.com/docs/git-show) command with a hash to see what changed in a specific commit. For example, running **git show 6a95e94ea10caa28013b767510d4bc59369d83fa** would return:

![image.png](attachment:image.png)

This output indicates that someone changed the README.md file in this commit, and added the line **This project needs no installation!**. a/README.md is the file state before the commit, and b/README.md is the file state after the commit.

git show will allow us to scroll up and down and side to side. We can exit by typing q.

let's take a closer look at the working directory and how it interacts with commits. The Git commit workflow has three main components:

* The working directory
* The staging area
* Commits

The working directory is the folder we're version controlling with Git, and the contents of the working directory are what we see when we list the contents of the folder with **ls**. In our case, **chatbot** is the working directory. We can edit the working directory by changing or adding files. So let's say our working directory looks like this:

![image.png](attachment:image.png)

In this example, we have one file named README.md in the working directory. There are no files in the staging area, and no commits.

When we run git add, Git adds the difference between the most recent commit and the current status of our working directory to the staging area, like this:

![image.png](attachment:image.png)

When we run git commit, we create a commit that contains all of the changes Git added to the staging area. The commit has a unique commit hash, so we can refer to it later. Note how making a commit removes all changes from the staging area:

![image.png](attachment:image.png)

We now have a commit with the hash 53d. This commit is a snapshot of the working directory at the moment it contained a file called README.md that had the text This is a README!.

Next, we can add a new file to the working directory, and edit README.md. This will only affect the working directory, where we're making changes -- not the remote:

![image.png](attachment:image.png)

Then we can use git add to stage our changes:
    
 ![image.png](attachment:image.png)

In this case, Git adds both the new file (bot.py) and the changed file to the staging area. Then we can commit the changes:

![image.png](attachment:image.png)

We now have two commits, each storing a snapshot of our working directory at a different point in time. We can pull up the difference between two commits with the [git diff](https://git-scm.com/docs/git-diff) command -- we just pass the two commit hashes as arguments to git diff. To save typing time, we can also just write the first few characters of the hash to uniquely identify the commit (four is usually enough). The order in which we pass the two hashes to git diff influences whether changes appear as deletions or additions.

We need to use q to exit git diff when we're done.

**Task:**

Use git diff to find the difference between your two commits.

**Answer:**

* HASH=`git rev-parse HEAD`
* HASH2=`git rev-parse HEAD~1`
* git --no-pager diff `$`HASH2 `$`HASH

Now that we know about commit hashes, we can use them to switch to a specific commit. Switching between commits allows us to quickly move between different historical versions of a project. If we introduce a change that causes issues and want to revert to an earlier version, for example, switching between commits will let us do so.

Commit hashes are permanent; Git preserves them and includes them in transfers between the local repo and the remote repo. For instance, let's say we have two commits, c12 and c53. The following diagram shows what happens to them as we clone, commit, and push.

![image.png](attachment:image.png)

c12 originally existed on the remote, but when we pulled it locally, the commit kept the same hash. This is because the commit is the same in the remote and our local repo -- the same changes were made to the same files.

When we changed a file and made a commit locally, Git gave it the hash c53. When we pushed this commit to the remote later on, it kept the same hash because it was still the same commit. In the diagram above, both the local repo and the remote repo have two commits, c12 and c53. We can switch between commits in the local repo without changing what commits are in the remote repo. We can do this with the [git reset](https://git-scm.com/docs/git-reset) command:

![image.png](attachment:image.png)

The diagram shows the commit on the left, and a representation of our working directory on the right. If we type git reset --hard c12, Git switches back to the commit with the hash c12, and changes all of the files in the working directory so that they're exactly the same as the files in the commit. This will essentially let us rewind the repo to past commits if there are problems with more recent ones, or if we want to see what the project looked like at an earlier point in time.

The --hard flag resets both the working directory and the Git history to a specific state. If we omitted the flag, or used the --soft flag instead, it would skip making changes to the working directory, and only reset the Git history.

**Task**

1. Use the git log command to find the commit hash corresponding to the oldest commit in the chatbot repo.
2. Use git reset to reset the chatbot repo to the oldest commit.
3. Explore README.md and see what text it contains.

**Answer**

1. git log
2. HASH=`git rev-list --max-parents=0 HEAD`
3. git reset --hard $HASH

Now that we've reverted our local chatbot repo to an older version, the remote repo actually has a newer commit that our local repo doesn't have. This often happens when other people make changes to a project's code, and then push those changes to a remote repo. Here's a diagram showing which commits exist in which locations:

![image.png](attachment:image.png)

When the latest commit in our local repo is older than the latest commit in the remote repo, we can use [git pull](https://git-scm.com/docs/git-pull) to update the current branch with the latest commits. The git pull command will also update our working directory so that it has the same files as the latest commit.

In our case, we'll be updating the master branch, because the chatbot repo only has a single branch.

When using Git, we'll often want to refer to the most recent commit. While we can use the full commit hash, that approach can be cumbersome. Fortunately, Git has a special variable called HEAD that always refers to the most recent commit in the current branch.

We can use the HEAD variable to switch to the most recent commit more easily. Let's say we modify a file and then want to undo our changes; using HEAD will revert the working directory to the state of the most recent commit.

We can also use shortcuts to get older commit hashes. HEAD~1 will get the second newest commit in the local repo, HEAD~2 will get the third newest commit, and so on. Here's a diagram of a local repo where 646 is the newest hash on the master branch, and c12 is the oldest:

![image.png](attachment:image.png)

We can use [git rev-parse](https://git-scm.com/docs/git-rev-parse) along with the HEAD variable to find the commit hash corresponding to a particular commit number. In the diagram above, git rev-parse HEAD will return 646, and git rev-parse HEAD~3 will return c53.

**Task**

Use git reset with the HEAD variable to reset the chatbot repo to the second most recent commit.

**Answer**

* git reset --hard HEAD~1
* git rev-parse HEAD