# Introduction to Git and GitHub
by Google

This course focus in how to keep track of the different versions of your code and configuration files using **Version Control Systems** (VCS). It will allow us to easily roll back when mistakes happen and also help us collaborate with others.

In this course, we'll introduce you to a popular VCS called Git, and show you some of the ways you can use it. We'll also go through how to set up an account with the service called GitHub, so that you can create your very own remote repositories to store your code and configuration. By the end of this course, you'll be able to store your codes history in Git, and collaborate with others in GitHub, where you'll also start creating your own portfolio. 

**Finding out more information**

Throughout this course, we’ll teach you how to do a range of things with Git and GitHub. While we’ll provide a lot of information through videos and supplemental readings, sometimes you may need to look things up on your own, both now and throughout your career. Things change fast in IT, so it’s critical to do your own research to stay up-to-date on what’s new. We recommend you use your favorite search engine to find more information about the concepts we cover in this course — it’s great practice for the real world!
On top of search results, here are some great Git resources available online:
Pro Git: This book (available online and in print) covers all the fundamentals of how Git works and how to use it. Refer to it if you want to learn more about the subjects that we cover throughout the course.
Git tutorial: This tutorial includes a very brief reference of all Git commands available. You can use it to quickly review the commands that you need to use.


## Week 1

### Before Version Control

**Keeping Historical Copies**

The goal of a version control system is to keep track of changes made to our files.

**Diffing Files (How to compare two versions of a file over time?) **

We can use diff command. Syntax:

    diff -u file1 file2

The `diff -u` command show adds and exclusions line by line. Another one id

    wdiff file1 file 2

that highlight words that have changed in a file instead working line by line. To help us even more, there are bunch of graphical tools that display files side by side and highlight the differences by using color. Some examples of this include: 

- `meld`
- `KDiff3`
- `vimdiff`


**Applying changes (Now, how to apply)**

Imagine a colleague sends you a script with a bug and asked you to help fix the issue.You find the error and fix it. Then, you could send just a .diff file to your friend. With the .diff file, he will be able to see the differences. Syntax to create the .diff file:

    diff -u old_file new_file > change.diff

Remember `>` redirect the content-results from the left to right. He also will be able to apply the changes using the patch command. Syntax: 

    patch old_file < change.diff

*diff and patch Cheat Sheet*

    diff

diff is used to find differences between two files. On its own, it’s a bit hard to use; instead, use it with diff -u to find lines which differ in two files:

    diff -u

diff -u is used to compare two files, line by line, and have the differing lines compared side-by-side in the same output. See below:

In [None]:
~$ cat menu1.txt 
Menu1:

Apples
Bananas
Oranges
Pears

~$ cat menu2.txt 
Menu:

Apples
Bananas
Grapes
Strawberries

~$ diff -u menu1.txt menu2.txt 
--- menu1.txt   2019-12-16 18:46:13.794879924 +0900
+++ menu2.txt   2019-12-16 18:46:42.090995670 +0900
@@ -1,6 +1,6 @@
-Menu1:
+Menu:
 
 Apples
 Bananas
-Oranges
-Pears
+Grapes
+Strawberries

`patch` 

Patch is useful for applying file differences. See the below example, which compares two files. The comparison is saved as a .diff file, which is then patched to the original file!

In [None]:
~$ cat hello_world.txt 
Hello World
~$ cat hello_world_long.txt 
Hello World

How a wonderful day!
~$ diff -u hello_world.txt hello_world_long.txt 
--- hello_world.txt     2019-12-16 19:24:12.556102821 +0900
+++ hello_world_long.txt        2019-12-16 19:24:38.944207773 +0900
@@ -1 +1,3 @@
 Hello World
+
+It's a wonderful day!
~$ diff -u hello_world.txt hello_world_long.txt > hello_world.diff
~$ patch < hello_world.diff 
patching file hello_world.txt
~$ cat hello_world.txt 
Hello World

How a wonderful day!

There are some other interesting patch and diff commands such as patch -p1, diff -r !

Check them out in the following references:

- http://man7.org/linux/man-pages/man1/diff.1.html
- http://man7.org/linux/man-pages/man1/patch.1.html

### Version Control Systems

**What is a version control?**

We've seen up till now, how we can use existing tools to extract differences between versions of files and apply those changes back to the original files. Those tools are really useful. But most of the time, we won't be using them directly. Instead, we'll use them through a Version Control System, or VCS. A Version Control System keeps track of the changes that we make to our files. By using a VCS, we can know when the changes were made and who made them. It also lets us easily revert a change if it turned out not to be a good idea. It makes collaboration easier by allowing us to merge changes from lots of different sources.

- With VCS, we can make edits to multiple files and treat that collection of edits as a single change which is commonly known as a, commit. A VCS even provides a mechanism to allow the author of a commit to record why the change was made, including what bugs, tickets or issues were fixed by the change.

**What is Git?**

Git is a VCS created in 2005 by Linus Torvalds. The developer who started the Linux kernel. 

Git is a free open source system available for installation on Unix based platforms, Windows and macOS. Linus originally created get to help manage the task of developing the Linux kernel. This was difficult because a lot of geographically distributed programmers were collaborating to write a whole bunch of code. Linus had some requirements for the way that the system worked, and its performance that weren't being met by the VCS tools at a time. So he decided to write his own. Git is now one of the most popular version control systems out there and is used in millions of projects. Unlike some version control systems that are centralized around a single server, Git has a distributed architecture. This means that every person contributing to a repository has full copy of the repository on their own development machines.

-  When looking for information online you might notice that the official Git website is called git-scm.com. And wonder what's the SCM at the end for? It's actually another acronym similar to VCS. It stands for Source Control Management. While both terms mean the same, we generally prefer VCS, because as we call that already, these systems can actually be used to store much more than just source code.

In this course we chose Git for its popularity, multi platform support and robust set of features. As with most things in the IT world, though, there are plenty of other tools that can be used to accomplish the same task. There are other VCS programs like 

- Subversion or
- Mercurial.

**More Info about Git**

Check out the following links for more information:

- https://git-scm.com/doc
- https://www.mercurial-scm.org/
- https://subversion.apache.org/
- https://en.wikipedia.org/wiki/Version_control

**Installing Git**

The first step on the way to using Git is to install it! The directions found in the Git documentation below are pretty thorough and helpful, check them out for the best method of getting Git onto your platform of choice.

- Git download page
- Git installation instructions for each platform

To install Git on Ubuntu, run

    sudo apt-get install git-all

### Using Git

**First steps with Git**

Let's start by setting some basic configuration. Remember when we said that a VCS tracks who made which changes, for this to work, we need to tell Git who we are. We can do this by using the Git config command and then setting the values of user.email and user.name to our email and our name like this.

    git config --global user.email "me@example.com"
    git config --global user.name "my name"

- We can check out our current configuration by using the git `config -l` command.

With that done, there are two ways to start working with a git repository. We can create one from scratch using the 

    git init

command or we can use the 

    git clone "adress"

command to make a copy of a repository that already exists somewhere else. We'll talk about remote repositories later in the course. Syntax to initiate git inside a repository addressed by a "path", that could be `\home`, for example.

    cd "path"
    git init
    
So when we run git init we initialize an empty *git repository* in the current directory. The message that we get mentions a directory called `.git`. We can check that this directory exist using the `ls-la` command which lists files that start with a dot. We can also use the ls-l .git command to look inside of it and see the many different things it contains. This is called a *Git directory*. You can think of it as a database for your Git project that stores the changes and the change history. We can see it contains a bunch of different files and directories. We won't touch any of these files directly, we'll always interact with them through Git commands. So whenever you clone a repository, this git directory is copied to your computer. Whenever you run git init to create a new repository like we just did, a new git directory is initialized.

The area outside the *git directory* is the *working tree*. The *working tree* is the current version of your project. You can think of it like a workbench or a sandbox where you perform all the modification you want to your file. This working tree will contain all the files that are currently tracked by Git and any new files that we haven't yet added to the list of track files.
 
-  The git directory contains all the changes and their history and the working tree contains the current versions of the files.

![image.png](attachment:image.png)

To make Git track our file, we'll add it to the project using the

    git add
    
command passing the file that we want as a parameter. With that, we've added our file to the *staging area*. The staging area which is also known as the index is a file maintained by Git that contains all of the information about what files and changes are going to go into your next command. We can use the

    git status

command to get some information about the current working tree and pending changes.

To get it committed into the `.git` directory, we run the 

    git commit

command. When we run this command, we tell Git that we want to save our changes. It opens a text editor where we can enter a commit message.


**Tracking Files**

- We mentioned that any Git project will consist of three sections. 
    - Git directory, 
    - the working tree,
    - and the staging area.

- The *Git directory* contains the history of all the files and changes. The *working tree* contains the current state of the project, including any changes that we've made. And the *staging area* contains the changes that have been marked to be included in the next commit.

![image-2.png](attachment:image-2.png)

![image-3.png](attachment:image-3.png)

![image-4.png](attachment:image-4.png)

![image-5.png](attachment:image-5.png)

We can check the files in each areas using the `git status` command.

**Anatomy of a Commit Message**

A commit message is generally broken up into a few sections. The first line is a short summary of the commit followed by a blank line. This is followed by a full description of the changes which details why they're necessary and anything that might be especially interesting about them or difficult to understand. When you run the git commit command, Git will open up a text editor of your choice so you can write your commit message. A good commit message might look something like this.
 
- A short description of the change (first line up to 50 characters), followed by one or more paragraphs giving more details of the change (if needed).

We can check the history of the commits of our project using the 

    git log

The result is packing a lot of information in just a few lines. The first thing listed for each commit is its *identifier*, which is a long string of letters and numbers that uniquely identify each commit. The first commit in the list also says that the *head indicator* is pointing to the master branch. If this is gibberish to you, don't worry. We'll talk more about what a head and master means in later videos. For each commit, we see the *name and the email of the person who made the commit* which is indicated as the author. Then we get the *date and time* the commit was made. Finally the *commit message* is displayed.

**Initial Git Cheat Sheet**

Check out the following links for more information:

The Linux kernel documentation itself, as well as impassioned opinions from other developers. 

You can check out "Setting your email in Git" and "Keeping your email address private" on the GitHub help site for how to do this.

- The command `git commit` with `-m` flag takes the commit message, too. This is different to the command without flag, where you had to type the commit message within the editor. If multiple -m flags are given to the command, it concatenates the values as separate paragraphs.

## Week 3 - Working with Remotes

### Introduction to GitHub

**What is GitHub**

GitHub is a web-based Git repository hosting service. On top of the version control functionality of Git, GitHub includes extra features like bug tracking, wikis, and task management. GitHub lets us share and access repositories on the web and copy or clone them to our local computer, so we can work on them. GitHub is a popular choice with a robust feature set, but it's not the only one. Other services that provide similar functionality are BitBucket, and GitLab.

- GitHub can be described in few words as a remote repository hosting service for Git.

**Basic Interaction with GitHub**

There are various remote repository hosting sites:

- [GitHub](https://github.com/)
- [BitBucket](https://bitbucket.org/product)
- [Gitlab](https://about.gitlab.com/)

Follow the workflow at https://github.com/join to set up a free account, username, and password. After that, [these steps](https://docs.github.com/en/free-pro-team@latest/github/getting-started-with-github/create-a-repo) will help you create a brand new repository on GitHub.

Some useful commands for getting started:

- git clone URL: [Git clone is used to clone a remote repository into a local workspace](https://git-scm.com/docs/git-clone)
- git push: [Git push is used to push commits from your local repo to a remote repo](https://git-scm.com/docs/git-push)
- git pull: [Git pull is used to fetch the newest updates from a remote repository](https://git-scm.com/docs/git-pull)

This can be useful for keeping your local workspace up to date.

- https://help.github.com/en/articles/caching-your-github-password-in-git
- https://help.github.com/en/articles/generating-an-ssh-key  


### Using a Remote Repository

**What is a remote?**

Will still modify stage and commit our local changes. After committing, we'll fetch any new changes from the remote repo manually merge if necessary and only then will push our changes to the remote repo. Git supports a variety of ways to connect to a remote repository. Some of the most common are using the HTTP, HTTPS and SSH protocols and their corresponding URLs. HTTP is generally used to allow read only access to a repository. In other words, it lets people clone the contents of your repo without letting them push new contents to it. Conversely HTTPS and SSH, both provide methods of authenticating users so you can control who gets permission to push.

It's a good idea to control who can push codes to repos and to make sure you give access only to people you trust. Web services like GitHub, offer a bunch of different mechanisms to control access to Repositories. Some of these are available to the general public while others are only available to enterprise users.

**Working with Remotes**

If we want to get even more information about our remote, we can call

    git remote show origin

There's a ton of information here, and we don't need all of it right now. We can see the fetch and push URLs that we saw before, and the local and remote branches too.

We could have a look at the remote branches that our Git repo is currently tracking by running 

    git branch -r
    
**Fetching New Changes**

To sync the data, we use the 

    git fetch
    
command. This command copies the commits done in the remote repository to the remote branches, so we can see what other people have committed.
If we want to integrate the branches into our master branch, we can perform a merge operation, which merges the origin/master branch into our local master branch. To do that, we'll call 

    git merge origin/master.

**Updating the Local Repository**

Since fetching and merging are so common, Git gives us the

    git pull
    
command that does both for us. Running git pull will fetch the remote copy of the current branch and automatically try to merge it into the current local branch.


**Git Remotes Cheat-Sheet**

- `git remote`: [Lists remote repos](https://git-scm.com/docs/git-remote)
- `git remote -v`: [List remote repos verbosely](https://git-scm.com/docs/git-remote#Documentation/git-remote.txt--v)
- `git remote show <name>`: [Describes a single remote repo](https://git-scm.com/docs/git-remote#Documentation/git-remote.txt-emshowem)
- `git remote update`: [Fetches the most up-to-date objects](https://git-scm.com/docs/git-remote#Documentation/git-remote.txt-emupdateem)
- `git fetch`: [Downloads specific objects](https://git-scm.com/docs/git-fetch)
- `git branch -r`: [Lists remote branches](https://git-scm.com/docs/git-branch#Documentation/git-branch.txt--r); can be combined with other branch arguments to manage remote branches

You can also see more in the video Cryptography in Action from the course IT Security: Defense against the digital dark arts.

### Solving Conflicts

**The Pull-Merge-Push Workflow**

We saw earlier how we can use the `git push` command to send our changes to the remote repo. But what if when we go to push our changes, there are new changes to the remote repo?

One thing to notice is that Git will try to do all possible automatic merges and only leave manual conflicts for us to resolve when the automatic merge fails. In this case, we can see that the other changes we made were merged successfully without intervention. Only the change that happened in the same line of the file needed our input. We fixed the conflict here, and the file is short enough that we can very quickly check that there are no other conflicts. For larger files, it might make sense to search for the conflict markers, greater than, greater, greater than, in the whole file. This lets us check that there are no unresolved conflicts left. Nice, now that we fixed the conflict, you can finish the merge.


**Pushing Remote Branches**

The first time we push a branch to a remote repo, we need to add a few more parameters to the Git push command. We'll need to add the `-u` flag to create the branch upstream, which is another way of referring to remote repositories. We'll also have to say that we want to push this to the origin repo, and that we're pushing the refactor branch.


**Rebasing Your Changes**

In our last video, we mentioned that once our branch has been properly reviewed and tested, it can get merged back into the master branch. This can be done by us or by someone else. One option is to use the `git merge` command that we discussed earlier. Another option is to use the 

    git rebase 

command. Rebasing means changing the base commit that's used for our branch.

**Best Practices for Collaboration**

It's a good idea to always synchronize your branches before starting any work on your own. That way, whenever you start changing code, you know that you're starting from the most recent version and you minimize the chances of conflicts or the need for rebasing. Another common practice is to try and avoid having very large changes that modify a lot of different things. Instead, try to make changes as small as possible as long as they're self-contained.

If you need to maintain more than one version of a project at the same time, it's common practice to have the latest version of the project in the master branch and a stable version of the project on a separate branch.

When using these two branches, some bug fixes for the stable version may be done directly on the stable branch if they aren't relevant to the latest version anymore. In the last couple of videos, we looked at how we can use rebase to make sure our history is linear. Rebasing can help a lot with identifying bugs, but use it with caution. Whenever we do a rebase, we're rewriting the history of our branch. The old commits get replaced with new commits, so they'll be based on different snapshots than the ones we had before and they'll have completely different hash sums. This works fine for local changes, but can cause a lot of trouble for changes that have been published and downloaded by other collaborators. So as a general rule, you shouldn't rebase changes that have been pushed to remote repos.

Early in our Git journey, we mentioned that having good commit messages is important. It's already important when you're working alone since good commit messages help the future you understand what's going on, but it's even more important when you're collaborating with others since it gives your collaborators more context on why you made the change and can help them understand how to solve conflicts when necessary. So commit to being a good collaborator and remember to add those commit messages.


**Conflict Resolution Cheat Sheet**

Merge conflicts are not uncommon when working in a team of developers, or on Open Source Software. Fortunately, GitHub has some good documentation on how to handle them when they happen:

- https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-merge-conflicts
- https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/resolving-a-merge-conflict-using-the-command-line

You can also use [git rebase branchname](https://git-scm.com/book/en/v2/Git-Branching-Rebasing) to change the base of the current branch to be branchname.

The git rebase command is a lot more powerful.  Check out [this link](https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History) for more information.

**Module Review**

- We talked about what GitHub is and what the basic interaction with the service looks like. 
- Then we discussed how remote repositories and the distributed nature of Git lets lots of contributors develop a project independently, and at the same time, we then learned how to pull data down from remote repositories, push our local changes to them, and also resolve conflicts that pop-up when our local and remote branches are out of sync. 
- We wrapped up by looking at a complex example of using a feature branch for a refactor of our code and using rebase to make sure that our history stayed linear.

**QuickLabs**

 In this lab, you'll practice the basics of interacting with GitHub. You'll practice setting up an account, logging in, creating a repository, making changes on the local machine, and pushing changes back to the remote repository. We use these git operations to share changes from the remote repository to the local repository and vice-versa.

What you'll do

- Create a Github account
- Create a git repository
- Git clone to create a local copy on your local machine
- Add a file to this repository
- Create snapshot/snapshots of the local repository
- Push the snapshots to the master branch

*See details in git-github gitbook or courser course.*


## Week 4

The Quicklabs on forking and more can be found on my [git-github gitbook](https://app.gitbook.com/@avertere/s/coursera/~/drafts/-MIQEIl2OYyDwhJbT73O/introduction-to-git-and-github#introduction-to-github).