## DataCamp - Intro to Git

A version control system is a tool that manages changes made to the files and directories in a project. Many version control systems exist; this lesson focuses on one called Git, which is used by many of the data science tools covered in our other lessons. Its strengths are:

Nothing that is saved to Git is ever lost, so you can always go back to see which results were generated by which versions of your programs.

Git automatically notifies you when your work conflicts with someone else's, so it's harder (but not impossible) to accidentally overwrite work.

Git can synchronize work done by different people on different machines, so it scales as your team does.

[git status] shows you which files are in this staging area, and which files have changes that haven't yet been put there. In order to compare the file as it currently is to what you last saved, you can use [git diff filename]. 

[git diff] without any filenames will show you all the changes in your repository, while git diff directory will show you the changes to the files in some directory.

To compare the state of your files with those in the staging area, you can use [git diff -r HEAD]. The [-r] flag means "compare to a particular revision", and [HEAD] is a shortcut meaning "the most recent commit".

You can restrict the results to a single file or directory using [git diff -r HEAD path/to/file], where the path to the file is relative to where you are (for example, the path from the root directory of the repository).

Ctrl-K: delete a line.
Ctrl-U: un-delete a line.
Ctrl-O: save the file ('O' stands for 'output').
Ctrl-X: exit the editor.

To save the changes in the staging area, you use the command [git commit]

[git commit -m 'log message']

[git commit --amend - m "new message"]

...go back to record notes

### section 4, how can i merge two branches?

Branching lets you create parallel universes; merging is how you bring them back together. When you merge one branch (call it the source) into another (call it the destination)

To merge two branches, you run [git merge source destination] (without .. between the two branch names). Git automatically opens an editor so that you can write a log message for the merge
You can use Ctrl+O and then Ctrl+X to exit this. 

When there is a conflict during a merge, Git tells you that there's a problem, and running [git status] after the merge reminds you which files have conflicts that you need to resolve by printing [both modified:] beside the files' names.

Inside the file, Git leaves markers that look like this to tell you where the conflicts occurred:

<- destination-branch-name

...changes from the destination branch...

=======

...changes from the source branch...

-> source-branch-name

In many cases, the destination branch name will be [HEAD] because you will be merging into the current branch. To resolve the conflict, edit the file to remove the markers and make whatever other changes are needed to reconcile the changes, then commit those changes.

If you want to create a repository for a new project in the current working directory, you can simply say [git init project-name], where "project-name" is the name you want the new repository's root directory to have.

One thing you should not do is create one Git repository inside another. While Git does allow this, updating nested repositories becomes very complicated very quickly, since you need to tell Git which of the two .git directories the update is to be stored in. Very large projects occasionally need to do this, but most programmers and data analysts try to avoid getting into this situation.

Turn an existing project into a repository:

[git init]

in the project's root directory, or:

[git init /path/to/project]

from anywhere else on your computer.

Cloning a repository does exactly what the name suggests: it creates a copy of an existing repository (including all of its history) in a new directory.

To clone a repository, use the command [git clone URL], where URL identifies the repository you want to clone.

When you clone a repository, Git uses the name of the existing repository as the name of the clone's root directory, for example:

[git clone /existing/project]

creates a new directory called [project] inside the home directory

if you want to call the clone something else, add the directory name you want to the command:

[git clone /existing/project newprojectname]

When you a clone a repository, Git remembers where the original repository was. It does this by storing a remote in the new repository's configuration. A remote is like a browser bookmark with a name and a URL.

If you are in a repository, you can list the names of its remotes using git [remote].

f you want more information, you can use [git remote -v] (for "verbose"), which shows the remote's URLs.


When you clone a repository, Git automatically creates a remote called origin that points to the original repository. You can add more remotes using:

[git remote add remote-name URL]

and remove existing ones using:


[git remote rm remote-name]

You can connect any two Git repositories this way, but in practice, you will almost always connect repositories that share some common ancestry.

Pulling changes is straightforward: the command [git pull] remote branch gets everything in branch in the remote repository identified by remote and merges it into the current branch of your local repository. For example, if you are in the [quarterly-report] branch of your local repository, the command:

[git pull thunk latest-analysis]

would get changes from [latest-analysis] branch in the repository associated with the remote called [thunk] and merge them into your [quarterly-report] branch.

use git checkout -- filename to discard changes if git pull shows errors


The complement of git pull is git push, which pushes the changes you have made locally into a remote repository. The most common way to use it is:

[git push remote-name branch-name]

which pushes the contents of your branch [branch-name] into a branch with the same name in the remote repository associated with [remote-name]. It's possible to use different branch names at your end and the remote's end, but doing this quickly becomes confusing: it's almost always better to use the same names for branches across repositories.