# What is git?

## Backups

- Known-working version of a script
- Version to try something that might work, or might not
- Version trying a different thing that might work if the first doesn't

## Collaborative editing

If you change the start of a script, and a friend changes the end, git can make a version with both changes together.

# What is GitHub?

- Cloud backups
- Central repository
- Tracker for problem reports and feature requests
- Other people's code they think someone else would find useful

# Git commands

## Creating a repository

- New project: [``git init``](https://git-scm.com/docs/git-init)
- Existing project: [``git clone``](https://git-scm.com/docs/git-clone)
  - Multiple ways to specify location of original project:
    - ``git clone /path/to/original``
    - ``git clone username@server.com:/path/to/original``
    - ``git clone https://github.com/username/original``
    - ``git clone /path/to/original name_of_new_copy``

## Adding changes to the history

[``git add``](https://git-scm.com/docs/git-add)

Which changes to add can be specified as:
- Single file: ``git add path/to/file``
- Single subdirectory (and everything under it): ``git add path/to``
- Current directory (and everything under it): ``git add .``

Then to actually save the changes for later: [``git commit``](https://git-scm.com/docs/git-commit)
  - It will ask for a description: people usually do a short overview, then a blank line, then a longer description
  - Ways to specify the description:
    - ``git commit -m "Description of changes"``
    - If no message specified with ``-m``, git will bring up an editor.  You can change which one by setting ``EDITOR``:
      - ``export EDITOR="emacs -nw`` (bourne shell and friends)
      - ``setenv EDITOR "vi"`` (C-shell and friends)
    - If using git from JupyterHub terminal, your browser will try to interpret some of the keystrokes intended for the editor.  It is probably best to specify the message using ``-m`` on that platform.
  - `git commit` will also want your name and email, so people can contact you if they have questions about what your code is doing and how it's doing it.  These can be saved with [``git config``](https://git-scm.com/docs/git-config) with the keys `user.name` and `user.email`: `git commit` will tell you how if you haven't yet done so.

## Checking what needs to be added

- Which files have been changed since last commit: [``git status``](https://git-scm.com/docs/git-status)
  - Adding file names to `.gitignore` will remove them from this list
    - This can be useful for script outputs that can be regenerated by running the script again
      - `.pyc` files created when reusing functions from another script
      - Saved weather maps for the current date
- What lines have changed since last commit: [``git diff``](https://git-scm.com/docs/git-diff)
  - With no arguments, this will show which lines are different from the last commit
    - Unchanged lines start with a space
    - The old version of changed lines starts with a `-`
    - The new version of changed lines starts with a `+`
    - The first line (and similarly-formatted lines) says which file it's looking at, which line it's starting on (in both the old and new versions of the file), and how many lines it shows

## Marking different versions

### Fixed versions that should not change: [``git tag``](https://git-scm.com/docs/git-tag)
  - Exact script used to generate plots for a paper
  - Known-working version of some code, so you can find it again

### Versions that will change with new code: Branches: [``git branch``](https://git-scm.com/docs/git-branch) and [``git checkout``](https://git-scm.com/docs/git-checkout)
- Current stable/working version
- Side project trying something new
- Different side project trying something else, currently broken but hopefully working soon
- How to use:
  - git checkout -b name_of_version
    - First time
  - git checkout name_of_version
    - Later times
  - git checkout main
    - Main/original version
    - Might be called `master` in older repositories.

## Interacting with other repositories

- [``git remote``](https://git-scm.com/docs/git-remote)` add name /path/to/other`
  - You can use any of the variants from `git clone` above
  - You can also use the `/path/to/other` variants in place of `name`, but there's less to remember this way
  - `git clone /path/to/other` effectively does `git remote add origin /path/to/other` in the new repository
- [``git fetch``](https://git-scm.com/docs/git-fetch): Find updates from specified repository
- [``git merge``](https://git-scm.com/docs/git-merge): Add updates to your current code
  - `git merge your-branch-name` will get updates from your local branch `your-branch-name`
  - `git merge remote-name/their-branch-name` will get updates from GitHub or a friend's version of the code
- [``git pull``](https://git-scm.com/docs/git-pull): Fetch followed by merge
- [``git push``](https://git-scm.com/docs/git-push): Put your changes somewhere other people can find them

## Looking at history: [``git log``](https://git-scm.com/docs/git-log)

- Displays summary of what changes were made, when, and by whom
- I like using `git log --remotes --graph --branches --pretty='format:%h %d %an %as %s'`
  - The `--remotes` includes other people's work in the log
  - `--branches` includes all the different things I'm working on
  - `--graph` draws an ASCII-art graph to show how the commits are connected: where branches diverge and come back together
  - `--pretty='format:%h %d %an %as %s'` tells git how to print the commits, which I chose to fit nicely on a single line
    - `%h`: a commit id
    - `%d`: other names for commits
    - `%an`: Who made the changes
    - `%as`: When the changes were made (short date format)
    - `%s`: What were the changes trying to do (the "subject" line of the commit message)

## Graphical interfaces

- [`gitk`](https://git-scm.com/docs/gitk) offers one graphical interface for looking at commit history
- [`git gui`](https://git-scm.com/docs/git-gui) focuses on creating commits or looking at who most recently edited which lines of a file
- Many editors have git integrations or plugins of one variety or another
- [A list of other graphical programs is provided here](https://git-scm.com/downloads/guis)

## Interacting with GitHub

Club main repository: https://github.com/karlwx/wxdatasci

### Make a linked copy on GitHub

See the "Fork" button in the top right?  Clicking that will make a copy of the repository that is entirely yours and linked to this copy.  The number next to the "Fork" button will show how many others have done this, and clicking on it will open a page that shows who has done this, when, and how much work they've done since.

### Make a local copy

There is a green "Code" button a few lines down and a bit closer to the center. Clicking on that will detail a few download options.  Each can be used with `git clone` or `git remote add`.  I think there's also a `Download as ZIP` button available somewhere in case you don't have git on your machine.

If you don't feel like clicking on anything, `git clone https://github.com/user-name/repository-name` should work.

### Make changes

It's common to immediately check out a new branch descibing what you want to do, so updating the main version stays easy.  You can then edit the files using whatever process you usually use, and run the scripts to see if they do what you want.  It is useful to add the changes to the repository (using `git add` and `git commit`) every time you have a somewhat complete change, before starting on a new change, so it's easy to undo the later changes if they turn out not to work the way you want.

### Put changes on GitHub

`git push origin your-branch-name`, or `git push github your-branch-name` if you use another central repository more often.

### Create a Pull Request

This asks whomever wrote the original code to consider your changes for inclusion in their code.  If you go to the repository from which you forked your copy, there will be a "Pull Requests" tab under the repository name (currently third from the left near the top, after "Code" and "Issues").  On that page there is a "New Pull Request" button.  You can click that, select the branch you did your work on (you may need to click the "compare across forks" button to see it), and click the "New PR" button.  At some point, it will ask for a one-line summary of what you did, together with a longer description of what problem you were trying to solve.

They may ask for changes, to make your code look more like theirs or to let them quickly check that your code still works after they make changes.  You can go back to your local repository, make the changes, then push them back to GitHub.

### Create an issue

These generally fall into two broad categories: bug reports and feature requests.

#### Bug reports

Some portion of the code is not behaving as expected.  Describing what code is misbehaving, how you're using it, what you expect to happen, and what actually happens instead will help other people figure out what's happening and why.
If the behavior is expected, they may provide alternate methods for doing what you want, and may update the documentation to keep other people from being confused the way you were.

An example for the club repository would be one of the example notebooks not working, or not explaining what they're doing in enough detail to work with other data.

#### Feature requests

Things the code could do but currently doesn't.  What do you want to happen?  Why do you think this project would be the best place for this feature, as opposed to some existing project or a new one?

An example for the club repository would be requests for new tutorials and example notebooks.  What do you want to learn about?  Open-source licensing?  Plotting fields on maps?  Getting data on theta and theta-e surfaces for plotting?  Getting data from NCEP or the ECMWF do do some analyses on?  Creating web pages?  Creating 3-D animations of the tropopause and jet stream?  How to test code to make sure it still works on a new machine, with new python installation, or after a complete rewrite of the code?  How to package your functions for other people to download and use?

# Examples

In [11]:
%%bash
rm -rf demo

In [2]:
%%bash
mkdir demo
cd demo
git init
git checkout -b demo-main
cat >script.py <<EOF
print("Hello, World!")
EOF
git add script.py
git commit -m "Add hello world"
# Git commit will fail on a new machine, and ask for your name and email

Initialized empty Git repository in /home/Daniel/src/DWesl/wxdatasci/tutorials/demo/.git/
[demo-main (root-commit) 34caa40] Add hello world
 1 file changed, 1 insertion(+)
 create mode 100644 script.py


hint: Using 'master' as the name for the initial branch. This default branch name
hint: is subject to change. To configure the initial branch name to use in all
hint: 
hint: 	git config --global init.defaultBranch <name>
hint: 
hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and
hint: 'development'. The just-created branch can be renamed via this command:
hint: 
hint: 	git branch -m <name>
Switched to a new branch 'demo-main'


In [3]:
%%bash
cd demo
# Provide name and email
git config user.name Me
git config user.email me@psu.edu
# Then redo the commit so everything actually gets saved
git commit -m "Add hello world"
# This will fail if the previous step succeeded, since there's nothing new to save

On branch demo-main
nothing to commit, working tree clean


CalledProcessError: Command 'b'cd demo\ngit config user.name Me\ngit config user.email me@psu.edu\ngit commit -m "Add hello world"\n'' returned non-zero exit status 1.

In [4]:
%%bash
cd demo
# Let's try some work that might be broken
git checkout -b demo-experimental
cat >script.py <<EOF
print("Hello, world!")
print 1/0
EOF
git commit -a -m "Add division test"

[demo-experimental 842c1d7] Add division test
 1 file changed, 2 insertions(+), 1 deletion(-)


Switched to a new branch 'demo-experimental'


In [5]:
%%bash
cd demo
cat >script.py <<EOF
import sys
print("Hello,", sys.argv[1])
print 1/0
EOF
git commit -a -m "Only say hello to given name"

[demo-experimental 691109f] Only say hello to given name
 1 file changed, 2 insertions(+), 1 deletion(-)


In [7]:
%%bash
cd demo
python script.py
# This doesn't work

Hello, World!


In [10]:
%%bash
cd demo
git checkout demo-main
python script.py
# But this still does

Hello, World!


Already on 'demo-main'
