# STA 141B Lecture 3

The class website is <https://github.com/2019-winter-ucdavis-sta141b/notes>

### Announcements

* Do not use external packages for assignment 1.
* Before or when you submit assignment 1, fill out the GitHub Username and Project Group Form. A link will be on Piazza later today.

### Topics

* git
* Modules
* Iterators
* Comprehensions and Generators

### References

* [ProGit][], Ch. 1-2

The [Git cheatsheet](https://services.github.com/on-demand/downloads/github-git-cheat-sheet.pdf) is also helpful.

[PDSH]: https://jakevdp.github.io/PythonDataScienceHandbook/
[ProGit]: https://git-scm.com/book/

## More about Jupyter

Jupyter breaks sections of the notebook into _cells_. You can choose the type of cell in the `Cell -> Cell Type` menu. Use "Code" for cells that contain code and "Markdown" for cells that contain text or images.

Code sells are set up to run Python code. When you open a Jupyter notebook, Jupyter runs a Python session called a _kernel_ in the background. Each time you run a code cell, the code is sent to the kernel, and then the results are printed in the notebook. The kernel maintains state between cells, so code you run in one cell can affect code you run in another cell.

__Caution!__ The state of the kernel depends on the order you run cells in, not the order cells appear in the notebook.

You can stop or restart the kernel using the `Kernel` menu. This is mostly useful when you want to cancel a computation.

Markdown cells allow you to input text and format it using the Markdown language. You can learn more about Markdown [here](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet).

* This
* is
* a
* list

**This is bold**

$\int$

$$ $$

In [3]:
x = 3

In [4]:
x

3

In [5]:
y

NameError: name 'y' is not defined

In [None]:
y = 3

## Git

[git](http://git-scm.com/) is a distributed version control system. Let's break that down:

* _Distributed_ means git can share files across multiple computers.
* A _version control system_ is a tool to keep track of different versions or drafts of files.

With git, you can

* Get or send sets of files with a single command.
* Back up your work to another computer or server.
* Work collaboratively with others (git will help resolve editing conflicts).
* Undo changes to files or entire directories.

A collection of files tracked by git is called a _repository_ or _repo_. A repo looks like any other directory on your computer, but always contains a hidden `.git` directory to store git tracking info.

You've already used a git repo -- the class website.

### GitHub

The class website is hosted on [GitHub][], an online service for backing up and sharing git repos. You'll need a free GitHub account in order to submit assignments for this class.

[GitHub]: https://github.com/

### The Shell

We'll run git commands in the shell, a text-based program for interacting with computers. Git and the shell are not part of Python or Jupyter. To open a shell window:

* On Windows, run "Git Bash"
* On Mac OS, run "Terminal"
* On Linux, run your favorite terminal emulator. Mine is `st`.

You can use the shell to navigate and modify directories on your computer. Directories are like places, and the shell is always at one directory at a time, called the _working directory_. By default, when you run commands in the shell, they affect the working directory. 

The essential shell commands for navigation are:

* `pwd` to print the working directory path.
* `cd PATH` to change the working directory. Replace `PATH` with a path to a directory, or with `..` to go up one directory.
* `ls` to list files and directories in the working directory.
* `man COMMAND` to get help. Replace `COMMAND` with the name of a shell command.

To learn more, I recommend Software Carpentry's [Unix Shell Notes][swc-shell].

[swc-shell]: https://swcarpentry.github.io/shell-novice/

### Configuring git

The first time you use git, you need to set your name and email address so that you'll get credit for your work.

Replace my name with yours, and run
```sh
git config --global user.name "Nick Ulle"
```

Then replace my email with yours, and run
```sh
git config --global user.email naulle@ucdavis.edu
```

You can check the settings any time by running these commands with the last value omitted. For instance:
```sh
git config --global user.name
```

### Clone and Pull

Let's use git to download the class repo.

When you want to download a git repo __for the first time__, use `git clone URL`. Replace `URL` with the web url of repo. For GitHub repos, the web url is always listed under the bright green "Clone or Download" button on the repo's front page.

So to clone the class repo, run
```sh
git clone https://github.com/2019-winter-ucdavis-sta141b/notes.git
```
Now you have a _local_ copy of the repo, one that's on your computer. The copy on GitHub is _remote_, since it's not on your computer.

When someone else owns the remote repo, or when you work on a remote repo with other people, they might make changes after you've cloned the repo. For instance, I might upload some new notes to the class repo. You can use `git pull` to check for and download changes from the remote repo to your local repo.

### Add, Commit, Push

A typical git workflow is:

1. Clone a repo from a server (like GitHub) with `git clone`. This downloads remote -> local.
2. Make some changes to your local copy of the repo.
3. Tell git to track your changes with `git add`.
4. Tell git to save your changes with `git commit -m`.
5. Push the changes on your computer back to the server with `git push`. This uploads local -> remote.
6. Repeat 2-5 as many times as you like until finished.

Let's learn the add, commit, and push commands.

Git calls a record of changes a _commit_. A commit is similar to a snapshot or save point. Before you create a commit, you need to tell git which changes to record. Use `git add PATH` to tell git to record changes to a file. Replace `PATH` with the path to the file.

Once you're done adding files, it's time to create the commit. Before you create the commit, optionally use `git status` to check that the changes you meant to add were added.

If everything looks correct, use `git commit -m "MESSAGE"` to create a commit. Replace `MESSAGE` with a 1 sentence message explaining what's changed in the commit. The commit message is a reminder for you and anyone else using your repo, so make sure it's clear.

You can make as many commits as you want before pushing them back to the server. When you are ready to push them back to the server, make sure you are connected to the internet and then use `git push`.

There are lots of steps in this process, so there are lots of places where it can go wrong. __Pay attention to error messages__ and search online or ask on Piazza to get help!

### Merge Conflicts

When you work on a GitHub repository with other people, they might change a file, commit the changes, and then _push_ the commit to GitHub. Your local copy of the file won't change unless you _pull_ the new commit from GitHub. In other words, your local repository can easily get out of sync with the remote repository on GitHub. If you change your local copy of the file and commit the changes, you create a _conflict_. If you try to push the conflicting commit to GitHub, you'll see an error message:
```
git push

To github.com:USERNAME/REPOSITORY.git
 ! [rejected]        master -> master (fetch first)
error: failed to push some refs to 'git@github.com:USERNAME/REPOSITORY.git'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.
```
When you see an error, __don't panic!__ The error message hints that you should try pulling commits from GitHub before pushing your commit. If you pull commits from GitHub, you might see another error message:
```
git pull

remote: Counting objects: 3, done.
remote: Total 3 (delta 0), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (3/3), done.
From github.com:USERNAME/REPOSITORY
   6fe289c..48e44d3  master     -> origin/master
Auto-merging README.md
CONFLICT (content): Merge conflict in README.md
Automatic merge failed; fix conflicts and then commit the result.
```
This is okay! Git tried to automatically fix the conflict by _merging_ your commit with the other person's commit, but couldn't figure out how because both commits changed the same file (`README.md` in the example). An automatic merge will only succeed if the commits being merged changed different files. Otherwise, it's up to you to resolve the conflict manually. If you open the file causing the conflict in a text editor, you'll see something like this:
```
# Our README.md

<<<<<<< HEAD
Here are the changes you made.
=======
Here are the changes the other person made.
>>>>>>> 48e44d3a60af614f3a0da794a1701d040221d40f

Here's some text that was added to the file in an earlier commit.
```
Git automatically marked which parts of the file conflict. Changes from your commit are shown between `<<<<<<<` and `=======`. Changes from the other person's commit are shown between `=======` and `>>>>>>>`. All you need to do is edit the file to look the way you want. If you wanted to keep your changes and the other person's changes (the polite thing to do), you could edit the file to look like this:
```
# Our README.md

Here are the changes you made.

Here are the changes the other person made.

Here's some text that was added to the file in an earlier commit.
```
When you're done editing, save and then commit the file. This is called a _merge commit_. Git will automatically provide a commit message indicating that you merged your commit with the other person's commit:
```
[master 9594c15] Merge branch 'master' of github.com:USERNAME/REPOSITORY
```
Finally, you can push your commit along with the merge commit to GitHub:
```
git push

Counting objects: 6, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (6/6), 602 bytes | 0 bytes/s, done.
Total 6 (delta 0), reused 0 (delta 0)
To github.com:USERNAME/REPOSITORY.git
   48e44d3..9594c15  master -> master
```
Note that if you pull and git asks you to merge a file, but you'd like to undo the pull and make more changes before merging, you can use the command `git merge --abort`. Git will remind you about "unmerged paths" in the `git status` message when it's waiting for you to merge a file:
```
git status

On branch master
Your branch and 'origin/master' have diverged,
and have 1 and 1 different commits each, respectively.
  (use "git pull" to merge the remote branch into yours)
You have unmerged paths.
  (fix conflicts and run "git commit")
  (use "git merge --abort" to abort the merge)

Unmerged paths:
  (use "git add <file>..." to mark resolution)

        both modified:   README.md

no changes added to commit (use "git add" and/or "git commit -a")
```