# GitHub Workflow Illustration
## Dataset Autogeneration Project

#### Jul 2022 @Eli, Fei
#### Epstein ISE @Prof. Bruce Wilcox

---

## Outline
1. General process
2. Detailed steps
    1. Branch
    2. Commit, Push, and Pull
        1. Git basics
    3. Pull request and Merge
    4. Issues
3. Packages
    1. Install packages using pip
    2. Develop conda package

---

## 1. General Process

### 1.0. Preface
There are always at least two ways of doing git operations, one is in **GitHub Desktop**, the other is using **command line**. The command line approach is introduced in this edition.

A full cycle of editing, updating code is:

Fetch/Pull ➡️ Edit code ➡️ Commit ➡️ Push

![workflow cycle](images/1.0-cycle.jpg)


### 1.1. The absolute first step: Clone the repository
The syntax is `git clone <url>`. You can find the url in the corresponding GitHub repo page.
![github clone url](images/1.1-clone.jpg)


1. Open terminal, go to the directory you would like the folder to be by using `cd` command
2. Type in `git clone https://github.com/Faye-yufan/analytics-dataset.git`



### 1.2. Fetching and Pulling

The first step is amost always to get the latest data in remote repo if you would like to work on the project on top of the newest verion.
You can run either `$ git fetch <remote>` or `$ git pull <remote>`.

The difference between `fectch` and `pull` will be explained in details in section 2.2.1.


###  1.3. Committing

**Step 1: Stage the changes**

After editing the code, in order to record the changes, we need to **stage** it.

In order to stage a file, we run `git add`.


**Step 2: Commit**

Now that your staging area is set up the way you want it, you can commit your changes. Basically to commit is to record your changes with a high-level explanation. This is done by `git commit -m "<message>"`.


### 1.4. Pushing

When you have your project at a point that you want to share, you have to push it upstream. The command for this is simple: `git push <remote> <branch>`. 


---

## 2. Detailed steps

### 2.1. Branch

![branch](images/2.1-branch.png)


#### 2.1.1. View your branch

`git branch`

e.g.

```
$ git branch
  add-categorical
* add-tutorial/eli
  add-tutorial/main
  main
  refactor-into-package
```

#### 2.1.2. Create a new branch
To create a new branch and switch to it at the same time, you can run the `git checkout -b <branch_name>`

```
$ git checkout -b iss53
Switched to a new branch "iss53"
```

#### 2.1.3. Switch to another branch

Let’s assume you’ve committed all your changes, so you can switch back to your `main` branch:

```
$ git checkout master
Switched to branch 'master'
```


### 2.2. Pull, Commit, and Push

#### 2.2.1. Fetching and Pulling

The first step is amost always to get the latest data in remote repo if you would like to work on the project on top of the newest verion.
You can run:

`$ git fetch <remote>`

The command goes out to that remote project and pulls down all the data from that remote project that you don’t have yet. After you do this, you should have references to all the branches from that remote, which you can merge in or inspect at any time.

On the other hand, you can also run:

`$ git pull <remote>`

Running `git pull` generally fetches data from the server you originally cloned from and automatically tries to merge it into the code you’re currently working on.

When to use fectch and when to use pull?


#### 2.2.2. Committing


**Step 1: Stage the changes**

After editing the code, in order to record the changes, we need to **stage** it.

In order to stage a file, we run `git add`.

For example, before:

```
$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

    new file:   README

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

    modified:   CONTRIBUTING.md
```

After running `git add CONTRIBUTING.md`:

```
$ git add CONTRIBUTING.md
$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

    new file:   README
    modified:   CONTRIBUTING.md
```

**Step 2: Commit**

Now that your staging area is set up the way you want it, you can commit your changes. Remember that anything that is still unstaged — any files you have created or modified that you haven’t run git add on since you edited them — won’t go into this commit.

The syntax of a commit is `git commit -m "<message>"`
For example,

```
$ git commit -m "Story 182: fix benchmarks for speed"
[master 463dc4f] Story 182: fix benchmarks for speed
 2 files changed, 2 insertions(+)
 create mode 100644 README
 ```
 
 #### 2.2.3. Pushing
 
 When you have your project at a point that you want to share, you have to push it upstream. The command for this is simple: `git push <remote> <branch>`. The name for `remote` in our project is `origin`.

`$ git push origin name_of_the_branch`
 

### C. Pull request and Merge

#### Pull Request
> - submitting a change to another code base
> - it's a conversation, to get some review on your code
> - educational tool for people who join the team at a later time
- workflow
    - create a branch
    - making some commits
        >`git add <FILENAME>`  
        >`git commit -m "<YOUR COMMIT MESSAGE>"`
    - when you are ready to get it reviewed, publish the branch, create a PR to send it back to the team
        - if it's your first push on this branch to the repo
        >`git push --set-upstream origin <BRANCH NAME>`
        - after that, for each push can just type
        >`git push`
    - merge if it's good to go
    - delete the branch


#### Merge
- Merge using command line  
>`git checkout add-tutorial/main  
git branch  
git merge  add-tutorial/fei`
- Facing conflict when merge  
    - git would mark out the conflict part, like below:  
    ```
    <<<<<<< HEAD
           ------------------------------------------
          |what the file look like on the side of    |
          |the branch you are currently check out to |
           ------------------------------------------
    =====
           ----------------------------------------------
          | what was on the incoming branch you are      |
          | attempting to merge that caused the conflict |
           ----------------------------------------------
    >>>>>>> add-tutorial/fei  
    ```
    - identify what content should stay and what should go, remove conflict markers and save the file
    - stage the file and commit (check tutorial above)
- delete the branch

### D. Issue
- Traditional use: bug reports and feature requests
- Anybody has acces to the repo can assign a issue to a member in the team
- Organize with label
- Link with a PR

## 3. Packages

#### How To Install Pip With Get-Pip.Py  
To manually install pip on Windows, you will need a copy of get-pip.py.  For older Python versions, you may need to use the appropriate version of the file from pypa.org. Download the file to a folder on your computer, or use the curl command:

> `curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py`

Next, run the following command to install pip:
> `python get-pip.py`

Check if Pip has successfully installed:
> `pip --version`

#### Install `Analytics Dataframe` test package
> `pip install -i https://test.pypi.org/simple/ example-package-analyticsdf`

### Develop conda version of Analytics Dataframe package