# Session on Git

## What is Git?
Explain what is git, also mention how it is different from GIthub. What is the definition of Git on Google. What is meant by Version Control System, and Source Code Manager, and what is the difference between these two? Are these two the same things? What is MVP and what does it say? Name some of the other VCS. 
What is the need of git or VCS?
Explain the types of VCS: Centralized and Distributed
Mention the advantages of Git or VCS, that are: Version Control, Bug Fixing, Doing non-linear development, collaborative development.
Can you demonstrate one or two scenarios where VCS or Git save the day?

## **Notes: Git, VCS, and SCM**

### **1. What is Git?**

* Git is a **Distributed Version Control System (DVCS)** used to track changes in files and manage code.
* Created by **Linus Torvalds (2005)** for Linux kernel development.
* Key characteristics:

  * Every developer has a **complete local copy** of the repository.
  * Very fast operations due to local processing.
  * Supports **branching, merging, non-linear development**, and experimentation.
  * Ensures strong history tracking and data integrity.

---

### **2. How is Git different from GitHub?**

* **Git**:

  * A tool/software.
  * Installed on your system.
  * Manages version control and history of your files.
  * Works even without internet or any online account.

* **GitHub**:

  * A cloud-based platform.
  * Hosts Git repositories online.
  * Adds collaboration features: pull requests, issues, code review, project management.
  * Acts as a *remote storage + collaboration* layer.

**In short**:
Git = version control system
GitHub = online service built around Git

---

### **3. What is a Version Control System (VCS)?**

A **Version Control System** is a tool that:

* Tracks changes made to files over time.
* Stores file history (who changed what & when).
* Allows restoring older versions.
* Helps teams avoid overwriting each other's work.

---

### **4. What is a Source Code Manager (SCM)? Are VCS and SCM the same?**

* **SCM (Source Code Manager)** manages source code changes and keeps track of different versions.
* SCM and VCS are generally used as **synonyms**.
* Some people consider SCM a broader term, but in practice:

  **SCM ≈ VCS**

Both store code, track versions, and manage revisions.

---

### **5. What is MVP? Why is it mentioned in Git context?**

* **MVP = Minimum Viable Product**
  A basic version of a product that contains only core features needed to test an idea or gather feedback.
* Relevance to Git:

  * While developing MVPs, teams make rapid changes.
  * Git helps manage frequent iterations, experimentation, and quick rollbacks.
  * Developers can test ideas in separate branches without affecting the main project.

---

### **6. Why do we need Git or a VCS?**

Key reasons:

1. **History Tracking**
   Complete record of all changes made over time.

2. **Recovery from Mistakes**
   You can revert to any previous working version.

3. **Safe Experimentation (Branches)**
   Try new features without affecting the main code.

4. **Collaboration**
   Multiple developers can work together without conflicts.

5. **Offline Work (Distributed VCS)**
   You can commit and work even without internet.

6. **Reduces risk of losing work**
   Multiple copies exist due to distributed architecture.

---

### **7. Types of VCS**

#### **1. Centralized VCS (CVCS)**

* Single central server stores the entire code.
* Developers check out and commit to **one main repository**.
* Pros: simple to understand, easy administration.
* Cons:

  * **Single Point of Failure** — if server dies, history is lost.
  * Need internet to commit or update.

Examples: SVN, CVS, Perforce.

---

#### **2. Distributed VCS (DVCS)**

* Every developer gets a **complete copy** of the repository.
* All commits, branches, history exist locally.
* Pros:

  * Faster, safer, works offline.
  * No single point of failure.
  * Better branching and merging.
* Cons:

  * Slightly more complex for beginners.

Examples: Git, Mercurial.

---

### **8. Other Version Control Systems (VCS)**

* **SVN (Subversion)**
* **Mercurial**
* **CVS**
* **Perforce**
* **Bazaar**

---

### **9. Advantages of Git / VCS**

#### **1. Version Control**

* Keeps full history
* Easy to compare changes
* Allows rollback

#### **2. Bug Fixing**

* Find the exact change where a bug came from
* Create bug-fix branches
* Test and then merge

#### **3. Non-Linear Development**

* Multiple branches
* Experimental features
* Safely merge once complete

#### **4. Collaborative Development**

* Many people can work at once
* Git manages merges and conflicts
* Remote repositories allow global teamwork

---

### **10. Scenarios where Git/VCS "saves the day"**

#### **Scenario 1: A major bug breaks the project**

* Developer accidentally breaks a critical function.
* Without Git → panic, difficult to find where it broke.
* With Git:

  * Go through commit history.
  * Identify the commit that introduced the issue.
  * Revert or fix the problem in a safe branch.
  * Project restored quickly.

#### **Scenario 2: Two developers overwrite each other's work**

* Without VCS → files may get overwritten and lost.
* With Git:

  * Each developer works in their own branch.
  * Git warns about conflicting changes during merge.
  * Conflicts can be resolved with clarity.
  * No work is lost.

#### **Scenario 3: Laptop crash / accidental deletion**

* In distributed VCS like Git:

  * Every developer has a full copy.
  * Even if one computer fails, project history is safe.
  * Just clone from another copy or remote.

---

## Git & Github Workflow
Creation of working dir -> Staging area/index } Both contained in a "Repo"

Now suppose you add a code file to working dir and make some changes in it. Now a snapshot is took and the changes move to the staging area/index. Now this file moves from the staging area to the "Repo", and this procedure is named as "Commit". The file commited in the repo by the name c1, c2, etc. Now we can switch to any of these versions, as stored in the repo.

# **Git & GitHub Workflow (Detailed Notes)**

## **1. Basic Flow Overview**

A Git repository internally works with **three main areas**:

1. **Working Directory (Working Tree)**

   * Where you actually create, edit, delete, and view files.
   * Contains the real physical files.

2. **Staging Area / Index**

   * A middle area where Git stores *snapshots of changes* that are ready to be committed.
   * You manually choose which changes should be included in the next commit.

3. **Repository (Local Repo / .git folder)**

   * Stores committed snapshots as versions (c1, c2, c3…).
   * Contains the full history and metadata.

---

## **2. Git Workflow Explanation**

### ➤ **Step 1: Creation of Working Directory**

You start by creating a project folder using:

```
git init
```

This creates:

* A **Working Directory** (the files you see)
* A hidden folder **.git** that contains:

  * **Staging Area / Index**
  * **Repository**

Everything happens inside this `.git` folder.

---

### ➤ **Step 2: Add Files to Working Directory**

You add or modify files normally in your project folder.

Example:

* Create `app.py`
* Add some code to it

At this point, Git notices changes, but nothing is being tracked yet.

---

### ➤ **Step 3: Move Changes to Staging Area (Snapshot Taken)**

When you run:

```
git add app.py
```

Git takes a **snapshot** of your current file state and moves it into the **Staging Area (Index)**.

> Note that manually adding each file by its name in a large project could be a tedious task. A better alternative would be the **bash** `git add .`, and add the exceptional files to `.gitignore`

This means:

* Git is preparing this version of the file to be stored permanently.
* You can choose which files or changes to stage.

---

### ➤ **Step 4: Commit the Changes (Move to Repo)**

When you commit:

```
git commit -m "Initial commit"
```

Git:

* Takes everything in the staging area
* Saves it into the **Repository**
* Creates a **commit** with an ID like c1, c2, c3…

A commit is:

* A permanent snapshot
* A version you can always return to
* Part of project history

---

### ➤ **Step 5: Switching Between Versions**

Git allows switching to any commit using:

```
git checkout <commit-id>
```

or using branches.

This means you can go back to:

* c1: initial version
* c2: after adding a feature
* c3: after fixing a bug
* etc.

---

# **3. ASCII Diagram of the Git Workflow**

Here is a clean, easy-to-understand workflow figure:

```
               ---------------------------
              |        WORKING DIR        |
              |  (Your project files)     |
               ---------------------------
                      |
                      |  git add (snapshot)
                      v
               ---------------------------
              |   STAGING AREA / INDEX    |
              | (Prepared changes ready   |
              |       for commit)         |
               ---------------------------
                      |
                      |  git commit
                      v
               ---------------------------
              |         REPOSITORY        |
              |  (.git folder with commit |
              |   history: c1, c2, c3...) |
               ---------------------------

```

Another simplified version showing the flow:

```
WORKING DIR  --git add-->  STAGING AREA  --git commit-->  REPOSITORY
      |                         |                            |
   modify files               snapshot                    saved history
```

Commit history example:

```
REPOSITORY:

c1 → c2 → c3 → c4 → ...
```

You can jump to any of these versions anytime.

---

# **Making Changes in Git**

## **1. `git add`**

* Moves changes from **Working Directory → Staging Area**.
* You can add:

  * A specific file: `git add file.py`
  * Multiple files: `git add file1.py file2.py`
  * All changes: `git add .`

---

## **2. `git commit`**

* Saves (records) the staged snapshot into the repository.
* Equivalent to “taking a permanent photo of your project at this point.”

### **When to Commit?**

* Commit **every time a meaningful change is completed**, such as:

  * A feature implemented
  * A bug fixed
  * A section of code refactored
  * Before pulling changes from others
* Avoid committing:

  * Half-written code
  * Debugging junk
  * Irrelevant temporary files

---

## **3. How to Write a Good Commit Message**

A good commit message contains two parts:

### **A) Title (Short Summary)**

* 50–70 characters.
* Imperative tone:

  * **Bad:** "Fixed bug in login"
  * **Good:** "Fix login authentication bug"

### **B) Description (Optional, but useful)**

* 1–3 lines explaining what and why.
* Example:

  ```
  Fix login authentication bug

  Updated password hashing logic to ensure consistent encoding.
  ```

---

# **VS Code Git File Abbreviations (VERY USEFUL)**

When using Git inside VS Code, you’ll see letters next to each file representing its **status**:

| Symbol | Meaning                 | Explanation                                 |
| ------ | ----------------------- | ------------------------------------------- |
| **U**  | Untracked               | File is new; Git does not know it yet.      |
| **A**  | Added                   | File is staged (added) and ready to commit. |
| **M**  | Modified                | File has changes but is not staged yet.     |
| **D**  | Deleted                 | File was deleted and Git detected it.       |
| **R**  | Renamed                 | File was renamed.                           |
| **C**  | Copied                  | File was copied.                            |
| **??** | Untracked (alternative) | Git GUI may show this for untracked files.  |
| **!**  | Ignored                 | File is in `.gitignore`.                    |
| **~**  | Type changed            | E.g., file → symlink.                       |

### **Typical Scenarios**

* You create a new file → **U**
* You run `git add file.py` → **A**
* You edit that same file again after staging → **M** appears along with **A**
* You delete a file → **D**
* You rename file → **R**

---

# ✅ **Seeing Commits**

## **1. View commit history**

### **`git log`**

* Shows a detailed log of commits.
* Displays:

  * Commit SHA ID
  * Author
  * Date
  * Commit message

Example output:

```
commit 92acd44979b4...
Author: Harsh Chandra
Date: Thu Nov 20 16:38:52 2025 +0530

    add 2 more courses
```

### **`git log --oneline`**

* Shorter form → **one line per commit**
* Shows:

  * Short SHA (first 7 characters)
  * Commit message

Example:

```
92acd44 add 2 more courses
70bbf4d add courses to the homepage
f11f5d2 Initial test commit via bash
```

#### **Difference:**

* `git log` → detailed, multiline output
* `git log --oneline` → compact, readable summary

---

## **2. What is the SHA ID?**

* A **40-character unique identifier** of each commit.
* Example:
  `92acd44979b416b02534f3ea1756ef1db7c0da97`
* Generated using SHA-1 hashing of:

  * File contents
  * Author info
  * Date/time
  * Previous commit hash

Purpose:

* Guarantees integrity.
* Used for referencing commits:

  ```
  git checkout <sha>
  git show <sha>
  git revert <sha>
  ```

---

## **3. Commit Order in Git**

* Git logs are shown in **descending order**:

  * Latest commit first
  * Oldest commit last

---

## **4. What does (HEAD -> master) mean?**

Example:

```
92acd44 (HEAD -> master) add 2 more courses
```

Meaning:

* **HEAD** = pointer to your current checkout
* **master** = your current branch

Thus:

* “You are currently on the master branch, and HEAD is pointing to this commit.”

---

# ✅ **Advanced Commit Viewing**

## **1. `git log --stat`**

Shows:

* Which files were changed
* How many lines added/removed

Example:

```
 app.py  | 2 ++
 .gitignore | 1 +
```

---

## **2. `git log -p`**

Shows:

* All commits **with full diffs**
* Every line added or removed

This is detailed and heavy → used less often.

---

## **3. Better alternative: `git show <SHA>`**

Shows:

* Commit details
* Patch (diff)
* Which files changed

Example:

```
git show f11f5d2
```

### **Let's explain your shown output line-by-line:**

```
commit f11f5d2958937d1f2c0d55c1d2014e27aa8a516b
```

→ The full commit SHA.

```
Author: Harsh Chandra Pvt <iamharshchandra@gmail.com>
```

→ Who made the commit.

```
Date: Thu Nov 20 16:00:36 2025 +0530
```

→ When the commit happened.

```
Initial test commit via bash
```

→ Commit message.

---

### **Diff section**

```
diff --git a/Xdp.jpg b/Xdp.jpg
```

→ A diff between the old and new versions of the file.

```
new file mode 100644
```

→ This file was newly added.

```
Binary files /dev/null and b/Xdp.jpg differ
```

→ Image files are binary → Git simply notes that it’s a binary change.

---

### **Next file**

```
diff --git a/app.py b/app.py
new file mode 100644
```

→ The file `app.py` is newly added.

```
--- /dev/null
+++ b/app.py
```

→ No old version existed; new version added.

```
@@ -0,0 +1,14 @@
```

→ 14 new lines added.

```
+import streamlit as st
+st.title("CampusX")
...
```

→ Every line starting with `+` is added.

---

# **What is `git diff`?**

* **`git diff`** shows the **difference between two states of your project**.
* By default, when you simply run `git diff` (with no arguments), Git compares:
  **✔ the Working Directory**
  **with**
  **✔ the Staging Area (Index)**

Meaning:

* It shows **what you have changed in your working files**,
* **but have NOT yet added (`git add`) to the staging area**.

This makes `git diff` extremely useful for reviewing changes **before** committing.

---

# **What does `git diff` show?**

* Which files were changed.
* Which lines were added.
* Which lines were removed.
* Exact line-by-line differences.

It uses typical diff symbols:

* **`-`** → line removed
* **`+`** → line added
* **`@@ ... @@`** → which part of the file the changes occurred
* **`--- a/file` / `+++ b/file`** → comparison between old and new version

---

# **Understanding an Example Output**

You ran:

```
git diff
```

and got:

```
diff --git a/app.py b/app.py
index 1b0ffe0..1242e30 100644
--- a/app.py
+++ b/app.py
@@ -14,8 +14,7 @@ with col2:
     )
 
 st.header("Courses Offered")
-st.subheader("Data Science")
+st.subheader("Data Science and Machine Learning")
 st.subheader("Data Analysis")
-st.subheader("Data Engineering")
 st.subheader("Python")
 st.subheader("SQL")
```

Let’s break this down **line by line**:

---

### **`diff --git a/app.py b/app.py`**

Git is comparing two versions of the file **app.py**.

---

### **`index 1b0ffe0..1242e30 100644`**

* `1b0ffe0` → old version hash
* `1242e30` → new version hash
* `100644` → file permissions (regular text file)

---

### **`--- a/app.py` / `+++ b/app.py`**

* `a/app.py` = old version
* `b/app.py` = new version

---

### **`@@ -14,8 +14,7 @@`**

This means:

* The changes occur near **line 14**
* The old block had **8 lines**
* The new block has **7 lines**

---

### **The Changes Inside**

Now the important part:

#### ❌ Removed lines (start with `-`)

```
-st.subheader("Data Science")
-st.subheader("Data Engineering")
```

#### ✅ Added line (starts with `+`)

```
+st.subheader("Data Science and Machine Learning")
```

#### ➡️ Unchanged lines (no prefix)

```
st.header("Courses Offered")
st.subheader("Data Analysis")
st.subheader("Python")
st.subheader("SQL")
```

---

# What Happened in the Code?

You made the following modifications in `app.py`:

### 1️⃣ Replaced this:

```
st.subheader("Data Science")
```

with:

```
st.subheader("Data Science and Machine Learning")
```

### 2️⃣ Removed this line entirely:

```
st.subheader("Data Engineering")
```

The other lines stayed the same.

Git is showing **exactly what you changed**, like a before/after comparison.

---

# Summary of What `git diff` Does

| State Compared                   | Command             |
| -------------------------------- | ------------------- |
| Working Directory ↔ Staging Area | `git diff`          |
| Staging Area ↔ Latest Commit     | `git diff --staged` |
| Working Dir ↔ Latest Commit      | `git diff HEAD`     |

`git diff` is mainly used **before `git add`** to verify what you’re about to stage.

---

# **Creating Versions of a Software (Using Tags in Git)**

Software generally evolves in stages. To keep track of stable, meaningful points in a project’s history, **Git provides “tags.”**

## **What Are Tags?**

Tags in Git are *labels* assigned to specific commits.
They help mark important versions of software, such as:

* **X (Major Version):**
  Indicates large, possibly backward-incompatible changes.
  Example: shifting from v1.x.x to v2.x.x.

* **Y (Minor Version):**
  Indicates new features that are backward-compatible.
  Example: v1.1.x → v1.2.x.

* **Z (Patch Version):**
  Indicates small fixes, bug patches, optimizations, etc., without breaking compatibility.
  Example: v1.1.1 → v1.1.2.

Together this forms **Semantic Versioning (SemVer)**:
`vX.Y.Z`

---

## **Why Use Tags?**

* They mark **stable releases** of software.
* They allow teams to **easily return** to a particular version.
* They ensure **clarity** about what version is deployed or distributed.
* They make it easy to **compare different releases**.

Example:
A series of commits may represent work on a feature. Once complete, the final commit can be marked with a tag like `v1.0.0`.

---

# **How Tags Work in Practice**

Below is the commit history for the example repository:

```
2d9ff95 (HEAD -> master) change name of DS course
92acd44 add 2 more courses
70bbf4d add courses to the homepage
f11f5d2 Initial test commit via bash
```

The commit **`2d9ff95`** is the latest commit.

---

## **Creating a Tag**

### **1. Creating an Annotated Tag on the Latest Commit**

```
git tag -a v1.0.0
```

* `-a` → creates an **annotated tag** (recommended).
* Git asks for a tag message.
* The tag gets attached to the **current commit** (HEAD).

Checking the log afterward:

```
commit 2d9ff9554c3996ed40cc3e876015fce795f95935 (HEAD -> master, tag: v1.0.0)
```

This shows that `v1.0.0` now marks the latest commit.

---

## **Deleting a Tag**

To delete a tag locally:

```
git tag -d v1.0.0
```

Output:

```
Deleted tag 'v1.0.0' (was e39a49e)
```

The tag label is removed, but the commit itself remains untouched.

---

## **Tagging an Older Commit**

Git allows assigning a tag to any commit by using its **SHA ID**.

Example:

```
git tag -a v1.0.0 92acd4
```

Here, `92acd4` is the shortened SHA for:

```
92acd44979b416b02534f3ea1756ef1db7c0da97
```

Checking the log now:

```
commit 92acd44979b416b02534f3ea1756ef1db7c0da97 (tag: v1.0.0)
```

This shows the tag is now attached to the older commit, not the latest one.

---

# **How Tags Benefit Versioning**

* A tag can represent a **complete feature**, **release**, or **milestone**.
* Tags make rollback easy:
  Example: checking out an older stable build.
* When collaborating, team members know *exactly* which version is deployed.
* Tools such as Docker, CI/CD systems, deployment pipelines, and GitHub Releases use tags to create versioned builds.

---

# **Summary (Bullet Format)**

### **Tags**

* Labels assigned to commits.
* Used for version numbering.
* Usually follow **Semantic Versioning X.Y.Z**.

### **X (Major)**

* Breaking changes.

### **Y (Minor)**

* New features.
* Backward compatible.

### **Z (Patch)**

* Bug fixes.
* Small improvements.

### **Commands Used**

| Action                       | Command                   | Meaning                             |
| ---------------------------- | ------------------------- | ----------------------------------- |
| Create annotated tag on HEAD | `git tag -a v1.0.0`       | Creates a tag for the latest commit |
| Delete a tag                 | `git tag -d v1.0.0`       | Removes the tag locally             |
| Tag a specific commit        | `git tag -a v1.0.0 <SHA>` | Assigns tag to an older commit      |
| List all tags                | `git tag`                 | Shows available tags                |

---

If you'd like, the next topic can be **Branches**, **GitHub workflow**, **Merging**, **Rebasing**, **Stashing**, or anything else—they can also be explained in the same clean third-person style.
