# 🚀 Databricks Repos (Git Integration) - GitHub Workflow

Databricks Repos integrates directly with Git providers like **GitHub**, enhancing version control and collaboration for data projects.

---

## Why Use Databricks Repos?

- Supports **branching, merging, pull requests**—unlike basic notebook revision history.
- Seamlessly integrates with GitHub, Azure DevOps, GitLab, Bitbucket, and more.
- Simplifies collaboration on notebooks, code, and data workflows.

---

## 1️⃣ Configuring Git Integration in Databricks

**Steps:**

1. Go to your **User → Setting** (top-right corner).
2. Navigate to the **Linked accounts** tab.
3. Select **GitHub** as your provider.
4. Link account using a **Personal Access Token (PAT)**
5. Enter:
   - Your **GitHub username**
   - Your **Personal Access Token (PAT)** for authentication

---

## 2️⃣ Generating a Personal Access Token (PAT)

In GitHub:

1. Go to **Settings → Developer Settings → Personal access tokens**.
2. Click **Generate new token**.
3. Grant necessary permissions (e.g., repo access).
4. Write a Note and check the **repo** category.
5. Copy and save the token securely—it will only be shown once.

---

## 3️⃣ Creating a New GitHub Repository

In GitHub:

- Create a new repository (private or public).
- Give it a meaningful name (e.g., `databricks-project`).
- Optionally, add a README file.

---

## 4️⃣ Cloning a GitHub Repository into Databricks

Quickly clone a repo into Databricks Repos:

1. In **GitHub**, copy the repository URL (click **Code → Copy HTTPS URL**).
2. In Databricks workspace, go to the **Workspace** tab (left sidebar).
3. Select the **Workspace → Repos** folder and 
3. Click **Add Repo**.
4. Paste the GitHub repository URL.
5. The provider and repo name auto-fill.
6. Click **Submit**.

✅ Your GitHub repo is now linked to Databricks. You can manage code & notebooks directly.

---

## 5️⃣ Working with Repositories

- **Branching:**  
  Create branches (e.g., `feature-branch`) to work independently from `main`.

- **Organizing Files:**  
  Add notebooks and folders inside the repo.

- **Cloning Existing Notebooks:**  
  Right-click an existing notebook → **Clone to Repo** → select your repo.

---

## 6️⃣ Committing & Pushing Changes

**Typical Workflow:**

1. Edit notebooks or code in Databricks.
2. Click **Commit & Push** (top-right of notebook toolbar).
3. Enter a commit message and push changes to GitHub.

---

## 7️⃣ Pull Requests & Merging

In GitHub:

1. Open a **Pull Request** to merge changes from your branch into `main`.
2. Review and merge changes.

---

## 8️⃣ Collaborative Best Practices

- **Pull latest changes regularly** to avoid conflicts.
- Use **branches** to isolate development.
- Use Pull Requests to manage reviews and code integrity.

---

## 🏁 Summary

| Feature                        | Purpose                                                    |
|--------------------------------|------------------------------------------------------------|
| **Databricks Repos**           | Integrate GitHub with Databricks workspace                  |
| **Git Integration Setup**      | Configure user settings and GitHub credentials              |
| **Cloning Repos**              | Easily bring GitHub repos into Databricks                   |
| **Branching & Commit**         | Version control workflows: commit, push, pull, PRs          |
| **Collaboration**              | Structured workflow to manage shared development            |

> **Tip:**  
For larger projects, export or backup notebooks via **File → Export → IPython notebook**, and use GitHub as your central source of truth!

