# Git & GitHub for Data Science
Version control is a critical skill for any data scientist. It helps track changes to code, collaborate effectively with others, and maintain a history of work. This notebook introduces Git and GitHub with practical examples and exercises.

## Version Control with Git
Git is a distributed version control system that tracks changes in files. It allows multiple people to collaborate on a project efficiently.

### Why Use Git?
- Track changes in code.
- Revert to earlier versions if needed.
- Collaborate with others without overwriting each other's work.

### Basic Git Commands
- `git init`: Initialize a new Git repository.
- `git add`: Stage changes for commit.
- `git commit`: Save changes to the repository.
- `git status`: View the status of changes.
- `git log`: View the history of commits.
- `git push`: Push changes to a remote repository.
- `git pull`: Fetch and merge changes from a remote repository.

### Example:
```bash
# Initialize a new repository
git init

# Add a file to the staging area
git add filename.py

# Commit changes
git commit -m "Initial commit"

# Push changes to the remote repository
git push origin main
```

## Branching and Merging
Branches allow you to work on different features or fixes independently. Once the work is complete, you can merge the branch back into the main branch.

### Commands:
- `git branch`: List or create branches.
- `git checkout`: Switch between branches.
- `git merge`: Merge branches.

### Example:
```bash
# Create a new branch
git branch feature-branch

# Switch to the new branch
git checkout feature-branch

# Merge the branch back into main
git checkout main
git merge feature-branch
```

## Working with Remote Repositories on GitHub
GitHub is a platform for hosting Git repositories. It enables collaboration, issue tracking, and version control.

### Steps to Push Code to GitHub:
1. Create a repository on GitHub.
2. Copy the repository URL.
3. Add the remote repository to Git:
   ```bash
   git remote add origin https://github.com/username/repo.git
   ```
4. Push changes:
   ```bash
   git push -u origin main
   ```

## Collaborative Workflows
Collaboration involves multiple people working on the same project. GitHub provides tools to manage collaboration effectively.

- **Forking:** Copy a repository to your GitHub account.
- **Pull Requests:** Propose changes to a repository.
- **Code Reviews:** Review and discuss proposed changes.

### Example Workflow:
1. Clone a repository:
   ```bash
   git clone https://github.com/username/repo.git
   ```
2. Create a branch for your changes:
   ```bash
   git checkout -b feature-branch
   ```
3. Make changes and commit:
   ```bash
   git add .
   git commit -m "Implemented new feature"
   ```
4. Push changes and create a pull request:
   ```bash
   git push origin feature-branch
   ```

### Use Case in Data Science
Collaborating on a machine learning project with multiple team members requires version control to:
- Track code changes.
- Avoid conflicts.
- Integrate contributions efficiently.

For example, one team member works on data preprocessing while another focuses on model development. Using Git and GitHub, both can collaborate without overwriting each other's work.

## Practice Exercises
1. Create a new Git repository and initialize it.
2. Create a Python script that prints "Hello, Git!" and commit it to the repository.
3. Create a new branch and make changes to the script.
4. Merge the changes back into the main branch.
5. Push the repository to GitHub and share it with a collaborator.