# 📚 1.3 Introduction to Git

This notebook introduces Git, a version control system essential for managing code and data in nutrition research projects.

**Objectives**:
- Understand Git’s purpose and basic commands.
- Practise committing changes to a repository.
- Explore GitHub for collaboration.

**Context**: Git ensures reproducibility and collaboration, vital for MSc projects like NDNS analysis.

<details><summary>Fun Fact</summary>
Git is like a hippo’s journal—tracking every snack and meal with precision! 🦛
</details>

In [None]:
# Setup for Google Colab: Fetch datasets automatically or manually
import os
from google.colab import files

# Define the module and dataset for this notebook
MODULE = '01_infrastructure'
DATASET = 'hippo_diets.csv'
DATASET_PATH = os.path.join('data', DATASET)

# Step 1: Attempt to clone the repository (automatic method)
try:
    print('Attempting to clone repository...')
    !git clone https://github.com/ggkuhnle/data-analysis-toolkit-FNS.git
    os.chdir(f'/content/data-analysis-toolkit-FNS/notebooks/{MODULE}')
    if os.path.exists(DATASET_PATH):
        print(f'Dataset found: {DATASET_PATH} 🦛')
    else:
        print(f'Error: Dataset {DATASET} not found after cloning.')
        raise FileNotFoundError
except Exception as e:
    print(f'Cloning failed: {e}')
    print('Falling back to manual upload option...')

    # Step 2: Manual upload option
    print(f'Please upload {DATASET} manually.')
    print(f'1. Click the "Choose Files" button below.')
    print(f'2. Select {DATASET} from your local machine.')
    print(f'3. Ensure the file is placed in notebooks/{MODULE}/data/')
    
    # Create the data directory if it doesn't exist
    os.makedirs('data', exist_ok=True)
    
    # Prompt user to upload the dataset
    uploaded = files.upload()
    
    # Check if the dataset was uploaded
    if DATASET in uploaded:
        with open(DATASET_PATH, 'wb') as f:
            f.write(uploaded[DATASET])
        print(f'Successfully uploaded {DATASET} to {DATASET_PATH} 🦛')
    else:
        raise FileNotFoundError(f'Upload failed. Please ensure you uploaded {DATASET}.')

# Install required packages for this notebook
%pip install pandas numpy
print('Python environment ready.')

## Setting Up Git

Install Git locally or use a cloud platform like GitHub. Run the following in a terminal (not Python):

```bash
git --version  # Check Git installation
git config --global user.name "Your Name"
git config --global user.email "your.email@example.com"
```

## Exercise 1: Create a Repository

Create a local Git repository and commit a sample file. Follow these steps in a terminal:

1. Create a directory: `mkdir my-nutrition-repo`
2. Navigate: `cd my-nutrition-repo`
3. Initialise: `git init`
4. Create a file: `echo "# Nutrition Notes" > README.md`
5. Stage: `git add README.md`
6. Commit: `git commit -m "Initial commit"`

Document your experience in a Markdown cell.

**Guidance**: Note any errors or successes.

**Answer**:

I created the repository by...

## Conclusion

You’ve learned the basics of Git for version control. Practise these skills to manage your nutrition data projects.

**Next Steps**: Explore Quarto for reproducible documents in 1.4.

**Resources**:
- [Git Documentation](https://git-scm.com/doc)
- [GitHub Guides](https://guides.github.com/)
- Repository: [github.com/ggkuhnle/data-analysis-toolkit-FNS](https://github.com/ggkuhnle/data-analysis-toolkit-FNS)