### Using a GitHub Personal Access Token (PAT) to Push from a SageMaker Notebook

When working in SageMaker notebooks, you may often need to push code updates to GitHub repositories. However, SageMaker notebooks are typically launched with temporary instances that don’t persist configurations, including SSH keys, across sessions. This makes HTTPS-based authentication, secured with a GitHub Personal Access Token (PAT), a practical solution. PATs provide flexibility for authentication and enable seamless interaction with both public and private repositories directly from your notebook. 

> **Important Note**: Personal access tokens are powerful credentials that grant specific permissions to your GitHub account. To ensure security, only select the minimum necessary permissions and handle the token carefully.


### Step 1: Generate a Personal Access Token (PAT) on GitHub

1. Go to **Settings > Developer settings > Personal access tokens** on GitHub.
2. Click **Generate new token**, select **Classic**.
3. Give your token a descriptive name (e.g., "SageMaker Access Token") and set an expiration date if desired for added security.
4. **Select the minimum permissions needed**:
   - **For public repositories**: Choose only **`public_repo`**.
   - **For private repositories**: Choose **`repo`** (full control of private repositories).
   - Optional permissions, if needed:
     - **`repo:status`**: Access commit status (if checking status checks).
     - **`workflow`**: Update GitHub Actions workflows (only if working with GitHub Actions).
5. Generate the token and **copy it** (you won’t be able to see it again).

> **Caution**: Treat your PAT like a password. Avoid sharing it or exposing it in your code. Store it securely (e.g., via a password manager like LastPass) and consider rotating it regularly.

### Step 2: Temporarily Store the PAT in an Environment Variable on SageMaker

To avoid embedding the PAT directly in commands, you can temporarily set it as an environment variable in your SageMaker notebook:



In [14]:

import os

# Replace 'your-token' with your actual GitHub PAT
# os.environ["GITHUB_PAT"] = "your_token"



> **Note**: **Delete this cell after running it** to ensure the token isn’t permanently stored in your notebook code.

The environment variables reset when the instance is stopped and started again, so any setup involving environment variables or temporary credentials (like a PAT) needs to be re-executed after each restart.

### Step 3: Configure Git to Use HTTPS with the PAT

Now you’ll configure Git to use HTTPS with the stored token.

1. **Set up Git to store credentials**:

    ```python
    # Set up Git to store the credentials
    !git config --global credential.helper 'store'
    ```

2. **Create a Git credentials helper** that reads the PAT from the environment variable, saving it temporarily to `~/.git-credentials` so Git can authenticate automatically.

    ```python
    # Create the .git-credentials file with the PAT
    with open(os.path.expanduser("~/.git-credentials"), "w") as f:
        f.write(f"https://{os.environ['GITHUB_PAT']}@github.com\n")
    ```

3. **Set your Git username and email** (if you haven’t already):

    ```python
    !git config --global user.name "your-username"
    !git config --global user.email "your-email@example.com"
    ```

4. **Remove the PAT from the environment variable** immediately after it’s no longer needed:

    ```python
    del os.environ["GITHUB_PAT"]
    ```

### Step 4: Push Changes to GitHub

Now you can add, commit, and push your changes to GitHub without being prompted for credentials:

```python
# Navigate to the repository directory (adjust path if needed)
%cd test_AWS

# Add, commit, and push changes
!git add .
!git commit -m "Added updates from Jupyter notebook"
!git push origin main
```

### Optional Security Cleanup

Once you’ve completed your push, you may want to remove the saved PAT from `~/.git-credentials` for additional security, especially if you’re working in a shared environment:

```python
!rm ~/.git-credentials
```

---

### Summary

Using a PAT in a SageMaker notebook provides the flexibility to push updates to GitHub securely, even in temporary or ephemeral environments. By selecting only the permissions you need and managing your token carefully, you can safely automate GitHub interactions while maintaining the security of your repositories.

In [9]:

!git config --global user.name "Chris Endemann"
!git config --global user.email endeman@wisc.edu
!ls

00_Data-storage-and-access-via-buckets.ipynb  push-updates.ipynb
01_Intro-train-models.ipynb		      __pycache__
02_Hyperparameter-tuning.ipynb		      README.md
create_large_data.ipynb			      train_nn.py
data					      train_xgboost.py
LICENSE


In [10]:
!git add .
!git commit -m "Added updates from SageMaker notebooks"
!git push origin main


On branch main
Your branch is ahead of 'origin/main' by 4 commits.
  (use "git push" to publish your local commits)

nothing to commit, working tree clean
Username for 'https://github.com/UW-Madison-DataScience/test_AWS.git': ^C
