### Using a GitHub Personal Access Token (PAT) to Push from a SageMaker Notebook

When working in SageMaker notebooks, you may often need to push code updates to GitHub repositories. However, SageMaker notebooks are typically launched with temporary instances that don’t persist configurations, including SSH keys, across sessions. This makes HTTPS-based authentication, secured with a GitHub Personal Access Token (PAT), a practical solution. PATs provide flexibility for authentication and enable seamless interaction with both public and private repositories directly from your notebook. 

> **Important Note**: Personal access tokens are powerful credentials that grant specific permissions to your GitHub account. To ensure security, only select the minimum necessary permissions and handle the token carefully.


### Step 1: Generate a Personal Access Token (PAT) on GitHub

1. Go to **Settings > Developer settings > Personal access tokens** on GitHub.
2. Click **Generate new token**, select **Classic**.
3. Give your token a descriptive name (e.g., "SageMaker Access Token") and set an expiration date if desired for added security.
4. **Select the minimum permissions needed**:
   - **For public repositories**: Choose only **`public_repo`**.
   - **For private repositories**: Choose **`repo`** (full control of private repositories).
   - Optional permissions, if needed:
     - **`repo:status`**: Access commit status (if checking status checks).
     - **`workflow`**: Update GitHub Actions workflows (only if working with GitHub Actions).
5. Generate the token and **copy it** (you won’t be able to see it again).

> **Caution**: Treat your PAT like a password. Avoid sharing it or exposing it in your code. Store it securely (e.g., via a password manager like LastPass) and consider rotating it regularly.

### Step 2: Temporarily Store the PAT in an Environment Variable on SageMaker

To avoid embedding the PAT directly in commands, you can temporarily set it as an environment variable in your SageMaker notebook:



> **Note**: **Delete this cell after running it** to ensure the token isn’t permanently stored in your notebook code.

The environment variables reset when the instance is stopped and started again, so any setup involving environment variables or temporary credentials (like a PAT) needs to be re-executed after each restart.

### Step 2: Use `getpass` to Prompt for Username and PAT

The `getpass` library allows you to input your GitHub username and PAT without exposing them in the notebook. This approach ensures you’re not hardcoding sensitive information.


In [31]:
import getpass

# Prompt for GitHub username and PAT securely
username = input("GitHub Username: ")
token = getpass.getpass("GitHub Personal Access Token (PAT): ")

GitHub Username:  qualiaMachine
GitHub Personal Access Token (PAT):  ········



### Explanation

- **`input("GitHub Username: ")`**: Prompts you to enter your GitHub username.
- **`getpass.getpass("GitHub Personal Access Token (PAT): ")`**: Prompts you to securely enter the PAT, keeping it hidden on the screen.



### Step 3: Add, Commit, and Push Changes with Manual Authentication
1. **Navigate to the Repository Directory** (adjust the path if needed):


In [24]:
!pwd
# !cd test_AWS

/home/ec2-user/SageMaker/test_AWS



2. **Add and Commit Changes**:



In [25]:
!git add .
!git commit -m "Added updates from Jupyter notebook"


On branch main
Your branch is ahead of 'origin/main' by 5 commits.
  (use "git push" to publish your local commits)

nothing to commit, working tree clean


3. **Push Changes and Enter Credentials**:

In [26]:
# Push with embedded credentials from getpass (avoids interactive prompt)
!git push https://{username}:{token}@github.com/UW-Madison-DataScience/test_AWS.git main

 

Username for 'https://github.com/UW-Madison-DataScience/test_AWS.git': ^C


When you run this `push` command, Git will prompt for:
- **Username**: Enter your GitHub username.
- **Password**: Paste your PAT (even though it’s called “password,” GitHub expects the PAT here).

This method securely avoids storing any credentials in the notebook environment or local files.

---

### Summary

By entering credentials manually for each push, you prevent storing sensitive information, maintaining the security of your PAT and GitHub account. This is the most secure approach, particularly in shared or temporary environments like SageMaker notebooks.

In [9]:

!git config --global user.name "Chris Endemann"
!git config --global user.email endeman@wisc.edu
!ls

00_Data-storage-and-access-via-buckets.ipynb  push-updates.ipynb
01_Intro-train-models.ipynb		      __pycache__
02_Hyperparameter-tuning.ipynb		      README.md
create_large_data.ipynb			      train_nn.py
data					      train_xgboost.py
LICENSE


In [10]:
!git add .
!git commit -m "Added updates from SageMaker notebooks"
!git push origin main


On branch main
Your branch is ahead of 'origin/main' by 4 commits.
  (use "git push" to publish your local commits)

nothing to commit, working tree clean
Username for 'https://github.com/UW-Madison-DataScience/test_AWS.git': ^C
