<a href="https://colab.research.google.com/github/TylerWichman/mgmt467-analytics-portfolio/blob/main/Week2_1_Prompt_Practice.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Week 2.1 — Prompt Practice: Git, GitHub, and Google Colab

**Course:** MGMT 467 — AI‑Assisted Big Data Analytics in the Cloud  
**Session:** Tuesday (2.1) — Developer Environment Setup

### How to use this notebook
- This is a **practice and planning** notebook: most cells are **Markdown** with copy‑pasteable prompt templates you will run in your AI tool (e.g., Gemini).  
- After you run a prompt in your AI tool, **summarize what you learned** in the provided **Reflection** cells here.  
- When a task asks for a short code snippet (e.g., Git or Colab), paste the **final, validated** snippet in the designated cell and add a one‑sentence explanation.

> **Validate everything.** Cross‑check AI outputs with official docs or a second prompt. If two sources disagree, note it and explain which you chose and why.



---
## Prompt Patterns Quick Reference

Use these as starting points and **adapt** them to your context.

### 1) Zero‑Shot (definition/explanation)
```
Act as a clear, concise tutor for first‑year CS students.
Explain {TOPIC} in 5 bullet points max. Include one analogy and one pitfall to avoid.
```

### 2) Few‑Shot (guided answers consistent with examples)
```
You will answer in the same style as the examples.

Q: What is a "commit" in Git?  
A: A snapshot of tracked file changes with a message explaining why.

Q: What is "pushing" in Git?  
A: Sending local commits to a remote repository so others can see them.

Q: {YOUR QUESTION}
A:
```

### 3) Step‑by‑Step Reasoning (show key steps)
```
I need a **numbered, step‑by‑step plan** for {TASK}.
For each step: the goal, one command (if applicable), and a 1‑line verification check.
Avoid hidden steps; keep it to 6–8 steps total.
```



---
## Group A — Git Fundamentals (3 questions)

### A1. What problem does Git solve? How is it different from file syncing?
**Use:** Zero‑Shot, then Few‑Shot for refinement.  
**Run this prompt:**
```
Act as a version control coach.
Explain what Git is and the specific problem it solves compared to simple file syncing (e.g., Drive).
List 3 concrete benefits for a small analytics team.
End with a 2‑sentence analogy.
```
**Reflection (2–4 sentences):** What did you learn that you didn’t already know?


One thing I learned by running this prompt is the abiility to use Git to experiment using different code solutions to the same problem with the ability to either go back if the original is stronger or to commit the changes if there is an improvement seen. This would be super helpful in data analytics as changing different parameters in a model could lead to different results, so using this feature would allow you to easily track the strongest model.


### A2. Commit → Branch → Merge: the minimal workflow
**Use:** Step‑by‑Step Reasoning.  
**Run this prompt:**
```
Create a minimal, step‑by‑step workflow to:
1) initialize a repo, 2) create and switch to a feature branch, 3) commit changes,
4) merge back to main locally, 5) push to a remote named "origin".
For each step include: goal, command, and a quick verification.
```
**Paste final validated commands below and add one sentence on when to branch.**


In [None]:

# Paste your validated minimal Git workflow commands here as comments, e.g.:
# git init
# git checkout -b feature/readme-polish
# git add README.md
# git commit -m "Clarify setup steps"
# git checkout main
# git merge feature/readme-polish --no-ff
# git remote add origin <REMOTE_URL>
# git push -u origin main


# git init
# git checkout -b feature/your-feature-name
# git add .
# git commit -m "Descriptive message about your changes"
# git checkout main
# git merge feature/your-feature-name
# git remote add origin <REMOTE_URL>
# git push -u origin main

# You should branch if you start working on a new feature, experiment with current code, or debug
# So progress can be saved


### A3. Resolving a simple merge conflict
**Use:** Step‑by‑Step Reasoning.  
**Run this prompt:**
```
I have a merge conflict in README.md after merging a feature branch into main.
Give a 6-step recipe to resolve it safely:
- how to open the file, identify conflict markers, choose/merge lines,
- add/commit the resolution, verify the merge, and push.
Include one common pitfall and how to avoid it.
```
**Reflection:** What’s your personal checklist to avoid conflicts getting messy?



1.   Identify the conflict
2.   Open the conflicted file
3.   Resolve the conflict
4.   Add the resolved file
5.   Commit the resolution
6.   Verify and Push


---
## Group B — GitHub Collaboration (3 questions)

### B1. Branch vs. Fork vs. Clone
**Use:** Few‑Shot to drive crisp distinctions with examples.  
**Run this prompt:**
```
Answer using this format:
Term — One-sentence definition — When to use — One example.

Branch —
Fork —
Clone —
```
**Reflection:** Which one will your team use for this course and why?


In this corse we will likely use branch, it is the only option in which all members could actively be working on a project and everyone can push changes that will be seen by all team members.


### B2. Pull Request (PR) checklist for this course
**Use:** Step‑by‑Step Reasoning.  
**Run this prompt:**
```
Write a "PR Checklist" for a university analytics course team repo.
Include: naming convention, description template, screenshots policy, reviewers, CI checks (if any),
and a revert plan. Limit to 8 concise checklist items.
```
**Paste your final checklist below.**


In [None]:

# Example (edit to your team's needs)
pr_checklist = [
    "Title: Use a clear naming convention like <week>-<assignment>-<short-description> (e.g., wk3-lab5-eda-customer-trends).",
    "Description: Include a brief summary of the changes, the problem it solves, key files modified, and instructions on how to test the changes.",
    "Screenshots/Visuals: Include screenshots of relevant outputs like plots or dashboard changes if applicable.",
    "Linked Issues: Link the PR to the relevant issue or assignment requirement it addresses.",
    "Reviewers: Request a review from at least one other teammate. Avoid merging your own PRs.",
    "Passes Checks: Ensure the code runs without errors (e.g., confirm the notebook runs top-to-bottom). Mention if any CI checks (like linters or tests) are in place and passing.",
    "No Secrets/PII: Double-check that no API keys, passwords, or personally identifiable information are included in the code or notebook outputs.",
    "Revert Plan: Briefly note how the changes could be quickly reverted if necessary (usually just reverting the merge commit)."
]
pr_checklist



### B3. Protected `main` workflow
**Use:** Zero‑Shot + Step‑by‑Step.  
**Run this prompt:**
```
Explain how to protect the main branch in a GitHub repo for a class team:
- Require PRs, at least one review, and passing checks
- Disallow force-pushes
Provide a numbered setup guide and a 3-line "why this matters" explanation.
```
**Reflection:** Which protection rules will you actually enable first, and why?


I think the protection rule that should be enabled first is requiring pull request before merging. This way one team member can't intentionally or unintentionally commit new changes and lose progress on the main branch for the team.


---
## Group C — Google Colab for Analytics (3 questions)

### C1. Why Colab? Benefits & limits for this course
**Use:** Zero‑Shot.  
**Run this prompt:**
```
Act as a data science tech advisor.
List 5 advantages and 3 limitations of Google Colab for analytics coursework.
Tailor to a class that uses BigQuery and dashboards. Keep it to bullet points.
```
**Reflection:** Which two advantages will help *you* most this semester?


These two advantages will help me the most this semester

1.   Easy Sharing and Collaboration: Notebooks can be easily shared with classmates and instructors, facilitating collaboration and feedback.
2.   Integrated with Google Ecosystem: Seamlessly integrates with Google Drive for file storage and readily connects to Google Cloud services like BigQuery.




### C2. Authenticate to GCP in Colab and query BigQuery
**Use:** Step‑by‑Step Reasoning for a minimal working snippet.  
**Run this prompt:**
```
Provide a minimal Colab snippet to:
1) authenticate to Google Cloud,
2) run a simple BigQuery SQL (e.g., SELECT 1),
3) get results into a pandas DataFrame,
4) print row count.
Include a one-line note on costs and safe use of LIMIT.
```
**Paste your final validated code below.**


In [None]:

# Minimal BigQuery test in Colab (paste your validated version)
# from google.colab import auth
# auth.authenticate_user()
#
# from google.cloud import bigquery
# client = bigquery.Client(project="<YOUR_PROJECT_ID>")
# sql = "SELECT 1 AS test_col"
# df = client.query(sql).result().to_dataframe()
# print("Rows:", len(df))
# df.head()


In [2]:
# Authenticate to Google Cloud
# from google.colab import auth
# auth.authenticate_user()

# Initialize BigQuery client and run a sample query
# from google.cloud import bigquery
# client = bigquery.Client(project="<YOUR_PROJECT_ID>") # Replace with your GCP project ID
# sql = "SELECT 1 AS test_col" # Simple test query
# df = client.query(sql).result().to_dataframe()

# print("Rows:", len(df))

# display(df.head())



### C3. Save notebooks to GitHub from Colab
**Use:** Step‑by‑Step Reasoning.  
**Run this prompt:**
```
Give two safe workflows to keep Colab notebooks versioned in GitHub:
(A) using "File > Save a copy in GitHub",
(B) local git with Drive sync (brief).
Provide steps and cautions (e.g., large outputs, secrets) for each.
```
**Reflection:** Which workflow will your team adopt and why?


I believe that Workflow A will be the easiest for our team to adobt because there is less complexity in connecting the accounts in workflow A


---
## Capstone Synthesis (end of class)

**Scenario:** Your team needs a reproducible workflow for this course: team repo on GitHub, branching, Colab auth to BigQuery, and a PR checklist.

**Run this prompt:**
```
Act as a DevEx lead for a university analytics team.
Produce a one-page "Runbook" with:
- Repo structure (folders for notebooks, data, dashboards, docs)
- Branching model (who creates branches, when to merge)
- Colab ↔ BigQuery quickstart (auth, sample query, cost-safe LIMIT)
- PR checklist (max 8 bullets) and protection rules for main
- Two risks + mitigations (e.g., secrets leakage, merge conflicts)
Use concise bullets and keep it classroom-ready.
```

**Paste your final runbook below (or attach as a Markdown file in your repo) and add a 3‑bullet reflection on what you changed after validation.**


As your DevEx lead, here is a concise Runbook for your university analytics team, designed to keep your workflow smooth and reproducible:

Team Analytics Project Runbook
Repository Structure
/
├── notebooks/         # Jupyter/Colab notebooks for analysis, modeling, etc.
├── data/              # Sample data, processed data (avoid large raw files)
├── dashboards/        # Code or config files for dashboarding tools (e.g., Looker Studio, Tableau)
├── docs/              # Project documentation, meeting notes, design docs
├── src/               # Python scripts or modules (if any)
├── .gitignore         # Files/folders to ignore (e.g., data/, .env, __pycache__)
├── README.md          # Project overview, setup instructions, team members
└── requirements.txt   # Python dependencies
Branching Model
Main Branch (main): Always represents the stable, production-ready state of the project. Protected.
Feature Branches: Team members create a new branch off main for each new task, feature, or bug fix (e.g., feat/eda-customer-segmentation, fix/dashboard-date-filter).
Branch Creation: Any team member can create a feature branch.
Merging: Feature branches are merged into main via a Pull Request after review and approval.
Colab ↔ BigQuery Quickstart
Authenticate:
from google.colab import auth
auth.authenticate_user()
Query BigQuery:
from google.cloud import bigquery
client = bigquery.Client(project="<YOUR_PROJECT_ID>")
sql = """
SELECT count(*) as row_count
FROM `your_gcp_project_id.your_dataset.your_table`
LIMIT 100 # Use LIMIT for cost control during exploration
"""
df = client.query(sql).result().to_dataframe()
print("Rows retrieved:", len(df))
Note: BigQuery charges for data processed. Use LIMIT and filter early (e.g., by date) to manage costs.

Pull Request (PR) Checklist & main Protection
PR Checklist (Before Merging):

Title follows <week>-<assignment>-<short-description> convention.
Description summarizes changes, problem solved, and testing notes.
Screenshots/visuals included for relevant outputs.
Linked to the relevant assignment/issue.
At least one teammate has reviewed and approved.
Notebook/code runs top-to-bottom without errors.
No secrets or PII are included.
Brief revert strategy noted.
main Branch Protection Rules (Enable in GitHub Settings > Branches):

Require a pull request before merging.
Require approvals (set to 1+).
Require status checks to pass (if configured).
Disallow force pushes.
Risks & Mitigations
Risk 1: Secrets Leakage (API keys, credentials):
Mitigation: Never hardcode secrets in notebooks or scripts. Use Colab's built-in Secrets manager or environment variables. Add secret file names (e.g., .env) to .gitignore.
Risk 2: Merge Conflicts:
Mitigation: Work on small, focused feature branches. Sync (git pull origin main) frequently before starting new work and before merging. Communicate with teammates about what files you are working on.
This runbook provides a solid foundation for your team's collaborative workflow. Remember to adapt it as needed throughout the semester!


---
## Submission Checklist (to your team repo + Brightspace link)

- [ ] All **Reflection** sections completed (A1–A3, B1–B3, C1–C3, Capstone).
- [ ] Any code snippets pasted are **validated** and include a 1‑line explanation.
- [ ] Notebook runs top‑to‑bottom without errors (where code cells exist).
- [ ] Commit message: `week2.1-prompt-practice` and open a PR for review.
- [ ] Add this notebook path to your repo **README.md** under Week 2.1.
