<a href="https://colab.research.google.com/github/caitlyn-cai/mgmt467-analytics-portfolio/blob/main/Week2_1_Prompt_Practice.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Week 2.1 — Prompt Practice: Git, GitHub, and Google Colab

**Course:** MGMT 467 — AI‑Assisted Big Data Analytics in the Cloud  
**Session:** Tuesday (2.1) — Developer Environment Setup

### How to use this notebook
- This is a **practice and planning** notebook: most cells are **Markdown** with copy‑pasteable prompt templates you will run in your AI tool (e.g., Gemini).  
- After you run a prompt in your AI tool, **summarize what you learned** in the provided **Reflection** cells here.  
- When a task asks for a short code snippet (e.g., Git or Colab), paste the **final, validated** snippet in the designated cell and add a one‑sentence explanation.

> **Validate everything.** Cross‑check AI outputs with official docs or a second prompt. If two sources disagree, note it and explain which you chose and why.



---
## Prompt Patterns Quick Reference

Use these as starting points and **adapt** them to your context.

### 1) Zero‑Shot (definition/explanation)
```
Act as a clear, concise tutor for first‑year CS students.
Explain {TOPIC} in 5 bullet points max. Include one analogy and one pitfall to avoid.
```

### 2) Few‑Shot (guided answers consistent with examples)
```
You will answer in the same style as the examples.

Q: What is a "commit" in Git?  
A: A snapshot of tracked file changes with a message explaining why.

Q: What is "pushing" in Git?  
A: Sending local commits to a remote repository so others can see them.

Q: {YOUR QUESTION}
A:
```

### 3) Step‑by‑Step Reasoning (show key steps)
```
I need a **numbered, step‑by‑step plan** for {TASK}.
For each step: the goal, one command (if applicable), and a 1‑line verification check.
Avoid hidden steps; keep it to 6–8 steps total.
```



---
## Group A — Git Fundamentals (3 questions)

### A1. What problem does Git solve? How is it different from file syncing?
**Use:** Zero‑Shot, then Few‑Shot for refinement.  
**Run this prompt:**
```
Act as a version control coach.
Explain what Git is and the specific problem it solves compared to simple file syncing (e.g., Drive).
List 3 concrete benefits for a small analytics team.
End with a 2‑sentence analogy.
```
**Reflection (2–4 sentences):** What did you learn that you didn’t already know?

I learned that Git is pretty similar to Google drive where it tracks changed to code and other files over time in a detailed and organized way. Git is very good for collaboration and recovering previous states of code. Git is also good for experimentation for new features without affecting the main project.



### A2. Commit → Branch → Merge: the minimal workflow
**Use:** Step‑by‑Step Reasoning.  
**Run this prompt:**
```
Create a minimal, step‑by‑step workflow to:
1) initialize a repo, 2) create and switch to a feature branch, 3) commit changes,
4) merge back to main locally, 5) push to a remote named "origin".
For each step include: goal, command, and a quick verification.
```
**Paste final validated commands below and add one sentence on when to branch.**

Branch when you want to test a new feature.


In [None]:

# Paste your validated minimal Git workflow commands here as comments

# git init

# git checkout -b feature/your-feature-name

# git add .
# git commit -m "Descriptive commit message about your changes"

# git checkout main
# git merge feature/your-feature-name --no-ff

# git log --oneline
# git push origin main



### A3. Resolving a simple merge conflict
**Use:** Step‑by‑Step Reasoning.  
**Run this prompt:**
```
I have a merge conflict in README.md after merging a feature branch into main.
Give a 6-step recipe to resolve it safely:
- how to open the file, identify conflict markers, choose/merge lines,
- add/commit the resolution, verify the merge, and push.
Include one common pitfall and how to avoid it.
```
**Reflection:** What’s your personal checklist to avoid conflicts getting messy?

1. identify the conflict: run git status
2. open the conflicting file: open readme.md
3. resolve the conflict: manually edit the file to remove the conflict markers
4. stage the resolved file: tell Git that you have resolved conflict in README.md
- command: git add README.md
5. commit the resolution
- command: git commit
6. push the changes



---
## Group B — GitHub Collaboration (3 questions)

### B1. Branch vs. Fork vs. Clone
**Use:** Few‑Shot to drive crisp distinctions with examples.  
**Run this prompt:**
```
Answer using this format:
Term — One-sentence definition — When to use — One example.

Branch — A separate line of development within a single repository — Use when working on a new feature or bug fix without affecting the main codebase — Example: Creating a feature/user-authentication branch to add login functionality.
Fork — A copy of a repository under a different user's account — Use when you want to contribute to an open-source project or use a project as a starting point for your own, independent work — Example: Forking the pandas library to suggest a new feature or bug fix.
Clone — Downloading a copy of a repository from a remote location to your local machine — Use to get a local working copy of a repository you want to work with — Example: Cloning a team repository from GitHub to start working on the project files.


```
**Reflection:** Which one will your team use for this course and why?

We will use a clone so we can work on the project files using the team repository.



### B2. Pull Request (PR) checklist for this course
**Use:** Step‑by‑Step Reasoning.  
**Run this prompt:**
```
Write a "PR Checklist" for a university analytics course team repo.
Include: naming convention, description template, screenshots policy, reviewers, CI checks (if any),
and a revert plan. Limit to 8 concise checklist items.
```
**Paste your final checklist below.**

**PR Title:** Follows <unit>-<lab>-<short-desc> convention (e.g., u1-lab2-eda-trends).

**Description:** Includes problem addressed, approach taken, key files changed, and how to test.

**Screenshots:** Attach 1-2 relevant screenshots if visuals (plots, dashboards) were added or significantly changed.

**Linked Items:** Link to the related issue or assignment requirement in the description.

**Reviewers:** Request review from at least one teammate; self-merging is not allowed.

**CI Checks:** Passes any automated checks (e.g., code formatting, basic tests, if configured).

**Secrets Check:** Ensure no secrets, API keys, or personally identifiable information (PII) are included in code or output cells.

**Revert Plan:** Briefly describe how to quickly revert the changes if necessary.

In [None]:

# Example (edit to your team's needs)
pr_checklist = [
    "PR title: <unit>-<lab>-<short-desc> (e.g., u1-lab2-eda-trends)",
    "Description includes: problem, approach, key files, and how to test",
    "Attach 1–2 screenshots (plots/dashboards) if visuals changed",
    "Link related issue or assignment requirement",
    "Request review from 1 teammate; no self-merge",
    "Passes notebook re-run without errors (Runtime > Run all)",
    "No secrets, tokens, or PII in code or outputs",
    "Revert plan: how to roll back quickly if needed"
]
pr_checklist



### B3. Protected `main` workflow
**Use:** Zero‑Shot + Step‑by‑Step.  
**Run this prompt:**
```
Explain how to protect the main branch in a GitHub repo for a class team:
- Require PRs, at least one review, and passing checks
- Disallow force-pushes
Provide a numbered setup guide and a 3-line "why this matters" explanation.
```
**Reflection:** Which protection rules will you actually enable first, and why?

I will enable requiring a pull request before merging, require status checks to pass before merging, require approvals, and do not allow bypassing the above settings. These restrictions allow that the team checks each other's work and ensures collaboration.


---
## Group C — Google Colab for Analytics (3 questions)

### C1. Why Colab? Benefits & limits for this course
**Use:** Zero‑Shot.  
**Run this prompt:**
```
Act as a data science tech advisor.
List 5 advantages and 3 limitations of Google Colab for analytics coursework.
Tailor to a class that uses BigQuery and dashboards. Keep it to bullet points.
```
**Reflection:** Which two advantages will help *you* most this semester?

Easy sharing and collaboration and its integration with Google ecosystem.


### C2. Authenticate to GCP in Colab and query BigQuery
**Use:** Step‑by‑Step Reasoning for a minimal working snippet.  
**Run this prompt:**
```
Provide a minimal Colab snippet to:
1) authenticate to Google Cloud,
2) run a simple BigQuery SQL (e.g., SELECT 1),
3) get results into a pandas DataFrame,
4) print row count.
Include a one-line note on costs and safe use of LIMIT.
```
**Paste your final validated code below.**


In [1]:
# Minimal BigQuery test in Colab
from google.colab import auth
from google.cloud import bigquery
import pandas as pd # Import pandas

# 1) Authenticate to Google Cloud
auth.authenticate_user()

# Replace with your Google Cloud Project ID
project_id = "<YOUR_PROJECT_ID>" # <--- REPLACE THIS WITH YOUR PROJECT ID
client = bigquery.Client(project=project_id)

# 2) Run a simple BigQuery SQL query
# 3) Get results into a pandas DataFrame
sql = "SELECT 1 AS test_col"
df = client.query(sql).result().to_dataframe()

# 4) Print row count
print("Rows:", len(df))

# Display the DataFrame head (optional, but good practice)
display(df.head())

# Note on costs and LIMIT: BigQuery costs are based on data processed.
# Use LIMIT in your queries during development to reduce costs and speed up iteration.

BadRequest: 400 POST https://bigquery.googleapis.com/bigquery/v2/projects/%3CYOUR_PROJECT_ID%3E/jobs?prettyPrint=false: ProjectId must be non-empty

Location: None
Job ID: bb5fbd03-b571-4daf-bd05-7f252fabea76



### C3. Save notebooks to GitHub from Colab
**Use:** Step‑by‑Step Reasoning.  
**Run this prompt:**
```
Give two safe workflows to keep Colab notebooks versioned in GitHub:
(A) using "File > Save a copy in GitHub",
(B) local git with Drive sync (brief).
Provide steps and cautions (e.g., large outputs, secrets) for each.
```
**Reflection:** Which workflow will your team adopt and why?

Workflow A: Using "File > Save a copy in GitHub"

It seems easier to sync with team


---
## Capstone Synthesis (end of class)

**Scenario:** Your team needs a reproducible workflow for this course: team repo on GitHub, branching, Colab auth to BigQuery, and a PR checklist.

**Run this prompt:**
```
Act as a DevEx lead for a university analytics team.
Produce a one-page "Runbook" with:
- Repo structure (folders for notebooks, data, dashboards, docs)
- Branching model (who creates branches, when to merge)
- Colab ↔ BigQuery quickstart (auth, sample query, cost-safe LIMIT)
- PR checklist (max 8 bullets) and protection rules for main
- Two risks + mitigations (e.g., secrets leakage, merge conflicts)
Use concise bullets and keep it classroom-ready.
```

**Paste your final runbook below (or attach as a Markdown file in your repo) and add a 3‑bullet reflection on what you changed after validation.**

1. Repository Structure

My version:

Included only 4 folders (notebooks, data, dashboards, docs).

Kept it simpler for a “one-page” classroom-ready doc.

Sample version:

Added subfolders under notebooks (e.g., notebooks/lab1_eda/).

Added src/ for reusable Python scripts.

Change: I removed src/ and didn’t detail subfolders to keep it lean and beginner-friendly.

2. Branching Model

My version:

Focused on roles: who creates (contributors), when to merge (after review/tests).

Added hotfix branches explicitly.

Sample version:

Emphasized naming conventions (e.g., feature/js-eda-trends).

No mention of hotfixes.

Change: I added hotfix workflow but didn’t include naming rules (kept lightweight).

3. Colab ↔ BigQuery Quickstart

My version:

Gave a realistic group-by query with LIMIT 1000.

Highlighted cost-safety with that example.

Sample version:

Used a minimal SELECT 1 test query.

Stated cost-safety separately as a principle.

Change: I replaced the trivial test query with a realistic one to show useful classroom practice while still cost-safe.

4. Pull Request Checklist & Main Protection

My version:

Kept it 8 bullets max (title/description, runs without errors, outputs cleared, no secrets, docs updated, tests pass, reviewer assigned, small PRs).

Separate section for protection rules (review, CI, no direct pushes, up-to-date merges).

Sample version:

Much more detailed (titles in specific format, screenshots, linked issues, revert plan).

Mixed protection rules into same section.

Change: I streamlined the checklist (shorter, simpler, fits classroom runbook) and separated protection rules for clarity.

5. Risks & Mitigations

My version:

Two risks: secrets leakage + merge conflicts.

Mitigations: .gitignore, scanners, env vars, small PRs, frequent merges.

Sample version:

Same two risks.

Mitigations more detailed (Colab Secrets Manager, clearing outputs, splitting notebooks).

Change: I kept risks simpler and focused on general Git hygiene rather than Colab-specific mitigations.




---
## Submission Checklist (to your team repo + Brightspace link)

- [ ] All **Reflection** sections completed (A1–A3, B1–B3, C1–C3, Capstone).
- [ ] Any code snippets pasted are **validated** and include a 1‑line explanation.
- [ ] Notebook runs top‑to‑bottom without errors (where code cells exist).
- [ ] Commit message: `week2.1-prompt-practice` and open a PR for review.
- [ ] Add this notebook path to your repo **README.md** under Week 2.1.
