# CS 5588 ‚Äî Data Science Capstone
## Week 4 Hands-On ‚Äî Capstone Product Integration Sprint  
**Deadline:** Feb. 12, 2026 (Thu), Midnight

### What this week is about
This Week 4 hands-on upgrades your Week 3 prototype into a **capstone-ready product module**:
- **Application integration** (a simple UI or endpoint that demonstrates a real workflow)
- **Operational logging + monitoring** (so you can measure usage and failures)
- **Impact-focused evaluation** (not only IR metrics ‚Äî also stakeholder/product impact)
- **Deployment readiness plan** (architecture + run instructions)
- **Failure & risk analysis** (what can go wrong, how you detect/mitigate)

### Submission policy
- **Team deliverables (GitHub):** code + notebook + brief/report + screenshots/diagram  
- **Individual reflection (Canvas/Survey):** one short paragraph

> This notebook is a **template**. Replace placeholders with your project specifics (data, users, goals, models).

---

## Recommended repo structure
```
/app/                 # Streamlit (or other) UI (recommended)
/src/                 # reusable pipeline code (data + modeling + retrieval)
/logs/                # monitoring logs (auto-created)
/reports/             # integration brief + diagrams
/notebooks/           # this notebook
requirements.txt
README.md
```

---

## Checklist (quick)
- [ ] Week-4 Integration Brief completed (Section 2)
- [ ] Working app demo (Section 6)
- [ ] Logging file created and populated (Section 4‚Äì5)
- [ ] Impact evaluation + technical metrics (Section 5)
- [ ] Deployment plan + architecture diagram (Section 7)
- [ ] One realistic failure/risk + mitigation (Section 8)
- [ ] Individual reflection (Section 9)


## 1) Team & project metadata (Required)
Fill these fields first. They will be reused in your report and README.

## ‚úÖ Step 0.5 ‚Äî Fill in your project info (required)

Before generating files, update the configuration values below. This prevents generic submissions.


In [49]:
# CS5588_WEEK4_CONFIG_VALIDATION
import os

# ---- REQUIRED: edit these ----
TEAM_NAME = "WEAI"
PROJECT_TITLE = "WeatherTwin"
TARGET_USER = "Common People"

# Optional
DEPLOYMENT_TARGET = "Streamlit Cloud"  # or HuggingFace Spaces, Render, etc.

def _require_filled(label, value):
    if value.strip().startswith('<REPLACE_') or value.strip() == "":
        raise ValueError(f"Please edit {label} at the top of this cell before continuing.")

_require_filled('TEAM_NAME', TEAM_NAME)
_require_filled('PROJECT_TITLE', PROJECT_TITLE)
_require_filled('TARGET_USER', TARGET_USER)

print('‚úÖ Config looks good')
print('TEAM_NAME:', TEAM_NAME)
print('PROJECT_TITLE:', PROJECT_TITLE)
print('TARGET_USER:', TARGET_USER)


‚úÖ Config looks good
TEAM_NAME: WEAI
PROJECT_TITLE: WeatherTwin
TARGET_USER: Common People


In [50]:
from dataclasses import dataclass
from pathlib import Path
import pandas as pd
import numpy as np
import json, time
from datetime import datetime, timezone

@dataclass
class Week4Config:
    course: str = "CS 5588 Data Science Capstone"
    week: str = "Week 4"
    deadline: str = "2026-02-12 23:59"
    team_name: str = "TEAM_NAME"
    project_title: str = "PROJECT_TITLE"
    stakeholder: str = "TARGET_USER / STAKEHOLDER"
    problem_statement: str = "1‚Äì2 sentences: what real problem are you solving?"
    data_summary: str = "What data (sources/modalities) does your project use?"
    model_summary: str = "What model(s) do you use (baseline + main)?"
    app_dir: str = "./app"
    src_dir: str = "./src"
    logs_dir: str = "./logs"
    reports_dir: str = "./reports"
    log_file: str = "./logs/week4_events.csv"

cfg = Week4Config()
Path(cfg.app_dir).mkdir(parents=True, exist_ok=True)
Path(cfg.src_dir).mkdir(parents=True, exist_ok=True)
Path(cfg.logs_dir).mkdir(parents=True, exist_ok=True)
Path(cfg.reports_dir).mkdir(parents=True, exist_ok=True)

cfg


Week4Config(course='CS 5588 Data Science Capstone', week='Week 4', deadline='2026-02-12 23:59', team_name='TEAM_NAME', project_title='PROJECT_TITLE', stakeholder='TARGET_USER / STAKEHOLDER', problem_statement='1‚Äì2 sentences: what real problem are you solving?', data_summary='What data (sources/modalities) does your project use?', model_summary='What model(s) do you use (baseline + main)?', app_dir='./app', src_dir='./src', logs_dir='./logs', reports_dir='./reports', log_file='./logs/week4_events.csv')

## 2) Week-4 Capstone Integration Brief (Required)
Create a **1-page** brief (can be in `reports/week4_integration_brief.md` or a README section).

Include:
1. **Where this module fits** in your capstone architecture  
2. **Primary user workflow** (what the user does end-to-end)  
3. **Success metrics** (product/impact metrics + technical metrics)  
4. **Risks if it fails** (stakeholder harm / wrong decision / wasted time)  
5. **Next sprint** (what you would build next)

Below is a starter you can export to Markdown.


In [51]:
brief_path = Path(cfg.reports_dir) / "week4_integration_brief.md"

brief_template = f"""# Week 4 Integration Brief ‚Äî {cfg.project_title}
**Team:** {cfg.team_name}
**Stakeholder/User:** {cfg.stakeholder}
**Problem:** {cfg.problem_statement}

## 1) Module placement in capstone system
- Upstream inputs:
- Module responsibilities:
- Downstream outputs:

## 2) User workflow (end-to-end)
1.
2.
3.

## 3) Success metrics
### Product / impact metrics (required)
- Time-to-decision:
- Trust/verification signals:
- Adoption/usage signal:

### Technical metrics (recommended)
- Quality (e.g., accuracy/F1, Precision@K/Recall@K, calibration, etc.):
- Latency:
- Failure rate:

## 4) Failure & risk (what happens if wrong?)
- Likely failure:
- Impact:
- Mitigation:

## 5) Next sprint plan
- Next feature:
- Data improvement:
- Evaluation improvement:
"""

# Write (or overwrite) template file
brief_path.write_text(brief_template, encoding="utf-8")
print("Wrote:", brief_path)
print("\nPreview (first 25 lines):\n")
print("\n".join(brief_template.splitlines()[:25]))


Wrote: reports/week4_integration_brief.md

Preview (first 25 lines):

# Week 4 Integration Brief ‚Äî PROJECT_TITLE
**Team:** TEAM_NAME
**Stakeholder/User:** TARGET_USER / STAKEHOLDER
**Problem:** 1‚Äì2 sentences: what real problem are you solving?

## 1) Module placement in capstone system
- Upstream inputs:
- Module responsibilities:
- Downstream outputs:

## 2) User workflow (end-to-end)
1.
2.
3.

## 3) Success metrics
### Product / impact metrics (required)
- Time-to-decision:
- Trust/verification signals:
- Adoption/usage signal:

### Technical metrics (recommended)
- Quality (e.g., accuracy/F1, Precision@K/Recall@K, calibration, etc.):
- Latency:
- Failure rate:


## 3) Data + modeling hook (Project-aligned)
Week 4 must use **your capstone project data and models**.

- If you are building a **RAG / search / recommender**: wire your retrieval + generation here.
- If you are building a **predictive model**: wire your training/inference function here.
- If you are building a **dashboard / analytics product**: wire your data processing + visualization logic here.

Below is a **minimal runnable stub** so this notebook executes even without your data.
Replace the stubs with your actual project code from Week 3.


In [52]:
# --- Replace these with your real project pipeline imports ---
# from src.pipeline import load_data, run_inference, retrieve_evidence, generate_output

def load_data_stub():
    # Example synthetic dataset (replace with your real dataset loader)
    df = pd.DataFrame({
        "id": [1, 2, 3],
        "text": [
            "Project doc: model deployment requires monitoring.",
            "Project doc: user trust improves with evidence and citations.",
            "Project doc: define success metrics and failure cases."
        ],
        "label": [1, 1, 0]
    })
    return df

def predict_stub(df: pd.DataFrame, query: str):
    # Replace with your real model inference
    # For demo: return top rows containing keyword overlap
    scores = df["text"].str.lower().apply(lambda t: sum(w in t for w in query.lower().split()))
    top = df.assign(score=scores).sort_values("score", ascending=False).head(3)
    return top

df_demo = load_data_stub()
df_demo


Unnamed: 0,id,text,label
0,1,Project doc: model deployment requires monitor...,1
1,2,Project doc: user trust improves with evidence...,1
2,3,Project doc: define success metrics and failur...,0


## 4) Monitoring & logging (Required)
You must implement **automatic logging** for every user interaction / query.

Minimum columns (recommended):
- timestamp
- event_type (query / feedback / error)
- user_task_type (what workflow this supports)
- config (model/retrieval settings)
- latency_ms
- output_quality_signal (e.g., faithfulness pass/fail, confidence, error flag)
- artifact_ids (evidence ids, record ids, etc.)

This creates the foundation for **production-style monitoring** and **capstone evaluation**.


**Mapping to the Week 4 handout terminology (1-to-1):**
- `artifact_ids` ‚Üí **evidence IDs**
- `model_or_mode` ‚Üí **retrieval configuration**
- `quality_signal` ‚Üí **confidence/faithfulness indicator**


In [53]:
import csv

LOG_COLUMNS = [
    "timestamp",
    "event_type",
    "user_task_type",
    "model_or_mode",
    "latency_ms",
    "artifact_ids",
    "quality_signal",
    "notes"
]

def ensure_csv(path: str, header: list):
    p = Path(path)
    p.parent.mkdir(parents=True, exist_ok=True)
    if not p.exists():
        with open(p, "w", newline="", encoding="utf-8") as f:
            writer = csv.writer(f)
            writer.writerow(header)

ensure_csv(cfg.log_file, LOG_COLUMNS)
print("Logging to:", cfg.log_file)

def log_event(event_type: str,
              user_task_type: str,
              model_or_mode: str,
              latency_ms: float,
              artifact_ids,
              quality_signal: str,
              notes: str = ""):
    row = [
        datetime.now(timezone.utc).isoformat(),
        event_type,
        user_task_type,
        model_or_mode,
        round(float(latency_ms), 2),
        json.dumps(artifact_ids, ensure_ascii=False),
        quality_signal,
        notes
    ]
    with open(cfg.log_file, "a", newline="", encoding="utf-8") as f:
        csv.writer(f).writerow(row)

# Demo log entry
log_event(
    event_type="query",
    user_task_type="demo",
    model_or_mode="stub",
    latency_ms=12.3,
    artifact_ids=[1,2],
    quality_signal="OK",
    notes="demo event"
)

pd.read_csv(cfg.log_file).tail(5)


Logging to: ./logs/week4_events.csv


Unnamed: 0,timestamp,event_type,user_task_type,model_or_mode,latency_ms,artifact_ids,quality_signal,notes
0,2026-02-12T22:39:17.173305+00:00,query,demo,stub,12.3,"[1, 2]",OK,demo event
1,2026-02-12T23:02:51.445484+00:00,query,demo,stub,12.3,"[1, 2]",OK,demo event
2,2026-02-12T23:05:09.313678+00:00,query,demo,stub,12.3,"[1, 2]",OK,demo event


## 5) Week-4 evaluation: Impact + Technical metrics (Required)
Capstone evaluation must include **impact-oriented metrics**, not only technical ones.

### A) Impact metrics (required)
Choose 2‚Äì3 that match your stakeholder workflow:
- time-to-decision (before vs after)
- trust/verification rate (e.g., citations shown, evidence opened)
- task success rate (user can complete task)
- adoption signal (weekly active usage in demo, number of queries run)

### B) Technical metrics (recommended)
Pick metrics that match your system type:
- Classification/regression: accuracy/F1/AUC/MAE + calibration
- Retrieval/RAG: Precision@K/Recall@K + citation coverage + refusal correctness
- Forecasting: MAE/MAPE/CRPS, interval coverage, etc.

Below is a small example that computes simple metrics from your demo data.
Replace with your project-specific metrics.


In [54]:
# Example technical metrics (replace for your system)
# Here we pretend "label" is the target and "score>0" is predicted positive.

def compute_demo_metrics(df: pd.DataFrame):
    y_true = df["label"].values
    y_pred = (df["text"].str.contains("trust|monitoring|metrics", case=False, regex=True)).astype(int).values
    acc = (y_true == y_pred).mean()
    return {"accuracy_demo": float(acc)}

metrics = compute_demo_metrics(df_demo)
metrics


{'accuracy_demo': 0.6666666666666666}

## 6) Build your capstone demo app (Required)
Your team must expose the module via an application interface:
- Streamlit UI (recommended), OR
- an API endpoint + simple client, OR
- a dashboard component integrated into your project

### Required app behavior
- Accept user input (question / task / parameters)
- Produce output aligned to your project workflow
- Display artifacts (evidence / records / plots) as appropriate
- Log events automatically to `logs/week4_events.csv`

Below is a Streamlit skeleton generator that you can commit to `/app/main.py`.


In [55]:
!pip install streamlit pyngrok



In [56]:
!ngrok config add-authtoken 39VrbVGFFud7BOwY1k4PTa7J0EA_e5YBTjU3N7w9mcNK9PGG

!streamlit run app.py &>/content/logs.txt &


Authtoken saved to configuration file: /root/.config/ngrok/ngrok.yml


In [57]:
# # streamlit_app = r'''
# import json, time
# from pathlib import Path
# import pandas as pd
# import streamlit as st

# # --- Customize: import your project pipeline here ---
# # from src.pipeline import load_data, run_inference

# LOG_FILE = "logs/week4_events.csv"
# LOG_COLUMNS = ["timestamp","event_type","user_task_type","model_or_mode","latency_ms","artifact_ids","quality_signal","notes"]

# def ensure_csv(path: str, header):
#     p = Path(path)
#     p.parent.mkdir(parents=True, exist_ok=True)
#     if not p.exists():
#         pd.DataFrame(columns=header).to_csv(p, index=False)

# def log_event(event_type, user_task_type, model_or_mode, latency_ms, artifact_ids, quality_signal, notes=""):
#     ensure_csv(LOG_FILE, LOG_COLUMNS)
#     df = pd.read_csv(LOG_FILE)
#     row = {
#         "timestamp": pd.Timestamp.utcnow().isoformat(),
#         "event_type": event_type,
#         "user_task_type": user_task_type,
#         "model_or_mode": model_or_mode,
#         "latency_ms": round(float(latency_ms), 2),
#         "artifact_ids": json.dumps(artifact_ids, ensure_ascii=False),
#         "quality_signal": quality_signal,
#         "notes": notes
#     }
#     df = pd.concat([df, pd.DataFrame([row])], ignore_index=True)
#     df.to_csv(LOG_FILE, index=False)

# # --- Demo stub (replace) ---
# def load_data_stub():
#     return pd.DataFrame({
#         "id":[1,2,3],
#         "text":[
#             "Project doc: model deployment requires monitoring.",
#             "Project doc: user trust improves with evidence and citations.",
#             "Project doc: define success metrics and failure cases."
#         ]
#     })

# def run_inference_stub(df, query: str):
#     scores = df["text"].str.lower().apply(lambda t: sum(w in t for w in query.lower().split()))
#     top = df.assign(score=scores).sort_values("score", ascending=False).head(3)
#     artifacts = top["id"].tolist()
#     output = top[["id","text","score"]].to_dict(orient="records")
#     return output, artifacts

# # --- UI ---
# st.set_page_config(page_title="CS5588 Week 4 ‚Äî Capstone Demo", layout="wide")
# st.title("CS 5588 Week 4 ‚Äî Capstone Product Integration Demo")
# st.caption("Project-aligned module + monitoring logs + deployment readiness")

# df = load_data_stub()

# user_task_type = st.selectbox("User task type (workflow)", ["analysis", "search", "decision-support", "reporting", "other"])
# model_or_mode = st.selectbox("Model/mode", ["baseline", "main", "ablation"])

# query = st.text_area("Enter your question / task input", height=120)
# run_btn = st.button("Run")

# if run_btn and query.strip():
#     t0 = time.time()
#     output, artifacts = run_inference_stub(df, query)
#     latency_ms = (time.time() - t0) * 1000

#     # --- Response panel ---
#     st.subheader("Response")
#     st.json(output)

#     # --- Evidence panel (explicit, required by handout) ---
#     # In your real system, this should render retrieved chunks/records with IDs + short text/snippet + source link/citation.
#     with st.expander("Evidence (IDs + preview)", expanded=True):
#         st.write("Evidence IDs:", artifacts)
#         # Stub evidence table (replace with real evidence objects)
#         evidence_rows = [{"evidence_id": a, "preview": "Replace with evidence snippet/source"} for a in artifacts]
#         st.table(pd.DataFrame(evidence_rows))

#     # --- Metrics panel (explicit, required by handout) ---
#     st.subheader("Metrics")
#     st.write({
#         "latency_ms": round(latency_ms, 2),
#         "retrieval_configuration": model_or_mode,   # maps to handout: retrieval configuration
#         "confidence_or_faithfulness": "OK" if len(artifacts) > 0 else "LOW"  # maps to handout: confidence/faithfulness indicator
#     })

#     # Simple quality signal stub
#     quality_signal = "OK" if len(artifacts) > 0 else "LOW"

#     log_event(
#         event_type="query",
#         user_task_type=user_task_type,
#         model_or_mode=model_or_mode,
#         latency_ms=latency_ms,
#         artifact_ids=artifacts,
#         quality_signal=quality_signal,
#         notes="week4 demo"
#     )
#     st.success(f"Logged event to {LOG_FILE}")
# # '''

# app_path = Path(cfg.app_dir) / "main.py"
# # app_path.write_text(streamlit_app, encoding="utf-8")
# print("Wrote Streamlit app:", app_path)
# print("\nRun locally (terminal):")
# print("  pip install -r requirements.txt")
# print("  streamlit run app/main.py")


In [58]:
!pip -q install faiss-cpu rank-bm25


# src/pipeline.py
import pandas as pd
import numpy as np
from sentence_transformers import SentenceTransformer
import faiss
import time

class WeatherTwinRAG:
    def __init__(self):
        print("üîÑ Loading Climate Intelligence Models...")
        self.model = SentenceTransformer('all-MiniLM-L6-v2')
        self.index = None
        self.data = None
        self._load_data()

    def _load_data(self):
        """
        Load your project data.
        For this demo, we use a hardcoded list of weather/climate facts.
        In your real project, load your PDFs and chunk them here.
        """
        # --- REAL PROJECT DATA PLACEHOLDER ---
        # df = pd.read_csv("your_processed_data.csv")
        # texts = df['text_column'].tolist()

        # --- DEMO DATA (WeatherTwin Context) ---
        self.data = pd.DataFrame([
            {"id": 1, "text": "Zone AE areas have a 1% annual chance of flooding. Flood insurance is mandatory.", "source": "FEMA Map"},
            {"id": 2, "text": "Heatwaves in urban areas increase AC load by 20% due to the heat island effect.", "source": "Climate Study 2023"},
            {"id": 3, "text": "Old brick buildings retain heat longer than modern wood structures, increasing indoor temps.", "source": "Building Guide"},
            {"id": 4, "text": "Coastal properties within 500m of the shoreline are at high risk for storm surge.", "source": "FEMA Report"},
            {"id": 5, "text": "Flash flood warnings are issued when rainfall exceeds 2 inches per hour.", "source": "Weather Service"}
        ])

        texts = self.data['text'].tolist()

        # Create Embeddings
        print("üîÑ Generating embeddings for search...")
        embeddings = self.model.encode(texts, show_progress_bar=True)

        # Build FAISS Index (L2 distance)
        dimension = embeddings.shape[1]
        self.index = faiss.IndexFlatL2(dimension)
        self.index.add(embeddings.astype('float32'))
        print("‚úÖ System Ready.")

    def search(self, query: str, k=3):
        """
        Perform a search based on user query.
        Returns: (Results DataFrame, List of IDs, Latency)
        """
        start_time = time.time()

        # 1. Encode the query
        query_vector = self.model.encode([query])

        # 2. Search FAISS
        distances, indices = self.index.search(query_vector.astype('float32'), k)

        # 3. Retrieve results
        results = []
        artifact_ids = []

        for i in range(k):
            idx = indices[0][i]
            if idx != -1: # FAISS returns -1 if index is empty/not found
                row = self.data.iloc[idx].to_dict()
                row['score'] = float(distances[0][i]) # Lower is better
                results.append(row)
                artifact_ids.append(row['id'])

        latency_ms = (time.time() - start_time) * 1000

        return results, artifact_ids, latency_ms

# Singleton instance to avoid reloading models on every click
_rag_instance = None

def get_rag_system():
    global _rag_instance
    if _rag_instance is None:
        _rag_instance = WeatherTwinRAG()
    return _rag_instance

import json, time
import pandas as pd
import streamlit as st
import sys
sys.path.append('../') # Go up one level to find 'src'

# --- IMPORT YOUR REAL PIPELINE ---
# from src.pipeline import get_rag_system

LOG_FILE = "logs/week4_events.csv"
LOG_COLUMNS = ["timestamp","event_type","user_task_type","model_or_mode","latency_ms","artifact_ids","quality_signal","notes"]

def ensure_csv(path: str, header):
    import os
    from pathlib import Path
    p = Path(path)
    p.parent.mkdir(parents=True, exist_ok=True)
    if not p.exists():
        pd.DataFrame(columns=header).to_csv(path, index=False)

def log_event(event_type, user_task_type, model_or_mode, latency_ms, artifact_ids, quality_signal, notes=""):
    ensure_csv(LOG_FILE, LOG_COLUMNS)
    df = pd.read_csv(LOG_FILE)
    row = {
        "timestamp": pd.Timestamp.utcnow().isoformat(),
        "event_type": event_type,
        "user_task_type": user_task_type,
        "model_or_mode": model_or_mode,
        "latency_ms": round(float(latency_ms), 2),
        "artifact_ids": json.dumps(artifact_ids, ensure_ascii=False),
        "quality_signal": quality_signal,
        "notes": notes
    }
    df = pd.concat([df, pd.DataFrame([row])], ignore_index=True)
    df.to_csv(LOG_FILE, index=False)

# --- UI ---
st.set_page_config(page_title="WeatherTwin - CS5588 Demo", layout="wide")
st.title("üå§Ô∏è WeatherTwin: AI-Powered Climate Intelligence")
st.caption("Personalized risk assessment based on Semantic Search")

# Load the RAG system (only loads once)
with st.spinner("Loading Climate Models..."):
    rag = get_rag_system()

# Sidebar
with st.sidebar:
    st.header("User Profile")
    user_address = st.text_input("Address", "123 River St")
    building_type = st.selectbox("Building Type", ["Apartment", "House"])
    model_mode = st.selectbox("Model Mode", ["Semantic Search (FAISS)", "Keyword Search"])

# Main Area
st.header("Risk Assessment")
user_task = st.selectbox("Task", ["Flood Risk", "Heat Risk", "General Info"])
query_input = st.text_area("Ask about your location or building:", "Is my area at risk of flooding?")

if st.button("Analyze Risk", type="primary"):
    # 1. Run the REAL inference
    results, artifact_ids, latency_ms = rag.search(query_input, k=3)

    # 2. Display Response
    if results:
        st.subheader("Analysis Result")

        # Construct a simple answer from the top result
        top_result = results[0]
        st.info(f"**Match (Score: {top_result['score']:.2f}):**\n\n{top_result['text']}")

        # 3. Evidence Panel
        with st.expander("View Evidence & Sources", expanded=True):
            for res in results:
                st.markdown(f"**Source:** {res['source']} (ID: {res['id']})")
                st.text(res['text'])
                st.markdown("---")

        # 4. Metrics
        col_a, col_b = st.columns(2)
        col_a.metric("Latency", f"{latency_ms:.2f} ms")
        col_b.metric("Documents Retrieved", len(results))

        # 5. Log Event
        log_event(
            event_type="query",
            user_task_type=user_task,
            model_or_mode=model_mode,
            latency_ms=latency_ms,
            artifact_ids=artifact_ids,
            quality_signal="OK",
            notes=f"Query: {query_input[:50]}..."
        )
        st.success("‚úÖ Analysis logged successfully")
    else:
        st.error("No relevant information found in the knowledge base.")



üîÑ Loading Climate Intelligence Models...


Loading weights:   0%|          | 0/103 [00:00<?, ?it/s]

BertModel LOAD REPORT from: sentence-transformers/all-MiniLM-L6-v2
Key                     | Status     |  | 
------------------------+------------+--+-
embeddings.position_ids | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.


üîÑ Generating embeddings for search...


Batches:   0%|          | 0/1 [00:00<?, ?it/s]



‚úÖ System Ready.




### 6.1) Generate `requirements.txt` and a starter `README.md` (Recommended)

For deployment reproducibility, create a minimal dependency list and run instructions.
If these files already exist, this cell will **not overwrite** them unless `FORCE_OVERWRITE=True`.


In [59]:
from pathlib import Path

FORCE_OVERWRITE = False

req_path = Path('requirements.txt')
readme_path = Path('README.md')

requirements_text = """streamlit
pandas
numpy
requests
python-dotenv
# Add your ML/LLM packages below (examples):
# scikit-learn
# torch
# transformers
"""

readme_text = f"""# {cfg.project_title or 'CS 5588 Capstone'} ‚Äî Week 4 Hands-On

**Team:** {cfg.team_name}
**Stakeholder/User:** {cfg.stakeholder}

## Deployment Link
- (Paste your deployed URL here)

## Run Locally
```bash
pip install -r requirements.txt
streamlit run app/main.py
```

## What this demo does
- Briefly describe the product workflow (2‚Äì4 bullets).

## Logs
- Logs are written to `logs/week4_events.csv`.

## Week 4 Metrics Summary
- Impact metrics:
- Technical metrics:

## Failure & Risk
- Link or summary of `reports/week4_failure_risk.md`.
"""

def write_if_missing(path: Path, content: str):
    if path.exists() and not FORCE_OVERWRITE:
        print(f"Exists (skipping): {path}")
        return
    path.write_text(content.strip() + "\n", encoding='utf-8')
    print(f"Wrote: {path}")

write_if_missing(req_path, requirements_text)
write_if_missing(readme_path, readme_text)


Exists (skipping): requirements.txt
Exists (skipping): README.md


## 7) Deployment readiness plan (Required)
In your README or `reports/`, include:
- **Deployment target:** (HF Spaces / Streamlit Cloud / Render / Railway)
- **Data handling:** what is included vs excluded from repo
- **Monitoring plan:** what you log and how you review it
- **Governance / guardrails:** what the system refuses to do, and why
- **Architecture diagram:** a simple block diagram of components and data flow

Below is a starter diagram description you can paste into your report (replace with your architecture).


In [60]:
arch_path = Path(cfg.reports_dir) / "week4_architecture_notes.md"
arch_notes = f"""# Week 4 Deployment Readiness ‚Äî Architecture Notes

## Deployment target
- Hosting option:
- Public link:

## Components
- UI: Streamlit (`/app/main.py`)
- Core module: `/src/` (data + model + retrieval)
- Logs: `/logs/week4_events.csv`
- Data: `/data/` (not committed if private/large)

## Data flow
User ‚Üí UI ‚Üí Core module ‚Üí Output + Artifacts ‚Üí UI
                         ‚Üò log_event() ‚Üí logs/week4_events.csv

## Monitoring & governance
- What gets logged:
- Refusal / safe behavior:
- What triggers an alert (failure signal):

## Scaling considerations (short)
- What becomes slow at 10√ó data?
- What caching/indexing would you add?
"""
arch_path.write_text(arch_notes, encoding="utf-8")
print("Wrote:", arch_path)


Wrote: reports/week4_architecture_notes.md


In [65]:
!pip install streamlit pyngrok
!ngrok config add-authtoken 39VrbVGFFud7BOwY1k4PTa7J0EA_e5YBTjU3N7w9mcNK9PGG
!streamlit run app.py &>/dev/null &


Authtoken saved to configuration file: /root/.config/ngrok/ngrok.yml


In [66]:
from pyngrok import ngrok
public_url = ngrok.connect(8501)
print(public_url)


NgrokTunnel: "https://unshrunken-jerilyn-concussant.ngrok-free.dev" -> "http://localhost:8501"


## 8) Failure & risk analysis (Required)
Document **one realistic deployment-level failure** and how you will detect/mitigate it.

Examples:
- Wrong evidence leads to wrong user decision
- Data drift reduces model performance
- Retrieval returns irrelevant context causing hallucination
- Latency spikes make product unusable

Below is a short template you can paste into your report.


In [64]:
risk_path = Path(cfg.reports_dir) / "week4_failure_risk.md"
risk_template = f"""# Week 4 Failure & Risk Analysis

## Failure scenario (realistic)
- What fails?
- Example user input that triggers it:

## Impact (stakeholder/product)
- What wrong decision could be made?
- What harm/cost could occur?

## Detection signals (monitoring)
- Which log fields reveal this?
- Thresholds (example):

## Mitigation
- Guardrails/refusal conditions:
- Model/data fix:
- Evaluation fix:

## Post-mortem plan
- What would you change next sprint?
"""
risk_path.write_text(risk_template, encoding="utf-8")
print("Wrote:", risk_path)


Wrote: reports/week4_failure_risk.md


## 9) Individual reflection (Required, individual submission)
Each student submits **one paragraph** addressing:

1. What part of the capstone module is closest to production-ready?
2. What is the biggest risk to deploying this?
3. What would you build next sprint?

Paste your paragraph in the individual survey/Canvas submission.


## 10) Final deliverables (Team)
Your GitHub repo should include:
- `/app/main.py` (or equivalent app interface)
- `/src/` (pipeline modules; reuse Week 3 work)
- `/logs/week4_events.csv` (auto-created, with sample rows)
- `/reports/week4_integration_brief.md`
- `/reports/week4_architecture_notes.md`
- `/reports/week4_failure_risk.md`
- `requirements.txt`
- `README.md` with:
  - deployment link
  - run instructions
  - screenshots
  - metrics summary (impact + technical)

---

### Tip (grading-friendly)
Make sure a TA can:
1) run `pip install -r requirements.txt`  
2) run `streamlit run app/main.py`  
3) see logs populate in `logs/week4_events.csv`


## GitHub Deployment (Required Example)

### Step 1 ‚Äî Push your repository
```bash
git init
git add .
git commit -m "Week4 capstone app"
git branch -M main
git remote add origin https://github.com/<username>/<repo>.git
git push -u origin main
```

### Step 2 ‚Äî Deploy using GitHub-connected hosting
Example: Streamlit Community Cloud

1. Go to https://share.streamlit.io
2. Click **New App**
3. Select your GitHub repository
4. Branch: `main`
5. App path: `app/main.py`
6. Click **Deploy**

### Step 3 ‚Äî Add deployment link to README
Include the deployed URL in your repository README.
