In [7]:
import os, textwrap, pathlib

REPO_NAME = "homework1" 
FOLDERS = ["data", "src", "notebooks", "docs"]

base = pathlib.Path.cwd() / REPO_NAME
base.mkdir(exist_ok=True)
for f in FOLDERS:
    (base / f).mkdir(parents=True, exist_ok=True)

README_TEMPLATE = textwrap.dedent("""
# Cross-Sell Feasibility: Auto Insurance → Medical Insurance
**Stage:** Problem Framing & Scoping (Stage 01)

## Problem Statement
The company’s existing auto-insurance portfolio contains a large, engaged customer base whose demographic and behavioral traits align closely with prospects for medical coverage. Despite strong overlap, the attach rate of medical policies among auto customers remains low. The problem is to determine whether targeted cross-selling can convert a meaningful share of this base without degrading the core auto experience. Solving this would unlock incremental premium revenue, lower acquisition costs (versus cold leads), and strengthen customer loyalty through a unified protection bundle.


## Stakeholder & User
·Executive Sponsor (Chief Growth Officer): Sets financial targets, allocates budget, and signs off on go / no-go decisions.
·Product Marketing Team: Consumes model outputs to design campaigns, creative assets, and channel mix.
·Data Science & Analytics: Owns the predictive framework, monitors performance, and schedules retraining.
·Legal & Compliance: Reviews feature sets and outreach scripts for regulatory adherence.
·Customer-Facing Teams (call-center, mobile app, web): Receive real-time next-best-offer prompts to use during service interactions.

## Useful Answer & Decision
*Type: Predictive (propensity) and Prescriptive (uplift).
*Metric: Incremental lift in medical-policy conversion rate relative to random targeting.
*Decision: Which auto customers to include in cross-sell campaigns, via which channel, and with what offer.
*Artifacts:
·Nightly customer-level propensity and uplift scores.
·Explainability summary highlighting the top drivers behind each score.
·Campaign playbook with recommended segments, messaging, and contact rules.

## Assumptions & Constraints
·Historical auto and medical policy data exist and are linkable at customer level.
·Marketing consent flags are available and respected.
·Model must be interpretable enough for compliance review.
·Runtime budget: nightly batch must complete within existing on-prem infrastructure.
·No use of protected health data beyond explicitly allowed variables.

## Known Unknowns / Risks
·Class imbalance: Medical conversions are rare; will test resampling and cost-sensitive methods.
·Temporal drift: Consumer attitudes toward medical coverage may shift; will track monthly.
·Channel saturation: Over-contact could hurt auto retention; will cap touches per customer.
·Regulatory changes: New privacy rules could restrict variable usage; will maintain fallback feature sets.

## Lifecycle Mapping
Goal → Stage → Deliverable
-Define opportunity → Problem Framing & Scoping (Stage 01) → This scoping document

## Repo Plan
/data/, /src/, /notebooks/, /docs/ ; cadence for updates
""")

readme_path = base / "README.md"
if not readme_path.exists():
    readme_path.write_text(README_TEMPLATE, encoding="utf-8")
readme_path.resolve()

WindowsPath('C:/Users/Yvaine/bootcamp_Rui_Han/homework/homework1/README.md')

In [9]:
Stakeholder_Memo_TEMPLATE = textwrap.dedent("""
# Stakeholder Memo  
**Subject:** Cross-Sell Feasibility – Auto to Medical Insurance  

## Background  
Our book of auto-insurance customers is large, loyal, and demographically aligned with private medical insurance. Yet fewer than one in ten auto policyholders currently hold a medical product, suggesting substantial untapped revenue. Rebalancing our outreach from broad-brush campaigns to precision targeting is now a Board-level priority.

## Problem  
How can we identify which auto-insurance customers are most likely to purchase medical coverage, and through which channel, so that we can reduce acquisition cost and increase lifetime value without eroding trust or triggering regulatory scrutiny?


## Proposed Solution  
- **Descriptive:** Exploratory visuals showing overlap of auto vs medical customer profiles by age, geography, payment behavior, claims history.  
- **Predictive / Prescriptive:** Uplift model estimating incremental probability of medical purchase under a cross-sell treatment, with interpretable driver features.  
- **Activation Layer:** API returning nightly scores + rule-based channel recommendations (email, SMS, agent call).


## Data & Assumptions  
- Historical auto and medical policy records linkable at customer level.  
- Marketing consent flags and opt-in status available.  
- Permitted variables only (no protected health data beyond allowed attributes).  
- Forecast pipeline must complete on existing on-prem cluster (< 45 min nightly).  


## Risks & Mitigations  

| Risk | Impact | Mitigation |
|------|--------|------------|
| Class imbalance (rare conversions) | Model instability | Use uplift modeling + cost-sensitive loss; monitor weekly. |
| Temporal drift (policy changes, seasonality) | Degraded lift | Schedule monthly retrain; monitor PSI. |
| Over-contact causing auto churn | Brand damage | Cap touches per customer via governance rules. |
| Regulatory scrutiny on variable usage | Launch delay | Pre-clear feature list with Legal; maintain interpretable model. |


## Next Steps & Commitments  
- **Product Marketing:** Finalize offer variants and contact rules by 25 Aug.  
- **Compliance:** Complete variable whitelist review by 30 Aug.  
- **Data Science:** Deliver v0 descriptive insights by 1 Sep, v1 uplift API by 15 Sep.
""")

Stakeholder_Memo_path = base /"docs/ STAKEHOLDER_MEMO.md"
if not Stakeholder_Memo_path.exists():
    Stakeholder_Memo_path.write_text(Stakeholder_Memo_TEMPLATE, encoding="utf-8")
Stakeholder_Memo_path.resolve()

WindowsPath('C:/Users/Yvaine/bootcamp_Rui_Han/homework/homework1/docs/ STAKEHOLDER_MEMO.md')