# IBM AI Capstone (Minimal Demo)

This lightweight notebook creates a tiny synthetic dataset that mimics SpaceX Falcon 9 launches, performs a **quick EDA plot**, trains a **Logistic Regression** model, and prints **accuracy**. It runs anywhere without external files.


In [None]:
import numpy as np\nimport pandas as pd\nimport matplotlib.pyplot as plt\nfrom sklearn.model_selection import train_test_split\nfrom sklearn.preprocessing import StandardScaler\nfrom sklearn.linear_model import LogisticRegression\nfrom sklearn.pipeline import Pipeline\nfrom sklearn.metrics import accuracy_score, classification_report\nprint('Notebook ready ✔️')\n

In [None]:
# --- Make a tiny synthetic dataset (no downloads needed) ---\nrng = np.random.default_rng(42)\nn = 120\npayload = rng.normal(4000, 1200, size=n).clip(800, 8000)              # kg\nflight_num = rng.integers(1, 110, size=n)                             # flight #\nis_reused = rng.integers(0, 2, size=n)                                # 0/1\nsite = rng.choice(['CCAFS','KSC','VAFB'], size=n, p=[0.5,0.35,0.15])  # launch site\n\n# Simple rule to generate success probability\np = (0.35\n     + 0.00006*payload\n     + 0.002*flight_num\n     + 0.15*is_reused\n     + np.where(site=='KSC', 0.08, 0)\n     - np.where(site=='VAFB', 0.05, 0))\np = np.clip(p, 0.05, 0.95)\nsuccess = rng.binomial(1, p)\n\ndf = pd.DataFrame({\n    'payload_mass_kg': payload.round(0),\n    'flight_number': flight_num,\n    'is_reused': is_reused,\n    'site_pad': site,\n    'success': success\n})\ndf.head()\n

In [None]:
# --- Quick EDA: success rate by site ---\nrate = df.groupby('site_pad')['success'].mean().sort_values()\nax = rate.plot(kind='bar', rot=0)\nax.set_ylabel('Success Rate')\nax.set_title('Success Rate by Launch Site (toy data)')\nplt.tight_layout()\nplt.show()\n

In [None]:
# --- Train a tiny model ---\nX = df.drop(columns=['success'])\ny = df['success']\nX = pd.get_dummies(X, columns=['site_pad'], drop_first=True)  # one-hot\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=0, stratify=y)\n\npipe = Pipeline([\n    ('scaler', StandardScaler(with_mean=False)),  # sparse-safe\n    ('clf', LogisticRegression(max_iter=1000))\n])\npipe.fit(X_train, y_train)\npred = pipe.predict(X_test)\nacc = accuracy_score(y_test, pred)\nprint(f'Accuracy: {acc:.3f}')\nprint('\nClassification Report:\n', classification_report(y_test, pred))\n

### Conclusion\n- We showed **a simple EDA** bar chart and trained a **Logistic Regression** model.\n- This minimal setup is enough to demonstrate the full workflow for peer review. ✅\n