# 🏆 Caprae Capital Lead Generation Pipeline – Jupyter Walkthrough

This notebook demonstrates, step-by-step, how the modular lead generation pipeline works:  
- Generate leads (simulated scraping, now with European company data)  
- Enrich leads (dummy email/LinkedIn)  
- Export as CSV  
All code is production-ready and tested on Linux (VPS).


In [1]:
import sys
import os
sys.path.append(os.path.abspath('src'))

# Force reload if module is already imported (anti-cache)
if 'scraper' in sys.modules:
    del sys.modules['scraper']
if 'enrich' in sys.modules:
    del sys.modules['enrich']

from scraper import scrape_leads
from enrich import enrich_contacts
import pandas as pd


## 1. Generate Sample Leads
This simulates scraping business leads 


In [2]:
# Generate dummy leads
leads = scrape_leads(criteria={"industry": "Any", "location": "Any"})
leads


[{'name': 'Berlin Data GmbH', 'website': 'https://berlindata.de'},
 {'name': 'Paris Analytics', 'website': 'https://parisanalytics.fr'},
 {'name': 'Madrid Innovate', 'website': 'https://madridinnovate.es'},
 {'name': 'Rome Digital Srl', 'website': 'https://romedigital.it'},
 {'name': 'London Cloud Ltd', 'website': 'https://londoncloud.co.uk'},
 {'name': 'Amsterdam Insights BV', 'website': 'https://amsterdaminsights.nl'},
 {'name': 'Zurich Tech AG', 'website': 'https://zurichtech.ch'},
 {'name': 'Stockholm Solutions AB',
  'website': 'https://stockholmsolutions.se'},
 {'name': 'Helsinki AI Oy', 'website': 'https://helsinkiai.fi'},
 {'name': 'Lisbon Digital', 'website': 'https://lisbondigital.pt'},
 {'name': 'Vienna Cloud GmbH', 'website': 'https://viennacloud.at'},
 {'name': 'Brussels Data SA', 'website': 'https://brusselsdata.be'},
 {'name': 'Copenhagen Software ApS',
  'website': 'https://copenhagensoftware.dk'},
 {'name': 'Prague Solutions s.r.o.', 'website': 'https://praguesolutions

## 2. Enrich Leads
Add dummy email and LinkedIn profile for each company.


In [3]:
df = enrich_contacts(leads)
df


Unnamed: 0,name,website,email,linkedin
0,Berlin Data GmbH,https://berlindata.de,berlindatagmbh@example.com,https://www.linkedin.com/in/berlindatagmbh
1,Paris Analytics,https://parisanalytics.fr,parisanalytics@example.com,https://www.linkedin.com/in/parisanalytics
2,Madrid Innovate,https://madridinnovate.es,madridinnovate@example.com,https://www.linkedin.com/in/madridinnovate
3,Rome Digital Srl,https://romedigital.it,romedigitalsrl@example.com,https://www.linkedin.com/in/romedigitalsrl
4,London Cloud Ltd,https://londoncloud.co.uk,londoncloudltd@example.com,https://www.linkedin.com/in/londoncloudltd
5,Amsterdam Insights BV,https://amsterdaminsights.nl,amsterdaminsightsbv@example.com,https://www.linkedin.com/in/amsterdaminsightsbv
6,Zurich Tech AG,https://zurichtech.ch,zurichtechag@example.com,https://www.linkedin.com/in/zurichtechag
7,Stockholm Solutions AB,https://stockholmsolutions.se,stockholmsolutionsab@example.com,https://www.linkedin.com/in/stockholmsolutionsab
8,Helsinki AI Oy,https://helsinkiai.fi,helsinkiaioy@example.com,https://www.linkedin.com/in/helsinkiaioy
9,Lisbon Digital,https://lisbondigital.pt,lisbondigital@example.com,https://www.linkedin.com/in/lisbondigital


## 3. Export as CSV
Save the enriched leads for CRM or sales workflow integration.


In [4]:
df.to_csv('data/leads_from_notebook.csv', index=False)
print("CSV saved as data/leads_from_notebook.csv")


CSV saved as data/leads_from_notebook.csv


## ✅ Recap & Next Steps

- Pipeline is modular: swap in real scraping/enrichment anytime.
- All steps work in cloud/Linux VPS environment.
- Can be extended for more complex enrichment, deduplication, or CRM integration.

*Developed by Rafif Sudanta for Caprae Capital Prework Technical Challenge.*
