Reads France's official business registry (BODACC) every morning at 6am, finds newly opened restaurants, scores them on 12 criteria, and generates a personalized 5-touch email sequence using Claude.
Built as a school project (L3 AI student, ESGI Paris) — ended up making it a real working system.
First real test in progress — results coming soon.
A B2B sales team spends 2–4 hours per prospect on manual research: searching for the restaurant's POS system, scrolling Google Maps reviews, guessing pain points, writing an email that doesn't sound like a template.
At 20 prospects per week, that's 40–80 hours of manual work — just for the first touch.
Most prospecting tools find contacts. This one uses BODACC — France's official business registry — as a buying signal.
A restaurant that just opened has 3 things in common:
- They haven't signed with a POS vendor yet
- The founder is still personally making decisions
- They're in the first 30–60 days (peak buying window)
BODACC publishes every French business registration, every day, for free. Nobody was using it for outbound prospecting.
The same signal logic applies to B2B SaaS targets: a company posting "Chef de projet IA" on LinkedIn is in the same buying window as a restaurant that just opened.
Score : 65/100 | Statut : prospect_tiede_a_qualifier
Vocabulaire miroir : "restauration rapide", "halal", "click & collect"
Pain points détectés :
- 3 tablettes séparées (Uber Eats, Deliveroo, Just Eat) à gérer en rush
- Pas d'intégration unifiée salle + livraison + bornes
- Aucun pilotage temps réel sur les 2 points de vente
Email objet : "Vos 3 tablettes Uber Eats, Deliveroo, Just Eat sur un seul écran ?"
Generated email (actual output, not a mock):
Bonjour,
Je vois que O'Tacos Porte Dorée et Porte de Vincennes gèrent les commandes Uber Eats, Deliveroo et Just Eat en parallèle — c'est 3 tablettes à surveiller en plein rush.
Notre solution unifie toutes vos commandes livraison, salle et click & collect dans une seule interface. Les restaurants fast-food qui ont franchi le pas réduisent typiquement les erreurs de commande et gagnent du temps sur chaque service.
Est-ce qu'un échange de 15 minutes cette semaine vous conviendrait ?
No human wrote this. It took ~25 seconds.
All Claude prompts enforce strict data discipline:
- ⛔ No invented restaurant names or client references
- ⛔ No fabricated stats or figures not present in the scraped data
- ⛔ No fictional "nearby competitor" or "same-city customer" stories
- ✅ Social proof uses generic segment formulations ("restaurants in this segment typically…")
- ✅ Stats are either sourced from the restaurant's own data or explicitly hedged ("typically", "in general")
Leads scored < 40/100 get no email sequence generated — only a "cold lead" flag.
Every morning at 6am, the pipeline reads BODACC for:
- New restaurant openings (
Créations) - Ownership transfers (
Cessions) - Active hiring signals — growth indicator
12-criteria scoring with dynamic decrements:
| Signal | Points |
|---|---|
| BODACC signal < 30 days | +35 |
| BODACC signal 30–60 days | +20 |
| No competitor POS detected (greenfield) | +25 |
| Delivery without unified integration | +20 |
| Manager identified on LinkedIn (high confidence) | +15 |
| Fast-growth sector (fast food, asian, halal) | +10 |
| Main establishment (not a branch) | +10 |
| Multi-location detected | +10 |
| Active hiring signal (expansion) | +10 |
| SMP — external constraint (opening < 30 days) | +25 |
| Inactive 30+ days | -20 |
| Email bounce | -50 |
Leads below 40/100 are flagged as cold — no sequence generated, no send button shown.
| Touch | Goal | Format |
|---|---|---|
| J+0 | First contact anchored on BODACC signal | 150 words |
| J+3 | Social proof — segment-level (no invented names) | 60 words |
| J+7 | ROI + ADERA (pre-answering the likely objection) | 100 words |
| J+14 | Call request | 3 lines |
| J+30 | Reactivation with fresh market data | 80 words |
The system uses mirror language: it detects the exact words the restaurant uses to describe itself and reintegrates them naturally. The prospect reads their own language — they feel understood, not spammed.
- Scrapes restaurant website (homepage +
/contact) for email + phone via regex - Falls back to TripAdvisor via Exa semantic search
- Falls back to Claude-estimated email format
LinkedIn enrichment via Exa returns a confidence score:
- High — LinkedIn profile title or URL matches the restaurant name/city → used in email
- Low — profile is ambiguous (wrong company, different city) → email uses generic opener, UI shows ⚠ badge
- 6am cron job — detects new signals, analyzes leads, sends digest email to sales team
- Streamlit dashboard — "Today" tab shows Top 5 leads by ITO score, one-click email send
- Notion CRM — automatic Kanban sync on every status change
Targets newly opened restaurants via France's official business registry. Scoring based on 12 criteria including signal recency, POS competitor detection, LinkedIn manager identification, and expansion signals.
Targets B2B SaaS companies in the same buying window: actively hiring an "AI project manager" = they have budget and a defined use case.
| Signal | Points |
|---|---|
| Open "Chef de projet IA" job posting | +35 |
| AI tools confirmed in stack (Claude, Dust, GPT) | +25 |
| Recent growth signal (funding, expansion) | +20 |
| Decision-maker identified (CEO, CTO, Head of Sales) | +15 |
| Pain point documented publicly (blog, job post, interview) | +5 |
Priority targets identified: Amenitiz, Skello, Inpulse, Fullsoon, Combo, Zelty, L'Addition.
BODACC API
│
▼
pipeline_signals.py ──► signal list (new openings, transfers, hiring)
│
▼
pipeline.py
├── enrich_gerant() # LinkedIn enrichment via Exa + confidence scoring
├── scrape_contact_from_website() # free email + phone extraction
├── scrape_tripadvisor_contact() # phone fallback via Exa
└── analyze_restaurant() # Claude: score + 5-touch email sequence (score ≥ 40 only)
│
├── outputs/*.json
│ │
│ └── Notion Kanban (auto-sync)
│
└── daily_run.py
├── Top 5 ITO ranking (Optimal Timing Index)
├── IRP alerts (leads at risk of competitor signing)
└── Gmail digest at 6am
│
▼
streamlit_app.py ──► dashboard (Détecter / Agir / Suivre / Analyse manuelle / Contexte)
| Manual | This pipeline | |
|---|---|---|
| Time per prospect | 2–4 hours | ~25 seconds |
| Cost per prospect | €40–80 (at €20/h) | ~€0.05 |
| Email personalization | Human-written | Mirror vocabulary (automated) |
| CRM update | Manual | Automatic |
| 50 prospects/week | 100–200 hours | ~20 minutes |
Apollo.io charges €99/month. Clay charges €149/month. Neither writes the email.
| Component | Tool |
|---|---|
| LLM | Claude Sonnet 4.6 (Anthropic) — Haiku in demo mode |
| Semantic search + LinkedIn | Exa |
| Signal source | BODACC (French public registry) |
| Web scraping | requests + regex |
| Dashboard | Streamlit |
| CRM | Notion API |
| Gmail SMTP | |
| Scheduling | Windows Task Scheduler |
git clone https://github.com/AkmaDev/prospection_pipeline
cd prospection_pipeline
pip install -r requirements.txtCopy .env.example → .env:
ANTHROPIC_API_KEY=sk-ant-... # required
# Enrichment (optional but recommended)
EXA_API_KEY=... # LinkedIn + TripAdvisor enrichment
# CRM sync (optional)
NOTION_API_KEY=...
NOTION_DATABASE_ID=...
# Digest email (optional)
DIGEST_EMAIL_FROM=you@gmail.com
DIGEST_EMAIL_TO=you@gmail.com
DIGEST_EMAIL_PASSWORD=xxxx-xxxx # Gmail app password
# Customization
COMPANY_NAME=Your Company
COMPANY_CONTEXT=Your product description
SALES_REP_NAME=Your Name
# Demo mode (limits to 3 analyses, uses Haiku)
DEMO_MODE=false
DEMO_LIMIT=3# Schedule daily run at 6am (Windows, run as admin)
setup_cron_windows.bat
# Launch dashboard
streamlit run streamlit_app.py
# Or run manually
python daily_run.py --days 7 --limit 20
# Analyze a single restaurant
python pipeline.py "O'Tacos" "Paris 12"
# Setup Notion Kanban (first time)
python notion_kanban.py --setupBuilt by Manassé Akpovi — L3 IA, ESGI Paris
