# 💹🚀 Building a Stock Analyzer with WaterCrawl × Claude 3.7 × E2B × Tavily

Welcome to this **step-by-step Jupyter Notebook** where you'll unleash powerful financial analysis using the latest AI & scraping tools:

1. 🕷️ **WaterCrawl** – Lightning-fast, customizable web crawling for up-to-date stock info
2. 🤖 **Claude 3.7** – Expert-level financial analysis and crisp, structured insights
3. 📊 **E2B Code Interpreter** – Instantly visualize your findings, in-code!
4. 🔎 **Tavily Search API** – Find just the right stock pages (so you're always using the freshest data!)

---

### What will you learn? 👨‍🎓👩‍💻

- How to set up WaterCrawl + API keys in minutes
- 🕸️ Crawl and extract financial data, zero HTML hassle
- 📈 Supercharge stock insights using Claude 3.7
- 📊 Generate pretty Python charts from your AI analysis
- 💡 Pro tips for reliable, robust data scraping and analytics

---

#### 🔑 **What do you need?**

| Service       | Key Purpose                | Where to generate                               |
|--------------|---------------------------|-------------------------------------------------|
| WaterCrawl   | Crawl anything            | [Get API key](https://app.watercrawl.dev/dashboard/api-keys)     |
| Anthropic    | Power Claude 3.7          | [Get API key](https://console.anthropic.com/settings/keys)       |
| E2B          | In-notebook code & plots  | [Get API key](https://app.e2b.dev/)             |
| Tavily Search| Find relevant URLs fast   | [Get API key](https://app.tavily.com/home)      |

Let's begin! 🎉

---
## 1. 🚦 Setup and Installation

> **Install all dependencies in one go!**

In [ ]:
!pip install watercrawl-py anthropic e2b-code-interpreter python-dotenv matplotlib requests

---
## 2. 📦 Import Required Libraries

> Bring in the packages you'll use for everything from crawling to charting.

In [ ]:
import os
from watercrawl import WaterCrawlAPIClient
import anthropic
from e2b_code_interpreter import Sandbox
import base64
import json
from dotenv import load_dotenv
import matplotlib.pyplot as plt
from IPython.display import Image, display
import requests

---
## 3. 🔑 Load Environment Variables

> **Never hardcode your API keys!** Store them safely in a `.env` file for peace of mind 💼🔒

A sample `.env` should look like:
```env
WATERCRAWL_API_KEY=sk-... 
ANTHROPIC_API_KEY=sk-ant-... 
E2B_API_KEY=pk-e2b-... 
TAVILY_API_KEY=sk-tavily-... 
```


In [ ]:
# Load environment variables from .env
load_dotenv()

# Retrieve API keys
watercrawl_api_key = os.getenv("WATERCRAWL_API_KEY")
anthropic_api_key = os.getenv("ANTHROPIC_API_KEY")
e2b_api_key = os.getenv("E2B_API_KEY")
tavily_api_key = os.getenv("TAVILY_API_KEY")

# Initialize clients
watercrawl_client = WaterCrawlAPIClient(api_key=watercrawl_api_key)
claude_client = anthropic.Anthropic(api_key=anthropic_api_key)
sandbox = Sandbox(api_key=e2b_api_key)

print("✅ All clients initialized!")

---
## 4. 🛠️ Define Helper Functions

> These functions stitch together the search, crawl, and analysis magic! 🪄

In [ ]:
def find_relevant_stock_pages(stock_symbol, base_url):
    """
    🔎 Find relevant stock pages using Tavily Search API
    """
    print(f"Searching for 📈 stock: {stock_symbol} on {base_url}")
    try:
        url = "https://api.tavily.com/search"
        payload = {
            "query": f"{stock_symbol} stock analysis site:{base_url}",
            "topic": "finance",
            "search_depth": "basic",
            "max_results": 5,
            "include_domains": [base_url],
            "include_answer": False
        }
        headers = {
            "Authorization": f"Bearer {tavily_api_key}",
            "Content-Type": "application/json"
        }
        response = requests.post(url, json=payload, headers=headers)
        response.raise_for_status()
        results = response.json()
        urls = [item.get("url") for item in results.get("results", []) if item.get("url")]
        print(f"🔗 Found {len(urls)} relevant pages!")
        return urls
    except Exception as e:
        print(f"⚠️ Error finding stock pages: {str(e)}")
        return []

In [ ]:
def analyze_stock_data(stock_pages, client):
    """
    🤖 Analyze stock data using Claude 3.7
    """
    if not stock_pages:
        print("❌ No pages provided for analysis!")
        return None
    try:
        stock_contents = []
        for page_url in stock_pages[:5]:  # Only the top 5
            try:
                scrape_result = watercrawl_client.scrape_url(
                    url=page_url,
                    page_options={
                        "exclude_tags": ["nav", "footer"],
                        "include_tags": ["article", "main"],
                        "wait_time": 1500,
                        "include_html": False,
                        "only_main_content": True
                    }
                )
                stock_contents.append({
                    'url': page_url,
                    'content': scrape_result.get('content', '')
                })
            except Exception as e:
                print(f"⚠️ Error scraping {page_url}: {str(e)}")
        
        if not stock_contents:
            print("😢 No content scraped from pages!")
            return None
        
        analyze_prompt = """
You are a financial analyst. Based on the following stock information, analyze and provide in JSON:
- company_overview
- financial_health
- growth_potential
- risk_factors
- investment_score (0-100)

Stock Information:
"""
        for stock in stock_contents:
            analyze_prompt += f"\nURL: {stock['url']}\nContent: {stock['content']}\n"
        
        # Claude 3.7 call
        completion = client.messages.create(
            model="claude-3-7-sonnet-20250219",
            max_tokens=1000,
            temperature=0,
            system="You are a financial analyst. Provide detailed, accurate analysis in valid JSON format.",
            messages=[{"role": "user", "content": analyze_prompt}]
        )
        response_text = completion.content[0].text

        try:
            json_start = response_text.find('{')
            json_end = response_text.rfind('}') + 1
            if json_start != -1 and json_end != 0:
                json_str = response_text[json_start:json_end]
                return json.loads(json_str)
            else:
                print('⚠️ Claude did not return valid JSON. Output:')
                print(response_text)
                return None
        except json.JSONDecodeError as e:
            print(f'⚠️ JSON parsing error: {str(e)}')
            print('Full response:', response_text)
            return None
    except Exception as e:
        print(f'⚠️ Error analyzing stock data: {str(e)}')
        return None

In [ ]:
def visualize_analysis(analysis_result):
    """
    📊 Visualize the stock analysis results with a beautiful chart!
    """
    try:
        plt.figure(figsize=(8, 5))
        plt.bar(['Investment Score'], [analysis_result['investment_score']], color='skyblue')
        plt.ylim(0, 100)
        plt.title('📈 Stock Investment Analysis Score')
        plt.ylabel('Score (higher = better)')
        plt.tight_layout()
        plt.savefig('stock_analysis.png')
        plt.close()
        # Show inline in notebook
        display(Image('stock_analysis.png'))
        print("\n🔍 **Detailed Insights**")
        print(f"- **Company Overview**: {analysis_result['company_overview']}")
        print(f"- **Financial Health**: {analysis_result['financial_health']}")
        print(f"- **Growth Potential**: {analysis_result['growth_potential']}")
        print(f"- **Risk Factors**: {analysis_result['risk_factors']}")
        print(f"- **Investment Score**: {analysis_result['investment_score']}/100")
    except Exception as e:
        print(f'⚠️ Visualization error: {str(e)}')

---
## 5. 🏁 Main Analysis Function

> The "easy button" for stock analysis: runs search, scrape, AI, and chart in one call!

In [ ]:
def analyze_stock(stock_symbol):
    """
    🏢🔍 Analyze any stock and visualize the result!
    """
    base_url = "https://finance.yahoo.com"  # Trusted finance portal
    print(f'\n=== 🏦 Starting analysis for: {stock_symbol} ===')
    stock_pages = find_relevant_stock_pages(stock_symbol, base_url)
    if not stock_pages:
        print('❌ Could not find relevant stock pages')
        return
    analysis_result = analyze_stock_data(stock_pages, claude_client)
    if analysis_result:
        visualize_analysis(analysis_result)
    else:
        print('❌ Failed to analyze stock data.')

---
## 6. 🎬 Run the Analysis!

Ready to test? Just type a ticker (e.g. `AAPL`, `TSLA`, `GOOG`, `NFLX`) and watch the pipeline go! 🚦

In [ ]:
analyze_stock('AAPL')   # 🍏 Feel free to try other symbols here!

---
## 7. 💡 Best Practices & Tips

✔️ **Rate Limiting**: WaterCrawl is smart, but don't hammer sites! Consider sleep delays for big scrapes.

✔️ **Error Handling**: Always check responses, catch exceptions, and log unexpected content.

✔️ **Data Quality**: Validate what you scrape, and if investing, always double-check data from multiple sources.

✔️ **Performance**: For mass crawling, consider async techniques or chunked batches.

_**Note:** This demo is for educational purposes. Do not use AI output as investment advice!_


---
## 8. 🌱 Next Steps & Remix Ideas

🎯 Try analyzing more tickers or sectors

📅 Add historical trends & time series charts

🔄 Incorporate technical indicators

🔥 Build a portfolio analyzer by looping over a stock list

_Happy Crawling & Analyzing!_
