<a href="https://colab.research.google.com/github/trancethehuman/ai-workshop-code/blob/main/Baby_step_scrape_your_first_website_and_pipe_to_an_LLM.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Install Firecrawl's python package to be able to use it.

In [None]:
pip install firecrawl-py -q

Setup Firecrawl's Python client

In [None]:
import getpass

FIRECRAWL_API_KEY = getpass.getpass("Firecrawl API Key: ")

In [None]:
from firecrawl import FirecrawlApp

app = FirecrawlApp(api_key=FIRECRAWL_API_KEY)

Let's pick two competitors: Stripe and Paddle. They do the same thing: process payments. But there are subtleties that I don't want to spend time reading about.

Good thing is their differentiations are in their pricing pages. So we're going to scrape them, give them to an LLM and compare.


Let's scrape Stripe's pricing and features page.

In [None]:
stripe = app.scrape_url(
    'https://stripe.com/en-ca/pricing',
    params={'formats': ['markdown', 'html']})

And now we scrape Paddle's.

In [None]:
print(stripe["markdown"])
stripe_markdown = stripe["markdown"]

Standard

Access a complete payments platform with simple, pay-as-you-go pricing. No setup fees, monthly fees, or hidden fees.

[Get started](https://dashboard.stripe.com/register)
 

2.9% + C$0.30
-------------

per successful charge for domestic cards

Custom

Design a custom package – available for businesses with large payments volume or unique business models.

[Contact sales](/en-ca/contact/sales)
 

*   IC+ pricing

*   Volume discounts

*   Multi-product discounts

*   Country-specific rates

Get [100+ features](https://stripe.com/en-ca/payments/features)
 out of the box with Stripe’s integrated per-transaction pricing.

Software and infrastructure for e-commerce, recurring billing, marketplaces, and more.

#### Cards and wallets

Integrated per-transaction pricing means no setup fees or monthly fees. The price is the same for all cards and digital wallets.

Visa Mastercard Maestro American Express Discover JCB Diners Club UnionPay Applepay Googlepay Link

**2.9% + C$0.30**

pe

In [None]:
paddle = app.scrape_url('https://www.paddle.com/pricing', params={'formats': ['markdown']})

In [None]:
print(paddle["markdown"])
paddle_markdown = paddle["markdown"]

Event

Happy Hour: Growing your consumer subscription business

[Join us live](https://www.paddle.com/events/happy-hour-2024?utm_medium=website&utm_source=website-banner&utm_campaign=webinars_fy2024_q4_core_paddle_happy_hour_webinar_series&utm_content=homepage-banner)

Our pricing

Your payments, billing, tax compliance, and more at an all-inclusive price.  
No added costs, no hidden fees.

[Interested in working with our Price Intelligently team? Get in touch.](/price-intelligently/contact-price-intelligently)

Pay-as-you-go
-------------

Global payments and billing seamlessly unified in one platform

Cross-border sales tax compliance

Protection against fraud and chargebacks

No migration fees, monthly fees, or hidden extras

[Sign up](https://login.paddle.com/signup)

5% + 50¢per Checkout transaction

Custom pricing

Tailored pricing for rapidly scaling and established large-scale businesses.

[Contact Sales](/demo)

Custom pricing to fit your business model and products

Get acces

Now let's concat the markdowns from both sites together in one string.

In [None]:
context = f"""The following are pricing and features for Stripe and Paddle. \n\n<stripe>{stripe_markdown}</stripe>\n\n<paddle>{paddle_markdown}</paddle>"""

Now we setup our LLM for extraction. Let's install Groq to access Llama 3.2 11B

In [None]:
pip install groq -q

In [None]:
from groq import Groq
import getpass

GROQ_API_KEY=getpass.getpass("Groq API Key: ")

client = Groq(api_key=GROQ_API_KEY)

In [None]:
system_message = f"""
You're a helpful competitve intelligence assistant.
You're going to look at the pricing and features
of two companies and give me insights.
Your answer is always extremely short and straight to the point.
"""

In [None]:
def get_response(query: str, context: str):
  completion = client.chat.completions.create(
      model="llama-3.2-11b-text-preview",
      messages=[
          {
              "role": "system",
              "content": system_message
          },
          {
              "role": "user",
              "content": context
          },
          {
              "role": "user",
              "content": query
          }
      ],
      temperature=0,
      max_tokens=1470,
      top_p=1,
      stream=False,
      stop=None,
  )

  return completion.choices[0].message.content

In [None]:
answer = get_response(
    query="which one is cheaper",
    context=context,
    )

print(answer)

Based on the pricing information provided, Stripe's pay-as-you-go pricing for domestic cards is 2.9% + C$0.30 per successful charge. Paddle's pricing is 5% + 50¢ per Checkout transaction. 

Stripe is cheaper for domestic card transactions.


In [None]:
print(
    get_response(
    query="what features does Paddle have that Stripe doesn't",
    context=context,
    )
)

Based on the provided information, here are some features that Paddle has that Stripe doesn't:

1. **Invoicing**: Paddle offers international invoicing without worrying about sales taxes or local regulations.
2. **Advisory services**: Paddle provides advisory services to ensure customers have performance updates, ongoing suggestions, and data insights.
3. **Implementation service**: Paddle offers bespoke implementation support from Solutions Architects to help customers navigate the setup process.
4. **Tax remittance**: Paddle collects the correct rate of tax, tracks filing deadlines, prepares records, submits sales tax returns, and makes payments to stay compliant.
5. **Upsell insights**: Paddle identifies groups of inbound subscribers from the same business domain.
6. **Customer support**: Paddle provides guidance on using Paddle, so customers can stay focused on growth.
7. **Migration service**: Paddle migrates customers without putting subscribers at risk in the process.
8. **Billi

In [None]:
print(
    get_response(
    query="what features does Stripe have that Paddle doesn't",
    context=context,
    )
)

Based on the provided information, here are some features that Stripe has that Paddle doesn't:

1. **Vault and Forward API**: Stripe offers a custom pricing plan for the Vault and Forward API, which allows tokenizing and storing card details in Stripe's PCI-compliant vault and routing payment requests to other processors. Paddle does not mention this feature.
2. **Adaptive Pricing**: Stripe includes adaptive pricing in its standard plan, which presents prices in local currencies to buyers globally to increase revenue and improve the customer experience. Paddle does not mention this feature.
3. **Post-payment invoices**: Stripe offers post-payment invoices with a 0.4% fee on transaction total and a US$2.00 cap per invoice. Paddle does not mention this feature.
4. **Platform management and risk tools**: Stripe offers platform management and risk tools as part of its custom pricing plan for large-scale businesses. Paddle does not mention this feature.
5. **Automated 1099 generation and de

In [None]:
print(
    get_response(
    query="what fraud detection services do each company have",
    context=context,
    )
)

Stripe offers several fraud detection services, including:

*   Radar: A machine learning-based fraud detection system that uses billions of data points across the Stripe network to identify and prevent fraudulent transactions.
*   Radar for Fraud Teams: A more advanced version of Radar that provides additional features and tools for professionals who need greater control and customisable settings.
*   ID document and selfie verification: A service that captures and verifies the authenticity of global ID documents, and uses biometric verification to match ID photos with selfies of the document holder.
*   ID number lookup: A service that allows users to key-in name, date of birth, and local government ID number, and validates against government and 3rd-party databases.

Paddle also offers several fraud detection services, including:

*   Fraud protection: A service that prevents card attacks, fights bad-faith chargebacks, and screens fraudsters.
*   Upsell insights: A service that iden