Unable to Scrape PDF URL #58

nagendrakumar02 · 2024-07-15T14:49:02Z

I'm experiencing an issue where I'm unable to scrape a PDF URL using the [library/tool name]. The URL in question is https://www.myelectric.coop/wp-content/uploads/Electric-Vehicle-Charging-Equipment-Rebates.pdf.

Also, is there an example to use crawl4ai with Azure open AI?

Steps to Reproduce:

Attempt to scrape the PDF URL using the crawl4ai
Observe that the scraping process fails or returns an error

Expected Behavior:

The crawl4ai should be able to successfully scrape the PDF URL and return the contents.

Actual Behavior:

The [library/tool name] is unable to scrape the PDF URL and returns an error or fails to complete the scraping process.

Error Message:
""" Failed to crawl https://www.myelectric.coop/wp-content/uploads/Electric-Vehicle-Charging-Equipment-Rebates.pdf, error: can only concatenate str (not "NoneType") to str"""

Reproduction Code:

def fetch_with_crawl(url):
# Create an instance of WebCrawler
crawler = WebCrawler()

# Warm up the crawler (load necessary models)
crawler.warmup()

# Run the crawler on a URL
result = crawler.run(url=url)

# Print the extracted content
# print(result.markdown)
return result.markdown

Let me know if you'd like me to add anything else to the issue!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to Scrape PDF URL #58

Unable to Scrape PDF URL #58

nagendrakumar02 commented Jul 15, 2024

Unable to Scrape PDF URL #58

Unable to Scrape PDF URL #58

Comments

nagendrakumar02 commented Jul 15, 2024