Skip to content

insumanth/webscrapping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

webscrapping

Sample Repository with basic web scrapping

Welcome DSAC Students.

I'm Sumanth,

You can find all the resources here.

Environment Needed

  • Python 3 🐍

  • The below mentioned Python Packages 📦

    • pip install lxml
    • pip install Scrapy
    • pip install requests
    • pip install gTTS (Optional)

Why Should i use these packages?

lxml is a Pythonic, mature binding for the libxml2 and libxslt libraries. It provides safe and convenient access to these libraries using the ElementTree API.

Scrapy is a fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

Requests is a simple, yet elegant, HTTP library. Requests allows you to send HTTP/1.1 requests extremely easily. There’s no need to manually add query strings to your URLs, or to form-encode your PUT & POST data — but nowadays, just use the json method!

gTTS (Google Text-to-Speech), a Python library and CLI tool to interface with Google Translate's text-to-speech API. Write spoken mp3 data to a file, a file-like object (bytestring) for further audio manipulation, or stdout. Or simply pre-generate Google Translate TTS request URLs to feed to an external program.

image

About

Sample Repository with basic web scrapping

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published