🌐 Web Text Scraper is your go-to tool for effortlessly extracting text elements from web pages. 🧰 Customize your extraction process by selecting 📃 paragraphs, 🏷️ titles, or specific HTML tags. With robust error handling and a visually appealing display of the extracted text, it simplifies web scraping, making data gathering a breeze. 🚀
- Flexible Element Selection 🖋️
- Interactive Interface 🌐
- Real-Time Text Extraction ⏳
- Feedback Messages 📢
To run the Web Text Scraper, make sure you have the following dependencies installed:
- streamlit
- beautifulsoup4==4.11.1
- pip==23.1.2
- requests==2.28.0
Command:
👉 pip install -r requirements.txt 👈
- Run the streamlit_app.py script using the following command:
🚀 python streamlit_app.py 🚀
-
Enter the URL of the web page to scrape.
-
Select the elements to scrape: "Paragraphs", "Titles", "Paragraphs and Titles", "All", or "Custom".
-
If choosing the "Custom" option, enable the "Custom Tag" checkbox and enter HTML tags (comma-separated).
-
Click the "Scrape" button to start scraping.
-
View the extracted text.
📌 Note:
Make sure to replace 'Jatin_Agrawal_20BCS6606' with your desired page title and 'LOGO.png' with the path to your desired page icon in the set_page_config function.
Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.
This project is licensed under the MIT License.