🕸️ Scrapipy – AI Web Scraper Dashboard

Scrapipy is a powerful AI-powered web scraping dashboard built using Streamlit, Selenium, and LangChain. It allows users to extract, summarize, and analyze web content with a user-friendly interface and LLM integration.

Visit Here- https://scrapipy.streamlit.app/

🚀 Features

🔍 Input a website URL and extract clean text using Selenium & BeautifulSoup
🧠 Analyze and summarize content using LLMs via langchain_together
📊 Interactive UI with Streamlit
🌐 Environment-secure configuration via .env or Streamlit Secrets
💬 Modular design for easy LLM and model integration

🛠️ Tech Stack

Streamlit – For building the web UI
Selenium + BeautifulSoup – For scraping dynamic and static content
LangChain + langchain_together – For LLM integration
Python-dotenv – For environment variables
OpenAI / Together API – For running language models

📦 Installation

1. Clone the Repository

git clone https://github.com/akarshmi/Scrapipy.git
cd Scrapipy

2. Create a Virtual Environment (Optional)

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

3. Install Dependencies

pip install -r requirements.txt

4. Setup Environment Variables

Create a .env file in the root folder:

OPENAI_API_KEY=your_openai_key_here
TOGETHER_API_KEY=your_together_api_key_here

Alternatively, add them securely on Streamlit Cloud under Secrets.

▶️ Run the App

streamlit run main.py

🌐 Deployment

This app can be deployed instantly using Streamlit Cloud:

Push your code to GitHub
Go to Streamlit Cloud
Click “New App”
Select your repo and main.py as the entry point
Add secrets (API keys), then deploy 🎉

📁 Project Structure

Scrapipy/
├── main.py
├── parse.py
├── utils/
│   └── dom_utils.py
├── requirements.txt
└── README.md

🧠 Credits

Created with 💻 by Akarsh Mishra
Feel free to fork, star ⭐ and contribute!

📜 License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.devcontainer		.devcontainer
.gitignore		.gitignore
README.md		README.md
chromedriver.exe		chromedriver.exe
main.py		main.py
parse.py		parse.py
requirements.txt		requirements.txt
scrape.py		scrape.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🕸️ Scrapipy – AI Web Scraper Dashboard

Visit Here- https://scrapipy.streamlit.app/

🚀 Features

🛠️ Tech Stack

📦 Installation

1. Clone the Repository

2. Create a Virtual Environment (Optional)

3. Install Dependencies

4. Setup Environment Variables

▶️ Run the App

🌐 Deployment

📁 Project Structure

🧠 Credits

📜 License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🕸️ Scrapipy – AI Web Scraper Dashboard

Visit Here- https://scrapipy.streamlit.app/

🚀 Features

🛠️ Tech Stack

📦 Installation

1. Clone the Repository

2. Create a Virtual Environment (Optional)

3. Install Dependencies

4. Setup Environment Variables

▶️ Run the App

🌐 Deployment

📁 Project Structure

🧠 Credits

📜 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages