Skip to content

akarshmi/scrapipy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🕸️ Scrapipy – AI Web Scraper Dashboard

Scrapipy is a powerful AI-powered web scraping dashboard built using Streamlit, Selenium, and LangChain. It allows users to extract, summarize, and analyze web content with a user-friendly interface and LLM integration.

🚀 Features

  • 🔍 Input a website URL and extract clean text using Selenium & BeautifulSoup
  • 🧠 Analyze and summarize content using LLMs via langchain_together
  • 📊 Interactive UI with Streamlit
  • 🌐 Environment-secure configuration via .env or Streamlit Secrets
  • 💬 Modular design for easy LLM and model integration

🛠️ Tech Stack

  • Streamlit – For building the web UI
  • Selenium + BeautifulSoup – For scraping dynamic and static content
  • LangChain + langchain_together – For LLM integration
  • Python-dotenv – For environment variables
  • OpenAI / Together API – For running language models

📦 Installation

1. Clone the Repository

git clone https://github.com/akarshmi/Scrapipy.git
cd Scrapipy

2. Create a Virtual Environment (Optional)

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

3. Install Dependencies

pip install -r requirements.txt

4. Setup Environment Variables

Create a .env file in the root folder:

OPENAI_API_KEY=your_openai_key_here
TOGETHER_API_KEY=your_together_api_key_here

Alternatively, add them securely on Streamlit Cloud under Secrets.


▶️ Run the App

streamlit run main.py

🌐 Deployment

This app can be deployed instantly using Streamlit Cloud:

  1. Push your code to GitHub
  2. Go to Streamlit Cloud
  3. Click “New App”
  4. Select your repo and main.py as the entry point
  5. Add secrets (API keys), then deploy 🎉

📁 Project Structure

Scrapipy/
├── main.py
├── parse.py
├── utils/
│   └── dom_utils.py
├── requirements.txt
└── README.md

🧠 Credits

Created with 💻 by Akarsh Mishra
Feel free to fork, star ⭐ and contribute!


📜 License

This project is licensed under the MIT License.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages