A web-based monitoring application that crawls and scrapes content from trusted URLs, searches for keywords, and displays results in a beautiful monitoring dashboard. Now powered by Django for better scalability and robustness.
- 🔍 Keyword Monitoring: Search for keywords across all trusted sources
- 🌐 URL Management: Add, view, and delete trusted sources
- 📊 Statistics Dashboard: Track your monitoring activity
- 🎯 Match Percentage: See relevance scores for each result
- 💾 Database Storage: Unified SQLite database using Django ORM
- 🎨 Modern UI: Beautiful dark-themed dashboard with smooth animations
- 🛡️ Admin Interface: Built-in Django admin for data management
- Backend: Python 3.x, Django 5.x
- Frontend: HTML5, CSS3, JavaScript (Vanilla)
- Database: SQLite (via Django ORM)
- Scraping: BeautifulSoup4, Requests
- CORS: django-cors-headers
source env/bin/activatepip install -r requirements.txtpython3 manage.py migratepython3 manage.py createsuperuser(Follow prompts to create an admin account)
python3 manage.py runserver 3000The server will start on http://localhost:3000
- Navigate to the URLs tab in the dashboard
- Add your trusted sources (websites you want to monitor)
- Each URL should be a valid web page
- Go to the Dashboard tab
- Enter keywords you want to search for
- Click Scan Now
- View results with match percentages and snippets
- Visit
http://localhost:3000/admin - Log in with your superuser credentials
- Manage URLs, Scraped Content, and Search Results directly
scraper/
├── scraper_project/ # Django project configuration
├── scraper_app/ # Main application logic (Models, Views, Admin)
├── requirements.txt # Python dependencies
├── script/
│ ├── scraper.py # Web scraping and keyword search logic
│ └── filter.py # Content filtering (future use)
├── frontend/
│ ├── index.html # Main dashboard HTML (Django template)
│ ├── index.css # Styles (Static file)
│ └── script.js # Frontend JavaScript (Static file)
├── manage.py # Django management script
└── db.sqlite3 # Unified SQLite database
All API endpoints are located under the root path and follow the same structure as the previous version:
POST /api/search- Trigger keyword searchGET /api/articles- Get search resultsGET /api/urls- Manage trusted URLsGET /api/statistics- Get monitoring statistics
- URL Storage: Trusted URLs are stored in the Django database using the
TrustedURLmodel. - Scraping: When you search for a keyword, the system:
- Fetches content from URLs defined in the database.
- Parses HTML structure (headings, paragraphs).
- Searches for keyword matches.
- Scoring: Match percentage is calculated based on keyword prominence and frequency.
- Storage: Scraped content and search results are stored in the database for future reference.
- Display: Results are shown in the dashboard with snippets and links.