This project is designed to scrape news articles from specified websites, store the data in a MySQL database, generate concise summaries using a Large Language Model (LLM), and provide a user-friendly interface via Streamlit. It aims to make news consumption easier by summarizing articles and presenting them in a simple, digestible format.
- Web Scraping: Collect news articles from various websites using FastAPI.
- Database Management: Store and manage news articles in a MySQL database.
- Summary Generation: Use an LLM to generate concise summaries of the articles.
- User Interface: Provide a simple UI for viewing news and summaries using Streamlit.
- Real-Time Updates: Periodically update the news database with the latest articles.
- Backend:
- FastAPI: For creating the RESTful API using Python.
- Database
- MySQL: For storing news articles.
- Machine Learning:
- LLM Model: For generating summaries of the articles.
- Frontend:
- Streamlit: For creating the user interface.
- Python: For scripting and automation.
- Python 3.8 or higher
- MySQL Server
- Docker (optional, for containerization)
fastapi
uvicorn
mysql-connector-python
sqlalchemy
groq
(depending on the LLM)streamlit
requests
pydantic
- OS: Windows
- Hardware: Minimum 8 GB RAM, 4 CPU cores
Follow the steps below to set up the project on your local machine:
python -m venv venv
source venv/bin/activate # On Windows use `venv\Scripts\activate`
pip install -r requirements.txt
Install MySQL Server and create a database. Create a .env file in the root directory with your MySQL credentials:
DB_HOST=localhost
DB_USER=youruser
DB_PASSWORD=yourpassword
DB_NAME=newsdb
Backend (FastAPI)
run main.py
Frontend (Streamlit)
streamlit run Home.py
Open your browser and navigate to http://localhost:8501 to view the Streamlit UI. But as it set on local server it can't be view in publicly that's why I have attached the screenshots in below. For Api Documentation run code and navigate to http://localhost/8011/endpoints#