A web application that helps users summarize and search scientific articles from ArXiv and HAL archives. It provides automatic summarization of scientific articles and keyword extraction to help users quickly understand research papers.
Example of an article summary showing details, generated summary, and key concepts
Search interface showing results from HAL archives
- Article Summarization: Generate concise summaries of scientific papers
- Key Concept Extraction: Identify the most important concepts in research papers
- Research Search: Search for articles on ArXiv and HAL archives
- Clean User Interface: Modern, responsive design for easy navigation
The project is divided into two main components:
- Backend (Flask API): Handles article extraction, summarization, and search
- Frontend (React): Provides user interface for interacting with the API
- Python 3.11+
- Flask (Web framework)
- LangChain (LLM integration)
- Groq (LLM provider)
- Pinecone (Vector database)
- React 19
- JavaScript/JSX
- CSS
- Vite (Build tool)
- Python 3.11+
- Node.js 18+ (for frontend)
- pnpm or npm
- Groq API Key
- Pinecone API Key and environment
-
Clone the repository
git clone https://github.com/Hamza-cpp/research-assistant.git cd research-assistant -
Create a Python virtual environment
python -m venv .venv source .venv/bin/activate # On Windows: venv\\Scripts\\activate
-
Install backend dependencies
pip install -e . -
Configure environment variables by creating a
.envfile in the root directory:# Flask settings FLASK_APP=main.py FLASK_DEBUG=True PORT=5000 # API Keys GROQ_API_KEY=your_groq_api_key_here PINECONE_API_KEY=your_pinecone_api_key_here PINECONE_ENVIRONMENT=your_pinecone_environment_here
-
Run the backend server
python main.py
The API will be available at http://localhost:5000
-
Navigate to the web-ui directory
cd web-ui -
Install frontend dependencies using pnpm
pnpm install
(Alternatively, you can use npm:
npm install) -
Start the development server
pnpm run dev # or npm run dev -
The frontend application will be available at http://localhost:5173
The backend provides the following API endpoints:
-
GET /api/health: Health check endpoint -
POST /api/summarize: Summarize an article from its URL// Request Body { "article_url": "https://arxiv.org/abs/2303.08774" } -
GET /api/search: Search for articles// Query Parameters ?q=machine+learning&source=arxiv&max_results=10
