F1-AI: Formula 1 RAG Application

F1-AI is a Retrieval-Augmented Generation (RAG) application specifically designed for Formula 1 information. It features an intelligent web scraper that automatically discovers and extracts Formula 1-related content from the web, stores it in a vector database, and enables natural language querying of the stored information.

Features

Web scraping of Formula 1 content with automatic content extraction
Vector database storage using Pinecone for efficient similarity search
OpenRouter integration for advanced LLM capabilities
RAG-powered question answering with contextual understanding and source citations
Command-line interface for automation and scripting
User-friendly Streamlit web interface with chat history
Asynchronous data ingestion and processing for improved performance

Architecture

F1-AI is built on a modern tech stack:

LangChain: Orchestrates the RAG pipeline and manages interactions between components
Pinecone: Vector database for storing and retrieving embeddings
OpenRouter: Primary LLM provider with Mistral-7B-Instruct model
Ollama: Alternative local LLM provider for embeddings
Playwright: Handles web scraping with JavaScript support
BeautifulSoup4: Processes HTML content and extracts relevant information
Streamlit: Provides an interactive web interface with chat functionality

Prerequisites

Python 3.8 or higher
OpenRouter API key (set as OPENROUTER_API_KEY environment variable)
Pinecone API key (set as PINECONE_API_KEY environment variable)
8GB RAM minimum (16GB recommended)
Internet connection for web scraping
Ollama installed (optional, for local embeddings)
- Download from Ollama's website
- Pull the required model: ollama pull all-minilm-l6-v2

Installation

Clone the repository:

git clone <repository-url>
cd FASTapi-RAG

Install the required dependencies:
```
pip install -r requirements.txt
```
Install Playwright browsers:
```
playwright install
```

Set up environment variables: Create a .env file with:

OPENROUTER_API_KEY=your_api_key_here    # Required for LLM functionality
PINECONE_API_KEY=your_api_key_here      # Required for vector storage

Usage

Command Line Interface

Scrape and ingest F1 content:
```
python f1_scraper.py --start-urls https://www.formula1.com/ --max-pages 100 --depth 2 --ingest
```
Options:
- --start-urls: Space-separated list of URLs to start crawling from
- --max-pages: Maximum number of pages to crawl (default: 100)
- --depth: Maximum crawl depth (default: 2)
- --ingest: Flag to ingest discovered content into RAG system
- --max-chunks: Maximum chunks per URL for ingestion (default: 50)
- --llm-provider: Choose LLM provider (openrouter, ollama)

Ask questions about Formula 1:

python f1_ai.py ask "Who won the 2023 F1 World Championship?"

Streamlit Interface

Run the Streamlit app:

streamlit run streamlit_app.py

This will open a web interface where you can:

Ask questions about Formula 1
View responses in a chat-like interface
See source citations for answers
Track conversation history
Get real-time updates on response generation

Project Structure

f1_scraper.py: Intelligent web crawler implementation
- Automatically discovers F1-related content
- Handles content relevance detection
- Manages crawling depth and limits
f1_ai.py: Core RAG application implementation
- Handles data ingestion and chunking
- Manages vector database operations
- Implements question-answering logic
llm_manager.py: LLM provider management
- Integrates with OpenRouter for advanced LLM capabilities
- Handles embeddings generation
- Manages API interactions
streamlit_app.py: Streamlit web interface
- Provides chat-based UI
- Manages conversation history
- Handles async operations

Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Submit a Pull Request

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.gitignore		.gitignore
README.md		README.md
f1_ai.py		f1_ai.py
f1_scraper.py		f1_scraper.py
f1_urls.txt		f1_urls.txt
image.png		image.png
llm_manager.py		llm_manager.py
requirements.txt		requirements.txt
streamlit_app.py		streamlit_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

F1-AI: Formula 1 RAG Application

Features

Architecture

Prerequisites

Installation

Usage

Command Line Interface

Streamlit Interface

Project Structure

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Languages

AdityaAdaki21/FASTapi-RAG

Folders and files

Latest commit

History

Repository files navigation

F1-AI: Formula 1 RAG Application

Features

Architecture

Prerequisites

Installation

Usage

Command Line Interface

Streamlit Interface

Project Structure

Contributing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages