Skip to content

A custom search engine that indexes and provides searchable access to all my portfolio content, personal websites, and professional profiles.

Notifications You must be signed in to change notification settings

rhamzthev/search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Personal Portfolio Search Engine

A custom search engine that indexes and provides searchable access to all my portfolio content, personal websites, and professional profiles.

Features

  • 🔍 Real-time search functionality
  • 🎯 TF-IDF based search ranking
  • 🕷️ Automated web crawler for content indexing
  • 🎨 Clean, modern UI inspired by popular search engines
  • 📱 Responsive design for all devices
  • ⚡ Fast and efficient search results

Tech Stack

Frontend

  • React 19 with TypeScript
  • Vite for build tooling
  • React Router for navigation
  • Custom CSS with CSS Variables for theming

Backend

  • FastAPI for the REST API
  • PostgreSQL for data storage
  • psycopg for database connectivity
  • BeautifulSoup4 for web crawling

Getting Started

Prerequisites

  • Node.js (v18 or higher)
  • Python 3.8+
  • PostgreSQL database

Backend Setup

  1. Create a virtual environment and install dependencies:
cd server
python -m venv venv
source venv/bin/activate  # On Windows: .\venv\Scripts\activate
pip install -r requirements.txt
  1. Set up your environment variables in .env:
DB_HOST=localhost
DB_PORT=5432
DB_NAME=your_db_name
DB_USER=your_db_user
DB_PASSWORD=your_db_password

Frontend Setup

  1. Install dependencies:
cd client
npm install
  1. Start the development server:
npm run dev

Architecture

Search Implementation

The search functionality uses TF-IDF (Term Frequency-Inverse Document Frequency) scoring for ranking results. The crawler processes content and stores word frequencies in the database, which are then used to calculate relevance scores during search.

Database Schema

  • pages: Stores crawled web pages
  • keywords: Stores unique keywords
  • keyword_pages: Maps keywords to pages with frequency counts

Web Crawler

The crawler automatically indexes content from specified start URLs, following links to related pages while respecting robots.txt rules and implementing polite crawling practices.

API Endpoints

  • GET /search?q={query}: Search for pages matching the query

Contributing

Feel free to submit issues and enhancement requests!

License

MIT License

Contact

About

A custom search engine that indexes and provides searchable access to all my portfolio content, personal websites, and professional profiles.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published