Skip to content

The Search-Engine-Project is a simple web-based search engine API built with ASP.NET Core (.NET 8), Entity Framework Core, and SQL Server. This project demonstrates how to build a basic search engine backend that indexes words and URLs, and exposes endpoints for querying word occurrences and page ranks.

Notifications You must be signed in to change notification settings

omarovici/Search-Engine-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Search-Engine-Project

A comprehensive web-based search engine solution built with ASP.NET Core (.NET 8), Entity Framework Core, SQL Server, and Python for data processing. This project demonstrates crawling, indexing, ranking, and searching web pages, with a modern frontend and robust backend API.


Table of Contents


Folder & File Explanations

Engine Scripts

Frontend/search-ui

  • index.html: Main HTML file for the search UI.
  • script.js: Handles search requests, result rendering, modal logic, and UI interactivity.
  • style.css: Styles the search UI, including dark mode and responsive design.

Organize Scrapping Using Python

Search Engine (ASP.NET Core Backend)


How to Clone and Run the Full Project

Prerequisites

  • .NET 8 SDK
  • Python 3.x (for engine scripts)
  • SQL Server (or compatible connection string)

How to Run with Docker & Docker Compose

Prerequisites

Build and Run

  1. Clone the repository:

    git clone https://github.com/omarovici/Search-Engine-Project.git
    cd Search-Engine-Project
  2. Build and start services:

    docker compose up --build

    This will build the backend image and start both the backend and SQL Server containers.

  3. Access the backend API:

DockerHub Image

Environment Variables

  • Database credentials and connection strings are managed via environment variables in docker-compose.yml.

Stopping Services

docker compose down

2. Prepare the Data (Python Engine)

  • Navigate to Engine Scripts/ and run the scripts in order:
    1. crawler.py to crawl and collect web data.
    2. inverted_index.py to build the index.
    3. pageRank.py to compute PageRank.
  • Ensure the output JSON files are generated in Engine Scripts/Jsons/.
  • (Optional) Use scripts in Organize Scrapping Using Python/ for data cleaning or conversion.

3. Configure the Database

  • Update the DefaultConnection string in Search Engine/appsettings.Development.json and/or appsettings.json to point to your SQL Server instance.

4. Apply Database Migrations

cd "Search Engine"
dotnet ef database update

5. Run the Backend API

dotnet run
  • The API will be available at https://localhost:<port>/api/SearchEngine
  • Swagger UI is available at https://localhost:<port>/swagger

6. Run the Frontend

  • Open Frontend/search-ui/index.html in your browser.
  • The frontend will connect to the backend API to perform searches and display results.

API Usage & Examples

Search Endpoint

  • GET /api/SearchEngine?word={word}&orderBy={orderBy}
    • word: The word to search for (required)
    • orderBy: pagerank or count (optional, default: pagerank)

Example Request

GET https://localhost:5001/api/SearchEngine?word=python&orderBy=pagerank

Example Response

[
  {
    "Url": "http://example.com/python-tutorial",
    "Count": 8,
    "PageRank": 0.92
  },
  {
    "Url": "http://another.com/python-guide",
    "Count": 5,
    "PageRank": 0.85
  }
]

Error Handling

  • If no results are found, the API returns HTTP 404 Not Found.
  • If parameters are missing or invalid, the API returns HTTP 400 Bad Request.

Troubleshooting & FAQ

Q: The API is not responding or returns 500 errors.

  • Check your database connection string in appsettings.json.
  • Ensure SQL Server is running and accessible.
  • Check for missing migrations or run dotnet ef database update.

Q: The frontend does not display results.

  • Make sure the backend API is running and accessible at the expected URL.
  • Check browser console for CORS or network errors.

Q: Python scripts fail to run.

  • Ensure you have Python 3.x installed and all required packages (see script headers for requirements).

Q: How do I reset the database?

  • Delete the database and run migrations again with dotnet ef database update.

Contribution Guidelines

  1. Fork the repository and create a new branch for your feature or bugfix.
  2. Write clear, concise commit messages.
  3. Ensure your code follows the existing style and conventions.
  4. Add or update documentation and tests as needed.
  5. Submit a pull request with a detailed description of your changes.

Credits & Acknowledgments

  • Team Members:
    • Abd El-Rahman Eldeeb (Frontend Developer)
    • Abd El-Rahman Ehab (Frontend Developer)
    • Omar Khalid (.NET Developer)
    • Shehab Mohamed (.NET Developer)
    • Shehab Yasser (Python Developer)
    • Haneen Hassan (Python Developer)
  • Special Thanks:
    • Open source libraries and the .NET, Python, and SQL Server communities.

Technologies Used

  • ASP.NET Core (.NET 8)
  • Entity Framework Core
  • SQL Server
  • Python 3.x
  • JavaScript (ES6+)
  • HTML5 & CSS3
  • Swagger (Swashbuckle)
  • Modern browser APIs

Security & Performance Notes

  • Security:
    • Always validate and sanitize user input.
    • Use HTTPS in production.
    • Restrict CORS as needed for your deployment.
  • Performance:
    • Use database indexes on frequently queried columns.
    • Consider caching frequent queries.
    • Use pagination in the backend for large result sets.

Future Improvements & Roadmap

  • Add user authentication and personalized search history.
  • Implement backend pagination and lazy loading for even faster responses.
  • Add more advanced ranking algorithms and machine learning integration.
  • Improve crawler to handle JavaScript-heavy sites.
  • Add automated tests and CI/CD pipeline.
  • Deploy demo version online.

Docker Hub & GitHub Links


Running with Docker Hub Image

You can pull the pre-built backend image from Docker Hub and run the full stack using Docker Compose.

  1. Pull the backend image:
    docker pull omarovici/search-engine-project:latest
  2. Update your docker-compose.yml (if needed): Ensure the backend service uses the image:
    services:
      backend:
        image: omarovici/search-engine-project:latest
        # ... other settings ...
  3. Start the application:
    docker compose up
    This will start both the backend and SQL Server containers using the pulled image.

You can view the image on Docker Hub: omarovici/search-engine-project

About

The Search-Engine-Project is a simple web-based search engine API built with ASP.NET Core (.NET 8), Entity Framework Core, and SQL Server. This project demonstrates how to build a basic search engine backend that indexes words and URLs, and exposes endpoints for querying word occurrences and page ranks.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •