Skip to content

david-allison/manx-corpus-search

Repository files navigation

manx-corpus-search

A corpus search for primarily bilingual manx to english texts.

Deployed at https://corpus.gaelg.im/

To add/modify documents, see: manx-search-data

Installation

  1. Clone the source
  2. Copy the OpenData folder from manx-search-data into CorpusSearch/OpenData folder
  3. dotnet run

Tech Stack

  • React
  • C# (ASP.NET Core, both WebAPI and content server)
  • Document Searching: Apache Lucene.NET
  • Query Search Syntax: csly
  • CSV: CsvHelper
  • JSON: Newtonsoft.Json

Aims

  • Run in RAM on a cheap (<$20/m) droplet
  • No expectation of scaling up for a large number of users
  • Expected corpus size is unlikely to exceed 10MM words of Manx (and 10MM words of English)
  • Stateless

Deployment

Deployable on a $5 DigitalOcean droplet. See GitHub actions

Analytics

Server requirements

  • git
  • dotnet-sdk-6.0
  • TODO