Skip to content

Semestral project for the subject B6B36PJC on FEE CTU in winter semester 2019.

Notifications You must be signed in to change notification settings

Baterka/WordCount

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WordCount

Semestral project for the subject B6B36PJC on FEE CTU in winter semester 2019.

Task

Word Count

  • A typical problem with simple parallelization.
  • The input is a set of files, the output is an ordered listing of the words and their frequency.
  • In addition to words, the so-called n-grams can be counted.

(Translated from Czech original)

Implementation

Overview

  • Written in: C++14
  • Supports multiple threads (One thread processing one file)

Build

git clone https://github.com/Baterka/WordCount.git
cd WordCount
mkdir Release
cd Release
cmake -DCMAKE_BUILD_TYPE=Release ..
make

Usage

  • WordCount - Creates dictionary of words (or n-grams) contained in input files and counts their frequency.
  • Generator - A utility to generate multiple word files with random words in them.
  • Tests - Unit tests (WIP)