Skip to content
This repository was archived by the owner on Jan 24, 2019. It is now read-only.

sahid/scality

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MAP REDUCE
==========

Introduction
------------

A modest implementation.

The framework is composed in 4 operations, 'inputify', 'map', 'reduce'
and 'ouputify'.

In a first step the 'inputify' operation takes any document to split
it in chunks of 'struct input_split'. Basically a set of 'struct
input_split' is a simple linked-list.

An internal process called 'distribute' will be responsible to equally
share to the map processes, part of the inputs.

The 'map' function is executed in parallel depending the number of
threads requested. The key/value pair results of map should be 'emit'
to the in-memory storage. That one is thread-safe and appends the
values based on the key.

  The in-memory storage concept

             ____ ____ ____ ____ 
  storage = | k1 | k2 | .. | kn |
	     ‾|‾‾ ‾‾‾‾ ‾‾‾‾ ‾‾‾‾
	     _|__ 
            | v1 |
	     ‾|‾‾
     	     _|__ 
            | v2 |
	     ‾|‾‾


The 'reduce' process receives a reference of the storage, computes the
result and returns an user defined data-structure which will be then
passed to the 'outputify' function responsible of managing the result.


Hacking note
------------

Trying to follow:

  https://www.kernel.org/doc/html/v4.10/process/coding-style.html


Known issues
------------

- The in-memory storage is based on a array and take 0(n) to find a
  key
- The emit is globally locked by a mutex
- It's something wrote in few days, the design may be not optimal at
  all :/
- ...

  

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors