<a href="https://colab.research.google.com/github/Praxis-QR/BDSN/blob/main/Basic_WordCount_Concept.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

![alt text](https://4.bp.blogspot.com/-gbL5nZDkpFQ/XScFYwoTEII/AAAAAAAAAGY/CcVb_HDLwvs2Brv5T4vSsUcz7O4r2Q79ACK4BGAYYCw/s1600/kk3-header00-beta.png)<br>


<hr>

[Prithwis Mukerjee](http://www.linkedin.com/in/prithwis)<br>

# Map Reduce Concept
Here we demonstrate the principles of Map-Reduce WITHOUT the need for any complex software like Hadoop or Spark

## Create a map.py and red(uce).py file

In [1]:
%%writefile map.py
#!/usr/bin/env python

import sys

# input comes from STDIN (standard input)
for line in sys.stdin:
    # remove leading and trailing whitespace
    line = line.strip()
    # split the line into words
    words = line.split()
    # increase counters
    for word in words:
        # write the results to STDOUT (standard output);
        # what we output here will be the input for the
        # Reduce step, i.e. the input for reducer.py
        #
        # tab-delimited; the trivial word count is 1
        print (word, 1)

Writing map.py


In [2]:
%%writefile red.py
#!/usr/bin/env python

from operator import itemgetter
import sys

current_word = None
current_count = 0
word = None

# input comes from STDIN
for line in sys.stdin:
    # remove leading and trailing whitespace
    line = line.strip()
    

    # parse the input we got from mapper.py
    word, count = line.split(' ')
    # convert count (currently a string) to int
    count = int(count)

    # this IF-switch only works because Hadoop sorts map output
    # by key (here: word) before it is passed to the reducer
    if current_word == word:
        current_count += count
    else:
        if current_word:
            # write result to STDOUT
            print (current_word, current_count)
        current_count = count
        current_word = word

# do not forget to output the last word if needed!
if current_word == word:
    print (current_word, current_count)

Writing red.py


In [3]:
!ls -al
# Removing carriage returns ... not required anymore
#!sed 's/\r$//' map.py > map1.py 
#!sed 's/\r$//' red.py > red1.py

total 24
drwxr-xr-x 1 root root 4096 Oct  7 04:47 .
drwxr-xr-x 1 root root 4096 Oct  7 04:45 ..
drwxr-xr-x 4 root root 4096 Oct  5 13:34 .config
-rw-r--r-- 1 root root  526 Oct  7 04:47 map.py
-rw-r--r-- 1 root root  871 Oct  7 04:47 red.py
drwxr-xr-x 1 root root 4096 Oct  5 13:35 sample_data


In [4]:
# make files executable
!chmod u+rwx /content/map.py
!chmod u+rwx /content/red.py
!ls -al

total 24
drwxr-xr-x 1 root root 4096 Oct  7 04:47 .
drwxr-xr-x 1 root root 4096 Oct  7 04:45 ..
drwxr-xr-x 4 root root 4096 Oct  5 13:34 .config
-rwxr--r-- 1 root root  526 Oct  7 04:47 map.py
-rwxr--r-- 1 root root  871 Oct  7 04:47 red.py
drwxr-xr-x 1 root root 4096 Oct  5 13:35 sample_data


## 3 steps > MAP > SORT > REDUCE

In [5]:
# Step 0 > just PRINT the input
!echo "the king beneath the mountain the king of carven stone"

the king beneath the mountain the king of carven stone


In [6]:
# Step 1 > MAP the data
!echo "the king beneath the mountain the king of carven stone" | ./map.py

the 1
king 1
beneath 1
the 1
mountain 1
the 1
king 1
of 1
carven 1
stone 1


In [7]:
# Step 2 > SORT the MAP output
!echo "the king beneath the mountain the king of carven stone" | ./map.py | sort

beneath 1
carven 1
king 1
king 1
mountain 1
of 1
stone 1
the 1
the 1
the 1


In [8]:
# Step 3 > REDUCE the SORT output
!echo "the king beneath the mountain the king of carven stone" | ./map.py | sort | ./red.py

beneath 1
carven 1
king 2
mountain 1
of 1
stone 1
the 3


## Use an input file

In [9]:
%%writefile hobbit.txt
The King beneath the mountains
The King of carven stone
The lord of silver fountains
Shall come into his own
His crown shall be upholding
His harp shall be restrung
His halls shall echo golden
To songs of yore re-sung
The woods shall wave on mountains
And grass beneath the sun
His wealth shall flow in fountains
And the rivers golden run
The streams shall run in gladness
The lakes shall shine and burn
All sorrow fail and sadness
At the mountain kings return
The King beneath the mountains
The King of carven stone
The lord of silver fountains
Shall come into his own
His crown shall be upholding
His harp shall be restrung
His halls shall echo golden
To songs of yore re-sung
The woods shall wave on mountains
And grass beneath the sun
His wealth shall flow in fountains
And the rivers golden run
The streams shall run in gladness
The lakes shall shine and burn
All sorrow fail and sadness
At the mountain kings return
The King beneath the mountains
The King of carven stone
The lord of silver fountains
Shall come into his own
His crown shall be upholding
His harp shall be restrung
His halls shall echo golden
To songs of yore re-sung
The woods shall wave on mountains
And grass beneath the sun
His wealth shall flow in fountains
And the rivers golden run
The streams shall run in gladness
The lakes shall shine and burn
All sorrow fail and sadness
At the mountain kings return
All sorrow fail and sadness
At the mountain kings return

Writing hobbit.txt


In [None]:
#!cat hobbit.txt

In [10]:
!cat hobbit.txt | ./map.py | sort | ./red.py

All 4
and 7
And 6
At 4
be 6
beneath 6
burn 3
carven 3
come 3
crown 3
echo 3
fail 4
flow 3
fountains 6
gladness 3
golden 6
grass 3
halls 3
harp 3
his 3
His 12
in 6
into 3
King 6
kings 4
lakes 3
lord 3
mountain 4
mountains 6
of 9
on 3
own 3
restrung 3
re-sung 3
return 4
rivers 3
run 6
sadness 4
shall 21
Shall 3
shine 3
silver 3
songs 3
sorrow 4
stone 3
streams 3
sun 3
the 13
The 18
To 3
upholding 3
wave 3
wealth 3
woods 3
yore 3


In [14]:
from datetime import datetime
import pytz
print('signed off at  ',datetime.now(pytz.timezone('Asia/Kolkata')))

signed off at   2022-10-07 10:20:02.709225+05:30


#Chronobooks <br>
![alt text](https://1.bp.blogspot.com/-lTiYBkU2qbU/X1er__fvnkI/AAAAAAAAjtE/GhDR3OEGJr4NG43fZPodrQD5kbxtnKebgCLcBGAsYHQ/s600/Footer2020-600x200.png)<hr>
Chronotantra and Chronoyantra are two science fiction novels that explore the collapse of human civilisation on Earth and then its rebirth and reincarnation both on Earth as well as on the distant worlds of Mars, Titan and Enceladus. But is it the human civilisation that is being reborn? Or is it some other sentience that is revealing itself. 
If you have an interest in AI and found this material useful, you may consider buying these novels, in paperback or kindle, from [http://bit.ly/chronobooks](http://bit.ly/chronobooks)