# Triadic Closure Study

in the `data` folder I have a file called `edges.txt` and `mini_edges.txt`. Both are text files with each row representing a user and friend pair. I am going to build a MapReduce job for making friend suggestions based on mutual friends. 

__Investigate data in `mini_edges.txt`__

In [20]:
with open('data/mini_edges.txt') as f:
    for row in f.readlines():
        print(row)

0 1

0 2

0 5

1 3

1 4

2 3

2 4

3 4

4 5



## 1.  Write a MapReduce job that returns each user's list of friends

__friends.py__

```Python
from mrjob.job import MRJob

class friends_list(MRJob):

    def mapper(self, _, line):
        clean = [int(x) for x in line.strip().split(' ')]
        f1 = clean[0]
        f2 = clean[1]
        yield (f1 ,f2)
        yield (f2, f1)

    def reducer(self, key, values):
        f_list = list(values)
        yield (key, f_list)

if __name__ == '__main__':
    friends_list.run()
```

Run the file in the terminal and abserve results. 
```bash

python friends.py ../data/mini_edges.txt

```
The results look like this:

```
0   [1, 2, 5]
1   [0, 3, 4]
2   [0, 3, 4]
3   [1, 2, 4]
4   [1, 2, 3, 5]
5   [0, 4]
```

## 2. Modify the script to return friend pairs for each user

```Python
from mrjob.job import MRJob, MRStep
import itertools


class friends_list(MRJob):

    def mapper(self, _, line):
        clean = [int(x) for x in line.strip().split(' ')]
        f1 = clean[0]
        f2 = clean[1]
        yield (f1 ,f2)
        yield (f2, f1)

    def reducer(self, key, values):
        f_list = list(values)
        yield (key, f_list)

    def mapper2(self, key, f_list):
        pairs = itertools.combinations(f_list,2)
        for i in pairs:
                yield (i , 1)

    def reducer2(self, key, values):
        yield (key , sum(values))

    def steps(self):
        return [MRStep(mapper=self.mapper,
        reducer=self.reducer),
        MRStep(mapper=self.mapper2,
        reducer=self.reducer2),]


if __name__ == '__main__':
    friends_list.run()
```

Run the file in the terminal and verify the results.
```bash

python fiends.py ../data/mini_edges.txt

```

The results look like this:

```
[1, 2]	3
[1, 3]	1
[1, 4]	1
[1, 5]	2
[2, 3]	1
[2, 4]	1
[2, 5]	2
[0, 3]	2
[0, 4]	3
[3, 4]	2
[3, 5]	1
```

## 3. Remove pairs that are already friends

Modify the `mapper2` and `reducer2` functions to discard any friend pairs that are already friends. 

```Python

def mapper2(self, key, f_list):
    pairs = itertools.combinations(f_list,2)
    for i in pairs:
        yield (i , 1)
    # reassign friend-pair value to 0 if already friends
    for f in f_list:
        yield (key, f), 0

def reducer2(self, key, values):
    count = 0
    for flag in values:
        if flag == 0:
            # if friend pair is already friends, discard
            return
        count+= flag
    yield (key , count)
    
```


Run the in the terminal again and observe the results.
```bash

python fiends.py ../data/mini_edges.txt 

```

```
[3, 5]	1
[1, 2]	3
[1, 5]	2
[0, 3]	2
[0, 4]	3
[2, 5]	2
```

The output has potential friend-pairs and a count of how many friends the pair have in common.

## 4. Make Friend Recommendations

Add another set of mapping and reducing functions that will make one friend recommendation for each user based on their mutual connections. 

```Python

def mapper3(self, key, values):
    # reformat the friend pairs and values
    yield key[0], (key[1], values)
    yield key[1], (key[0], values)

def reducer3(self, key, values):
    # recommend the user to friend the user with the most connections
    f2 = max(values, key=lambda x : x[1])[0]
    yield key, f2
    

```

Run the in the terminal and observe the results.
```bash

python fiends.py ../data/mini_edges.txt

```

```
...

2864	2877
2865	3255
2866	2694
2867	2966
2868	3291
2869	3434
287	 228
2870	2951
2871	2951
2872	2927

...
```

## 5. Run on full set of data

I saved the results in the `print_outs` folder in a text file `friend_suggestions.txt`

```bash
python fiends.py ../data/edges.txt > print_outs/friend_suggestions.txt
```

Query the results in the terminal using `grep`

```bash
grep '^2882' friend_suggestions.txt 
```

The result will be:
```
2882	2734
```