# Lab  - Investigating the performance advantages of Numpy arrays

At a high level, the purpose of this lab is to test how much more efficient bulk, vectorized operations can be (using NumPy arrays) than standard, dictionary-oriented Python computations.

In this lab, your task is to compare three different solutions to the co-occurrence problem. This is a very fundamental computation in text analytics. 

For this lab, our corpus of documents consists of 50 documents (docId from 0 to  45) generated from 2000 words (wordId from 0 to 1999) through LDA process. 

For each of the possible (word, word) pairs, we want to compute the number of documents (i.e. value must be between zero and 50) that have that particular (word, word) pair. That is, compute the co-occurrence of each possible word pair in the corpus of documents.

For example, let’s say our documents are:
```
doc 1: [word1, word2, word4, word5]
doc 2: [word1, word2, word5]
doc 3: [word2, word3, word5]
```
Then the result of the co-occurrence computation is:
```
{word1, word1}: 2 co-occurs (meaning that word1 occurs in two documents in the corpus)
{word1, word2}: 2 co-occurs
{word1, word4}: 1 co-occurs
{word1, word5}: 2 co-occurs
{word2, word2}: 3 co-occurs
{word2, word3}: 1 co-occurs
{word2, word4}: 1 co-occurs
{word2, word5}: 3 co-occurs
{word3, word3}: 1 co-occurs
{word3, word5}: 1 co-occurs
{word4, word4}: 1 co-occurs
{word4, word5}: 1 co-occurs
{word5, word5}: 3 co-occurs
```

For this lab, you need to complete the 3 tasks below, to compute co-occurrences in three different ways. 

## Task 1 - compute co-occurrences from a Python dictionary (4 pts)
First, run the pure, dictionary-based LDA implementation provided below
to build a document corpus (this is the similar approach we used in the last Numpy array lab). This will build the `wordsInCorpus` object, which is a Python dictionary. The dictionary’s key is a document identifier (0-49), and
the value is another dictionary. For the dictionary associated with a particular document
identifier, the key is a word identifier (0-1999), and the value is the number of occurrences of the
word in the document. Here is the structure of the nested dictionary `wordsInCorpus`:   

```
{ docId0 : { wordId# : # of occurrences of wordId# in docId0,
           
             ...
             
            },
            
  docId1 : { wordId# : # of occurrences of wordId# in docId1,
             
             ...
             
            }, 
  .
  .
  .
  
  docId49 : { wordId# : # of occurrences of wordId# in docId49,
              
              ...
              
            }
}

```

If a "wordId#", that is from 0 to 1999,  is **not present** for a documnet, it means that word does not occur in that document. 

In the first part of this lab, given a corpus of documents as a Python dictionary, your task is to compute the co-occurrence of each possible word pair in this corpus of documents. For this task, you must structure the co-occurrences as a Python dictionary, with the keys as tuple of (wordId, wordId), and the values as the number of documents having the two words.  

You must then time the execution of your Python dictionary-based computation. 

Some notes on your `coOccurrences` dictionary:
* You can ignore the pairs that do not occur in any document, and not count them in your `coOccurrences` dictionary. 
* Co-occurrence of word_i and word_j can appear twice in your `coOccurrences` dictionary, one time as (word_i, word_j) pair and a second time as (word_j, word_i) pair.      
* Your `coOccurrences` dictionary can include (word_i, word_i) pairs. 

In [1]:
import numpy as np
import pprint 
import time

### Dictionary-based LDA

In [2]:
# use the next line to produce the same results every time
np.random.seed(553)

# this returns a number whose probability of occurence is p
def sampleValue (p):
    return np.flatnonzero (np.random.multinomial (1, p, 1))[0]
 
# there are 2000 words in the corpus
alpha = np.full (2000, .1)
 
# there are 100 topics
beta = np.full (100, .1)
 
# this gets us the probabilty of each word happening in each of the 100 topics
wordsInTopic = np.random.dirichlet (alpha, 100)
# wordsInCorpus[i] will be a dictionary that gives us the number of each word in the document
wordsInCorpus = {}
 
# generate each doc
for doc in range (0, 50):
    #
    # no words in this doc yet
    wordsInDoc = {}
    #
    # get the topic probabilities for this doc
    topicsInDoc = np.random.dirichlet (beta)
    #
    # generate each of the 2000 words in this document
    for word in range (0, 2000):
        #
        # select the topic and the word
        whichTopic = sampleValue (topicsInDoc)
        whichWord = sampleValue (wordsInTopic[whichTopic])
        #
        # and record the word
        wordsInDoc [whichWord] = wordsInDoc.get (whichWord, 0) + 1
        #
    # now, remember this document
    wordsInCorpus [doc] = wordsInDoc

In [3]:
pprint.pprint(wordsInCorpus)

{0: {0: 1,
     4: 2,
     7: 1,
     10: 1,
     12: 1,
     15: 2,
     16: 4,
     17: 1,
     19: 3,
     20: 3,
     21: 1,
     22: 1,
     23: 1,
     24: 2,
     26: 2,
     27: 1,
     28: 1,
     30: 3,
     32: 2,
     33: 1,
     34: 2,
     35: 2,
     36: 1,
     37: 3,
     42: 1,
     43: 2,
     44: 2,
     45: 1,
     46: 1,
     48: 1,
     50: 2,
     54: 1,
     55: 7,
     56: 3,
     58: 4,
     59: 2,
     60: 2,
     62: 2,
     64: 2,
     65: 2,
     66: 1,
     67: 2,
     69: 1,
     71: 3,
     72: 6,
     73: 3,
     74: 1,
     77: 2,
     78: 1,
     85: 2,
     87: 3,
     89: 1,
     91: 2,
     93: 2,
     95: 4,
     96: 1,
     97: 1,
     101: 1,
     102: 2,
     106: 2,
     108: 2,
     109: 1,
     111: 2,
     114: 1,
     115: 2,
     116: 1,
     118: 1,
     119: 5,
     120: 3,
     121: 1,
     122: 3,
     132: 1,
     134: 4,
     136: 1,
     137: 1,
     138: 1,
     140: 2,
     142: 1,
     143: 4,
     145: 1,
     146: 2,
     14

     300: 1,
     302: 1,
     303: 2,
     305: 1,
     306: 2,
     308: 2,
     309: 1,
     311: 1,
     314: 2,
     315: 3,
     320: 1,
     324: 3,
     325: 1,
     326: 1,
     328: 3,
     331: 1,
     333: 1,
     335: 2,
     336: 1,
     338: 6,
     339: 4,
     340: 3,
     341: 2,
     342: 1,
     344: 2,
     345: 5,
     347: 1,
     348: 3,
     352: 1,
     354: 1,
     355: 1,
     357: 1,
     358: 2,
     360: 1,
     361: 1,
     362: 1,
     365: 1,
     366: 2,
     367: 4,
     368: 1,
     369: 2,
     370: 1,
     371: 5,
     374: 2,
     378: 1,
     380: 1,
     381: 3,
     382: 4,
     383: 1,
     384: 2,
     385: 12,
     387: 2,
     389: 3,
     392: 1,
     393: 1,
     396: 2,
     397: 2,
     398: 1,
     399: 1,
     403: 1,
     404: 1,
     406: 4,
     408: 2,
     409: 1,
     411: 2,
     412: 1,
     413: 1,
     415: 4,
     417: 4,
     418: 2,
     419: 1,
     420: 2,
     422: 1,
     423: 4,
     424: 1,
     425: 2,
     426: 2

     835: 1,
     839: 3,
     840: 7,
     841: 1,
     843: 2,
     845: 3,
     846: 2,
     847: 10,
     849: 1,
     852: 1,
     857: 6,
     858: 2,
     859: 1,
     860: 1,
     861: 2,
     864: 2,
     868: 1,
     870: 1,
     874: 1,
     875: 3,
     877: 3,
     880: 2,
     882: 1,
     883: 2,
     885: 1,
     886: 3,
     887: 1,
     888: 1,
     890: 1,
     893: 1,
     896: 1,
     897: 3,
     899: 2,
     900: 1,
     902: 1,
     904: 1,
     907: 3,
     911: 1,
     913: 2,
     914: 2,
     916: 1,
     917: 1,
     918: 1,
     925: 2,
     926: 1,
     927: 2,
     928: 2,
     929: 2,
     930: 3,
     933: 1,
     935: 1,
     936: 1,
     939: 1,
     940: 1,
     941: 1,
     942: 4,
     944: 1,
     945: 5,
     946: 3,
     947: 1,
     951: 1,
     954: 1,
     955: 1,
     956: 5,
     958: 6,
     960: 4,
     963: 1,
     972: 1,
     975: 3,
     981: 1,
     983: 2,
     987: 1,
     988: 1,
     990: 2,
     991: 1,
     992: 1,
     994: 1

     1244: 5,
     1245: 1,
     1246: 1,
     1247: 2,
     1248: 3,
     1249: 1,
     1250: 2,
     1251: 4,
     1252: 1,
     1253: 1,
     1254: 1,
     1255: 5,
     1256: 1,
     1257: 1,
     1258: 1,
     1262: 1,
     1263: 1,
     1270: 1,
     1277: 1,
     1278: 1,
     1279: 3,
     1280: 1,
     1282: 2,
     1285: 2,
     1287: 3,
     1289: 2,
     1290: 1,
     1291: 2,
     1293: 1,
     1295: 1,
     1296: 1,
     1297: 1,
     1298: 3,
     1300: 1,
     1301: 1,
     1304: 3,
     1306: 13,
     1309: 1,
     1310: 8,
     1312: 2,
     1314: 1,
     1316: 1,
     1319: 4,
     1322: 1,
     1323: 3,
     1326: 1,
     1328: 1,
     1329: 2,
     1330: 5,
     1331: 1,
     1334: 1,
     1336: 1,
     1337: 1,
     1339: 1,
     1340: 1,
     1343: 5,
     1346: 3,
     1347: 2,
     1350: 2,
     1354: 1,
     1358: 1,
     1359: 3,
     1362: 1,
     1363: 1,
     1364: 1,
     1366: 1,
     1367: 3,
     1370: 2,
     1371: 1,
     1372: 2,
     1373: 1,
     

 5: {0: 1,
     1: 6,
     2: 1,
     8: 2,
     9: 1,
     12: 1,
     13: 1,
     15: 1,
     19: 1,
     21: 1,
     25: 1,
     27: 1,
     28: 1,
     29: 2,
     33: 1,
     35: 1,
     36: 4,
     41: 2,
     42: 1,
     43: 2,
     44: 1,
     45: 5,
     46: 1,
     47: 1,
     48: 1,
     49: 1,
     50: 1,
     57: 1,
     59: 1,
     60: 1,
     64: 2,
     65: 2,
     66: 1,
     67: 1,
     70: 2,
     71: 1,
     75: 2,
     79: 5,
     80: 4,
     82: 1,
     83: 1,
     92: 2,
     95: 4,
     99: 1,
     100: 6,
     103: 2,
     106: 4,
     107: 2,
     111: 2,
     113: 1,
     114: 2,
     115: 2,
     121: 1,
     122: 1,
     124: 1,
     126: 2,
     128: 3,
     129: 5,
     130: 3,
     131: 5,
     132: 1,
     134: 1,
     141: 1,
     142: 1,
     145: 1,
     147: 2,
     148: 2,
     150: 1,
     153: 4,
     154: 3,
     155: 6,
     156: 2,
     157: 3,
     158: 1,
     159: 3,
     160: 1,
     161: 3,
     163: 3,
     164: 2,
     166: 2,
     168:

     595: 3,
     596: 2,
     597: 1,
     598: 2,
     599: 2,
     600: 1,
     602: 2,
     603: 3,
     604: 4,
     607: 1,
     609: 1,
     610: 1,
     611: 1,
     615: 1,
     616: 1,
     621: 2,
     622: 2,
     623: 1,
     624: 1,
     626: 1,
     627: 3,
     631: 2,
     632: 2,
     633: 1,
     635: 1,
     636: 2,
     638: 2,
     639: 1,
     640: 1,
     641: 2,
     642: 1,
     645: 1,
     646: 2,
     647: 1,
     649: 1,
     650: 1,
     652: 2,
     653: 2,
     654: 1,
     656: 3,
     658: 2,
     660: 1,
     661: 1,
     662: 1,
     663: 1,
     664: 1,
     666: 2,
     667: 1,
     669: 3,
     670: 4,
     671: 3,
     672: 2,
     673: 5,
     675: 1,
     676: 1,
     679: 2,
     685: 4,
     688: 2,
     689: 1,
     690: 6,
     693: 3,
     694: 1,
     696: 6,
     699: 2,
     702: 2,
     703: 4,
     704: 2,
     705: 2,
     706: 3,
     707: 2,
     708: 1,
     709: 2,
     710: 2,
     711: 1,
     713: 1,
     714: 1,
     715: 3,

     572: 2,
     573: 2,
     574: 1,
     576: 1,
     578: 3,
     579: 1,
     583: 5,
     584: 1,
     585: 2,
     586: 1,
     587: 2,
     588: 2,
     590: 1,
     592: 1,
     593: 1,
     595: 2,
     600: 1,
     606: 1,
     608: 1,
     609: 1,
     610: 3,
     611: 3,
     612: 2,
     613: 1,
     615: 1,
     617: 2,
     619: 3,
     621: 1,
     622: 1,
     623: 1,
     628: 1,
     629: 2,
     631: 2,
     635: 1,
     636: 2,
     640: 2,
     641: 1,
     643: 2,
     646: 3,
     648: 1,
     650: 2,
     653: 1,
     654: 2,
     656: 1,
     657: 4,
     658: 1,
     661: 3,
     662: 2,
     663: 2,
     666: 1,
     667: 1,
     668: 1,
     669: 1,
     671: 5,
     674: 2,
     675: 4,
     676: 4,
     677: 1,
     679: 3,
     680: 1,
     681: 1,
     683: 1,
     684: 1,
     686: 1,
     690: 1,
     691: 2,
     692: 2,
     693: 1,
     694: 3,
     695: 1,
     696: 2,
     698: 2,
     700: 4,
     702: 1,
     703: 1,
     705: 5,
     707: 1,

     759: 1,
     762: 1,
     763: 1,
     764: 9,
     765: 1,
     766: 1,
     769: 1,
     770: 1,
     771: 1,
     772: 2,
     773: 2,
     776: 1,
     778: 1,
     782: 2,
     786: 1,
     787: 1,
     788: 1,
     789: 4,
     792: 3,
     794: 1,
     795: 3,
     796: 3,
     797: 1,
     799: 1,
     800: 1,
     803: 4,
     804: 2,
     806: 2,
     808: 2,
     809: 2,
     810: 2,
     811: 1,
     812: 1,
     813: 1,
     816: 1,
     817: 2,
     819: 1,
     820: 1,
     822: 1,
     825: 1,
     827: 2,
     828: 1,
     829: 2,
     831: 2,
     832: 3,
     833: 2,
     834: 2,
     835: 1,
     836: 1,
     837: 1,
     838: 1,
     840: 6,
     845: 3,
     848: 1,
     850: 3,
     852: 1,
     854: 2,
     855: 1,
     857: 3,
     858: 2,
     864: 4,
     866: 2,
     868: 1,
     870: 2,
     873: 1,
     874: 1,
     876: 1,
     877: 3,
     879: 3,
     880: 5,
     881: 1,
     883: 3,
     885: 3,
     888: 2,
     890: 3,
     892: 1,
     893: 2,

     1059: 1,
     1061: 2,
     1062: 1,
     1063: 4,
     1064: 1,
     1065: 4,
     1067: 4,
     1068: 1,
     1069: 1,
     1070: 1,
     1072: 1,
     1077: 1,
     1080: 1,
     1081: 1,
     1085: 3,
     1086: 1,
     1087: 1,
     1089: 1,
     1090: 1,
     1091: 15,
     1097: 1,
     1098: 2,
     1101: 1,
     1105: 1,
     1106: 2,
     1108: 2,
     1115: 3,
     1121: 4,
     1124: 2,
     1125: 1,
     1127: 1,
     1129: 2,
     1130: 1,
     1131: 2,
     1133: 1,
     1137: 1,
     1138: 3,
     1144: 1,
     1145: 2,
     1146: 6,
     1147: 1,
     1149: 7,
     1151: 3,
     1152: 1,
     1154: 1,
     1156: 1,
     1158: 1,
     1159: 1,
     1161: 7,
     1163: 2,
     1164: 2,
     1166: 1,
     1167: 1,
     1168: 1,
     1169: 6,
     1170: 1,
     1171: 4,
     1173: 1,
     1175: 1,
     1179: 1,
     1185: 4,
     1186: 1,
     1187: 1,
     1188: 1,
     1191: 1,
     1192: 4,
     1193: 2,
     1195: 1,
     1197: 3,
     1198: 1,
     1200: 1,
     

      1374: 1,
      1376: 4,
      1380: 2,
      1383: 3,
      1384: 2,
      1387: 2,
      1388: 2,
      1389: 2,
      1390: 1,
      1391: 1,
      1393: 8,
      1394: 1,
      1395: 1,
      1396: 1,
      1398: 1,
      1399: 1,
      1402: 1,
      1403: 3,
      1405: 1,
      1407: 1,
      1413: 2,
      1414: 1,
      1415: 1,
      1417: 1,
      1418: 1,
      1419: 1,
      1420: 2,
      1422: 2,
      1423: 2,
      1427: 2,
      1432: 10,
      1434: 1,
      1435: 2,
      1436: 1,
      1437: 2,
      1438: 1,
      1440: 3,
      1441: 2,
      1442: 3,
      1443: 2,
      1445: 4,
      1446: 1,
      1448: 2,
      1449: 1,
      1450: 2,
      1455: 1,
      1456: 3,
      1457: 2,
      1462: 3,
      1465: 1,
      1466: 1,
      1467: 1,
      1468: 1,
      1471: 2,
      1472: 7,
      1473: 1,
      1475: 1,
      1476: 6,
      1478: 1,
      1479: 3,
      1480: 1,
      1482: 3,
      1484: 1,
      1485: 3,
      1487: 1,
      1491: 1,
      149

 12: {0: 1,
      5: 2,
      8: 1,
      10: 2,
      11: 2,
      13: 2,
      15: 2,
      16: 2,
      19: 1,
      20: 1,
      22: 7,
      25: 1,
      29: 2,
      30: 1,
      32: 1,
      35: 1,
      36: 2,
      42: 1,
      45: 1,
      49: 1,
      50: 1,
      57: 4,
      58: 2,
      59: 2,
      62: 6,
      63: 4,
      64: 1,
      68: 1,
      69: 1,
      70: 1,
      72: 1,
      74: 4,
      78: 1,
      82: 1,
      85: 3,
      86: 2,
      87: 1,
      89: 4,
      90: 1,
      95: 3,
      98: 2,
      99: 3,
      101: 3,
      103: 6,
      104: 4,
      105: 1,
      106: 3,
      107: 2,
      108: 1,
      110: 1,
      114: 4,
      115: 2,
      120: 1,
      121: 2,
      123: 8,
      127: 2,
      130: 1,
      132: 2,
      139: 3,
      140: 1,
      147: 2,
      148: 2,
      149: 5,
      150: 1,
      152: 1,
      153: 5,
      154: 1,
      157: 1,
      159: 1,
      162: 2,
      165: 3,
      167: 3,
      169: 1,
      170: 1,
      171

 13: {2: 2,
      6: 1,
      7: 1,
      8: 1,
      9: 1,
      11: 1,
      12: 2,
      13: 2,
      15: 2,
      16: 1,
      17: 1,
      19: 4,
      20: 1,
      21: 1,
      22: 4,
      27: 1,
      30: 3,
      32: 1,
      34: 2,
      35: 1,
      37: 2,
      40: 1,
      41: 2,
      44: 3,
      46: 2,
      48: 3,
      50: 1,
      54: 1,
      55: 2,
      56: 2,
      57: 4,
      58: 1,
      61: 4,
      62: 4,
      63: 5,
      64: 1,
      66: 2,
      72: 2,
      74: 3,
      77: 1,
      78: 1,
      80: 1,
      81: 1,
      82: 1,
      83: 1,
      87: 1,
      88: 2,
      89: 1,
      91: 4,
      92: 1,
      94: 1,
      95: 1,
      96: 2,
      97: 1,
      99: 6,
      103: 3,
      104: 3,
      105: 1,
      106: 1,
      107: 3,
      108: 6,
      109: 1,
      110: 1,
      112: 1,
      113: 1,
      114: 2,
      117: 1,
      119: 2,
      120: 2,
      121: 1,
      124: 1,
      126: 1,
      128: 1,
      129: 1,
      131: 1,
      133:

      1929: 1,
      1931: 2,
      1932: 2,
      1935: 1,
      1937: 1,
      1939: 1,
      1943: 1,
      1946: 2,
      1947: 1,
      1949: 1,
      1950: 3,
      1952: 1,
      1954: 2,
      1957: 1,
      1958: 1,
      1961: 1,
      1962: 2,
      1964: 1,
      1965: 3,
      1970: 1,
      1971: 4,
      1972: 1,
      1975: 1,
      1976: 2,
      1977: 2,
      1978: 3,
      1979: 1,
      1982: 1,
      1983: 1,
      1985: 2,
      1988: 1,
      1989: 1,
      1992: 1,
      1994: 1,
      1995: 3,
      1996: 1,
      1997: 1,
      1998: 2,
      1999: 1},
 14: {2: 1,
      8: 1,
      9: 4,
      14: 3,
      15: 1,
      16: 1,
      17: 2,
      25: 1,
      27: 2,
      29: 1,
      30: 2,
      32: 1,
      36: 2,
      39: 1,
      40: 1,
      42: 2,
      43: 1,
      44: 1,
      46: 1,
      47: 1,
      48: 2,
      49: 1,
      51: 2,
      53: 1,
      55: 1,
      56: 2,
      58: 2,
      59: 1,
      60: 1,
      62: 1,
      65: 1,
      67: 1,
 

      192: 1,
      193: 5,
      194: 1,
      195: 1,
      196: 2,
      197: 1,
      198: 5,
      199: 1,
      203: 6,
      205: 3,
      208: 2,
      210: 2,
      211: 2,
      213: 5,
      214: 2,
      215: 2,
      218: 2,
      219: 3,
      220: 1,
      221: 2,
      222: 3,
      224: 6,
      227: 1,
      228: 1,
      229: 1,
      232: 2,
      234: 1,
      235: 2,
      237: 1,
      241: 1,
      242: 1,
      243: 2,
      244: 1,
      246: 2,
      247: 1,
      248: 3,
      249: 1,
      252: 3,
      254: 2,
      255: 2,
      256: 2,
      257: 5,
      258: 2,
      262: 4,
      263: 3,
      264: 1,
      265: 2,
      266: 2,
      268: 2,
      269: 1,
      270: 3,
      272: 3,
      273: 1,
      275: 1,
      278: 2,
      282: 1,
      283: 1,
      284: 4,
      287: 2,
      288: 1,
      293: 1,
      297: 2,
      299: 2,
      302: 1,
      303: 1,
      305: 1,
      309: 4,
      310: 3,
      312: 1,
      316: 2,
      318: 1,
      

      578: 1,
      579: 1,
      580: 1,
      586: 1,
      588: 1,
      591: 4,
      592: 1,
      595: 1,
      596: 1,
      597: 1,
      598: 4,
      602: 1,
      604: 1,
      605: 1,
      606: 1,
      607: 1,
      610: 1,
      611: 1,
      613: 2,
      614: 1,
      615: 2,
      621: 2,
      623: 3,
      626: 3,
      632: 2,
      636: 4,
      640: 2,
      642: 1,
      643: 2,
      646: 2,
      647: 2,
      648: 1,
      649: 1,
      652: 1,
      654: 5,
      656: 3,
      658: 1,
      660: 1,
      661: 3,
      663: 1,
      667: 3,
      673: 5,
      675: 2,
      676: 1,
      677: 1,
      681: 1,
      682: 3,
      683: 1,
      684: 1,
      688: 1,
      691: 1,
      694: 4,
      695: 2,
      696: 3,
      700: 1,
      701: 2,
      702: 1,
      703: 3,
      704: 2,
      705: 1,
      706: 3,
      707: 1,
      710: 1,
      712: 1,
      713: 1,
      715: 2,
      717: 1,
      718: 3,
      719: 1,
      721: 1,
      722: 1,
      

      944: 2,
      945: 2,
      946: 1,
      949: 4,
      952: 2,
      953: 5,
      955: 1,
      956: 1,
      957: 3,
      958: 1,
      960: 3,
      969: 1,
      970: 1,
      971: 1,
      973: 1,
      975: 6,
      976: 2,
      977: 1,
      979: 2,
      988: 2,
      989: 7,
      992: 3,
      998: 1,
      999: 2,
      1000: 3,
      1003: 1,
      1005: 1,
      1006: 1,
      1007: 2,
      1009: 2,
      1010: 1,
      1013: 1,
      1015: 3,
      1017: 3,
      1019: 3,
      1021: 1,
      1022: 5,
      1024: 1,
      1025: 3,
      1026: 2,
      1028: 3,
      1033: 1,
      1035: 1,
      1037: 1,
      1040: 4,
      1041: 1,
      1043: 3,
      1044: 1,
      1046: 2,
      1048: 1,
      1052: 1,
      1055: 1,
      1056: 1,
      1057: 1,
      1066: 1,
      1067: 2,
      1069: 1,
      1070: 4,
      1071: 2,
      1072: 3,
      1074: 1,
      1075: 1,
      1077: 1,
      1078: 1,
      1080: 1,
      1081: 1,
      1083: 1,
      1084: 1,
    

      1207: 1,
      1208: 3,
      1212: 2,
      1214: 1,
      1215: 2,
      1216: 4,
      1218: 2,
      1221: 1,
      1222: 1,
      1223: 1,
      1226: 1,
      1227: 1,
      1228: 3,
      1229: 3,
      1231: 1,
      1234: 1,
      1236: 5,
      1237: 1,
      1238: 1,
      1239: 1,
      1243: 1,
      1245: 7,
      1246: 1,
      1248: 1,
      1249: 2,
      1250: 1,
      1251: 2,
      1257: 1,
      1258: 4,
      1260: 2,
      1261: 1,
      1262: 3,
      1263: 1,
      1267: 2,
      1268: 1,
      1269: 2,
      1271: 1,
      1272: 1,
      1274: 1,
      1275: 2,
      1279: 6,
      1281: 1,
      1282: 1,
      1284: 1,
      1286: 1,
      1287: 2,
      1288: 2,
      1289: 1,
      1290: 1,
      1291: 1,
      1293: 1,
      1295: 1,
      1297: 2,
      1298: 1,
      1303: 1,
      1305: 1,
      1306: 1,
      1309: 1,
      1310: 2,
      1311: 1,
      1313: 2,
      1315: 1,
      1316: 1,
      1317: 1,
      1319: 2,
      1320: 3,
      1323

      1477: 1,
      1479: 2,
      1480: 1,
      1483: 1,
      1484: 1,
      1488: 1,
      1490: 2,
      1492: 2,
      1493: 2,
      1494: 1,
      1495: 1,
      1498: 1,
      1502: 1,
      1504: 1,
      1506: 1,
      1510: 1,
      1511: 1,
      1512: 5,
      1514: 1,
      1515: 1,
      1516: 8,
      1520: 1,
      1527: 1,
      1529: 2,
      1531: 1,
      1535: 1,
      1536: 1,
      1540: 3,
      1542: 1,
      1544: 1,
      1546: 1,
      1547: 1,
      1548: 1,
      1551: 3,
      1552: 1,
      1553: 1,
      1554: 6,
      1555: 2,
      1557: 3,
      1559: 1,
      1560: 2,
      1563: 2,
      1564: 1,
      1565: 2,
      1566: 1,
      1568: 2,
      1569: 1,
      1572: 2,
      1578: 2,
      1584: 1,
      1585: 2,
      1590: 1,
      1594: 2,
      1596: 1,
      1597: 1,
      1598: 3,
      1599: 1,
      1600: 3,
      1601: 2,
      1602: 3,
      1603: 2,
      1605: 3,
      1606: 1,
      1607: 4,
      1608: 2,
      1611: 1,
      1614

 21: {3: 2,
      4: 2,
      7: 3,
      9: 6,
      10: 2,
      11: 1,
      12: 1,
      14: 2,
      16: 1,
      18: 4,
      23: 1,
      25: 1,
      27: 2,
      30: 6,
      31: 2,
      32: 2,
      36: 6,
      37: 1,
      39: 3,
      41: 1,
      47: 1,
      48: 1,
      50: 3,
      51: 1,
      52: 1,
      57: 2,
      58: 1,
      60: 1,
      61: 2,
      63: 3,
      64: 1,
      65: 3,
      67: 2,
      68: 1,
      69: 1,
      71: 1,
      72: 2,
      76: 1,
      78: 1,
      80: 4,
      83: 1,
      84: 2,
      86: 1,
      90: 1,
      92: 3,
      94: 1,
      96: 3,
      97: 1,
      99: 1,
      103: 1,
      104: 1,
      107: 3,
      108: 3,
      110: 2,
      112: 2,
      113: 3,
      114: 5,
      115: 1,
      116: 2,
      120: 1,
      121: 1,
      123: 1,
      124: 1,
      126: 1,
      128: 1,
      129: 4,
      130: 2,
      131: 3,
      132: 1,
      133: 1,
      135: 1,
      136: 1,
      137: 1,
      138: 1,
      143: 1,
   

      46: 1,
      47: 1,
      48: 1,
      49: 1,
      50: 1,
      54: 2,
      55: 1,
      56: 1,
      57: 1,
      58: 3,
      59: 1,
      60: 1,
      62: 5,
      63: 1,
      66: 1,
      67: 1,
      68: 1,
      69: 8,
      70: 2,
      73: 1,
      75: 1,
      76: 1,
      79: 3,
      80: 1,
      81: 1,
      82: 1,
      85: 2,
      87: 2,
      88: 3,
      92: 6,
      94: 1,
      95: 2,
      96: 3,
      97: 1,
      103: 2,
      105: 2,
      106: 6,
      107: 1,
      109: 1,
      110: 1,
      111: 1,
      112: 1,
      114: 3,
      115: 1,
      116: 1,
      117: 4,
      119: 1,
      120: 2,
      123: 1,
      124: 1,
      128: 3,
      131: 4,
      132: 1,
      134: 1,
      136: 1,
      137: 1,
      140: 1,
      142: 2,
      143: 1,
      144: 1,
      145: 3,
      146: 1,
      147: 1,
      148: 3,
      149: 2,
      150: 3,
      151: 2,
      152: 1,
      153: 3,
      154: 4,
      155: 1,
      156: 2,
      159: 2,
      160: 1

      382: 3,
      383: 1,
      386: 3,
      388: 2,
      389: 1,
      390: 2,
      392: 9,
      393: 1,
      397: 3,
      405: 1,
      409: 4,
      412: 2,
      413: 3,
      416: 2,
      417: 3,
      418: 1,
      419: 1,
      421: 2,
      423: 1,
      425: 1,
      426: 2,
      428: 1,
      429: 2,
      430: 1,
      431: 1,
      433: 1,
      434: 1,
      436: 1,
      441: 2,
      443: 2,
      444: 1,
      445: 1,
      446: 7,
      447: 1,
      448: 2,
      449: 1,
      450: 2,
      451: 1,
      453: 1,
      454: 3,
      455: 1,
      456: 1,
      457: 1,
      458: 1,
      460: 1,
      463: 2,
      465: 2,
      466: 2,
      467: 3,
      468: 1,
      471: 1,
      475: 1,
      476: 1,
      477: 2,
      478: 3,
      480: 2,
      481: 1,
      486: 13,
      488: 2,
      490: 1,
      491: 1,
      492: 1,
      493: 3,
      496: 1,
      498: 1,
      503: 5,
      505: 1,
      506: 1,
      515: 1,
      516: 1,
      519: 3,
     

      822: 2,
      823: 4,
      825: 1,
      826: 1,
      828: 4,
      829: 1,
      830: 2,
      831: 2,
      833: 2,
      834: 2,
      835: 2,
      839: 4,
      840: 2,
      842: 2,
      843: 1,
      844: 1,
      846: 2,
      847: 2,
      848: 1,
      849: 1,
      856: 2,
      857: 1,
      858: 1,
      860: 2,
      861: 3,
      863: 1,
      864: 4,
      865: 3,
      866: 6,
      868: 2,
      869: 1,
      870: 1,
      871: 1,
      872: 5,
      873: 1,
      880: 1,
      882: 1,
      883: 2,
      884: 1,
      885: 1,
      887: 1,
      888: 2,
      891: 2,
      892: 3,
      893: 3,
      894: 3,
      896: 1,
      899: 1,
      902: 1,
      904: 2,
      905: 1,
      906: 1,
      907: 2,
      908: 3,
      912: 2,
      915: 2,
      916: 2,
      918: 2,
      920: 1,
      921: 4,
      923: 2,
      931: 1,
      932: 1,
      933: 3,
      935: 4,
      936: 1,
      938: 2,
      939: 1,
      940: 1,
      942: 1,
      947: 1,
      

      1185: 1,
      1186: 2,
      1187: 1,
      1188: 1,
      1189: 1,
      1191: 2,
      1192: 5,
      1194: 1,
      1196: 2,
      1197: 2,
      1198: 2,
      1199: 2,
      1200: 3,
      1201: 1,
      1202: 2,
      1205: 2,
      1206: 2,
      1207: 1,
      1210: 1,
      1211: 1,
      1214: 1,
      1216: 2,
      1218: 1,
      1221: 1,
      1222: 2,
      1223: 3,
      1225: 2,
      1231: 2,
      1233: 11,
      1235: 2,
      1236: 1,
      1237: 2,
      1238: 1,
      1241: 2,
      1245: 3,
      1246: 2,
      1247: 2,
      1248: 1,
      1249: 1,
      1250: 1,
      1251: 2,
      1253: 1,
      1256: 1,
      1257: 4,
      1259: 2,
      1260: 1,
      1261: 2,
      1263: 1,
      1264: 1,
      1265: 1,
      1270: 1,
      1271: 1,
      1275: 1,
      1276: 1,
      1279: 1,
      1280: 2,
      1283: 2,
      1286: 2,
      1288: 3,
      1289: 1,
      1290: 6,
      1291: 1,
      1292: 1,
      1295: 1,
      1296: 2,
      1297: 1,
      129

      1274: 1,
      1278: 3,
      1280: 1,
      1281: 1,
      1282: 1,
      1283: 1,
      1284: 3,
      1286: 1,
      1288: 1,
      1290: 2,
      1293: 1,
      1294: 1,
      1295: 2,
      1296: 1,
      1297: 3,
      1298: 4,
      1301: 1,
      1302: 2,
      1304: 1,
      1305: 2,
      1307: 2,
      1311: 1,
      1312: 1,
      1313: 4,
      1315: 2,
      1318: 1,
      1319: 2,
      1320: 1,
      1324: 1,
      1326: 1,
      1327: 1,
      1328: 2,
      1329: 3,
      1330: 3,
      1331: 1,
      1333: 1,
      1334: 1,
      1335: 3,
      1336: 2,
      1338: 3,
      1339: 4,
      1341: 1,
      1342: 1,
      1343: 5,
      1344: 3,
      1346: 1,
      1349: 1,
      1350: 2,
      1355: 1,
      1356: 2,
      1357: 1,
      1358: 1,
      1361: 2,
      1362: 7,
      1363: 1,
      1364: 1,
      1367: 2,
      1368: 2,
      1370: 1,
      1371: 3,
      1372: 2,
      1375: 2,
      1376: 3,
      1379: 2,
      1380: 3,
      1382: 3,
      1383

      1837: 1,
      1843: 7,
      1848: 1,
      1852: 1,
      1853: 4,
      1857: 1,
      1858: 1,
      1859: 2,
      1860: 1,
      1864: 2,
      1866: 2,
      1867: 1,
      1868: 2,
      1869: 2,
      1871: 2,
      1874: 3,
      1875: 2,
      1876: 1,
      1878: 3,
      1879: 2,
      1880: 1,
      1881: 3,
      1887: 1,
      1888: 1,
      1891: 1,
      1893: 1,
      1895: 1,
      1896: 1,
      1897: 3,
      1903: 4,
      1905: 1,
      1907: 2,
      1908: 2,
      1911: 1,
      1912: 1,
      1914: 4,
      1915: 2,
      1917: 1,
      1918: 2,
      1919: 1,
      1920: 5,
      1921: 1,
      1924: 2,
      1925: 1,
      1928: 2,
      1930: 2,
      1932: 1,
      1933: 1,
      1934: 1,
      1935: 4,
      1936: 2,
      1939: 1,
      1941: 2,
      1942: 1,
      1945: 3,
      1946: 2,
      1954: 1,
      1955: 1,
      1959: 1,
      1961: 1,
      1962: 1,
      1964: 1,
      1965: 2,
      1966: 1,
      1967: 1,
      1968: 1,
      1972

      1982: 1,
      1983: 2,
      1987: 1,
      1988: 1,
      1989: 1,
      1990: 1,
      1991: 1,
      1992: 3,
      1993: 1,
      1994: 1,
      1996: 1,
      1998: 1,
      1999: 1},
 29: {2: 1,
      3: 1,
      4: 1,
      6: 1,
      8: 2,
      9: 1,
      11: 3,
      12: 2,
      14: 2,
      16: 1,
      17: 2,
      19: 1,
      22: 1,
      23: 1,
      27: 1,
      28: 1,
      31: 1,
      32: 1,
      35: 1,
      36: 2,
      37: 2,
      41: 3,
      43: 2,
      44: 8,
      45: 2,
      46: 1,
      47: 1,
      49: 3,
      51: 1,
      56: 1,
      58: 1,
      60: 4,
      61: 3,
      62: 3,
      63: 3,
      65: 2,
      66: 1,
      67: 1,
      68: 1,
      70: 1,
      71: 2,
      72: 4,
      73: 1,
      74: 1,
      75: 1,
      76: 2,
      77: 3,
      79: 2,
      81: 3,
      82: 1,
      83: 1,
      85: 1,
      87: 1,
      89: 1,
      91: 1,
      95: 3,
      98: 1,
      99: 1,
      101: 4,
      102: 3,
      105: 1,
      106: 3,


      60: 1,
      62: 4,
      63: 2,
      64: 1,
      65: 1,
      67: 2,
      69: 1,
      71: 4,
      72: 1,
      77: 1,
      78: 2,
      79: 1,
      80: 3,
      81: 1,
      82: 2,
      83: 2,
      84: 1,
      85: 2,
      86: 2,
      88: 1,
      91: 1,
      93: 5,
      94: 1,
      95: 2,
      96: 3,
      97: 2,
      99: 2,
      103: 2,
      104: 1,
      105: 3,
      106: 4,
      107: 4,
      108: 2,
      109: 1,
      113: 1,
      114: 2,
      116: 5,
      117: 1,
      118: 1,
      119: 1,
      121: 1,
      122: 2,
      123: 2,
      125: 1,
      128: 1,
      129: 1,
      133: 2,
      135: 7,
      138: 1,
      139: 1,
      140: 2,
      141: 1,
      142: 1,
      143: 1,
      144: 1,
      147: 1,
      148: 2,
      149: 2,
      151: 2,
      154: 2,
      155: 1,
      157: 1,
      161: 3,
      162: 1,
      164: 1,
      165: 3,
      166: 3,
      167: 2,
      172: 1,
      173: 1,
      175: 1,
      177: 1,
      178: 3,
     

      474: 1,
      475: 2,
      477: 1,
      478: 3,
      479: 1,
      481: 1,
      482: 1,
      483: 3,
      487: 2,
      490: 6,
      491: 1,
      493: 1,
      494: 1,
      496: 1,
      497: 1,
      498: 1,
      500: 2,
      502: 2,
      503: 1,
      504: 2,
      505: 1,
      506: 3,
      507: 1,
      508: 2,
      510: 5,
      511: 4,
      515: 1,
      516: 1,
      517: 2,
      519: 2,
      520: 1,
      521: 3,
      522: 3,
      524: 1,
      525: 1,
      526: 1,
      527: 2,
      530: 2,
      533: 1,
      536: 1,
      541: 2,
      543: 1,
      544: 2,
      545: 5,
      548: 2,
      552: 2,
      554: 1,
      555: 1,
      556: 2,
      559: 1,
      560: 1,
      564: 2,
      565: 1,
      568: 6,
      569: 3,
      574: 1,
      576: 1,
      578: 1,
      579: 1,
      580: 2,
      581: 2,
      583: 1,
      584: 5,
      585: 3,
      586: 1,
      587: 1,
      589: 2,
      590: 1,
      592: 2,
      594: 1,
      595: 1,
      

      717: 3,
      725: 1,
      726: 2,
      727: 1,
      728: 2,
      737: 1,
      739: 1,
      744: 2,
      746: 1,
      747: 1,
      748: 1,
      752: 2,
      754: 2,
      756: 3,
      757: 3,
      758: 1,
      759: 1,
      761: 2,
      764: 2,
      765: 2,
      767: 4,
      768: 1,
      769: 2,
      770: 1,
      771: 1,
      775: 1,
      776: 2,
      777: 4,
      782: 1,
      783: 5,
      785: 1,
      788: 1,
      789: 2,
      791: 2,
      794: 1,
      795: 1,
      802: 1,
      807: 2,
      808: 1,
      811: 1,
      812: 1,
      813: 1,
      816: 1,
      817: 1,
      819: 3,
      824: 4,
      825: 2,
      828: 1,
      831: 7,
      833: 4,
      837: 7,
      842: 1,
      844: 2,
      845: 2,
      848: 1,
      849: 4,
      850: 1,
      851: 1,
      852: 1,
      853: 2,
      854: 2,
      857: 8,
      858: 5,
      859: 1,
      860: 1,
      861: 2,
      863: 1,
      864: 4,
      865: 2,
      866: 5,
      868: 1,
      

      1789: 3,
      1792: 2,
      1793: 1,
      1795: 3,
      1799: 2,
      1800: 2,
      1802: 1,
      1803: 2,
      1805: 1,
      1807: 1,
      1808: 1,
      1811: 1,
      1814: 2,
      1815: 4,
      1816: 1,
      1824: 4,
      1826: 1,
      1827: 3,
      1828: 2,
      1831: 1,
      1832: 2,
      1834: 1,
      1835: 2,
      1838: 1,
      1840: 2,
      1845: 1,
      1847: 1,
      1852: 2,
      1853: 1,
      1854: 2,
      1857: 4,
      1858: 1,
      1865: 1,
      1866: 1,
      1868: 2,
      1870: 1,
      1871: 1,
      1872: 1,
      1874: 1,
      1875: 1,
      1876: 1,
      1878: 3,
      1879: 1,
      1880: 2,
      1881: 2,
      1882: 1,
      1883: 1,
      1885: 1,
      1886: 2,
      1887: 1,
      1888: 3,
      1890: 2,
      1896: 2,
      1900: 1,
      1901: 2,
      1903: 2,
      1906: 2,
      1908: 4,
      1909: 2,
      1910: 1,
      1911: 3,
      1913: 1,
      1915: 4,
      1916: 2,
      1917: 1,
      1918: 1,
      1919

      96: 1,
      97: 8,
      99: 1,
      100: 2,
      101: 1,
      103: 4,
      107: 2,
      108: 1,
      109: 1,
      110: 2,
      112: 1,
      113: 1,
      114: 13,
      116: 2,
      118: 1,
      119: 2,
      120: 2,
      122: 2,
      123: 2,
      124: 8,
      128: 1,
      132: 2,
      133: 2,
      136: 1,
      138: 1,
      141: 1,
      143: 2,
      145: 6,
      146: 2,
      147: 3,
      148: 2,
      149: 1,
      151: 2,
      152: 1,
      155: 1,
      156: 2,
      157: 1,
      158: 4,
      163: 1,
      164: 1,
      166: 2,
      167: 2,
      168: 4,
      169: 1,
      170: 2,
      171: 1,
      175: 3,
      176: 1,
      179: 1,
      182: 1,
      183: 5,
      184: 1,
      185: 1,
      186: 3,
      189: 3,
      194: 2,
      195: 1,
      198: 1,
      200: 1,
      201: 2,
      203: 1,
      205: 1,
      206: 2,
      207: 2,
      209: 1,
      210: 2,
      211: 2,
      212: 1,
      213: 1,
      214: 1,
      215: 1,
      21

      1094: 2,
      1097: 2,
      1099: 1,
      1101: 2,
      1104: 3,
      1105: 1,
      1106: 2,
      1108: 1,
      1110: 1,
      1111: 2,
      1114: 4,
      1115: 6,
      1116: 3,
      1117: 1,
      1119: 1,
      1120: 1,
      1121: 1,
      1125: 2,
      1126: 1,
      1128: 1,
      1129: 1,
      1132: 1,
      1133: 1,
      1134: 10,
      1137: 2,
      1138: 2,
      1140: 1,
      1141: 1,
      1142: 2,
      1146: 1,
      1147: 2,
      1148: 3,
      1150: 1,
      1151: 1,
      1154: 1,
      1159: 1,
      1163: 5,
      1164: 2,
      1168: 2,
      1169: 3,
      1171: 1,
      1172: 1,
      1173: 1,
      1174: 3,
      1176: 8,
      1177: 3,
      1178: 4,
      1179: 1,
      1182: 1,
      1185: 1,
      1188: 1,
      1194: 2,
      1197: 2,
      1200: 2,
      1202: 1,
      1204: 1,
      1205: 1,
      1206: 1,
      1207: 2,
      1209: 4,
      1210: 1,
      1212: 2,
      1214: 1,
      1216: 8,
      1217: 1,
      1218: 2,
      121

      1492: 3,
      1493: 1,
      1496: 3,
      1501: 1,
      1502: 1,
      1503: 10,
      1504: 1,
      1505: 1,
      1506: 2,
      1508: 1,
      1509: 1,
      1510: 5,
      1512: 1,
      1513: 4,
      1516: 1,
      1518: 1,
      1519: 2,
      1521: 1,
      1523: 3,
      1524: 2,
      1526: 2,
      1530: 1,
      1531: 1,
      1532: 1,
      1533: 1,
      1534: 5,
      1535: 4,
      1536: 2,
      1538: 1,
      1539: 1,
      1540: 2,
      1544: 2,
      1547: 2,
      1549: 3,
      1552: 1,
      1555: 1,
      1557: 2,
      1559: 1,
      1561: 2,
      1565: 1,
      1566: 1,
      1568: 5,
      1569: 5,
      1572: 1,
      1573: 4,
      1577: 5,
      1578: 3,
      1579: 4,
      1581: 9,
      1582: 2,
      1587: 1,
      1588: 1,
      1589: 3,
      1590: 1,
      1592: 1,
      1593: 2,
      1594: 2,
      1595: 1,
      1598: 2,
      1600: 1,
      1601: 4,
      1604: 2,
      1606: 1,
      1608: 1,
      1609: 1,
      1611: 1,
      161

      1712: 1,
      1714: 1,
      1716: 1,
      1717: 1,
      1718: 2,
      1720: 1,
      1722: 2,
      1724: 2,
      1725: 1,
      1727: 2,
      1733: 1,
      1734: 1,
      1735: 1,
      1736: 1,
      1738: 1,
      1739: 2,
      1741: 1,
      1742: 1,
      1743: 1,
      1744: 2,
      1745: 1,
      1747: 2,
      1749: 1,
      1750: 1,
      1751: 1,
      1752: 1,
      1753: 3,
      1754: 2,
      1756: 2,
      1758: 2,
      1760: 2,
      1762: 2,
      1763: 4,
      1766: 6,
      1769: 3,
      1772: 2,
      1774: 7,
      1775: 1,
      1776: 1,
      1777: 1,
      1782: 1,
      1784: 4,
      1785: 2,
      1788: 1,
      1790: 5,
      1791: 1,
      1792: 1,
      1793: 5,
      1794: 1,
      1797: 2,
      1798: 1,
      1800: 1,
      1801: 1,
      1803: 3,
      1806: 2,
      1812: 1,
      1813: 1,
      1814: 1,
      1815: 2,
      1817: 1,
      1818: 4,
      1819: 2,
      1820: 3,
      1821: 1,
      1823: 3,
      1824: 1,
      1825

      1: 1,
      2: 2,
      6: 2,
      8: 4,
      12: 1,
      13: 1,
      15: 3,
      17: 3,
      19: 5,
      20: 2,
      22: 2,
      23: 3,
      27: 3,
      28: 3,
      30: 1,
      31: 3,
      34: 2,
      35: 1,
      36: 1,
      37: 1,
      39: 1,
      40: 3,
      41: 1,
      43: 1,
      45: 1,
      46: 1,
      47: 1,
      48: 1,
      50: 2,
      51: 1,
      54: 1,
      55: 1,
      57: 2,
      58: 2,
      59: 1,
      61: 2,
      62: 2,
      63: 1,
      64: 4,
      65: 2,
      67: 2,
      68: 2,
      69: 1,
      72: 1,
      73: 1,
      74: 1,
      75: 1,
      76: 2,
      77: 2,
      80: 2,
      81: 1,
      82: 3,
      85: 2,
      88: 2,
      89: 2,
      95: 2,
      96: 3,
      98: 3,
      99: 5,
      100: 1,
      103: 5,
      106: 1,
      107: 1,
      108: 1,
      109: 1,
      110: 1,
      112: 1,
      113: 1,
      114: 2,
      115: 1,
      116: 1,
      119: 1,
      120: 1,
      122: 1,
      123: 3,
      124: 2,

      642: 3,
      644: 2,
      645: 1,
      647: 3,
      651: 1,
      653: 1,
      656: 2,
      659: 5,
      660: 1,
      662: 3,
      663: 1,
      664: 2,
      665: 2,
      668: 1,
      671: 3,
      672: 3,
      673: 1,
      676: 2,
      679: 1,
      680: 8,
      682: 9,
      683: 1,
      686: 2,
      692: 1,
      694: 1,
      695: 1,
      696: 3,
      703: 1,
      704: 1,
      705: 2,
      710: 3,
      711: 4,
      712: 5,
      713: 2,
      717: 2,
      719: 5,
      720: 1,
      723: 1,
      725: 1,
      728: 6,
      732: 1,
      733: 2,
      736: 3,
      739: 1,
      740: 2,
      741: 1,
      747: 2,
      748: 1,
      749: 2,
      750: 1,
      752: 2,
      756: 4,
      758: 2,
      759: 2,
      764: 2,
      765: 1,
      766: 1,
      767: 4,
      769: 1,
      770: 2,
      771: 4,
      772: 8,
      774: 2,
      775: 2,
      778: 1,
      780: 1,
      783: 1,
      786: 3,
      787: 2,
      790: 1,
      791: 1,
      

      822: 3,
      823: 1,
      826: 4,
      827: 1,
      828: 3,
      831: 1,
      832: 2,
      838: 1,
      839: 2,
      840: 3,
      847: 3,
      848: 1,
      850: 1,
      852: 1,
      853: 1,
      854: 2,
      858: 2,
      860: 1,
      861: 1,
      862: 2,
      864: 3,
      866: 1,
      871: 1,
      875: 1,
      879: 2,
      880: 2,
      883: 7,
      884: 2,
      885: 1,
      886: 2,
      887: 1,
      888: 1,
      890: 2,
      891: 1,
      895: 4,
      897: 2,
      904: 2,
      905: 2,
      906: 1,
      909: 2,
      911: 2,
      912: 1,
      915: 3,
      917: 3,
      919: 2,
      921: 1,
      924: 4,
      925: 1,
      927: 1,
      930: 1,
      931: 3,
      932: 1,
      936: 1,
      937: 1,
      938: 1,
      939: 1,
      940: 1,
      943: 1,
      944: 1,
      945: 1,
      946: 3,
      947: 2,
      949: 1,
      951: 1,
      954: 1,
      955: 1,
      956: 2,
      957: 1,
      958: 1,
      961: 2,
      963: 2,
      

      1677: 3,
      1679: 1,
      1681: 1,
      1683: 1,
      1684: 1,
      1685: 2,
      1686: 1,
      1688: 1,
      1691: 1,
      1692: 1,
      1693: 4,
      1694: 1,
      1695: 2,
      1696: 2,
      1701: 1,
      1704: 3,
      1709: 4,
      1710: 6,
      1711: 1,
      1715: 3,
      1716: 2,
      1718: 1,
      1719: 2,
      1728: 2,
      1729: 2,
      1731: 1,
      1732: 1,
      1735: 1,
      1736: 3,
      1738: 1,
      1739: 4,
      1740: 2,
      1742: 2,
      1743: 1,
      1744: 1,
      1746: 2,
      1747: 2,
      1748: 2,
      1751: 1,
      1752: 3,
      1753: 3,
      1756: 4,
      1759: 2,
      1760: 8,
      1763: 2,
      1764: 1,
      1766: 1,
      1768: 2,
      1769: 1,
      1770: 1,
      1773: 1,
      1774: 1,
      1775: 1,
      1776: 1,
      1778: 2,
      1779: 1,
      1781: 1,
      1782: 1,
      1783: 1,
      1786: 1,
      1788: 1,
      1790: 3,
      1793: 3,
      1797: 2,
      1798: 1,
      1799: 3,
      1800

      1795: 2,
      1797: 2,
      1798: 1,
      1799: 3,
      1802: 3,
      1804: 2,
      1805: 1,
      1806: 1,
      1807: 1,
      1808: 1,
      1809: 1,
      1810: 1,
      1812: 2,
      1815: 2,
      1818: 3,
      1819: 1,
      1820: 2,
      1821: 3,
      1823: 4,
      1824: 1,
      1825: 1,
      1826: 2,
      1827: 4,
      1828: 1,
      1830: 1,
      1831: 2,
      1832: 2,
      1833: 4,
      1834: 1,
      1835: 2,
      1836: 2,
      1837: 1,
      1838: 2,
      1839: 1,
      1840: 1,
      1841: 2,
      1843: 5,
      1844: 1,
      1846: 1,
      1847: 1,
      1850: 2,
      1851: 1,
      1852: 1,
      1853: 1,
      1858: 1,
      1859: 4,
      1860: 1,
      1861: 1,
      1863: 1,
      1864: 1,
      1865: 1,
      1866: 4,
      1868: 1,
      1869: 1,
      1870: 1,
      1871: 1,
      1877: 1,
      1881: 1,
      1883: 3,
      1885: 4,
      1886: 2,
      1887: 1,
      1888: 5,
      1889: 1,
      1890: 2,
      1891: 1,
      1895

      160: 1,
      163: 1,
      164: 1,
      166: 1,
      168: 1,
      172: 1,
      173: 3,
      174: 2,
      175: 1,
      176: 1,
      177: 2,
      179: 2,
      182: 2,
      183: 1,
      186: 4,
      188: 3,
      190: 1,
      196: 1,
      198: 1,
      199: 2,
      200: 1,
      201: 1,
      202: 2,
      204: 1,
      205: 4,
      210: 1,
      212: 1,
      213: 1,
      216: 2,
      217: 4,
      219: 1,
      220: 2,
      221: 1,
      224: 1,
      225: 4,
      226: 1,
      227: 1,
      229: 1,
      231: 2,
      232: 3,
      239: 2,
      240: 3,
      241: 3,
      243: 4,
      249: 1,
      251: 1,
      253: 1,
      255: 3,
      260: 1,
      261: 1,
      263: 1,
      267: 3,
      268: 3,
      270: 1,
      271: 4,
      272: 1,
      273: 4,
      274: 1,
      275: 1,
      276: 1,
      278: 2,
      279: 1,
      281: 1,
      283: 1,
      285: 1,
      287: 1,
      290: 1,
      294: 1,
      296: 1,
      297: 1,
      299: 3,
      

      273: 2,
      274: 1,
      278: 4,
      279: 4,
      280: 1,
      281: 8,
      282: 3,
      283: 1,
      284: 1,
      287: 2,
      289: 1,
      291: 1,
      292: 1,
      293: 2,
      295: 3,
      298: 1,
      300: 1,
      302: 1,
      304: 1,
      306: 2,
      307: 1,
      308: 2,
      311: 3,
      315: 2,
      317: 2,
      318: 2,
      319: 1,
      322: 3,
      323: 1,
      324: 1,
      326: 3,
      328: 4,
      330: 1,
      331: 1,
      332: 1,
      333: 1,
      334: 5,
      336: 1,
      338: 2,
      339: 2,
      340: 1,
      341: 1,
      342: 1,
      343: 1,
      344: 1,
      345: 2,
      347: 1,
      349: 1,
      351: 2,
      357: 2,
      358: 1,
      359: 1,
      364: 2,
      365: 3,
      366: 4,
      368: 4,
      371: 1,
      372: 1,
      375: 6,
      377: 1,
      378: 2,
      379: 2,
      380: 2,
      381: 3,
      385: 1,
      386: 2,
      388: 2,
      389: 3,
      391: 1,
      394: 1,
      395: 3,
      

      436: 1,
      441: 1,
      442: 1,
      444: 4,
      446: 1,
      448: 2,
      450: 2,
      451: 2,
      453: 2,
      454: 1,
      455: 1,
      459: 3,
      461: 3,
      463: 1,
      464: 1,
      467: 3,
      469: 1,
      473: 3,
      475: 1,
      478: 1,
      479: 1,
      480: 1,
      481: 1,
      482: 2,
      484: 1,
      485: 2,
      488: 7,
      491: 1,
      492: 2,
      494: 2,
      496: 3,
      497: 3,
      498: 1,
      499: 1,
      500: 2,
      501: 1,
      502: 1,
      503: 1,
      512: 1,
      517: 1,
      519: 1,
      520: 1,
      521: 1,
      522: 3,
      523: 1,
      524: 1,
      525: 4,
      527: 3,
      529: 2,
      530: 2,
      531: 5,
      535: 1,
      539: 4,
      541: 3,
      544: 1,
      546: 2,
      548: 2,
      549: 1,
      550: 1,
      552: 2,
      553: 2,
      554: 7,
      555: 1,
      556: 3,
      558: 3,
      560: 1,
      565: 2,
      566: 2,
      569: 4,
      572: 3,
      573: 1,
      

      717: 1,
      718: 1,
      719: 2,
      720: 1,
      722: 2,
      726: 1,
      730: 3,
      731: 1,
      734: 1,
      739: 2,
      740: 1,
      741: 1,
      744: 3,
      750: 1,
      751: 1,
      753: 2,
      755: 2,
      758: 2,
      759: 2,
      761: 2,
      765: 1,
      766: 2,
      769: 5,
      770: 2,
      772: 9,
      774: 1,
      775: 2,
      778: 1,
      781: 1,
      782: 1,
      783: 1,
      785: 1,
      790: 3,
      791: 5,
      792: 3,
      798: 1,
      802: 4,
      803: 1,
      804: 1,
      805: 3,
      806: 1,
      808: 5,
      809: 1,
      810: 1,
      811: 6,
      812: 1,
      813: 1,
      815: 1,
      817: 1,
      819: 1,
      824: 5,
      826: 1,
      827: 2,
      829: 1,
      832: 3,
      835: 4,
      836: 1,
      837: 2,
      840: 1,
      841: 3,
      847: 1,
      849: 2,
      850: 1,
      851: 1,
      852: 1,
      853: 1,
      854: 2,
      855: 1,
      856: 2,
      857: 1,
      861: 2,
      

In case you don’t remember, here are some common ways to loop through a Python dictionary:

In [4]:
# loop through keys and values together
for key, value in wordsInCorpus.items():
    # code here
    pass

In [5]:
# loop through keys
for key in wordsInCorpus:
    # code here
    pass

In [6]:
# loop through values
for value in wordsInCorpus.values():
    # code here
    pass

### Enter your code for task 1 here 

In [7]:
start = time.time()
# coOccurrences will be a dictionary where the key is a
# (word_i, word_j) pair, and the value is the number of times
# those two words co-occurred
# initialize coOccurences
coOccurrences = {}

# now, have a nested loop that fills up coOccurrences
#  YOUR  CODE  HERE
for i in wordsInCorpus:
    for j in wordsInCorpus[i]:
        for k in wordsInCorpus[i]:
            if (j, k) not in coOccurrences:
                coOccurrences[(j, k)] = 1
            else:
                coOccurrences[(j, k)] += 1
    
end = time.time()
end - start

38.09063673019409

### Look up the value of `coOccurrences[(1041, 1976)]`

In [8]:
coOccurrences[(1041, 1976)]

12

In [9]:
coOccurrences

{(1041, 1041): 25,
 (1041, 95): 14,
 (1041, 576): 14,
 (1041, 74): 19,
 (1041, 488): 14,
 (1041, 7): 14,
 (1041, 1295): 9,
 (1041, 1982): 16,
 (1041, 1841): 11,
 (1041, 573): 14,
 (1041, 659): 14,
 (1041, 1452): 16,
 (1041, 1317): 14,
 (1041, 1230): 14,
 (1041, 1624): 9,
 (1041, 1327): 10,
 (1041, 1580): 13,
 (1041, 1169): 16,
 (1041, 1876): 16,
 (1041, 30): 20,
 (1041, 674): 14,
 (1041, 252): 13,
 (1041, 1367): 17,
 (1041, 1445): 15,
 (1041, 48): 15,
 (1041, 1379): 12,
 (1041, 335): 16,
 (1041, 1210): 11,
 (1041, 985): 15,
 (1041, 1249): 13,
 (1041, 1959): 13,
 (1041, 431): 15,
 (1041, 1748): 18,
 (1041, 1389): 11,
 (1041, 258): 22,
 (1041, 1750): 14,
 (1041, 1488): 21,
 (1041, 1578): 20,
 (1041, 965): 17,
 (1041, 1311): 14,
 (1041, 462): 13,
 (1041, 1465): 12,
 (1041, 758): 20,
 (1041, 942): 17,
 (1041, 798): 14,
 (1041, 1120): 17,
 (1041, 1935): 15,
 (1041, 1868): 12,
 (1041, 719): 14,
 (1041, 187): 18,
 (1041, 1451): 9,
 (1041, 665): 11,
 (1041, 1160): 17,
 (1041, 58): 21,
 (1041, 

## Task 2 - compute co-occurrences from a Numpy array (3 pts)
Now, run the Numpy array-based LDA implementation provided below (again, this is the similar approach we used in the last Numpy array lab). This code will build the `wordsInCorpus` Numpy array. It's a 2D array that rows represent documents, and columns represent words. Rows and columns indices correspond to documents and words identifiers, respectively. The array, then, stores number of times each word occurs in each document. 

For example, `wordsInCorpus[34, 355]` represents the number of times that word 355 occurred in document 34.

In the second part of this lab, given a corpus of documents as a Numpy array, your task is to compute the co-occurrence of each possible word pair in this corpus of documents. For this task, you must structure the co-occurrences as a 2D Numpy array (`coOccurrences`), with number of rows and number of columns to be equal to the number of words in the corpus. Then, the number of documnets having both of word_i and word_j is stored in `coOccurrences[i, j]`.

You must then time the execution of your Numpy array-based computation.

Here is the approach you must take for this task to compute `coOccurrences` array:

Get the vector of words occurrences for each document, from `wordsInCorpus`. Then, you can compute the [**outer product**](https://en.wikipedia.org/wiki/Outer_product) of each vector with itself to create a matrix of co-occurrences for each document. Summing the 50 matrices gives the answer. 

Recall, the [**outer product**](https://en.wikipedia.org/wiki/Outer_product) of two column vectors $u$ and $v$, where $u \in \mathbb{R}^m$ and $v \in \mathbb{R}^n$ is defined as:

$$
\begin{bmatrix}
u_1\\
u_2\\
\vdots \\
u_m\\
\end{bmatrix}
\bigotimes
\begin{bmatrix}
v_1 & v_2 & \ldots & v_n\\
\end{bmatrix}
= uv^T =
\begin{bmatrix}
u_1v_1 & u_1v_2 & \ldots & u_1v_n\\
u_2v_1 & u_2v_2 & \ldots & u_2v_n\\
\vdots & \vdots & \ddots & \vdots \\
u_mv_1 & u_mv_2 & \ldots & u_mv_n\\
\end{bmatrix}
$$

**hint**: in order to take a NumPy array ```foo``` and "clip" all of its entries between 0 and 1, use:
[```np.clip(foo, 0, 1)```](https://docs.scipy.org/doc/numpy/reference/generated/numpy.clip.html)

To compute the outer product between two vectors or matrices, use:
[```np.outer(foo, bar)```](https://docs.scipy.org/doc/numpy/reference/generated/numpy.outer.html)

### Array-based LDA

In [10]:
# uncomment the next line to produce the same results every time
np.random.seed(553)

# there are 2000 words in the corpus
alpha = np.full (2000, .1)
 
# there are 100 topics
beta = np.full (100, .1)
 
# this gets us the probabilty of each word happening in each of the 100 topics
wordsInTopic = np.random.dirichlet (alpha, 100)
 
# wordsInCorpus[i] will give us the vector of words in document i
wordsInCorpus = np.zeros ((50, 2000))
 
# generate each doc
for doc in range (0, 50):
    #
    # get the topic probabilities for this doc
    topicsInDoc = np.random.dirichlet (beta)
    #
    # assign each of the 2000 words in this doc to a topic
    wordsToTopic = np.random.multinomial (2000, topicsInDoc)
    # 
    # and generate each of the 2000 words
    for topic in range (0, 100):
        wordsFromCurrentTopic = np.random.multinomial (wordsToTopic[topic], wordsInTopic[topic])
        wordsInCorpus[doc] = np.add (wordsInCorpus[doc], wordsFromCurrentTopic)

In [11]:
wordsInCorpus

array([[0., 1., 1., ..., 2., 1., 0.],
       [0., 1., 1., ..., 0., 2., 1.],
       [0., 3., 3., ..., 4., 0., 1.],
       ...,
       [0., 2., 3., ..., 0., 1., 2.],
       [1., 2., 0., ..., 0., 1., 2.],
       [0., 1., 0., ..., 1., 3., 1.]])

In [12]:
wordsInCorpus.shape

(50, 2000)

### Enter your code for task 2 here 

In [13]:
start = time.time()
# coOccurrences[i, j] will give the count of the number of times that
# word i and word j appear in the same document in the corpus
coOccurrences = np.zeros ((2000, 2000))
# now, have a loop that fills up coOccurrences
# YOUR CODE HERE
my_word = np.clip(wordsInCorpus, 0, 1)
for i in range(len(wordsInCorpus)):
    u = my_word[i]
    coOccurrences = np.add(coOccurrences, np.outer(u, u))

end = time.time()
end - start

0.6105649471282959

### Look up the value of `coOccurrences[1041, 1976]`

In [14]:
coOccurrences[1041, 1976]

14.0

In [15]:
coOccurrences

array([[14.,  9.,  5., ...,  9.,  6.,  9.],
       [ 9., 30., 18., ..., 18., 11., 20.],
       [ 5., 18., 26., ..., 15., 12., 16.],
       ...,
       [ 9., 18., 15., ..., 31., 14., 17.],
       [ 6., 11., 12., ..., 14., 22., 14.],
       [ 9., 20., 16., ..., 17., 14., 31.]])

## Task 3 - compute co-occurrences from a Numpy array without using loops (3pts)

In the last part of this lab, your task is to compute co-occurrences through Numpy vectorized operation, without using any loop. 

The corpus of documents has been built from task 2, which is through Numpy array-based LDA implementation. So, for this task, you will use the same `wordsInCorpus` Numpy array that was built in task 2. 

Same as in task 2, you must structure the co-occurrences as a 2D Numpy array (`coOccurrences`), with number of rows and number of columns to be equal to the number of words in the corpus. Then, the number of documnets having both of word_i and word_j is stored in coOccurrences[i, j].

However, in this task, you must avoid using any loop, and instead use a single matrix multiplication operation. 

Recall, the [**product**](https://en.wikipedia.org/wiki/Matrix_multiplication) of two matrices $A$ and $B$, where $A \in \mathbb{R}^{n \times m}$ and $B \in \mathbb{R}^{m \times p}$ is a $n \times p$ dimension matrix $C$:


$$
\begin{bmatrix}
a_{1,1} & a_{1,2} & \ldots & a_{1,m} \\
a_{2,1} & a_{2,2} & \ldots & a_{2,m} \\
\vdots & \vdots & \ddots & \vdots \\
a_{n,1} & a_{n,2} & \ldots & a_{n,m} \\
\end{bmatrix}
\begin{bmatrix}
b_{1,1} & b_{1,2} & \ldots & b_{1,p} \\
b_{2,1} & b_{2,2} & \ldots & b_{2,p} \\
\vdots & \vdots & \ddots & \vdots \\
b_{m,1} & b_{m,2} & \ldots & b_{m,p} \\
\end{bmatrix}
=
\begin{bmatrix}
c_{1,1} & c_{1,2} & \ldots & c_{1,p} \\
c_{2,1} & c_{2,2} & \ldots & c_{2,p} \\
\vdots & \vdots & \ddots & \vdots \\
c_{n,1} & c_{n,2} & \ldots & c_{n,p} \\
\end{bmatrix}
$$


where each entry, $C_{ij} = \langle A_i, B_{\cdot,j} \rangle = \sum_{k=1}^m A_{i,k}B_{k,j}$. Note that $B_{\cdot,j}$ indicates the $j^{th}$ column of $B$ and $A_i$ indicates the $i^{th}$ row of $A$.


Note that you can use [`np.transpose`](https://numpy.org/doc/stable/reference/generated/numpy.transpose.html) operation to transpose a matrix, and [`np.dot()`](https://numpy.org/doc/stable/reference/generated/numpy.dot.html) to multiply two matrices. Your solution should not contain any loops.

### Enter your code for task 3 here 

In [16]:
start = time.time()
# now, create coOccurrences via a matrix multiply
# YOUR CODE HERE
word_a = np.clip(wordsInCorpus, 0, 1)
word_b = np.transpose(word_a)

coOccurrences = np.dot(word_b, word_a)
end = time.time()
end - start

0.019419431686401367

## Look up the value of coOccurrences[1041, 1976]


In [17]:
coOccurrences[1041, 1976]

14.0

In [18]:
coOccurrences

array([[14.,  9.,  5., ...,  9.,  6.,  9.],
       [ 9., 30., 18., ..., 18., 11., 20.],
       [ 5., 18., 26., ..., 15., 12., 16.],
       ...,
       [ 9., 18., 15., ..., 31., 14., 17.],
       [ 6., 11., 12., ..., 14., 22., 14.],
       [ 9., 20., 16., ..., 17., 14., 31.]])

Copyright ©  2020 Rice University

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.