<a href="https://colab.research.google.com/github/roitraining/SparkProgram/blob/master/Day1/IntroToSpark.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Create the Spark context to start a session and connect to the cluster.

In [0]:
import sys
sys.path.append('/home/student/ROI/SparkProgram')
from initspark import *
sc, spark, conf = initspark()


Read a text file from the local file system.

In [0]:
shake = sc.textFile('/home/student/ROI/SparkProgram/datasets/text/shakespeare.txt')
print(shake.count())
print(shake.take(10))

124796
['The Project Gutenberg EBook of The Complete Works of William Shakespeare, by ', 'William Shakespeare', '', 'This eBook is for the use of anyone anywhere at no cost and with', 'almost no restrictions whatsoever.  You may copy it, give it away or', 're-use it under the terms of the Project Gutenberg License included', 'with this eBook or online at www.gutenberg.org', '', '** This is a COPYRIGHTED Project Gutenberg eBook, Details Below **', '**     Please follow the copyright guidelines in this file.     **']


Use the map method to apply a function call on each element.

In [0]:
shake2 = shake.map(str.upper)
shake2.take(10)

['THE PROJECT GUTENBERG EBOOK OF THE COMPLETE WORKS OF WILLIAM SHAKESPEARE, BY ',
 'WILLIAM SHAKESPEARE',
 '',
 'THIS EBOOK IS FOR THE USE OF ANYONE ANYWHERE AT NO COST AND WITH',
 'ALMOST NO RESTRICTIONS WHATSOEVER.  YOU MAY COPY IT, GIVE IT AWAY OR',
 'RE-USE IT UNDER THE TERMS OF THE PROJECT GUTENBERG LICENSE INCLUDED',
 'WITH THIS EBOOK OR ONLINE AT WWW.GUTENBERG.ORG',
 '',
 '** THIS IS A COPYRIGHTED PROJECT GUTENBERG EBOOK, DETAILS BELOW **',
 '**     PLEASE FOLLOW THE COPYRIGHT GUIDELINES IN THIS FILE.     **']

Using the split method you get a list of lists.

In [0]:
shake3 = shake.map(lambda x : x.split(' '))
shake3.take(10)

[['The',
  'Project',
  'Gutenberg',
  'EBook',
  'of',
  'The',
  'Complete',
  'Works',
  'of',
  'William',
  'Shakespeare,',
  'by',
  ''],
 ['William', 'Shakespeare'],
 [''],
 ['This',
  'eBook',
  'is',
  'for',
  'the',
  'use',
  'of',
  'anyone',
  'anywhere',
  'at',
  'no',
  'cost',
  'and',
  'with'],
 ['almost',
  'no',
  'restrictions',
  'whatsoever.',
  '',
  'You',
  'may',
  'copy',
  'it,',
  'give',
  'it',
  'away',
  'or'],
 ['re-use',
  'it',
  'under',
  'the',
  'terms',
  'of',
  'the',
  'Project',
  'Gutenberg',
  'License',
  'included'],
 ['with', 'this', 'eBook', 'or', 'online', 'at', 'www.gutenberg.org'],
 [''],
 ['**',
  'This',
  'is',
  'a',
  'COPYRIGHTED',
  'Project',
  'Gutenberg',
  'eBook,',
  'Details',
  'Below',
  '**'],
 ['**',
  '',
  '',
  '',
  '',
  'Please',
  'follow',
  'the',
  'copyright',
  'guidelines',
  'in',
  'this',
  'file.',
  '',
  '',
  '',
  '',
  '**']]

In [0]:
The flatMap method flattens the inner list to return one big list of strings instead

In [0]:
shake4 = shake.flatMap(lambda x : x.split(' '))
shake4.take(20)

['The',
 'Project',
 'Gutenberg',
 'EBook',
 'of',
 'The',
 'Complete',
 'Works',
 'of',
 'William',
 'Shakespeare,',
 'by',
 '',
 'William',
 'Shakespeare',
 '',
 'This',
 'eBook',
 'is',
 'for']

Parallelize will load manually created data into the spark cluster into an RDD.

In [0]:
r = sc.parallelize(range(1,11))
print(r.collect())
print(r.take(5))

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[1, 2, 3, 4, 5]


Load a folder stored on HDFS.

In [0]:
sc.textFile('hdfs://localhost:9000/categories').collect()

['1,Beverages,Soft drinks coffees teas beers and ales',
 '2,Condiments,Sweet and savory sauces relishes spreads and seasonings',
 '3,Confections,Desserts candies and sweet breads',
 '4,Dairy Products,Cheeses',
 '5,Grains/Cereals,Breads crackers pasta and cereal',
 '6,Meat/Poultry,Prepared meats',
 '7,Produce,Dried fruit and bean curd',
 '8,Seafood,Seaweed and fish']

Use the helper function to point to the HDFS URI.

In [0]:
cat = sc.textFile(hdfsPath('categories'))
print(cat.takeOrdered(5))
print(cat.top(5))
print(cat.takeSample(False,5))
cat.foreach(lambda x : print(x.upper)) # does not display properly in notebook

['1,Beverages,Soft drinks coffees teas beers and ales', '2,Condiments,Sweet and savory sauces relishes spreads and seasonings', '3,Confections,Desserts candies and sweet breads', '4,Dairy Products,Cheeses', '5,Grains/Cereals,Breads crackers pasta and cereal']
['8,Seafood,Seaweed and fish', '7,Produce,Dried fruit and bean curd', '6,Meat/Poultry,Prepared meats', '5,Grains/Cereals,Breads crackers pasta and cereal', '4,Dairy Products,Cheeses']
['7,Produce,Dried fruit and bean curd', '4,Dairy Products,Cheeses', '8,Seafood,Seaweed and fish', '1,Beverages,Soft drinks coffees teas beers and ales', '3,Confections,Desserts candies and sweet breads']


Save the results in an RDD to disk. Note how it makes a folder and fills it with as many files as there are nodes solving the problem. Also, you must make sure that the folder does not exist or it throws an exception.

In [0]:
! rm -r /home/student/file1.txt
cat.saveAsTextFile('/home/student/file1.txt')

In [0]:
print(cat.map(str.upper).collect())

['1,BEVERAGES,SOFT DRINKS COFFEES TEAS BEERS AND ALES', '2,CONDIMENTS,SWEET AND SAVORY SAUCES RELISHES SPREADS AND SEASONINGS', '3,CONFECTIONS,DESSERTS CANDIES AND SWEET BREADS', '4,DAIRY PRODUCTS,CHEESES', '5,GRAINS/CEREALS,BREADS CRACKERS PASTA AND CEREAL', '6,MEAT/POULTRY,PREPARED MEATS', '7,PRODUCE,DRIED FRUIT AND BEAN CURD', '8,SEAFOOD,SEAWEED AND FISH']


Parse the string into a tuple to resemble a record structure.

In [0]:
cat1 = cat.map(lambda x : tuple(x.split(',')))
cat1 = cat1.map(lambda x : (int(x[0]), x[1], x[2]))
cat1.take(10)

[(1, 'Beverages', 'Soft drinks coffees teas beers and ales'),
 (2, 'Condiments', 'Sweet and savory sauces relishes spreads and seasonings'),
 (3, 'Confections', 'Desserts candies and sweet breads'),
 (4, 'Dairy Products', 'Cheeses'),
 (5, 'Grains/Cereals', 'Breads crackers pasta and cereal'),
 (6, 'Meat/Poultry', 'Prepared meats'),
 (7, 'Produce', 'Dried fruit and bean curd'),
 (8, 'Seafood', 'Seaweed and fish')]

**LAB:** Put the regions folder found in /home/student/ROI/datasets/northwind/csv/regions into HDFS. Read it into an RDD and convert it into a tuple shape.

Convert the tuple into a dictionary as an alternative form.

In [0]:
cat2 = cat1.map(lambda x : dict(zip(['CategoryID', 'Name', 'Description'], x)))
cat2.take(10)

[{'CategoryID': 1,
  'Name': 'Beverages',
  'Description': 'Soft drinks coffees teas beers and ales'},
 {'CategoryID': 2,
  'Name': 'Condiments',
  'Description': 'Sweet and savory sauces relishes spreads and seasonings'},
 {'CategoryID': 3,
  'Name': 'Confections',
  'Description': 'Desserts candies and sweet breads'},
 {'CategoryID': 4, 'Name': 'Dairy Products', 'Description': 'Cheeses'},
 {'CategoryID': 5,
  'Name': 'Grains/Cereals',
  'Description': 'Breads crackers pasta and cereal'},
 {'CategoryID': 6, 'Name': 'Meat/Poultry', 'Description': 'Prepared meats'},
 {'CategoryID': 7,
  'Name': 'Produce',
  'Description': 'Dried fruit and bean curd'},
 {'CategoryID': 8, 'Name': 'Seafood', 'Description': 'Seaweed and fish'}]

You can chain multiple transformations together to do it all in one step.

In [0]:
cat2 = cat.map(lambda x : tuple(x.split(','))) \
      .map(lambda x : (int(x[0]), x[1], x[2])) \
      .map(lambda x : dict(zip(['CategoryID', 'Name', 'Description'], x)))
cat2.take(10)


[{'CategoryID': 1,
  'Name': 'Beverages',
  'Description': 'Soft drinks coffees teas beers and ales'},
 {'CategoryID': 2,
  'Name': 'Condiments',
  'Description': 'Sweet and savory sauces relishes spreads and seasonings'},
 {'CategoryID': 3,
  'Name': 'Confections',
  'Description': 'Desserts candies and sweet breads'},
 {'CategoryID': 4, 'Name': 'Dairy Products', 'Description': 'Cheeses'},
 {'CategoryID': 5,
  'Name': 'Grains/Cereals',
  'Description': 'Breads crackers pasta and cereal'},
 {'CategoryID': 6, 'Name': 'Meat/Poultry', 'Description': 'Prepared meats'},
 {'CategoryID': 7,
  'Name': 'Produce',
  'Description': 'Dried fruit and bean curd'},
 {'CategoryID': 8, 'Name': 'Seafood', 'Description': 'Seaweed and fish'}]

The filter method takes a lambda that returns a True or False.

In [0]:
cat1.filter(lambda x : x[0] <= 5).collect()


[(1, 'Beverages', 'Soft drinks coffees teas beers and ales'),
 (2, 'Condiments', 'Sweet and savory sauces relishes spreads and seasonings'),
 (3, 'Confections', 'Desserts candies and sweet breads'),
 (4, 'Dairy Products', 'Cheeses'),
 (5, 'Grains/Cereals', 'Breads crackers pasta and cereal')]

The filter expressions can be more complicated.

In [0]:
cat2.filter(lambda x : x['CategoryID'] % 2 == 0 and 'e' in x['Name']).collect()

[{'CategoryID': 2,
  'Name': 'Condiments',
  'Description': 'Sweet and savory sauces relishes spreads and seasonings'},
 {'CategoryID': 6, 'Name': 'Meat/Poultry', 'Description': 'Prepared meats'},
 {'CategoryID': 8, 'Name': 'Seafood', 'Description': 'Seaweed and fish'}]

The sortBy method returns an expression that is used to sort the data.

In [0]:
cat1.sortBy(lambda x : x[2]).collect()

[(5, 'Grains/Cereals', 'Breads crackers pasta and cereal'),
 (4, 'Dairy Products', 'Cheeses'),
 (3, 'Confections', 'Desserts candies and sweet breads'),
 (7, 'Produce', 'Dried fruit and bean curd'),
 (6, 'Meat/Poultry', 'Prepared meats'),
 (8, 'Seafood', 'Seaweed and fish'),
 (1, 'Beverages', 'Soft drinks coffees teas beers and ales'),
 (2, 'Condiments', 'Sweet and savory sauces relishes spreads and seasonings')]

sortBy has an option ascending parameter to sort in reverse order.

In [0]:
cat1.sortBy(lambda x : x[0], ascending = False).collect()

[(8, 'Seafood', 'Seaweed and fish'),
 (7, 'Produce', 'Dried fruit and bean curd'),
 (6, 'Meat/Poultry', 'Prepared meats'),
 (5, 'Grains/Cereals', 'Breads crackers pasta and cereal'),
 (4, 'Dairy Products', 'Cheeses'),
 (3, 'Confections', 'Desserts candies and sweet breads'),
 (2, 'Condiments', 'Sweet and savory sauces relishes spreads and seasonings'),
 (1, 'Beverages', 'Soft drinks coffees teas beers and ales')]

**LAB:** Try to sort region by name and descending order by ID.

Reshape categories from a tuple of three elements like (1, 'Beverages', 'Soft drinks') to a tuple with two elements (key, value) like (1, ('Beverages', 'Soft drinks')).

In [0]:
cat3 = cat1.map(lambda x : (x[0], (x[1], x[2])))
cat3.collect()

[(1, ('Beverages', 'Soft drinks coffees teas beers and ales')),
 (2,
  ('Condiments', 'Sweet and savory sauces relishes spreads and seasonings')),
 (3, ('Confections', 'Desserts candies and sweet breads')),
 (4, ('Dairy Products', 'Cheeses')),
 (5, ('Grains/Cereals', 'Breads crackers pasta and cereal')),
 (6, ('Meat/Poultry', 'Prepared meats')),
 (7, ('Produce', 'Dried fruit and bean curd')),
 (8, ('Seafood', 'Seaweed and fish'))]

The sortByKey method does not require a function as a parameter if the data is structured into a tuple of the shape (key, value).

In [0]:
cat3.sortByKey(ascending=False).collect()

[(8, ('Seafood', 'Seaweed and fish')),
 (7, ('Produce', 'Dried fruit and bean curd')),
 (6, ('Meat/Poultry', 'Prepared meats')),
 (5, ('Grains/Cereals', 'Breads crackers pasta and cereal')),
 (4, ('Dairy Products', 'Cheeses')),
 (3, ('Confections', 'Desserts candies and sweet breads')),
 (2,
  ('Condiments', 'Sweet and savory sauces relishes spreads and seasonings')),
 (1, ('Beverages', 'Soft drinks coffees teas beers and ales'))]

Read in another CSV file.

In [0]:
prod = shake = sc.textFile('/home/student/ROI/SparkProgram/datasets/northwind/CSV/products')
print(prod.count())
prod.take(4)


77


['1,Chai,8,1,10 boxes x 30 bags,18.0,39,0,10,1',
 '2,Chang,1,1,24 - 12 oz bottles,19.0,17,40,25,1',
 '3,Aniseed Syrup,1,2,12 - 550 ml bottles,10.0,13,70,25,0',
 "4,Chef Anton's Cajun Seasoning,2,2,48 - 6 oz jars,22.0,53,0,0,0"]

Split it up and just keep the ProductID, ProductName, CategoryID, Price, Quantity values.

In [0]:
prod1 = prod.map(lambda x : x.split(',')).map(lambda x : (int(x[0]), x[1], int(x[3]), float(x[5]), int(x[6])))
prod1.take(5)

[(1, 'Chai', 1, 18.0, 39),
 (2, 'Chang', 1, 19.0, 17),
 (3, 'Aniseed Syrup', 2, 10.0, 13),
 (4, "Chef Anton's Cajun Seasoning", 2, 22.0, 53),
 (5, "Chef Anton's Gumbo Mix", 2, 21.35, 0)]

Reshape it to a key value tuple.

In [0]:
prod2 = prod1.map(lambda x : (x[2], (x[0], x[1], x[3], x[4])))
prod2.take(5)

[(1, (1, 'Chai', 18.0, 39)),
 (1, (2, 'Chang', 19.0, 17)),
 (2, (3, 'Aniseed Syrup', 10.0, 13)),
 (2, (4, "Chef Anton's Cajun Seasoning", 22.0, 53)),
 (2, (5, "Chef Anton's Gumbo Mix", 21.35, 0))]

In [0]:
cat3.collect()

[(1, ('Beverages', 'Soft drinks coffees teas beers and ales')),
 (2,
  ('Condiments', 'Sweet and savory sauces relishes spreads and seasonings')),
 (3, ('Confections', 'Desserts candies and sweet breads')),
 (4, ('Dairy Products', 'Cheeses')),
 (5, ('Grains/Cereals', 'Breads crackers pasta and cereal')),
 (6, ('Meat/Poultry', 'Prepared meats')),
 (7, ('Produce', 'Dried fruit and bean curd')),
 (8, ('Seafood', 'Seaweed and fish'))]

Both c3 and prod2 are in key value tuple format so they can be joined to produce a new tuple of (key, (cat, prod)).

In [0]:
joined = cat3.join(prod2)
joined.sortByKey().take(15)

[(1,
  (('Beverages', 'Soft drinks coffees teas beers and ales'),
   (1, 'Chai', 18.0, 39))),
 (1,
  (('Beverages', 'Soft drinks coffees teas beers and ales'),
   (2, 'Chang', 19.0, 17))),
 (1,
  (('Beverages', 'Soft drinks coffees teas beers and ales'),
   (24, 'Guarana Fantastica', 4.5, 20))),
 (1,
  (('Beverages', 'Soft drinks coffees teas beers and ales'),
   (34, 'Sasquatch Ale', 14.0, 111))),
 (1,
  (('Beverages', 'Soft drinks coffees teas beers and ales'),
   (35, 'Steeleye Stout', 18.0, 20))),
 (1,
  (('Beverages', 'Soft drinks coffees teas beers and ales'),
   (38, 'Cote de Blaye', 263.5, 17))),
 (1,
  (('Beverages', 'Soft drinks coffees teas beers and ales'),
   (39, 'Chartreuse verte', 18.0, 69))),
 (1,
  (('Beverages', 'Soft drinks coffees teas beers and ales'),
   (43, 'Ipoh Coffee', 46.0, 17))),
 (1,
  (('Beverages', 'Soft drinks coffees teas beers and ales'),
   (67, 'Laughing Lumberjack Lager', 14.0, 52))),
 (1,
  (('Beverages', 'Soft drinks coffees teas beers and ales'

**LAB:** Load territories into HDFS and join it to regions.

The groupBy methods are seldom used but they can produce hierarchies where children records are embedded inside a parent.

In [0]:
list(group1.take(1)[0][1])

[(1, 'Chai', 18.0, 39),
 (2, 'Chang', 19.0, 17),
 (24, 'Guarana Fantastica', 4.5, 20),
 (34, 'Sasquatch Ale', 14.0, 111),
 (35, 'Steeleye Stout', 18.0, 20),
 (38, 'Cote de Blaye', 263.5, 17),
 (39, 'Chartreuse verte', 18.0, 69),
 (43, 'Ipoh Coffee', 46.0, 17),
 (67, 'Laughing Lumberjack Lager', 14.0, 52),
 (70, 'Outback Lager', 15.0, 15),
 (75, 'Rhonbrau Klosterbier', 7.75, 125),
 (76, 'Lakkalikoori', 18.0, 57)]

In [0]:
group1 = prod2.groupByKey()
group1.take(3)

[(1, <pyspark.resultiterable.ResultIterable at 0x7f9c154c4860>),
 (2, <pyspark.resultiterable.ResultIterable at 0x7f9c154c4278>),
 (7, <pyspark.resultiterable.ResultIterable at 0x7f9c154c4048>)]

In [0]:
group2 = [(key, list(it)) for key, it in group1.collect()]
for k,v in group2:
    print ('Key:', k)
    for x in v:
        print(x)
#print (group2)

Key: 1
(1, 'Chai', 18.0, 39)
(2, 'Chang', 19.0, 17)
(24, 'Guarana Fantastica', 4.5, 20)
(34, 'Sasquatch Ale', 14.0, 111)
(35, 'Steeleye Stout', 18.0, 20)
(38, 'Cote de Blaye', 263.5, 17)
(39, 'Chartreuse verte', 18.0, 69)
(43, 'Ipoh Coffee', 46.0, 17)
(67, 'Laughing Lumberjack Lager', 14.0, 52)
(70, 'Outback Lager', 15.0, 15)
(75, 'Rhonbrau Klosterbier', 7.75, 125)
(76, 'Lakkalikoori', 18.0, 57)
Key: 2
(3, 'Aniseed Syrup', 10.0, 13)
(4, "Chef Anton's Cajun Seasoning", 22.0, 53)
(5, "Chef Anton's Gumbo Mix", 21.35, 0)
(6, "Grandma's Boysenberry Spread", 25.0, 120)
(8, 'Northwoods Cranberry Sauce', 40.0, 6)
(15, 'Genen Shouyu', 13.0, 39)
(44, 'Gula Malacca', 19.45, 27)
(61, "Sirop d'erable", 28.5, 113)
(63, 'Vegie-spread', 43.9, 24)
(65, 'Louisiana Fiery Hot Pepper Sauce', 21.05, 76)
(66, 'Louisiana Hot Spiced Okra', 17.0, 4)
(77, 'Original Frankfurter grune Sosse', 13.0, 32)
Key: 7
(7, "Uncle Bob's Organic Dried Pears", 30.0, 15)
(14, 'Tofu', 23.25, 35)
(28, 'Rossle Sauerkraut', 45.6, 2

The reduce methods take a function as a parameter that tells Spark how to accumulate the values for each group. The function takes two parameters; the first is the accumulated value and the second is the next value in the list. 

In [0]:
shake4.map(lambda x : (x, 1)).reduceByKey(lambda x, y : x + y).sortBy(lambda x : x[1], ascending = False).take(10)

[('', 506672),
 ('the', 23407),
 ('I', 19540),
 ('and', 18358),
 ('to', 15682),
 ('of', 15649),
 ('a', 12586),
 ('my', 10824),
 ('in', 9633),
 ('you', 9129)]

**LAB:** Use the territories RDD to count how many territories are in each region. 
Display the results in regionID order and then descending order based on the counts.

In this example, we are adding up all the prices for each categoryID.

In [0]:
red1 = prod2.map(lambda x : (x[0], x[1][2])).reduceByKey(lambda x, y: x + y)
red1.collect()

[(1, 455.75),
 (2, 274.25),
 (7, 161.85),
 (6, 324.04),
 (8, 248.19),
 (4, 287.3),
 (3, 327.08),
 (5, 141.75)]

To accumulate more than one value, use a tuple to hold as many values as you want to aggregate.

In [0]:
red1 = prod2.map(lambda x : (x[0], (x[1][2], x[1][3], 1))).reduceByKey(lambda x, y: (x[0] + y[0], x[1] + y[1], x[2] + y[2]))
red1.collect()

[(1, (455.75, 559, 12)),
 (2, (274.25, 507, 12)),
 (7, (161.85, 100, 5)),
 (6, (324.04, 165, 6)),
 (8, (248.19, 701, 12)),
 (4, (287.3, 393, 10)),
 (3, (327.08, 386, 13)),
 (5, (141.75, 308, 7))]

Some Python magic can make things easier in the long run.
Named tuples make accessing the elements of the row easier.
Unpacking using the * is a neat Python trick that is widely used. 
datetime has function to convert a string into a date

In [0]:
mort = sc.textFile('/home/student/ROI/SparkProgram/datasets/finance/30YearMortgage.csv')
head = mort.first()
mort = mort.filter(lambda x : x != head)

In [0]:
from datetime import date, datetime
from collections import namedtuple
Rate = namedtuple('Rate','date fed_fund_rate avg_rate_30year')
mort1 = mort.map(lambda x : Rate(*(x.split(','))))
mort2 = mort1.map(lambda x : Rate(datetime.strptime(x.date, '%Y-%m').date(), float(x.fed_fund_rate), float(x.avg_rate_30year)))
mort2.take(5)

[Rate(date=datetime.date(1971, 4, 1), fed_fund_rate=0.0415, avg_rate_30year=0.0731),
 Rate(date=datetime.date(1971, 5, 1), fed_fund_rate=0.0463, avg_rate_30year=0.07425),
 Rate(date=datetime.date(1972, 2, 1), fed_fund_rate=0.0329, avg_rate_30year=0.07325),
 Rate(date=datetime.date(1979, 8, 1), fed_fund_rate=0.1094, avg_rate_30year=0.11094),
 Rate(date=datetime.date(1979, 9, 1), fed_fund_rate=0.1143, avg_rate_30year=0.113)]

In [0]:
mort2.filter(lambda x : x.fed_fund_rate > .1 ).collect()

[Rate(date=datetime.date(1979, 8, 1), fed_fund_rate=0.1094, avg_rate_30year=0.11094),
 Rate(date=datetime.date(1979, 9, 1), fed_fund_rate=0.1143, avg_rate_30year=0.113),
 Rate(date=datetime.date(1979, 10, 1), fed_fund_rate=0.1377, avg_rate_30year=0.11637499999999999),
 Rate(date=datetime.date(1979, 11, 1), fed_fund_rate=0.1318, avg_rate_30year=0.1283),
 Rate(date=datetime.date(1979, 12, 1), fed_fund_rate=0.1378, avg_rate_30year=0.129),
 Rate(date=datetime.date(1980, 1, 1), fed_fund_rate=0.1382, avg_rate_30year=0.128775),
 Rate(date=datetime.date(1980, 2, 1), fed_fund_rate=0.1413, avg_rate_30year=0.1304),
 Rate(date=datetime.date(1980, 3, 1), fed_fund_rate=0.17190000000000003, avg_rate_30year=0.15282500000000002),
 Rate(date=datetime.date(1980, 4, 1), fed_fund_rate=0.1761, avg_rate_30year=0.16325),
 Rate(date=datetime.date(1980, 5, 1), fed_fund_rate=0.10980000000000001, avg_rate_30year=0.14262),
 Rate(date=datetime.date(1980, 9, 1), fed_fund_rate=0.10869999999999999, avg_rate_30year=0.1

HOMEWORK:
