## Module submission header
### Submission preparation instructions 
_Completion of this header is mandatory, subject to a 2-point deduction to the assignment._ Only add plain text in the designated areas, i.e., replacing the relevant 'NA's. You must fill out all group member Names and Drexel email addresses in the below markdown list, under header __Module submission group__. It is required to fill out descriptive notes pertaining to any tutoring support received in the completion of this submission under the __Additional submission comments__ section at the bottom of the header. If no tutoring support was received, leave NA in place. You may as well list other optional comments pertaining to the submission at bottom. _Any distruption of this header's formatting will make your group liable to the 2-point deduction._

### Module submission group
- Group member 1
    - Name: Kasonde Chewe
    - Email: kc3745@drexel.edu
- Group member 2
    - Name: Kholoud Hamed M Al Nazzawi
    - Email: ka974@drexel.edu
- Group member 3
    - Name: NA
    - Email: NA
- Group member 4
    - Name: NA
    - Email: NA

### Additional submission comments
- Tutoring support received: NA
- Other (other): NA

# Assignment Group 1


## Module B _(23 points)_

__B1.__ _(5 points)_ You will be using part of the [Goodbooks 10k dataset](https://github.com/zygmuntz/goodbooks-10k) for this module. To start, complete the function and read the `data/goodreads-books.csv` file into a list. Your output should be a list of dictionaries (one for for each book), which each contain the following fields: `'authors'`, original `'title'`, original `'publication year'`, `'average rating'`, and `'ratings count'`. Note: you must convert the average rating and ratings count into `float` and `int` types, respectively.

In [3]:
# B1:Function(5/5)

import csv

def read_books(filename):
    """
        Function: read_books takes a csv file as input and extracts into a list/ dict format as follow:
                    
                    metadata = [
                        {authors: ....}, 
                        {title: ......}, 
                        {publication year: ... }, 
                        {average rating:....}, 
                        {ratings count: ...}
                        ]
        @param filename is the filepath to the .csv file
        returns the metadata list/dict information 
    """

    # create a list of books to store metadata 
    books_metadata = []

    # opens file in correct format to avoid encoding error 
    with open(filename, 'r', encoding='utf-8') as file:
        # csv.DictReader resolves columns into key: value pairs (row/column)
        reader = csv.DictReader(file)
        # append meta data to dictionaries
        for row in reader:
            # dictionary containing metadata 
            book_dict = {}

            # create dictionary key: value pairs and add data
            book_dict['authors'] = row['authors']
            book_dict['title'] = row['title']
            book_dict['publication year'] = (row['original_publication_year'])
            book_dict['average rating'] = float(row['average_rating'])
            book_dict['ratings count'] = int(row['ratings_count'])
            
            # add the data to the books list 
            books_metadata.append(book_dict)

    return books_metadata


"""
*** alternative version that does not use csv.DictReader

import csv

def read_books(filename):
    books = []
    with open(filename, 'r', encoding='utf-8') as file:
        reader = csv.reader(file)
        headers = next(reader)  # read the header row
        for row in reader:
            book = {}
            book['authors'] = row[0]
            book['title'] = row[1]
            book['publication year'] = int(row[2])
            book['average rating'] = float(row[3])
            book['ratings count'] = int(row[4])
            books.append(book)
    return books

books = read_books('data/goodreads-books.csv')
print(len(books)) # prints the number of books in the list

"""; 




For reference, your output should be:
```
(10000,
 [{'authors': 'Suzanne Collins',
   'title': 'The Hunger Games',
   'publication year': '2008.0',
   'average rating': 4.34,
   'ratings count': 4780653},
  {'authors': 'J.K. Rowling, Mary GrandPré',
   'title': "Harry Potter and the Philosopher's Stone",
   'publication year': '1997.0',
   'average rating': 4.44,
   'ratings count': 4602479},
  {'authors': 'Stephenie Meyer',
   'title': 'Twilight',
   'publication year': '2005.0',
   'average rating': 3.57,
   'ratings count': 3866839},
  {'authors': 'Harper Lee',
   'title': 'To Kill a Mockingbird',
   'publication year': '1960.0',
   'average rating': 4.25,
   'ratings count': 3198671},
  {'authors': 'F. Scott Fitzgerald',
   'title': 'The Great Gatsby',
   'publication year': '1925.0',
   'average rating': 3.89,
   'ratings count': 2683664}])
```

In [4]:
# B1:SanityCheck

metadata = read_books('data/goodreads-books.csv')
len(metadata), metadata[:5] # prints the number of books in the list


(10000,
 [{'authors': 'Suzanne Collins',
   'title': 'The Hunger Games (The Hunger Games, #1)',
   'publication year': '2008.0',
   'average rating': 4.34,
   'ratings count': 4780653},
  {'authors': 'J.K. Rowling, Mary GrandPré',
   'title': "Harry Potter and the Sorcerer's Stone (Harry Potter, #1)",
   'publication year': '1997.0',
   'average rating': 4.44,
   'ratings count': 4602479},
  {'authors': 'Stephenie Meyer',
   'title': 'Twilight (Twilight, #1)',
   'publication year': '2005.0',
   'average rating': 3.57,
   'ratings count': 3866839},
  {'authors': 'Harper Lee',
   'title': 'To Kill a Mockingbird',
   'publication year': '1960.0',
   'average rating': 4.25,
   'ratings count': 3198671},
  {'authors': 'F. Scott Fitzgerald',
   'title': 'The Great Gatsby',
   'publication year': '1925.0',
   'average rating': 3.89,
   'ratings count': 2683664}])

__B2.__ _(4 points)_ Complete the function to sort this list of book metadata in descending order of `'average rating'`. The function should take the list of metadata dictionaries as an input argument, and output the same format with the list sorted by the specified column.

In [14]:
# B2:Function(4/4)
def sort_metadata(books_metadata):
    #---your code starts here---
    books_metadata.sort(key=lambda x: x['average rating'], reverse=True)
    #---your code stops here---
    return books_metadata

For reference, your output should be:
```
[{'authors': 'Bill Watterson',
  'title': 'The Complete Calvin and Hobbes',
  'publication year': '2005.0',
  'average rating': 4.82,
  'ratings count': 28900},
 {'authors': 'Brandon Sanderson',
  'title': 'Words of Radiance',
  'publication year': '2014.0',
  'average rating': 4.77,
  'ratings count': 73572},
 {'authors': 'J.K. Rowling, Mary GrandPré',
  'title': '',
  'publication year': '2003.0',
  'average rating': 4.77,
  'ratings count': 33220},
 {'authors': 'Anonymous, Lane T. Dennis, Wayne A. Grudem',
  'title': '',
  'publication year': '2002.0',
  'average rating': 4.76,
  'ratings count': 8953},
 {'authors': 'Francine Rivers',
  'title': 'Mark of the Lion Trilogy',
  'publication year': '1993.0',
  'average rating': 4.76,
  'ratings count': 9081}]
```

In [15]:
# B2:SanityCheck
sorted_metadata = sort_metadata(metadata)
sorted_metadata[:5]

[{'authors': 'Bill Watterson',
  'title': 'The Complete Calvin and Hobbes',
  'publication year': '2005.0',
  'average rating': 4.82,
  'ratings count': 28900},
 {'authors': 'Brandon Sanderson',
  'title': 'Words of Radiance (The Stormlight Archive, #2)',
  'publication year': '2014.0',
  'average rating': 4.77,
  'ratings count': 73572},
 {'authors': 'J.K. Rowling, Mary GrandPré',
  'title': 'Harry Potter Boxed Set, Books 1-5 (Harry Potter, #1-5)',
  'publication year': '2003.0',
  'average rating': 4.77,
  'ratings count': 33220},
 {'authors': 'Anonymous, Lane T. Dennis, Wayne A. Grudem',
  'title': 'ESV Study Bible',
  'publication year': '2002.0',
  'average rating': 4.76,
  'ratings count': 8953},
 {'authors': 'Francine Rivers',
  'title': 'Mark of the Lion Trilogy',
  'publication year': '1993.0',
  'average rating': 4.76,
  'ratings count': 9081}]

__B3.__ _(5 points)_ Now complete the below function to take two arguments: the list, and an integer value for minimum ratings count. The new function should now sort _and_ filter the list, returning a list of books sorted by average rating that have been rated by more than a specified number of users. Here, you must use an approach using loops and `if` statements&mdash;in subsequent parts, comprehensions and the built-in `filter()` must be utilized instead (look up documentation and examples).

In [None]:
# B3:Function(5/5)

def sort_and_filter(books_metadata, min_ratings_count):
    new_metadata = []
    
    #---your code starts here---
    
    #---your code stops here---
    
    return new_metadata

For reference, your output should be:
```
[{'authors': 'J.K. Rowling, Mary GrandPré',
  'title': 'Harry Potter and the Deathly Hallows',
  'publication year': '2007.0',
  'average rating': 4.61,
  'ratings count': 1746574},
 {'authors': 'J.K. Rowling, Mary GrandPré',
  'title': 'Harry Potter and the Half-Blood Prince',
  'publication year': '2005.0',
  'average rating': 4.54,
  'ratings count': 1678823},
 {'authors': 'J.K. Rowling, Mary GrandPré, Rufus Beck',
  'title': 'Harry Potter and the Prisoner of Azkaban',
  'publication year': '1999.0',
  'average rating': 4.53,
  'ratings count': 1832823},
 {'authors': 'J.K. Rowling, Mary GrandPré',
  'title': 'Harry Potter and the Goblet of Fire',
  'publication year': '2000.0',
  'average rating': 4.53,
  'ratings count': 1753043},
 {'authors': 'J.K. Rowling, Mary GrandPré',
  'title': 'Harry Potter and the Order of the Phoenix',
  'publication year': '2003.0',
  'average rating': 4.46,
  'ratings count': 1735368}]
```

In [None]:
# B3:SanityCheck

sorted_filtered_metadata = sort_and_filter(books_metadata, 1000000)
sorted_filtered_metadata[:5]

__B4:__ _(8 points)_ Now use the other two approaches to create different versions of the function. See if you can condense your code into a single line for some of the approaches! 

First, use a list comprehension _inside_ of the `sorted()` function:

In [None]:
# B4:Function(4/8)

def sort_and_filter(books_metadata, min_ratings_count):
    
    #---your code starts here---
    
    #---your code stops here---
    
    return new_metadata

For reference, your output should be:
```
[{'authors': 'J.K. Rowling, Mary GrandPré',
  'title': 'Harry Potter and the Deathly Hallows',
  'publication year': '2007.0',
  'average rating': 4.61,
  'ratings count': 1746574},
 {'authors': 'J.K. Rowling, Mary GrandPré',
  'title': 'Harry Potter and the Half-Blood Prince',
  'publication year': '2005.0',
  'average rating': 4.54,
  'ratings count': 1678823},
 {'authors': 'J.K. Rowling, Mary GrandPré, Rufus Beck',
  'title': 'Harry Potter and the Prisoner of Azkaban',
  'publication year': '1999.0',
  'average rating': 4.53,
  'ratings count': 1832823},
 {'authors': 'J.K. Rowling, Mary GrandPré',
  'title': 'Harry Potter and the Goblet of Fire',
  'publication year': '2000.0',
  'average rating': 4.53,
  'ratings count': 1753043},
 {'authors': 'J.K. Rowling, Mary GrandPré',
  'title': 'Harry Potter and the Order of the Phoenix',
  'publication year': '2003.0',
  'average rating': 4.46,
  'ratings count': 1735368}]
```

In [None]:
# B4:SanityCheck

sorted_filtered_metadata = sort_and_filter(books_metadata, 1000000)
sorted_filtered_metadata[:5]

In [None]:
# B4:Function(4/8)

def sort_and_filter(books_metadata, min_ratings_count):
    
    #---your code starts here---
    
    #---your code stops here---
    
    return new_metadata

For reference, your output should be:
```
[{'authors': 'J.K. Rowling, Mary GrandPré',
  'title': 'Harry Potter and the Deathly Hallows',
  'publication year': '2007.0',
  'average rating': 4.61,
  'ratings count': 1746574},
 {'authors': 'J.K. Rowling, Mary GrandPré',
  'title': 'Harry Potter and the Half-Blood Prince',
  'publication year': '2005.0',
  'average rating': 4.54,
  'ratings count': 1678823},
 {'authors': 'J.K. Rowling, Mary GrandPré, Rufus Beck',
  'title': 'Harry Potter and the Prisoner of Azkaban',
  'publication year': '1999.0',
  'average rating': 4.53,
  'ratings count': 1832823},
 {'authors': 'J.K. Rowling, Mary GrandPré',
  'title': 'Harry Potter and the Goblet of Fire',
  'publication year': '2000.0',
  'average rating': 4.53,
  'ratings count': 1753043},
 {'authors': 'J.K. Rowling, Mary GrandPré',
  'title': 'Harry Potter and the Order of the Phoenix',
  'publication year': '2003.0',
  'average rating': 4.46,
  'ratings count': 1735368}]
```

In [None]:
# B4:SanityCheck

sorted_filtered_metadata = sort_and_filter(books_metadata, 1000000)
sorted_filtered_metadata[:5]

__B5.__ _(1 point)_ Objectively, should the solution in __B3__ be faster or slower than either of the solutions in __B4__? Print your answer as `'Faster'` or `'Slower'` in the code cell below.

In [None]:
# Inline(1/1)

# Objectively, should the solution in B3 be faster or slower than either of the solutions from B4? 
# Print your answer as 'Faster' or 'Slower', below
print("<Faster or Slower>")