## Introduction

This report demonstrates the code developed in the `assignment3_code.py` file, illustrating how the classes and methods can be used to process data effectively.

#### Design
Throughout this assignment I have used lists of dictionaries to act as databases. I have hardcoded the tabular data in question one as a list of dictionaries into the accompanying .py file, as this data will not change and is small. This format is ideal for structured data, as each field can be assigned a meaningful key (e.g. 'ISBN' in the books database or 'postId' in the comments database). The list groups the dictionaries together, which makes it easy to add, remove, or modify data. Pythons's built in functions can also be used for tasks such as sorting, searching, and filtering, and this strucutre allows continuity in approach throughout the assignment.

I have used three classes to organise the code: `helperClass`, `bookData`, and `commentData`. The `helperClass` contains utility functions that help make the code more modular, such as `wrap_text()`, `get_url()`, `gen_rand_time()`, and others.`bookData` and `commentData` follow similar structures to one another. However, there are extensions and changes between each class that are relevant to either the books or comments databases only, such as validating ISBN numbers in the `bookData` class and adding `post_info()` to the comments database. This mirrored functionality makes the methods in each class intuitive to use. It is worth noting that the methods for filtering in `commentData` allow the user to specify the field you want to filter by, in addition to the specific value you are filtering by. This serves to make the `commentData` methods more modular and re-usable, and allows the methods to be used even if variable names are subject to change (as is the case after using the `post_info()` method). 

#### Implementation highlights
This implementation prints all books in a well-formatted table, with column widths dynamically adjusted based on the content. It also filters the data case-insensitively, to allow for minor input errors. The full list of books is stored in `self.books`, and each filter operation returns a new `bookData` object containing a filtered subset. Similarly, in the `commentData` class, each filter operation returns a new `commentData` object with a filtered subset, ensuring that the original list of dictionaries remains unchanged. This approach promotes clean, modular code by preserving the integrity of the original dataset. There are a couple of exceptions to this, for instance when the `post_info` method is called, which adds post_info data to the existing dataset, modifying it directly.

The `bookData` and `commentData` classes include optional parameters in their methods, such as `show`, `limit`, and `wrap_width`, providing flexibility in how the data is displayed. These parameters allow users to:

 - Control whether to display all text for each value (useful when dealing with long text fields like the body of comment data).
 - Limit the number of entries printed to avoid excessive output.
 - Wrap text to ensure the table remains neatly formatted.
 - Chaining methods: Each filtering method returns a new object, enabling logical method chaining.

The commentData class has an extra parameter, `preview_only`, to handle the typically larger and more complex data associated with comments. This parameter helps manage how the text is displayed, allowing longer string values to be cut off at a specified length, helping to maintain overall readability.This approach ensures that the code remains modular, flexible, and efficient, enabling users to customise the display and behavior of the output as needed.

In [1]:
import assignment3_code as at
printer = at.bookData(at.books)

#### Question 1: Book Data - Function Demonstations
The code below demonstrates functions to print book data based on different search and filter criteria. This includes printing all books data, printing by book title or author, printing over a specified price, or sorting and printing the database. To keep the notebook clean and succinct, I have set a limit on the sorted books to 4. Each method below prints the results of a filtered books database using the `printAllBooks()` method, demonstrating modularity. The formatted tables that are printed dynamically adjust based on the length of the data in the filtered results. Note that the `helperClass.wrap_text()` method specifies the width at which to wrap text. 

In [2]:
print("1. All books") 
printer.printAllBooks()

print("\n2. By title: 'Catch-22'")
printer.printByTitle('Catch-22')

print("\n3. By author: 'George R. R. Martin'")
printer.printByAuthor('George R. R. Martin')

print("\n4. Books over £12.99")
printer.printOverPrice(12.99)

print("\n5. All books sorted by price (limit = 4)")
printer.printAllBooksSorted('Price', reverse_param=True, limit=4);

1. All books
------------------------------------------------------------------------------
| ISBN                | Title              | Author              | Price (£) |
------------------------------------------------------------------------------
| 978-1-785-15028-9   | Go Set a Watchman  | Harper Lee          |      9.89 |
| 978-0-744-01669-7   | The Legend of      | Prima Games         |     14.99 |
|                     | Zelda: Tri Force   |                     |           |
|                     | Heroes             |                     |           |
| 978-0-099-52912-6   | Catch-22           | Joseph Heller       |      6.29 |
| 978-0-007-44783-1   | A Clash of Kings   | George R. R. Martin |      4.95 |
| 978-1-853-26000-1   | Pride and          | Jane Austin         |      1.99 |
|                     | Prejudice          |                     |           |
| 978-0-099-57685-3   | Casino Royale      | Ian Fleming         |      6.79 |
| 978-0-099-54948-2   | To Kill a      

#### Question 2: ISBN Validation
The `valid()` method returns a boolean value depending on whether a given input is a valid ISBN number or not. The `validateISBNs()` method removes books with invalid ISBN numbers, if there are any. 

In [3]:
printer.valid(9781785150289) # Check for a valid ISBN

True

In [4]:
printer.validateISBNs(); # Demonstrate how to remove books with invalid ISBN numbers

Removing book: 'Catch-22' from the database.
Removing book: 'Fundamentals of Computer Architecture' from the database.


In [5]:
printer.validateISBNs(); # No invalid ISBN numbers

All books in database have valid ISBN numbers.


#### Question 3: Comments JSON data

Question 3 loads JSON data from a given URL and processes this using the `commentData` class. The code demonstrates filtering and printing, and parameters to control display have been harnessed to maintain a neat report. These inlcude the `limit` and `preview_only` parameters. The methods used mirror the structure of question 1, highlighting the modularity of the code.

In [6]:
# Load JSON data from given URL
decoded_data = at.helperClass.get_url("https://jsonplaceholder.typicode.com/comments")
comments = at.commentData(decoded_data)

In [7]:
# Demonstrate how to filter and print comments
print("Filter by ID = 11")
comments.printByVar('id', 11, preview_only=True);

print("\nFilter by postId > 3 (limit = 1)")
comments.printOverVar('postId', 3, preview_only=True,limit=1);

print("\nSort by ID (reverse, limit = 2)")
comments.printCommentsSorted("id", reverse_param=True, preview_only=True, limit=2);

Filter by ID = 11
------------------------------------------------------------------------
| ID  | Post ID | Name       | Email        | Body                     |
------------------------------------------------------------------------
| 11  | 3       | fugit      | Veronica_Goo | ut dolorum nostrum id    |
|     |         | labore     | dwin@timmoth | quia...                  |
|     |         | qui...     | y.net        |                          |
------------------------------------------------------------------------

Filter by postId > 3 (limit = 1)
------------------------------------------------------------------------
| ID  | Post ID | Name       | Email        | Body                     |
------------------------------------------------------------------------
| 16  | 4       | perferendi | Christine@ay | iste ut laborum aliquid  |
|     |         | s temp...  | ana.info     | ve...                    |
------------------------------------------------------------------------

Simulated information about each post's time, date, and IP address can be added using the `add_field()` method. This method generates a random date between 1st January 2010 and yesterday's date, avoiding future timestamps. The helper method `gen_rand_ip()` is also used to generate an IP address. These helper functions use functionality from the random and datetime modules. These additions are previewed below, with just the first entry being displayed to demonstrate succesful addition of metadata. The new dataset is then re-coded into valid JSON using the `encode_json` method. 

In [8]:
# Add new 'post_info' field including date, time, and IP
comments.add_field(show=True, preview_only=True, limit_field=1);

------------------------------------------------------------------------------
| ID  | Post ID | Name       | Email        | Body        | Post Info        |
------------------------------------------------------------------------------
| 1   | 1       | id labore  | Eliseo@gardn | laudantium  | Time: 02:46:11   |
|     |         | ex et ...  | er.biz       | enim quasi  | Date: 2013:04:14 |
|     |         |            |              | est ...     | IP address:      |
|     |         |            |              |             | 70.71.165.153    |
------------------------------------------------------------------------------


In [9]:
# Encode the data back to JSON format
json_data = at.helperClass.encode_json(comments.comments)

### Testing

In [10]:
# No matching values
printer.printByTitle("Not found");
comments.printByVar('id', 5000);

No books found.
No comments found with 'id' matching '5000'.


In [11]:
# Case insensitivity
printer.printByAuthor("ian fleming");
comments.printByVar('name', 'Id Labore ex Et Quam LABORUM');

----------------------------------------------------------------------
| ISBN                | Title              | Author      | Price (£) |
----------------------------------------------------------------------
| 978-0-099-57685-3   | Casino Royale      | Ian Fleming |      6.79 |
----------------------------------------------------------------------
-------------------------------------------------------------------------------------------
| ID  | Post ID | Name       | Email        | Body                     | Post Info        |
-------------------------------------------------------------------------------------------
| 1   | 1       | id labore  | Eliseo@gardn | laudantium enim quasi    | Time: 02:46:11   |
|     |         | ex et quam | er.biz       | est quidem magnam        | Date: 2013:04:14 |
|     |         | laborum    |              | voluptate ipsam eos      | IP address:      |
|     |         |            |              | tempora quo              | 70.71.165.153    |
|

#### Chaining methods

To build in flexibility, each filtering method (e.g. `printByTitle()` or `printByVar()`) is designed to both print results immediately, if the optional parameter `show` is set to True, and return a new object containing only the filtered subset of the data. This approach ensures users can see the results of each filtering step if required and allows method chaining. The example below demonstrates the books data being filtered by price, and then results of this filter being sorted. 

In [12]:
printer.printOverPrice(6.50, show=False).printAllBooksSorted('Price', limit=3);

-------------------------------------------------------------------------
| ISBN                | Title              | Author         | Price (£) |
-------------------------------------------------------------------------
| 978-0-099-57685-3   | Casino Royale      | Ian Fleming    |      6.79 |
| 978-1-785-15028-9   | Go Set a Watchman  | Harper Lee     |      9.89 |
| 978-0-701-18935-8   | Simply Nigella:    | Nigella Lawson |     12.50 |
|                     | Feel Good Food     |                |           |
-------------------------------------------------------------------------


Similar functionality is demonstrated in the `commentData` methods, where I have set a limit on the number of rows of being printed to the screen for neatness.

In [13]:
comments.printByVar('postId', 10, show=False).printCommentsSorted('id', reverse_param=True, show=True, preview_only=True, limit=3);

-------------------------------------------------------------------------------------------
| ID  | Post ID | Name       | Email        | Body                     | Post Info        |
-------------------------------------------------------------------------------------------
| 50  | 10      | dolorum    | Kiana_Predov | eum accusamus aut        | Time: 05:21:10   |
|     |         | soluta     |              | delectus...              | Date: 2021:09:25 |
|     |         | q...       | ic@yasmin.io |                          | IP address:      |
|     |         |            |              |                          | 242.183.252.205  |
-------------------------------------------------------------------------------------------
| 49  | 10      | rerum      | Camryn.Weima | id est iure occaecati    | Time: 12:50:08   |
|     |         | placeat    | nn@doris.io  | quam...                  | Date: 2014:12:24 |
|     |         | qu...      |              |                          | IP addr

The examples used in this report demonstrate a range of behaviour in the code. Basic functionality with correct and incorrect inputs, edge cases such as those with no matching values or case-insensitivity, method chaining, ISBN validation and post generation have all been demonstrated thoroughly yet succinctly.  

### Difficulties

One of the main challenges was allowing method calls to chain and allow for results to immediately be printed after each method was called. Chaining is helpful to build up a set of filters before displaying results, but it is clunky if the results of each method call are printed immediately. Therefore, my approach returns a new object containing filtered data to allow chaining, and I have added an optional parameter, `show` so users can specify whether to print results from filtering or sort methods or not. 

This approach does have some limitations. Returning a new object after each method call could lead to memory inefficiency for very large datasets, as many copies may be generated. On balance, this issue seems manageable due to working with small datasets in this instance. 

Another difficulty was displaying long comment data in a readable format. Values in the fields 'name' or 'body' could contain many characters. Printing these details in a neatly formatted table was a challenge. To address this I developed a `wrap_text()` function that wraps text within a fixed width, introduced a `preview_only` parameter to avoid long displays and instead show partial values, and added a limit parameter to control the number of results printed. These additions add complexity to the code I have written but they offer greater control over the way outputs are printed. From this process I learnt how to consider user experience better, and skills to work with formatting strings. If the user experiences difficulties, adjusting the width of the pane may allow the table to be presented more neatly.


### Conclusion

The code effectively processes structured data such as tabular and JSON data by storing it as a list of dictionaries. This implementation allows intutitve filtering and sorting and the methods developed follow consistent logic. This project goes beyond the assignment brief as the methods utilise additional display controls added that allow the user to decide how filtered results are displayed. This demonstrates awareness of user experience in a thoughtful way. Moreover the code is modular, and intutive to follow due to the mirrored structure of methods in `bookData` and `commentData`. 