# Coursework 2B: CSV and HTML

### Last updated 21.11.2019

This coursework is designed to illustrate and develop techniques for
extracting information from files and transforming it into a format
suitable for display. In particular, you will be concerned with extracting
information from CSV files and displaying it in a HTML format. You will
also see how information can be enriched by automatic addition of links
to other relevant information.


### Questions Overview

* Extract and display information from a CSV playlist file (4 marks)

* Generate an HTML search query from textual query strings (2 marks)

* Generate a HTML table from playlist file (9 marks)

### Files Provided
A couple of example CSV playlist files are provided:
* [```geek-music.csv```](https://teaching.bb-ai.net/PythonCoding/Files_and_Web/geek-music.csv)
* [```snake-music.csv```](https://teaching.bb-ai.net/PythonCoding/Files_and_Web/snake-music.csv)

### Provided Code for Extracting Data from CSV Files

You have seen how we can quite easily extract data from a CSV file in the form of a _list of lists_, which I call a _datalist_.

In [1]:
## Import module for handling csv files.
import csv

## You can use the following function to read data from a csv format file
# The data is extracted from the file as a list of lists

def get_datalist_from_csv( filename ):
    ## Create a 'file object' f, for accessing the file
    with open( filename ) as f:
         reader = csv.reader(f)     # create a 'csv reader' from the file object
         datalist = list( reader )  # create a list from the reader
    return datalist

def test():
    data = get_datalist_from_csv("geek-music.csv")
    for row in data: 
        print(row)
        
test()
    

['Track', 'Artist', 'Album', 'Time']
['Computer Love', 'Kraftwerk', 'Computer World', '7:15']
['Paranoid Android', 'Radiohead', 'OK Computer', '6:27']
['Computer Age', 'Neil Young', 'Trans', '5:24']
['Digital', 'Joy Division', 'Still', '2:50']
['Silver Machine', 'Hawkwind', 'Roadhawks', '4:39']
['Start the Simulator', 'A-Ha', 'Foot of the Mountain', '5:11']
['Internet Connection', 'M.I.A.', 'MAYA', '2:56']
['Deep Blue', 'Arcade Fire', 'The Suburbs', '4:29']
['I Will Derive!', 'MindofMatthew', 'You Tube', '3:17']
['Lobachevsky', 'Tom Lehrer', 'You Tube', '3:04']


## Q1: Extracting information from files


Download the files ```geek-music.csv``` and ```snake-music.csv```.
These files contain music playlists, specified in a fixed four column format. The contents of  ```geek-music.csv``` are:

<pre>
Track,Artist,Album,Time
Computer Love,Kraftwerk,Computer World,7:15
Paranoid Android,Radiohead,OK Computer,6:27
Computer Age,Neil Young,Trans,5:24
Digital,Joy Division,Still,2:50
Silver Machine,Hawkwind,Roadhawks,4:39
Start the Simulator,A-Ha,Foot of the Mountain,5:11
Internet Connection,M.I.A.,MAYA,2:56
Deep Blue,Arcade Fire,The Suburbs,4:29
I Will Derive!,MindofMatthew,You Tube,3:17
Lobachevsky,Tom Lehrer,You Tube,3:04
</pre>

The first three collumns give the track name, artist and album. These are all strings.
For this excercise, you do not need to worry about commas and quotations within the CSV file.
I will not put these in my test file, and in any case, you are recomended to import the ```csv``` 
module, which will handle such characters automatically. 

The format for the playing time is usually ```m:s``` but could also be ```h:m:s``` or ```s``` (for very long
or short tracks),  where ```h``` ```m``` and ```s``` are any appropriate 1 or 2 digit numbers representing hours, minutes and seconds, respectively.

Your task is to write a function ```display_playlist_info``` that can load a playlist CSV file and extract and print out the following information:
* the number of tracks,
* the longest and shortest tracks,
* the total playing time of all tracks.

So, for ```geek-music.csv``` the output should be something line:

<pre>
Displaying info for playlist file: geek_music.csv 
Number of tracks: 10
Longest track: Paranoid Android by Radiohead (6:27)
Shortest track: Digital by Joy Division (2:50)
Total playing time: 45:32
</pre>


In [2]:
import csv
def display_playlist_info( filename ):
    print( "Displaying info for playlist file:", filename )
    ## replace the following with code to produce the required info
    print( "Sorry, display code not written.")

### Grading for Q1

The following cell is for the marker to use to run your ```display_playlist_info``` function. It will run it on a different playlist file from the ones you have been given. But it will have the same format in terms of its columns --- i.e. 4 columns with the
contents described above.

In [3]:
from PP_CW2B_CSV_HTML_tests import *
grade = None
grade_PP_CW2B(display_playlist_info, grade)


Grading: display_playlist_info
calling  display_playlist_info( "geek-music.csv"  ) .....
Displaying info for playlist file: geek-music.csv
Sorry, display code not written.
calling  display_playlist_info( "snake-music.csv" ) .....
Displaying info for playlist file: snake-music.csv
Sorry, display code not written.


### Q1 Feedback
_Markers feedback on Q1 will go here._

## Background and Examples on Displaying and Transforming Data
The ```tabulate``` module provides a simple and easy way
to display a datalist as a table.

(Note: ```tabulate``` came installed with my Anaconda3 Python installation, but I have been told it is not pre-installed with all distributions. So you may need to install it. However, this
example is not essential to the assignment, so you could
just comment out the code in the next code cell and the one
below that also uses ```tabulate```.)

In [2]:
from tabulate import tabulate
    
TABLE = [["Sun",696000,1989100000],
         ["Earth",6371,5973.6],
         ["Moon",1737,73.5],
         ["Mars",3390,641.85]]

print( tabulate(TABLE) )

-----  ------  -------------
Sun    696000     1.9891e+09
Earth    6371  5973.6
Moon     1737    73.5
Mars     3390   641.85
-----  ------  -------------


### Displaying a Table using HTML

We will first do this using the ```tabulate``` module which provides functions for creating tables, 
and the ```display``` module, which enables various types of information representation, including HTML,
to be displayed in Jupyter output.

In [5]:
from IPython.display import HTML, display
import tabulate

table = [["Sun",696000,1989100000],
         ["Earth",6371,5973.6],
         ["Moon",1737,73.5],
         ["Mars",3390,641.85]]

display(HTML(tabulate.tabulate(table, tablefmt='html')))



0,1,2
Sun,696000,1989100000.0
Earth,6371,5973.6
Moon,1737,73.5
Mars,3390,641.85



The preceeding code example shows a quick and easy way of visualising tabular information. However, it hides the HTML formatting of the table from the user. This is often fine because the user does not really care about that and just wants to see the data layed out in a readable way. However, it does not allow much control of the way the data is displayed, so may not present it the way we would like it.

To get more control over how information is presented, we can explicitly specify HTML formatting codes in order to display information. This is more flexibile, but does involve dealing with
the somewhat complex details of HTML, which can be time confusing.

HTML  can be used
more efficiently, if we write general purpose functions that
enable data to be automatically transformed into appropriate
HTML code. In fact, This is just one case where we want to transform
information represented in one data format into another format.

The code example in the following cell illustrates such a data transformation by showing how a datalist (such as can be abstracted from a CSV) can be automatically converted into an HTML table.
The example also shows how we can add flexibility to the
transformation by having default settings that can be over-ridden
by specifying alternatives.


In [6]:
## We need the following modules to display HTML directly
## in the output of a Jupyter code cell
from IPython.display import HTML, display

## Here we specify default values for certain HTML elements
## Note that here we are using a "dictionary of dictionaries".
HTML_DEFAULTS = {
  # Main dictionary key  
  "table_style"         : { "text-align" : "center",
                            "border"     : '4px solid black' },

  "element_style"       : { "text-align" : "center",
                            "border"     : '1px solid black' } ,
    
  # Colors are specified by 6 hexadecimal digits: "#RRGGBB"
  # The digits represent the amount of Red, Green and Blue
  "element_color"      : "#ddffff", 
}

def make_html_table_from_datalist( 
       datalist,      # This is the actual data
    
       #These are optional styling parameters:
       # parameter
       table_style    = HTML_DEFAULTS["table_style"   ], 
       element_style  = HTML_DEFAULTS["element_style" ],
       element_color = HTML_DEFAULTS["element_color"]    ):
    
    ## Get html code representing the table style options
    ## using function defined below
    table_style_str = html_style_string( table_style )
    
    ## Start of table (the table style string is inserted)
    html_string = "<table {}>\n".format(table_style_str)
    
    ## Get html code representing the style of table elements
    elt_style_str   = html_style_string( element_style )
    
    ## Add html for each row of the table
    for row in datalist:
        
        # start table row:
        html_string += '<tr style="background-color:{}">\n'.format(element_color) 
        
        # add html for each element of the row 
        for element in row:
            html_string += "<td {}>{}</td> ".format(elt_style_str, str(element)) 
            
        html_string += "\n</tr>\n" # end table row
        
    return html_string ## return html string for the full table


## Create an HTML style string from a dictionary of style options
def html_style_string( style_dict ):
    style_str = ""
    if style_dict:
        style_str = 'style="'
        for style in style_dict: 
            style_str += "{}:{};".format(style, style_dict[style])
        style_str += '"' # add final close quote
    return style_str

## Now use functions from the imported modules display and HTML
## to display our HTML string as formatted HTML output.
def display_datalist_as_html_table( datalist ):
    html_table = make_html_table_from_datalist( datalist )
    display(HTML( html_table ))
    

### An example conversion from CSV to HTML
The following cell tests the functions defined so far by showing how we can extract a datalist from an file "geek-music.csv", and display it as an HTML table.

In [7]:
datalist = get_datalist_from_csv("geek-music.csv") 

display_datalist_as_html_table( datalist )

0,1,2,3
Track,Artist,Album,Time
Computer Love,Kraftwerk,Computer World,7:15
Paranoid Android,Radiohead,OK Computer,6:27
Computer Age,Neil Young,Trans,5:24
Digital,Joy Division,Still,2:50
Silver Machine,Hawkwind,Roadhawks,4:39
Start the Simulator,A-Ha,Foot of the Mountain,5:11
Internet Connection,M.I.A.,MAYA,2:56
Deep Blue,Arcade Fire,The Suburbs,4:29
I Will Derive!,MindofMatthew,You Tube,3:17


### Automatically Adding URL Links
Perhaps the most distinctive feature of an HTML document is that it is connected to other documents or information sources via links.

Suppose I want to create a link that, when clicked, will carry out a search for some given wors on YouTube. 

You tube supports access to its search engine via a special type of URL, that will perform a search and generate a results page, just as if the a normal search was being performed. But in this case the words that it searches for are in the URL, so do not need to be typed in.

For instance, the URL to search YouTube for "Computer Love" by "Kraftwerk" is:

```html
https://www.youtube.com/results?search_query=Computer+Love+by+Kraftwerk
```

Similarly, to search for "Kraftwerk" in Wikipedia, I can use the following URL:

```html
https://en.wikipedia.org/?search=Kraftwerk
```

To actually create a link in an HTML document, we need to use an _anchor_ tag, of
the form:
```html
<a href="the_url">Link Text</a>
```

This is illustrated in the following code cell, which uses the _magic_ declaration ```%%HTML```
to specify that the cell should be interpreted and displayed as HTML, when run. If you run it
you should see two active HTML links in the output.

In [8]:
%%HTML
<a href="https://www.youtube.com/results?search_query=Computer+Love+by+Kraftwerk">Computer Love</a>
<br>
<a href="https://en.wikipedia.org/?search=Kraftwerk">Kraftwerk</a>

## Q2: Automatically Generate an HTML link from data

In order to dynamically create a web-page with useful links from some data source
(e.g. a playlist csv file), we need a function that will generate the correct
HTML code for such a link from some given query terms.

To answer this question, code a function ```youtube_query_link( track, artist, link_text)```, which will create an HTML link that will search for the given
```track``` by the given ```artist``` and use ```link_text``` for the text of the link.

Thus for example:
```
youtube_query_link( "Computer Love", "Krafterk", "Computer Love" )

```
would produce the link:
```
<a href="https://www.youtube.com/results?search_query=Computer+Love+by+Kraftwerk">Computer Love</a>
```
as in the previous example.

In [9]:
from IPython.display import display, HTML

def youtube_query_link( track, artist, link_text):
    ## Modify the following code to produce the correct youtube query link
    return( '<a target="_blank" href="https://teaching.bb-ai.net">Hello!</a>' ) 

## Note: the cryptic HTML link option target="_blank", makes the link open in
##       a new tab or new window.


### Q2 Grading
The following cell will be used by the marker to grade Q2. 
You can use it to check that your ```youtube_query_link``` function generates a link as required.

In [10]:
from PP_CW2B_CSV_HTML_tests import *
grade = None
grade_PP_CW2B(youtube_query_link, grade)


Grading: youtube_query_link
Testing: display(HTML(func( "Hallogallo", "Neu!", "Hallogallo by Neu!")))


### Q2 Feedback
_Marker's feedback on Q2 will be given here._

## Q3: Generate an HTML Display from a Playlist CSV File

Your goal for the last part of the coursework is to write a function ```display_playlist_as_html_table```,
which will take the filename of a CSV playlist file, in the format described above, and display
its contents in an HTML table.
The table should be made attractive and informative by the use of HTML styling options
and data enrichment such as adding additional information and links.


In [11]:
from IPython.display import display, HTML

def display_playlist_as_html_table( filename ):
    ## Define a function to display a playlist attractively using HTML
    ## The playlist is read from filename, which is a CSV file.
    print( "This should display the playlist in the form of a nice HTML table.")

### Q3 Grading

In [12]:
from PP_CW2B_CSV_HTML_tests import *
grade = None
grade_PP_CW2B( display_playlist_as_html_table, grade )


Grading: display_playlist_as_html_table
This should display the playlist in the form of a nice HTML table.


### Q3 Feedback
_Marker's feedback on Q3 will be given here._

### Overall Grade Processing
Once grading has been competed this will show the marks awarded.

In [13]:
from PP_CW2B_CSV_HTML_tests import *
show_grades_PP_CW2B()

This function will be used to display your grades once it has been marked.
Q1:  None
Q2:  None
Q3:  None
