# Print Simple Tables
> Personal notes for different methods to print plain text tables in Python.

- toc: true
- badges: true
- comments: true
- categories: [general]

In [1]:
#hide

# PyPI
from bs4 import BeautifulSoup as BS
import requests

# Standard
import pprint
import json
import re

# SUPPORTING FUNCTIONS
def build_title(title: str) -> str:
    return f'-------------------------------\n{title}\n-------------------------------\n'

In [2]:
#hide

import requests, json

query_url = 'https://idph.illinois.gov/DPHPublicInformation/api/COVID/GetResurgenceData?regionID=8&daysIncluded=3' #+ selectedRegion + '&daysIncluded=' + chartRange

page = requests.get(query_url)

print(build_title('CLI ADMISSIONS'))
pprint.pprint(page.json()['CLIAdmissions'])

print(build_title('COUNTRY TEST POSITIVITY REPORTS'))
pprint.pprint(page.json()['CountyTestPositivityReports'])

print(build_title('HOSPITAL AVAILABILITY'))
pprint.pprint(page.json()['HospitalAvailability'])

print(build_title('HOSPITAL BEDS IN USE AVG'))
pprint.pprint(page.json()['HospitalBedsInUseAvg'])

print(build_title('TEST POSITIVITY'))
pprint.pprint(page.json()['TestPositivity'])

print(build_title('LAST UPDATED DATE'))
pprint.pprint(page.json()['lastUpdatedDate'])

print(build_title('REGION METRICS'))
pprint.pprint(page.json()['regionMetrics'])

-------------------------------
CLI ADMISSIONS
-------------------------------

[{'CLIAdmissionsRA': 50.0,
  'regionDescription': 'West Suburban',
  'regionID': 8,
  'reportDate': '2020-11-24T00:00:00'},
 {'CLIAdmissionsRA': 50.0,
  'regionDescription': 'West Suburban',
  'regionID': 8,
  'reportDate': '2020-11-25T00:00:00'},
 {'CLIAdmissionsRA': 47.0,
  'regionDescription': 'West Suburban',
  'regionID': 8,
  'reportDate': '2020-11-26T00:00:00'},
 {'CLIAdmissionsRA': 47.0,
  'regionDescription': 'West Suburban',
  'regionID': 8,
  'reportDate': '2020-11-27T00:00:00'}]
-------------------------------
COUNTRY TEST POSITIVITY REPORTS
-------------------------------

[{'countyTestPositivities': [{'CountyName': 'DuPage',
                              'dailyPositivity': 0.0,
                              'positive_test': 777,
                              'positivityRollingAvg': 12.3,
                              'regionID': 8,
                              'totalTest': 6411},
            

## Manually Created Tables

- Creating plain text tables from scratch is tedious, especially if they require a lot of flexibility.
- Libraries exist that remove all of this work (one is explored further below), but with all programming libraries it's helpful (if not very important) to understand the general concept of what is happening behind the scenes.
- This sections covers that "basic concept" part. It walks through sample methods for creating tables (from crude to dynamic).

### Crude separators

**Sample Data 1**

In [3]:
table_list = [
    ['INDEX', 'COLUMN1', 'COLUMN2', 'COLUMN3'],
    ['1', 'A', 'B', 'C'],
    ['2', 'D', 'E', 'F'],
    ['3', 'G', 'H', 'I']
]

- At the simplest level, you could just print out each sublist (for example), with each item separated by bars. The issue with this approach is that the columns won't be lined up at all. In this case lists 2-4 are lined up, but not the column headers.

In [4]:
for item in table_list:
    row = f'| {item[0]} | {item[1]} | {item[2]} | {item[3]} |'
    print(row)

| INDEX | COLUMN1 | COLUMN2 | COLUMN3 |
| 1 | A | B | C |
| 2 | D | E | F |
| 3 | G | H | I |


### Basic column width
- A basic fix for this would be decided on a column width, and make sure that each string was that long using a raw equation.
- In the case below, the string length for the column is 10 characters -- the equation is similar to: `' ' * (10 - len(string_variable))`
  - Working from the outside in, this does the following:
    1. `string_length = len(string_variable)`     => counts the number of characters in the string that contains the value
    2. `length_diff = 10 - string_length`         => subtracts the result of step 1 from ten
    3. `string_fill = ' ' * length_diff`          => mutliplies a space character by the length difference between the orginal value and the required column length
    4. `final_str = string_variable + string_fill => lenth will meet the column width requirement
- The approach is used below. Each item in the lists takes the original string, calculates required filler space characters, then builds the new variable.

In [5]:
for item in table_list:
    # HORIZONAL LINES ADDED FOR REASONS
    print('+---------------------------------------------------+')
    item0 = item[0] + " " * (10 - len(item[0]))
    item1 = item[1] + " " * (10 - len(item[1]))
    item2 = item[2] + " " * (10 - len(item[2]))
    item3 = item[3] + " " * (10 - len(item[3]))
    row = f'| {item0} | {item1} | {item2} | {item3} |'
    print(row)
print('+---------------------------------------------------+')

+---------------------------------------------------+
| INDEX      | COLUMN1    | COLUMN2    | COLUMN3    |
+---------------------------------------------------+
| 1          | A          | B          | C          |
+---------------------------------------------------+
| 2          | D          | E          | F          |
+---------------------------------------------------+
| 3          | G          | H          | I          |
+---------------------------------------------------+


- The output above now resembles an actual table of data, but it the current approach comes with drawbacks. For example: What happens if a string is wider than the desired string length of the column width?

### Dynamic Column Width

**Sample Data 2** (increases column 4 in row 4 beyond 10 characters)

In [6]:
table_list = [
    ['INDEX', 'COLUMN1', 'COLUMN2', 'COLUMN3'],
    ['1', 'A', 'B', 'C'],
    ['2', 'D', 'E', 'F'],
    ['3', 'G', '9999999999999', 'I']
]

- In the next example, each columns width is dynamically calulated based on the width of the longest string for that column
- Doing this requires interating over each index position in every row to calculate the max length - then storing that information in another list (`width_list`)
- The index positions in `width_list` match up with column index positions, so `width_list[n]` can be directly replace the "10" used above
- NOTE: The horizontal line is now calculated as well. There is no magic in the calculation. Just trial and error until I figured it out.

In [7]:
# CREATE EMPTY LIST FOR COLUMN WIDTHS
width_list = []

# ITERATE OVER `table_list`, ONCE FOR EACH LIST/ROW
# - Sets an index of 0, which is increase each time an index position (representing a column) is fully accounted for throughout `table_list`
index = 0
while index < len(table_list):
    # FOR EACH INDEX POSITION ITERATE OVER THAT INDEX IN EVER LIST TO FIND LONGEST STRING
    for row in table_list:
        try:
            if len(row[index]) > width_list[index]:
                width_list[index] = len(row[index])
        # HANDLES EXCEPTIONS THROWN FOR THE FIRST TIME THE CODE ATTEMPTS TO SET/CHANGE EACH INDEX IN `width_list`
        # - Creates the index position by appending a 0, then sets that position as the length of the current value being evaluated
        except IndexError as e:
            if str(e) == 'list index out of range':
                width_list.append(0)
                width_list[index] = len(row[index])
    index += 1

# RESULT IS LIST WITH LONGEST STRING IN EACH COLUMN
print('Longest string in each column:', width_list, '\n')

# CALCULATE HORIZONTAL LINES BASED ON `width_list`
row_length = len(width_list)*2 + (len(width_list) - 1) + sum(width_list)
# BUILD LINE BASED ON LENGTH CALCULATION
line = f'+{"-" * row_length}+'

# BUILD AND PRINT TABLE
for item in table_list:
    item0 = item[0] + ' ' * (width_list[0] - len(item[0]))
    item1 = item[1] + ' ' * (width_list[1] - len(item[1]))
    item2 = item[2] + ' ' * (width_list[2] - len(item[2]))
    item3 = item[3] + ' ' * (width_list[3] - len(item[3]))
    print(line)
    print(f'| {item0} | {item1} | {item2} | {item3} |')
print(line)

Longest string in each column: [5, 7, 13, 7] 

+-------------------------------------------+
| INDEX | COLUMN1 | COLUMN2       | COLUMN3 |
+-------------------------------------------+
| 1     | A       | B             | C       |
+-------------------------------------------+
| 2     | D       | E             | F       |
+-------------------------------------------+
| 3     | G       | 9999999999999 | I       |
+-------------------------------------------+


### Dynamic Number of Columns

**Sample Data 3** (increases number of rows and columns)

In [8]:
table_list = [
    ['INDEX', 'COLUMN1', 'COLUMN2', 'COLUMN3', 'COLUMN4', 'COLUMN5'],
    ['1', 'A', 'B', 'C', 'CB', 'CC'],
    ['2', 'D', 'E', 'F', 'FB', 'FC'],
    ['3', 'G', '9999999999999', 'I', 'IB', 'IC'],
    ['4', '!', '@', '#', '$', '%'],
]

- Every example up to this point assumes that the width of the table is know
- The example below has dynamically to populate the `width_list` (simply involves adding `[0]` to the `while` statement - which comes with its own limitations), and dynamically builds each row (rather than index positions being hard coded

In [9]:
# CREATE EMPTY LIST FOR COLUMN WIDTHS
width_list = []

# VALIDATES THAT EVERY LIST/ROW IS THE SAME LENGTH (required for valid table)
# - The map function runs `len` against every list in table_list and records the results in a map object
# - Wrapping `list()` around the map function converts the map function to a list of int representing list lengths
lengths = list(map(len, table_list))
# - The list comprehention tests the first value in `lengths` against every other value in the list.
#   - If the test matches, True, else False.
#   - The list function converts the map object to a list
# - The if statement around the comprehention checks if False exists anywhere in the resulting list.
#   - If False exists one or more times `same` is set to False (which will skip the result of the code).
#   - If False does not exist `same` set to True
same = False if False in [lengths[0] == x for x in lengths] else True

# ITERATE OVER `table_list`, ONCE FOR EACH LIST/ROW
# - Sets an index of 0, which is increase each time an index position (representing a column) is fully accounted for throughout `table_list`
if same == True:
    index = 0
    while index < len(table_list[0]):   
        # FOR EACH INDEX POSITION ITERATE OVER THAT INDEX IN EVER LIST TO FIND LONGEST STRING
        for row in table_list:
            try:
                if len(row[index]) > width_list[index]:
                    width_list[index] = len(row[index])
            # HANDLES EXCEPTIONS THROWN FOR THE FIRST TIME THE CODE ATTEMPTS TO SET/CHANGE EACH INDEX IN `width_list`
            # - Creates the index position by appending a 0, then sets that position as the length of the current value being evaluated
            except IndexError as e:
                if str(e) == 'list index out of range':
                    width_list.append(0)
                    width_list[index] = len(row[index])
        index += 1

    # RESULT IS LIST WITH LONGEST STRING IN EACH COLUMN
    print('Longest string in each column:', width_list, '\n')

    # CALCULATE HORIZONTAL LINES BASED ON `width_list`
    row_length = len(width_list)*2 + (len(width_list) - 1) + sum(width_list)
    # BUILD LINE BASED ON LENGTH CALCULATION
    line = f'+{"-" * row_length}+'

    # BUILD AND PRINT TABLE
    for list_row in table_list:
        width_index = 0
        row = ''
        # THE BUILD RELATED STATEMENTS ARE NOW DYNAMIC INSTEAD OF HARDCODED
        for item in list_row:
            item_str = " " + item + " " * (width_list[width_index] - len(item)) + " "
            row = row + '|' + item_str
            width_index += 1
        print(line)
        print(row + '|')
    print(line)
else:
    print('Not all rows are the same length. Valid table can\'t be built.')


Longest string in each column: [5, 7, 13, 7, 7, 7] 

+---------------------------------------------------------------+
| INDEX | COLUMN1 | COLUMN2       | COLUMN3 | COLUMN4 | COLUMN5 |
+---------------------------------------------------------------+
| 1     | A       | B             | C       | CB      | CC      |
+---------------------------------------------------------------+
| 2     | D       | E             | F       | FB      | FC      |
+---------------------------------------------------------------+
| 3     | G       | 9999999999999 | I       | IB      | IC      |
+---------------------------------------------------------------+
| 4     | !       | @             | #       | $       | %       |
+---------------------------------------------------------------+


- This gets the job, but it is a **lot of work** just to print a plain text table
- Fortunately, there is a library that does a lot of this work behind the scenes

## Tables Created with **Tabulate**

- PyPI: https://pypi.org/project/tabulate/
- Conda-Forge: https://anaconda.org/conda-forge/tabulatehttps://anaconda.org/conda-forge/tabulate
- Github: https://github.com/astanin/python-tabulate (includes docs)

**Sample Data 3** (same as previous example)

In [10]:
table_list = [
    ['INDEX', 'COLUMN1', 'COLUMN2', 'COLUMN3', 'COLUMN4', 'COLUMN5'],
    ['1', 'A', 'B', 'C', 'CB', 'CC'],
    ['2', 'D', 'E', 'F', 'FB', 'FC'],
    ['3', 'G', '9999999999999', 'I', 'IB', 'IC'],
    ['4', '!', '@', '#', '$', '%'],
]

### Very Basic Tabulate Table (no headers)

- The simplest form of table just involves passing a list of lists into a tabulate call.
- The result is not very pretty, but we just dropped from upwards of 30+ lines of code to... ONE LINE.

In [11]:
from tabulate import tabulate

print(tabulate(table_list))

-----  -------  -------------  -------  -------  -------
INDEX  COLUMN1  COLUMN2        COLUMN3  COLUMN4  COLUMN5
1      A        B              C        CB       CC
2      D        E              F        FB       FC
3      G        9999999999999  I        IB       IC
4      !        @              #        $        %
-----  -------  -------------  -------  -------  -------


### Basic Tabulate Table with Headers

- The first example was nice and easy, but it didn't separate the header list from the data.
- Tabulate allows headers to be set using using a `headers` parameter.
- In the example below, notice the headers list was also removed from `table_list` (otherwise they would show up twice).

In [12]:
print(tabulate(table_list[1:], headers=['INDEX', 'COLUMN1', 'COLUMN2', 'COLUMN3', 'COLUMN4', 'COLUMN5']))

  INDEX  COLUMN1    COLUMN2        COLUMN3    COLUMN4    COLUMN5
-------  ---------  -------------  ---------  ---------  ---------
      1  A          B              C          CB         CC
      2  D          E              F          FB         FC
      3  G          9999999999999  I          IB         IC
      4  !          @              #          $          %


- Typing out all of the headers can be tedious.
- Fortunately an exiting list object can be passed into the `headers` parameter, and a headers list already exists: `table_list[0]`
- Below is a cleaned up example with the header list sliced out of `table_list`, but added to `headers`

In [13]:
print(tabulate(table_list[1:], headers=table_list[0]))

  INDEX  COLUMN1    COLUMN2        COLUMN3    COLUMN4    COLUMN5
-------  ---------  -------------  ---------  ---------  ---------
      1  A          B              C          CB         CC
      2  D          E              F          FB         FC
      3  G          9999999999999  I          IB         IC
      4  !          @              #          $          %


- Similar how `headers` functions, a raw list can also be typed out in parameter position 1 as well (rather than passing in an existing list object).

### Tabulate Table with Index

- It's also easy to add an index (our example has one manually created, but this can be useful as well).

In [14]:
print(tabulate(table_list[1:], headers=table_list[0], showindex="always"))

      INDEX  COLUMN1    COLUMN2        COLUMN3    COLUMN4    COLUMN5
--  -------  ---------  -------------  ---------  ---------  ---------
 0        1  A          B              C          CB         CC
 1        2  D          E              F          FB         FC
 2        3  G          9999999999999  I          IB         IC
 3        4  !          @              #          $          %


### Tabulate Table Formats

- One of the nicest features of tabulate is that it has a large number of formats available

In [15]:
print(tabulate(table_list[1:], headers=table_list[0], tablefmt="github"))

|   INDEX | COLUMN1   | COLUMN2       | COLUMN3   | COLUMN4   | COLUMN5   |
|---------|-----------|---------------|-----------|-----------|-----------|
|       1 | A         | B             | C         | CB        | CC        |
|       2 | D         | E             | F         | FB        | FC        |
|       3 | G         | 9999999999999 | I         | IB        | IC        |
|       4 | !         | @             | #         | $         | %         |


In [16]:
print(tabulate(table_list[1:], headers=table_list[0], tablefmt="grid"))

+---------+-----------+---------------+-----------+-----------+-----------+
|   INDEX | COLUMN1   | COLUMN2       | COLUMN3   | COLUMN4   | COLUMN5   |
|       1 | A         | B             | C         | CB        | CC        |
+---------+-----------+---------------+-----------+-----------+-----------+
|       2 | D         | E             | F         | FB        | FC        |
+---------+-----------+---------------+-----------+-----------+-----------+
|       3 | G         | 9999999999999 | I         | IB        | IC        |
+---------+-----------+---------------+-----------+-----------+-----------+
|       4 | !         | @             | #         | $         | %         |
+---------+-----------+---------------+-----------+-----------+-----------+


In [17]:
print(tabulate(table_list[1:], headers=table_list[0], tablefmt="fancy_grid"))

╒═════════╤═══════════╤═══════════════╤═══════════╤═══════════╤═══════════╕
│   INDEX │ COLUMN1   │ COLUMN2       │ COLUMN3   │ COLUMN4   │ COLUMN5   │
╞═════════╪═══════════╪═══════════════╪═══════════╪═══════════╪═══════════╡
│       1 │ A         │ B             │ C         │ CB        │ CC        │
├─────────┼───────────┼───────────────┼───────────┼───────────┼───────────┤
│       2 │ D         │ E             │ F         │ FB        │ FC        │
├─────────┼───────────┼───────────────┼───────────┼───────────┼───────────┤
│       3 │ G         │ 9999999999999 │ I         │ IB        │ IC        │
├─────────┼───────────┼───────────────┼───────────┼───────────┼───────────┤
│       4 │ !         │ @             │ #         │ $         │ %         │
╘═════════╧═══════════╧═══════════════╧═══════════╧═══════════╧═══════════╛


In [18]:
print(tabulate(table_list[1:], headers=table_list[0], tablefmt="psql"))

+---------+-----------+---------------+-----------+-----------+-----------+
|   INDEX | COLUMN1   | COLUMN2       | COLUMN3   | COLUMN4   | COLUMN5   |
|---------+-----------+---------------+-----------+-----------+-----------|
|       1 | A         | B             | C         | CB        | CC        |
|       2 | D         | E             | F         | FB        | FC        |
|       3 | G         | 9999999999999 | I         | IB        | IC        |
|       4 | !         | @             | #         | $         | %         |
+---------+-----------+---------------+-----------+-----------+-----------+


## Summary

- It's very helpful to know the basics of how a plain text table can be constructed
- Once that concept is understood, it's also very helpful to have a tool that does most of the work for you.
- In the github link further above, documentation exists showing that the **tabulate** library contains a lot of functionality not shown here, as well as a number of additional formatting options.