<a href="https://colab.research.google.com/github/ProfessorPatrickSlatraigh/CIS2300/blob/main/DataFrameFormattedOutput.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Formatted Output from pandas DataFrames  
  
*by Professor Patrick: May 2025*  
  
*This notebook can be accessed using -- https://bit.ly/DataFrameFormattedOutput*  
  
    
##Contents  
- Using `pandas` built-in `.to_markdown()` (and saving it as a file)   
- Using `tabulate` for Pretty Terminal Output of pandas DataFrames  
- Using IPython `display()` for Pretty Output of pandas DataFrames  




---



##Using `pandas` built-in `.to_markdown()`  

To produce the content of a DataFrame in **markdown** text, use the `.to_markdown()` method in pandas.

In [None]:
import pandas as pd
from tabulate import tabulate

df = pd.DataFrame({
    'Name': ['Alice', 'Bob', 'Ted', 'Carol'],
    'Age': [25, 30, 22, 28],
    'Office' : ['Milwaukee', 'Duluth', 'Duluth', 'Chicago'],
    'Department': ['HR', 'Engineering', 'Finance', 'Marketing']
})

print(df.to_markdown())

The same `.to_markdown()` method can write a file...

In [None]:
df.to_markdown("markdown4u.htm")



---



## Using `tabulate` for Pretty Terminal Output of pandas DataFrames

This notebook demonstrates how to use the `tabulate` library to print well-formatted tables from pandas DataFrames in a variety of styles. The `tabulate` module is useful for displaying tables in plain text environments, such as the terminal or logs.

### Sections
- Introduction
- Basic Usage of `tabulate`
- Examples of Available Styles
- When to Use or Avoid `tabulate`

### Introduction

Install the library using pip:
```bash
pip install tabulate
```
Import it in your Python code alongside `pandas`:

In [None]:
import pandas as pd
from tabulate import tabulate

### Basic Usage

Create a simple DataFrame and format it using the `tabulate` function.

In [None]:
import pandas as pd
from tabulate import tabulate

df = pd.DataFrame({
    'Name': ['Alice', 'Bob', 'Ted', 'Carol'],
    'Age': [25, 30, 22, 28],
    'Office' : ['Milwaukee', 'Duluth', 'Duluth', 'Chicago'],
    'Department': ['HR', 'Engineering', 'Finance', 'Marketing']
})

print(tabulate(df, headers='keys', tablefmt='psql'))

### Examples of `tabulate` Styles
Below are examples of all styles supported by `tabulate` using the same DataFrame.

####<u>A Variety of General-Use Styles</u>    

##### Format: `asciidoc`

In [None]:
# Format: asciidoc
print(tabulate(df, headers='keys', tablefmt='asciidoc'))

##### Format: `double_grid`

In [None]:
# Format: double_grid
print(tabulate(df, headers='keys', tablefmt='double_grid'))

##### Format: `double_outline`

In [None]:
# Format: double_outline
print(tabulate(df, headers='keys', tablefmt='double_outline'))

##### Format: `grid`

In [None]:
# Format: grid
print(tabulate(df, headers='keys', tablefmt='grid'))

**Description:** Fully bordered grid with `+`, `-`, and `|` characters. Clear in terminals and good for fixed-width fonts.

##### Format: `heavy_grid`

In [None]:
# Format: heavy_grid
print(tabulate(df, headers='keys', tablefmt='heavy_grid'))

##### Format: `heavy_outline`

In [None]:
# Format: heavy_outline
print(tabulate(df, headers='keys', tablefmt='heavy_outline'))

##### Format: `mixed_grid`

In [None]:
# Format: mixed_grid
print(tabulate(df, headers='keys', tablefmt='mixed_grid'))

##### Format: `mixed_outline`

In [None]:
# Format: mixed_outline
print(tabulate(df, headers='keys', tablefmt='mixed_outline'))

##### Format: `outline`

In [None]:
# Format: outline
print(tabulate(df, headers='keys', tablefmt='outline'))

##### Format: `rounded_grid`

In [None]:
# Format: rounded_grid
print(tabulate(df, headers='keys', tablefmt='rounded_grid'))

##### Format: `rounded_outline`

In [None]:
# Format: rounded_outline
print(tabulate(df, headers='keys', tablefmt='rounded_outline'))

##### Format: `simple_grid`

In [None]:
# Format: simple_grid
print(tabulate(df, headers='keys', tablefmt='simple_grid'))

##### Format: `simple_outline`

In [None]:
# Format: simple_outline
print(tabulate(df, headers='keys', tablefmt='simple_outline'))

##### Format: `textile`

In [None]:
# Format: textile
print(tabulate(df, headers='keys', tablefmt='textile'))

####<u>Styles for Different Contexts/Uses</u>    

##### Format: `pretty`

**Description:** Readable and compact with aligned columns. Balances visual appeal and conciseness.

In [None]:
# Format: pretty
print(tabulate(df, headers='keys', tablefmt='pretty'))

####<u>Platform or Document Specific Styles</u>    


##### Format: `fancy_grid`

**Description:** Unicode box-drawing characters for a visually pleasing output in modern terminals. Best for enhanced CLI tools.

In [None]:
# Format: fancy_grid
print(tabulate(df, headers='keys', tablefmt='fancy_grid'))

##### Format: `fancy_outline`

In [None]:
# Format: fancy_outline
print(tabulate(df, headers='keys', tablefmt='fancy_outline'))

##### Format: `github`

**Description:** Markdown-style format compatible with GitHub README files. Ideal for documentation in repositories.

In [None]:
# Format: github
print(tabulate(df, headers='keys', tablefmt='github'))

##### Format: `html`

**Description:** Generates HTML `<table>` output. Suitable for embedding in web pages.

In [None]:
# Format: html
print(tabulate(df, headers='keys', tablefmt='html'))

##### Format: `jira`

**Description:** Jira markup syntax. Useful when pasting tables into Jira tickets or wiki pages.

In [None]:
# Format: jira
print(tabulate(df, headers='keys', tablefmt='jira'))

##### Format: `latex`

**Description:** Standard LaTeX tabular environment. Good for academic papers or PDFs.

In [None]:
# Format: latex
print(tabulate(df, headers='keys', tablefmt='latex'))

##### Format: `latex_booktabs`

**Description:** Uses LaTeX booktabs style with professional typography. Excellent for polished academic output.

In [None]:
# Format: latex_booktabs
print(tabulate(df, headers='keys', tablefmt='latex_booktabs'))

##### Format: `latex_longtable`

**Description:** LaTeX format for multipage tables. Ideal when printing large datasets.

In [None]:
# Format: latex_longtable
print(tabulate(df, headers='keys', tablefmt='latex_longtable'))

##### Format: `latex_raw`

**Description:** Raw LaTeX without formatting. For custom LaTeX environments.

In [None]:
# Format: latex_raw
print(tabulate(df, headers='keys', tablefmt='latex_raw'))

##### Format: `mediawiki`

**Description:** MediaWiki table format. Use for publishing to wikis like Wikipedia.

In [None]:
# Format: mediawiki
print(tabulate(df, headers='keys', tablefmt='mediawiki'))

##### Format: `moinmoin`

**Description:** MoinMoin wiki syntax. Useful in legacy documentation systems.

In [None]:
# Format: moinmoin
print(tabulate(df, headers='keys', tablefmt='moinmoin'))

##### Format: `orgtbl`

**Description:** Emacs org-mode table style. Suitable if you're generating tables for Emacs users.

In [None]:
# Format: orgtbl
print(tabulate(df, headers='keys', tablefmt='orgtbl'))

##### Format: `pipe`

**Description:** Another Markdown-style format using `|` pipes. Works well in Markdown files.

In [None]:
# Format: pipe
print(tabulate(df, headers='keys', tablefmt='pipe'))

##### Format: `plain`

**Description:** Minimalist format with no borders or headers. Useful for exporting raw data to logs or simple text files.

In [None]:
# Format: plain
print(tabulate(df, headers='keys', tablefmt='plain'))

##### Format: `psql`

**Description:** PostgreSQL-style output. Familiar to those who use SQL and want structured terminal output.

In [None]:
# Format: psql
print(tabulate(df, headers='keys', tablefmt='psql'))

##### Format: `presto`

**Description:** Similar to psql but tailored for Presto/Trino CLI output. Great for database logs.

In [None]:
# Format: presto
print(tabulate(df, headers='keys', tablefmt='presto'))

##### Format: `rst`

**Description:** reStructuredText format for Sphinx documentation. Ideal for Python project docs.

In [None]:
# Format: rst
print(tabulate(df, headers='keys', tablefmt='rst'))

##### Format: `simple`

**Description:** Compact and readable with minimal horizontal lines. Good for quick CLI output.

In [None]:
# Format: simple
print(tabulate(df, headers='keys', tablefmt='simple'))

##### Format: `tsv`

**Description:** Tab-separated values. Useful for exporting data to TSV files or copy-paste to Excel.

In [None]:
# Format: tsv
print(tabulate(df, headers='keys', tablefmt='tsv'))

##### Format: `unsafehtml`

**Description:** HTML output with no escaping — only use if you're sure the input is safe.

In [None]:
# Format: unsafehtml
print(tabulate(df, headers='keys', tablefmt='unsafehtml'))

##### Format: `youtrack`

**Description:** YouTrack wiki-compatible syntax. Designed for JetBrains YouTrack users.

In [None]:
# Format: youtrack
print(tabulate(df, headers='keys', tablefmt='youtrack'))



---



<i>Note: the `tabulate` library does not provide arguments to select or reorder columns directly. It simply formats the data it is given. However, you can control selection and order by modifying the DataFrame before passing it to tabulate.

In [None]:
import pandas as pd
from tabulate import tabulate


df = pd.DataFrame({
    'Name': ['Alice', 'Bob', 'Ted', 'Carol'],
    'Age': [25, 30, 22, 28],
    'Office' : ['Milwaukee', 'Duluth', 'Duluth', 'Chicago'],
    'Department': ['HR', 'Engineering', 'Finance', 'Marketing']
})

# Select and reorder columns
selected_columns = ['Department', 'Name']
df_selected = df[selected_columns]

# Format with tabulate
print(tabulate(df_selected, headers='keys', tablefmt='psql'))

#### Arguments for `tabulate()`

The `tabulate()` function supports various arguments to control formatting and display behavior. Below is a summary of commonly used arguments.

##### 🔹 Commonly Used Arguments

| Argument             | Type                        | Description |
|----------------------|-----------------------------|-------------|
| `tabular_data`       | list of lists, dict, DataFrame, etc. | The data structure to format into a table |
| `headers`            | `'keys'`, list, dict, or string | Column headers; `'keys'` uses dict keys or DataFrame columns |
| `tablefmt`           | string                      | Table format style (e.g., `'grid'`, `'pipe'`, `'psql'`) |
| `numalign`           | `'right'`, `'left'`, `'center'` | Align numeric columns |
| `stralign`           | `'right'`, `'left'`, `'center'` | Align text columns |
| `floatfmt`           | string or list              | Format for floating-point numbers (e.g., `".2f"`) |
| `missingval`         | string                      | Replacement for missing values (`None`, `NaN`, etc.) |
| `showindex`          | bool, `'always'`, `'default'`, list | Whether to display row indices |
| `disable_numparse`   | bool                        | Prevents automatic detection of numeric types |

##### 🔹 Less Common / Advanced Arguments

| Argument                   | Type      | Description |
|----------------------------|-----------|-------------|
| `colalign`                 | list      | Alignment per column (overrides `stralign`/`numalign`) |
| `maxcolwidths`             | int or list | Maximum column width for truncating content |
| `colwidths`                | list      | Fixed column widths (pads shorter strings) |
| `maxheadercolwidths`       | int or list | Truncate header column widths only |

---

##### 📌 Notes
- You can use `help(tabulate)` to view documentation interactively.
- The output formatting depends on the chosen `tablefmt`. Try styles like `'fancy_grid'`, `'github'`, or `'psql'` to see different effects.


In [None]:
from tabulate import tabulate
import pandas as pd

df = pd.DataFrame({
    'Name': ['Alice', 'Bob', 'Ted', 'Carol'],
    'Age': [25, 30, 22, 28],
    'Office' : ['Milwaukee', 'Duluth', 'Duluth', 'Chicago'],
    'Department': ['HR', 'Engineering', 'Finance', 'Marketing'],
    'Salary': [70000.1234, 80000.5678, 125500.1255, 1470100.8917]
})


print(tabulate(df, headers='keys', tablefmt='grid', floatfmt=",.2f", numalign="right"))

<i>... to process `float` content as currency, transform the column in the DataFrame to a string in `currency` format.</i>  

In [None]:
from tabulate import tabulate
import pandas as pd

df = pd.DataFrame({
    'Name': ['Alice', 'Bob', 'Ted', 'Carol'],
    'Age': [25, 30, 22, 28],
    'Office' : ['Milwaukee', 'Duluth', 'Duluth', 'Chicago'],
    'Department': ['HR', 'Engineering', 'Finance', 'Marketing'],
    'Salary': [70000.1234, 80000.5678, 125500.1255, 1470100.8917]
})

# Format the 'Salary' column as strings with currency format
df['Salary'] = df['Salary'].apply(lambda x: f"${x:,.2f}")

# Use tabulate (no floatfmt needed, since values are now strings)
print(tabulate(df, headers='keys', tablefmt='grid'))

### When to Use or Avoid `tabulate`

**Use when:**
- You need readable terminal output
- You're logging data summaries
- You're working in plain-text reports or markdown

**Avoid when:**
- You need interactive or rich HTML (use `display()` in Jupyter)
- You're working with very wide DataFrames
- You require interactivity like sorting/filtering



---



## Using IPython `display()` for Pretty Output of pandas DataFrames

This notebook explains how to use IPython’s `display()` function to produce rich, well-formatted visual outputs of pandas DataFrames in Jupyter and Colab environments.



### Introduction

The `display()` function from `IPython.display` is commonly used in Jupyter notebooks and Google Colab to show DataFrames with rich formatting. Unlike plain `print()`, `display()` uses HTML rendering behind the scenes to create readable, scrollable, and styled tables when viewed in notebook interfaces.

In [None]:
# Import necessary modules
import pandas as pd
from IPython.display import display

### Basic Usage of `display()`

Here's how to create a sample DataFrame and render it using `display()`.

In [None]:
# Import necessary modules
import pandas as pd
from IPython.display import display

# Create sample DataFrame
df = pd.DataFrame({
    'Product': ['Laptop', 'Monitor', 'Keyboard'],
    'Price': [1200.00, 300.00, 50.00],
    'Stock': [15, 34, 120]
})

# Use display instead of print
display(df)

You can also use `display()` multiple times in a cell to show multiple DataFrames without printing them as plain text.

In [None]:
# Create additional DataFrames
df1 = df[df['Price'] > 100]
df2 = df[df['Stock'] > 50]

# Display both DataFrames separately
display(df1)
display(df2)

#### Summary: Capabilities of `display()` vs `pandas` for Column Control

| Task                   | Supported by `display()`? | Use `pandas` instead |
|------------------------|---------------------------|----------------------|
| Display formatted table | ✅ Yes                    |                      |
| Select specific columns | ❌ No                     | ✅ Yes                |
| Reorder columns         | ❌ No                     | ✅ Yes                |
| Rename columns          | ❌ No                     | ✅ Yes (`df.rename`)  |


### When to Use or Avoid `display()`

**Use `display()` when:**
- Working in Jupyter or Colab where rich HTML output is supported
- Displaying multiple outputs in the same cell
- You want readable, scrollable, styled DataFrames
- Creating instructional content, tutorials, or reports with rendered output

**Avoid or don't rely on `display()` when:**
- Running scripts from the terminal or environments without HTML rendering
- Exporting plain text outputs to logs or command-line interfaces
- Needing cross-platform, consistent output in terminals (use `tabulate()` instead)

`display()` is ideal for notebooks but not for non-interactive use cases or where formatting must be preserved in plain text.



---

