# Pandas presentation tips I wish I knew earlier
https://towardsdatascience.com/pandas-presentation-tips-i-wish-i-knew-earlier-8e767365d190

## To Step Up Your Pandas Game, read:  

- [5 lesser-known pandas tricks](https://towardsdatascience.com/5-lesser-known-pandas-tricks-e8ab1dd21431)  
- [Exploratory Data Analysis with pandas](https://towardsdatascience.com/exploratory-data-analysis-with-pandas-508a5e8a5964)  
- [How NOT to write pandas code -paid](https://gum.co/vxxiV) 
- [5 Gotchas With Pandas](https://towardsdatascience.com/5-gotchas-with-pandas-974df6595e61)  
- [Pandas tips that will save you hours of head-scratching](https://towardsdatascience.com/pandas-tips-that-will-save-you-hours-of-head-scratching-31d8572218c9)  
- [Display Customizations for pandas Power Users](https://towardsdatascience.com/become-a-pandas-power-user-with-these-display-customizations-6d3a5a5885c1)  
- [5 New Features in pandas 1.0 You Should Know About](https://towardsdatascience.com/5-new-features-in-pandas-1-0-you-should-know-about-fc31c83e396b)  
- [pandas analytics server](https://towardsdatascience.com/pandas-analytics-server-d9abceec888b)  
- [Pandas analysis of coronavirus pandemic](https://medium.com/datadriveninvestor/pandas-analysis-of-coronavirus-pandemic-2b0d784e0806)  
- [Sports analysis with Pandas](https://towardsdatascience.com/sports-analysis-with-pandas-real-vs-barca-94f85819bf6)  
- [3 hidden mistakes with pandas](https://towardsdatascience.com/3-hidden-mistakes-with-pandas-712792dfb91a)  
- [Pandas Pivot — The Ultimate Guide](https://towardsdatascience.com/pandas-pivot-the-ultimate-guide-5c693e0771f3)  


## Setup

In [2]:
import os
import platform
import random
from platform import python_version

import jupyterlab
import numpy as np
import pandas as pd
import lxml

print("System")
print("os name: %s" % os.name)
print("system: %s" % platform.system())
print("release: %s" % platform.release())
print()
print("Python")
print("version: %s" % python_version())
print()
print("Python Packages")
print("jupterlab==%s" % jupyterlab.__version__)
print("pandas==%s" % pd.__version__)
print("numpy==%s" % np.__version__)
print("lxml==%s" % lxml.__version__)

System
os name: posix
system: Darwin
release: 19.4.0

Python
version: 3.7.7

Python Packages
jupterlab==1.2.6
pandas==1.0.3
numpy==1.18.1
lxml==4.5.0


In [6]:
## Added by Rob
output_folder = './output'
if not os.path.exists(output_folder):
    os.makedirs(output_folder)
output_folder = output_folder + '/'

notebook_outbook_prefix = output_folder + "PandasPresentation_"

### Create dummy dataframe  

In [3]:
n = 10
df = pd.DataFrame(
    {
        "col1": np.random.random_sample(n),
        "col2": np.random.random_sample(n),
        "col3": [[random.randint(0, 10) for _ in range(random.randint(3, 5))] for _ in range(n)],
    }
)
df.shape

(10, 3)

In [8]:
df.head()

Unnamed: 0,col1,col2,col3
0,0.840138,0.292938,"[10, 4, 5, 7, 6]"
1,0.085789,0.832004,"[2, 10, 9]"
2,0.9105,0.270349,"[0, 4, 4, 2, 5]"
3,0.689022,0.308323,"[7, 1, 9, 9, 3]"
4,0.95888,0.523427,"[1, 6, 0]"


### Saving (and reading back) from HTML  

In [9]:
df_html = df.to_html()

In [10]:
output_file_name = notebook_outbook_prefix + "analysis.html"
with open(output_file_name, 'w') as f:
    f.write(df_html)

In [12]:
df_list = pd.read_html(output_file_name)
df_list

[   Unnamed: 0      col1      col2              col3
 0           0  0.840138  0.292938  [10, 4, 5, 7, 6]
 1           1  0.085789  0.832004        [2, 10, 9]
 2           2  0.910500  0.270349   [0, 4, 4, 2, 5]
 3           3  0.689022  0.308323   [7, 1, 9, 9, 3]
 4           4  0.958880  0.523427         [1, 6, 0]
 5           5  0.038681  0.793978      [1, 8, 0, 5]
 6           6  0.357967  0.954240      [7, 5, 0, 6]
 7           7  0.535167  0.029869     [10, 9, 4, 1]
 8           8  0.706353  0.673613         [5, 8, 1]
 9           9  0.927402  0.693116      [1, 5, 7, 4]]

### Latex  

In [15]:
df.to_latex()

'\\begin{tabular}{lrrl}\n\\toprule\n{} &      col1 &      col2 &              col3 \\\\\n\\midrule\n0 &  0.840138 &  0.292938 &  [10, 4, 5, 7, 6] \\\\\n1 &  0.085789 &  0.832004 &        [2, 10, 9] \\\\\n2 &  0.910500 &  0.270349 &   [0, 4, 4, 2, 5] \\\\\n3 &  0.689022 &  0.308323 &   [7, 1, 9, 9, 3] \\\\\n4 &  0.958880 &  0.523427 &         [1, 6, 0] \\\\\n5 &  0.038681 &  0.793978 &      [1, 8, 0, 5] \\\\\n6 &  0.357967 &  0.954240 &      [7, 5, 0, 6] \\\\\n7 &  0.535167 &  0.029869 &     [10, 9, 4, 1] \\\\\n8 &  0.706353 &  0.673613 &         [5, 8, 1] \\\\\n9 &  0.927402 &  0.693116 &      [1, 5, 7, 4] \\\\\n\\bottomrule\n\\end{tabular}\n'

You can use it with print to get a nicely formatted output.

In [16]:
print(df.to_latex())

\begin{tabular}{lrrl}
\toprule
{} &      col1 &      col2 &              col3 \\
\midrule
0 &  0.840138 &  0.292938 &  [10, 4, 5, 7, 6] \\
1 &  0.085789 &  0.832004 &        [2, 10, 9] \\
2 &  0.910500 &  0.270349 &   [0, 4, 4, 2, 5] \\
3 &  0.689022 &  0.308323 &   [7, 1, 9, 9, 3] \\
4 &  0.958880 &  0.523427 &         [1, 6, 0] \\
5 &  0.038681 &  0.793978 &      [1, 8, 0, 5] \\
6 &  0.357967 &  0.954240 &      [7, 5, 0, 6] \\
7 &  0.535167 &  0.029869 &     [10, 9, 4, 1] \\
8 &  0.706353 &  0.673613 &         [5, 8, 1] \\
9 &  0.927402 &  0.693116 &      [1, 5, 7, 4] \\
\bottomrule
\end{tabular}



### Markdown  

In [17]:
print(df.to_markdown())

|    |      col1 |      col2 | col3             |
|---:|----------:|----------:|:-----------------|
|  0 | 0.840138  | 0.292938  | [10, 4, 5, 7, 6] |
|  1 | 0.0857885 | 0.832004  | [2, 10, 9]       |
|  2 | 0.9105    | 0.270349  | [0, 4, 4, 2, 5]  |
|  3 | 0.689022  | 0.308323  | [7, 1, 9, 9, 3]  |
|  4 | 0.95888   | 0.523427  | [1, 6, 0]        |
|  5 | 0.0386814 | 0.793978  | [1, 8, 0, 5]     |
|  6 | 0.357967  | 0.95424   | [7, 5, 0, 6]     |
|  7 | 0.535167  | 0.0298689 | [10, 9, 4, 1]    |
|  8 | 0.706353  | 0.673613  | [5, 8, 1]        |
|  9 | 0.927402  | 0.693116  | [1, 5, 7, 4]     |


### Excel  

In [23]:
output_file_name = notebook_outbook_prefix + "analysis.xlsx"
df.to_excel(output_file_name, index=False)

In [24]:
df_xlsx = pd.read_excel(output_file_name)
df_xlsx

Unnamed: 0,col1,col2,col3
0,0.840138,0.292938,"[10, 4, 5, 7, 6]"
1,0.085789,0.832004,"[2, 10, 9]"
2,0.9105,0.270349,"[0, 4, 4, 2, 5]"
3,0.689022,0.308323,"[7, 1, 9, 9, 3]"
4,0.95888,0.523427,"[1, 6, 0]"
5,0.038681,0.793978,"[1, 8, 0, 5]"
6,0.357967,0.95424,"[7, 5, 0, 6]"
7,0.535167,0.029869,"[10, 9, 4, 1]"
8,0.706353,0.673613,"[5, 8, 1]"
9,0.927402,0.693116,"[1, 5, 7, 4]"


### Text  

In [25]:
print(df.to_string())

       col1      col2              col3
0  0.840138  0.292938  [10, 4, 5, 7, 6]
1  0.085789  0.832004        [2, 10, 9]
2  0.910500  0.270349   [0, 4, 4, 2, 5]
3  0.689022  0.308323   [7, 1, 9, 9, 3]
4  0.958880  0.523427         [1, 6, 0]
5  0.038681  0.793978      [1, 8, 0, 5]
6  0.357967  0.954240      [7, 5, 0, 6]
7  0.535167  0.029869     [10, 9, 4, 1]
8  0.706353  0.673613         [5, 8, 1]
9  0.927402  0.693116      [1, 5, 7, 4]
