Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pl.DataFrame.to_markdown #13907

Open
mkleinbort-ic opened this issue Jan 22, 2024 · 6 comments
Open

Add pl.DataFrame.to_markdown #13907

mkleinbort-ic opened this issue Jan 22, 2024 · 6 comments
Labels
A-formatting Area: display/repr of Polars objects enhancement New feature or an improvement of an existing feature

Comments

@mkleinbort-ic
Copy link

Description

It'd be good to have a to_markdown method at the dataframe level.

The string repr is almost right, just needs some tweeking to be valid markdown.

A very bad implementation in Python would be:

def to_markdown(df:pl.DataFrame)->str:
      '''Returns the repr of the dataframe as valid markdown'''

      lines = (df
            .pipe(str)
            .replace('┆','|')
            .replace('╪','|')
            .replace('╡','|')
            .replace('╞','|')
            .replace(' --- ', '     ')
            .replace('═', '-')
            .replace('┌', '|')
            .replace('┐', '|')
            .replace('└', '|')
            .replace('┘', '|')
            .replace('─', '')
            .replace('┬', '')
            .replace('┴', '')
            .replace('||', '')
            .split('\n')
      )

      lines_to_show = [lines[2]]+lines[5:] # column names + values (skips shape & data type)
      
      ans = '\n'.join([l.strip('│ ').strip('|') for l in lines_to_show])

      return ans
@mkleinbort-ic mkleinbort-ic added the enhancement New feature or an improvement of an existing feature label Jan 22, 2024
@baggiponte
Copy link
Contributor

I also received a feature request to cast to HTML; if one is accepted then the other would be pretty much straighforward.

@deanm0000
Copy link
Collaborator

doesn't _repr_html already cast it to html or am I misunderstanding something?

@cjackal
Copy link
Contributor

cjackal commented Jan 22, 2024

Another (less-effort) implementation:

import polars as pl
import io

def to_markdown(df: pl.DataFrame) -> str:
    buf = io.StringIO()
    with pl.Config(
        tbl_formatting="ASCII_MARKDOWN",
        tbl_hide_column_data_types=True,
        tbl_hide_dataframe_shape=True,
    ):
        print(df, file=buf)
    buf.seek(0)
    return buf.read()

One may need more config options like tbl_rows=-1 though.

@MarcoGorelli
Copy link
Collaborator

pandas does this by deferring to tabulate - there's a feature there about adding support for polars astanin/python-tabulate#258
I guess contributing Polars support to tabulate would be the easiest way to make it happen

@cjackal
Copy link
Contributor

cjackal commented Jan 22, 2024

@MarcoGorelli I thought the opposite trying to implement pyarrow table support to it a while ago; the way tabulate support pandas is a bit too rough (checking hasattr(df, "value"), hasattr(df, "index"), hasattr(df.index, "name"), ..., ) for other third party dataframe libraries to support it without first converting to pandas.

@alexander-beedie
Copy link
Collaborator

alexander-beedie commented Jan 23, 2024

Another (less-effort) implementation:

Nice one; I think it's relatively unknown, but the Config object can also function as a decorator1, which works well with that pattern, eg:

@pl.Config( 
    tbl_formatting = "ASCII_MARKDOWN",        
    tbl_hide_column_data_types = True,
    tbl_hide_dataframe_shape = True,
)
def frame_to_markdown( df: pl.DataFrame ) -> str:
    return str(df)

If we could add left/right/center align options to the markdown output we'd have the basics covered at least 🤔

Footnotes

  1. https://github.com/pola-rs/polars/pull/9307

@stinodego stinodego added the A-formatting Area: display/repr of Polars objects label Feb 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-formatting Area: display/repr of Polars objects enhancement New feature or an improvement of an existing feature
Projects
None yet
Development

No branches or pull requests

7 participants