Unicode text table output from pandas dataframe: turns string objects into numbers and changes their representation #44

mjb-v9-5-2 · 2021-06-22T11:38:56Z

This brilliant tool is breaking when rendering a pandas table as text.

The data contains very long numbers stored as strings. The strings contain representations of long decimals and long ints with thousands separators. In the Pandas dataframe, they are stored as objects. When I output the table using a pytablewriter unicode writer, it produces a faithful rendering of the long ints, but it seems to process the strings representing long decimals as though they were numbers, and shows them as though they had been converted from strings into floats, with all the problems of string representation that floats bring: unwanted zeros after the last significant decimal digit on short decimals, and precision too short to show the whole number on long decimals.

For example:

"0.000000000000001" is represented by pytablewriter as "0.000000"
"0.001" as "0.001000"

Yet, with long numbers:

"1,000,000,000,000" is represented faithfully as "1,000,000,000,000".

The problem seems to be a general issue with decimals. Notwithstanding the fact that they are held as strings for the very purpose of ensuring their representation is as strings and not numbers, the pytablewriter output table applies different justification to the strings that represent integers and those that represent decimals. The former, it justifies left, the latter it justifies right. So it seems to be treating the strings that contain decimals as though they are numbers, converts them to float and then outputs them as numbers.

It justifies the string "1" to the right with the decimals as well.

thombashi · 2021-07-18T03:34:05Z

@mjb-v9-5-2
Thank you for your feedback.

The problems that you described are fixed for certain values at pytablewriter 0.62.0:

import pandas as pd
import pytablewriter as ptw

writer = ptw.UnicodeTableWriter(
    dataframe=pd.DataFrame(
        {"realnumber": ["0.000000000000001", "0.000000000000002"], "long": ["1,000,000,000,000", "1"]}
    ),
    margin=1,
    column_styles=[
        ptw.style.Style(thousand_separator=","),
        ptw.style.Style(thousand_separator=","),
    ]
)
writer.write_table()

┌───────────────────┬───────────────────┐
│    realnumber     │       long        │
├───────────────────┼───────────────────┤
│ 0.000000000000001 │ 1,000,000,000,000 │
├───────────────────┼───────────────────┤
│ 0.000000000000002 │                 1 │
└───────────────────┴───────────────────┘

However, in the case of mixed decimal place values, the problem still exists as before:

import pandas as pd
import pytablewriter as ptw

writer = ptw.UnicodeTableWriter(
    dataframe=pd.DataFrame(
        {"realnumber": ["0.000000000000001", "0.1"], "long": ["1,000,000,000,000", "1"]}
    ),
    margin=1,
    column_styles=[
        ptw.style.Style(thousand_separator=","),
        ptw.style.Style(thousand_separator=","),
    ]
)
writer.write_table()

┌─────────────┬───────────────────┐
│ realnumber  │       long        │
├─────────────┼───────────────────┤
│ 0.000000000 │ 1,000,000,000,000 │
├─────────────┼───────────────────┤
│ 0.100000000 │                 1 │
└─────────────┴───────────────────┘

I will also fix this in the future version.

thombashi · 2021-09-20T15:55:07Z

The problem fixed at pytablewriter 0.63.0

thombashi added the enhancement label Jul 4, 2021

thombashi added a commit that referenced this issue Jul 17, 2021

Improve output precision for numbers: #44

eab4370

thombashi added a commit that referenced this issue Jul 18, 2021

Improve output precision of real numbers: #44

75f9811

thombashi closed this as completed Sep 20, 2021

araffin mentioned this issue May 31, 2022

Round reward mean and std in benchmark markdown file DLR-RM/rl-baselines3-zoo#243

Closed

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unicode text table output from pandas dataframe: turns string objects into numbers and changes their representation #44

Unicode text table output from pandas dataframe: turns string objects into numbers and changes their representation #44

mjb-v9-5-2 commented Jun 22, 2021 •

edited

Loading

thombashi commented Jul 18, 2021 •

edited

Loading

thombashi commented Sep 20, 2021

Unicode text table output from pandas dataframe: turns string objects into numbers and changes their representation #44

Unicode text table output from pandas dataframe: turns string objects into numbers and changes their representation #44

Comments

mjb-v9-5-2 commented Jun 22, 2021 • edited Loading

thombashi commented Jul 18, 2021 • edited Loading

thombashi commented Sep 20, 2021

mjb-v9-5-2 commented Jun 22, 2021 •

edited

Loading

thombashi commented Jul 18, 2021 •

edited

Loading