Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tabulators break formatting #113

Open
dremerb opened this issue May 13, 2021 · 6 comments
Open

Tabulators break formatting #113

dremerb opened this issue May 13, 2021 · 6 comments

Comments

@dremerb
Copy link

dremerb commented May 13, 2021

I am trying to print a diff between different commits in Git. The string contains a tabulator character that seems to interfere with the centering.
Example string from diff:

diff --git program.c program.c
index 05ced0c..804cd7d 100644
--- partdiff.c
+++ partdiff.c
@@ -377,6 +377,7 @@ main (int argc, char** argv)
        struct calculation_arguments arguments;
        struct calculation_results results;

+       printf("Some random text!");
        askParams(&options, argc, argv);

        initVariables(&arguments, &results, &options);

Actually spelled out the "\t":

diff --git partdiff.c partdiff.c
index 05ced0c..804cd7d 100644
--- partdiff.c
+++ partdiff.c
@@ -377,6 +377,7 @@ main (int argc, char** argv)
 \tstruct calculation_arguments arguments;
 \tstruct calculation_results results;

+\tprintf("Some random text!");
 \taskParams(&options, argc, argv);

 \tinitVariables(&arguments, &results, &options);

Printing with a PrettyTable breaks the layout. If you replace the "\t" with any character it works fine,

|   4   | 2021-05-13 17:41:21.512372 | BUILD |    make   |       0       |       f0252dd       |         diff --git partdiff.c partdiff.c         |     |
|       |                            |       |           |               |                     |          index 05ced0c..804cd7d 100644           |     |
|       |                            |       |           |               |                     |                  --- partdiff.c                  |     |
|       |                            |       |           |               |                     |                  +++ partdiff.c                  |     |
|       |                            |       |           |               |                     | @@ -377,6 +377,7 @@ main (int argc, char** argv) |     |
|       |                            |       |           |               |                     |                                struct calculation_arguments arguments;                          |     |
|       |                            |       |           |               |                     |                                struct calculation_results results;                          |     |
|       |                            |       |           |               |                     |                                                  |     |
|       |                            |       |           |               |                     |                         +      printf("Some random text!");                          |     |
|       |                            |       |           |               |                     |                                askParams(&options, argc, argv);                          |     |
|       |                            |       |           |               |                     |                                                  |     |
|       |                            |       |           |               |                     |                                initVariables(&arguments, &results, &options);                          |     |
|       |                            |       |           |               |                     |                                                  |     |
+-------+----------------------------+-------+-----------+---------------+---------------------+--------------------------------------------------+-----+

@hugovk
Copy link
Member

hugovk commented May 15, 2021

How are you actually using PrettyTable? Please provide some example Python code.

@dremerb
Copy link
Author

dremerb commented May 15, 2021

        startindex = page * num_elems
        showarr = inputarr[startindex:startindex+num_elems]
        headers = ["Index", "Date", "Type", "Command", "Internal Hash", "Project Commit Hash", "Diff from last commit", "Tag"]
        table = PrettyTable()
        table.field_names = headers
        # this line is currently used to remove "\t" chars from problematic column, breaks without
        table.add_rows(map(lambda x: (x[0], x[1], x[2], x[3], x[4], x[5], x[6].replace("\t", ""), x[7]), showarr))
        table.align = "l"
        print(table)

@hugovk
Copy link
Member

hugovk commented May 15, 2021

Thanks, please could you create a "Minimal, Reproducible Example"?

https://stackoverflow.com/help/minimal-reproducible-example

@dremerb
Copy link
Author

dremerb commented May 15, 2021

from prettytable import PrettyTable

data = [[   "first",
            "second",
            """diff --git partdiff.c partdiff.c
index 05ced0c..804cd7d 100644
--- partdiff.c
+++ partdiff.c
@@ -377,6 +377,7 @@ main (int argc, char** argv)
 \tstruct calculation_arguments arguments;
 \tstruct calculation_results results;

+\tprintf("Some random text!");
 \taskParams(&options, argc, argv);

 \tinitVariables(&arguments, &results, &options);""",
            "fourth"],
        ]

headers = ["col1", "col2", "col3", "col4"]

table = PrettyTable()
table.field_names = headers
table.add_rows(data)
print(table)

produces

+-------+--------+--------------------------------------------------+--------+
|  col1 |  col2  |                       col3                       |  col4  |
+-------+--------+--------------------------------------------------+--------+
| first | second |         diff --git partdiff.c partdiff.c         | fourth |
|       |        |          index 05ced0c..804cd7d 100644           |        |
|       |        |                  --- partdiff.c                  |        |
|       |        |                  +++ partdiff.c                  |        |
|       |        | @@ -377,6 +377,7 @@ main (int argc, char** argv) |        |
|       |        |                              struct calculation_arguments arguments;                          |        |
|       |        |                              struct calculation_results results;                          |        |
|       |        |                                                  |        |
|       |        |                         +    printf("Some random text!");                          |        |
|       |        |                              askParams(&options, argc, argv);                          |        |
|       |        |                                                  |        |
|       |        |                              initVariables(&arguments, &results, &options);                          |        |
+-------+--------+--------------------------------------------------+--------+

You could also try adding a table.align = "l", shifts content correctly inside the column, but delimiters are broken in exactly the same way.

@hugovk
Copy link
Member

hugovk commented May 16, 2021

Thanks for that!

So it looks like this is the Python's print reading the \t as a tab character.

For example:

>>> print("a\tb")
a	b
>>> print(" \tstruct calculation_arguments arguments;")
 	struct calculation_arguments arguments;

If I preprocess the data and replace \t with \\t, to escape the backslash:

from prettytable import PrettyTable

data = [[   "first",
            "second",
            """diff --git partdiff.c partdiff.c
index 05ced0c..804cd7d 100644
--- partdiff.c
+++ partdiff.c
@@ -377,6 +377,7 @@ main (int argc, char** argv)
 \\tstruct calculation_arguments arguments;
 \\tstruct calculation_results results;

+\\tprintf("Some random text!");
 \\taskParams(&options, argc, argv);

 \\tinitVariables(&arguments, &results, &options);""",
            "fourth"],
        ]

headers = ["col1", "col2", "col3", "col4"]

table = PrettyTable()
table.field_names = headers
table.add_rows(data)
print(table)

It prints nicely as:

+-------+--------+---------------------------------------------------+--------+
|  col1 |  col2  |                        col3                       |  col4  |
+-------+--------+---------------------------------------------------+--------+
| first | second |          diff --git partdiff.c partdiff.c         | fourth |
|       |        |           index 05ced0c..804cd7d 100644           |        |
|       |        |                   --- partdiff.c                  |        |
|       |        |                   +++ partdiff.c                  |        |
|       |        |  @@ -377,6 +377,7 @@ main (int argc, char** argv) |        |
|       |        |      \tstruct calculation_arguments arguments;    |        |
|       |        |        \tstruct calculation_results results;      |        |
|       |        |                                                   |        |
|       |        |          +\tprintf("Some random text!");          |        |
|       |        |         \taskParams(&options, argc, argv);        |        |
|       |        |                                                   |        |
|       |        |  \tinitVariables(&arguments, &results, &options); |        |
+-------+--------+---------------------------------------------------+--------+

But I guess it doesn't make much sense to show \t in the output.


Another idea is to pre-process the \t as four spaces, or two spaces, whatever you prefer:

from prettytable import PrettyTable

data = [[   "first",
            "second",
            """diff --git partdiff.c partdiff.c
index 05ced0c..804cd7d 100644
--- partdiff.c
+++ partdiff.c
@@ -377,6 +377,7 @@ main (int argc, char** argv)
     struct calculation_arguments arguments;
     struct calculation_results results;

+    printf("Some random text!");
     askParams(&options, argc, argv);

     initVariables(&arguments, &results, &options);""",
            "fourth"],
        ]

headers = ["col1", "col2", "col3", "col4"]

table = PrettyTable()
table.field_names = headers
table.add_rows(data)
print(table)

Produces:

+-------+--------+-----------------------------------------------------+--------+
|  col1 |  col2  |                         col3                        |  col4  |
+-------+--------+-----------------------------------------------------+--------+
| first | second |           diff --git partdiff.c partdiff.c          | fourth |
|       |        |            index 05ced0c..804cd7d 100644            |        |
|       |        |                    --- partdiff.c                   |        |
|       |        |                    +++ partdiff.c                   |        |
|       |        |   @@ -377,6 +377,7 @@ main (int argc, char** argv)  |        |
|       |        |          struct calculation_arguments arguments;    |        |
|       |        |            struct calculation_results results;      |        |
|       |        |                                                     |        |
|       |        |          +    printf("Some random text!");          |        |
|       |        |             askParams(&options, argc, argv);        |        |
|       |        |                                                     |        |
|       |        |      initVariables(&arguments, &results, &options); |        |
+-------+--------+-----------------------------------------------------+--------+

How does that sound? Can you preprocess \t into 2 or 4 (or whatever) spaces?

I'm not sure if this should be handled by PrettyTable itself, whether it should make assumptions whether \t should be 2 or 4 spaces. Or if the centering algorithm can be adapted to somehow take into account a tab, or how it should do that.

@dremerb
Copy link
Author

dremerb commented May 16, 2021

Printing \t as a character does not make sense in my case, as it's simply for formatting, no information.
Imo PrettyTable could handle the replacing, makes for a better user experience - just throw in some data and it'll work. While I found the workaround pretty quickly, might not be the case for a Python beginner. For me \t is 4 spaces, but you could introduce an optional parameter to set it to whatever value.
But you could also argue that the user has to sanitize their data. In that case maybe document the behavior somewhere, so that the fix can be found easily?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants