Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When getting the repr of a pandas dataframe in the repl, don't do any customization to avoid getting a big output. #1078

Closed
zljubisic opened this issue Oct 6, 2022 · 37 comments

Comments

@zljubisic
Copy link

zljubisic commented Oct 6, 2022

Type: Bug

Please look at the picture:

image

If I am debugging a pytest test, when I reach the breakpoint, debug console behaves as I have described in the picture above.

I have few questions:

  1. why you are not showing (why "..."?) all columns because there is enough width in my terminal?
  2. why 3rd row is swallowed after setting max_columns = None
  3. why 3rd row is swalowed after expand_frame_repr = False

Extension version: 2022.14.0
VS Code version: Code 1.71.2 (74b1f979648cc44d385a2286793c226e611f59e7, 2022-09-14T21:03:37.738Z)
OS version: Windows_NT x64 10.0.19042
Modes:
Sandboxed: No
Remote OS version: Linux x64 3.10.0-1127.el7.x86_64

In case you need it:

df = pd.DataFrame(
    [
        ['A', 'short text', 'very very very very very very very very very very long text', 1],
        ['A', '1 very 2 very 3 very 4 very 5 very 6 very 7 very 8 very 9 very 10 very text', 'short text', 2],
        ['A', 'not so long text', 'also short text', 3],
    ], columns=['first', 'description', 'middle', 'last']
)
df
  first  ... last
0     A  ...    1
1     A  ...    2
2     A  ...    3

[3 rows x 4 columns]
len(df)
3
pd.options.display.max_colwidth = None
df
  first  ... last
0     A  ...    1
1     A  ...    2
2     A  ...    3

[3 rows x 4 columns]
pd.options.display.max_columns = None
df
  first                                        description  \
0     A                                         short text   
1     A  1 very 2 very 3 very 4 very 5 very 6 very 7 ve...   
2     A                                   not so long text   

                                              middle  last  
0  very very very very very very very very very v...     1  
1                                         short text     2  
2                                    also short text     3  
pd.options.display.expand_frame_repr = False
df
  first                                        description                                             middle  last
0     A                                         short text  very very very very very very very very very v...     1
1     A  1 very 2 very 3 very 4 very 5 very 6 very 7 ve...                                         short text     2
2     A                                   not so long text                                    also short text     3
print(df)
df.columns
Index(['first', 'description', 'middle', 'last'], dtype='object')
df.index
RangeIndex(start=0, stop=3, step=1)

@karthiknadig karthiknadig transferred this issue from microsoft/vscode-python Oct 6, 2022
@fabioz
Copy link
Collaborator

fabioz commented Oct 6, 2022

Regarding the print not showing, this is because pytest will grab the output itself (the issue being that you need to add --capture=no when running so that it doesn't collect the output -- see issue: microsoft/vscode-python#19322).

Regarding pandas customization, we really set up lower limits in the debugger (you could use the print to use the pandas configuration directly if you setup pytest to stop capturing the output).

To setup the limits used in the debugger you need to set the following environment variables (to the values you find appropriate in your use case):

PYDEVD_PANDAS_MAX_ROWS=300
PYDEVD_PANDAS_MAX_COLS=300
PYDEVD_PANDAS_MAX_COLWIDTH=80

Note that raising those values could make interacting with pandas slower when you're stepping in general in the debugger (I'm not sure why, but getting the representation of a pandas data frame is very slow in pandas and the debugger will ask for that representation whenever a pandas data frame is found).

As microsoft/vscode#162965 is tracking the remaining issue (where the contents of the debug console messages seem clipped in your use case), I'm closing this one.

@fabioz fabioz closed this as completed Oct 6, 2022
@karthiknadig
Copy link
Member

We have a PR on python extension to set --capture=no by default. microsoft/vscode-python#19903

@zljubisic
Copy link
Author

You closed the issue, but the problem remains even if settings.json i like this (please notice "-s" -> "--capture=no":

{
    "python.testing.unittestEnabled": false,
    "python.testing.pytestEnabled": true,
    "python.testing.pytestArgs": [
        "--no-cov", "-s"
    ]
}

@zljubisic
Copy link
Author

Regarding pandas customization, we really set up lower limits in the debugger (you could use the print to use the pandas configuration directly if you setup pytest to stop capturing the output).

First, when you say "setup pytest to stop capturing the output" am I doing it correctly with settings.json above or there is another procedure?

To setup the limits used in the debugger you need to set the following environment variables (to the values you find appropriate in your use case):

PYDEVD_PANDAS_MAX_ROWS=300
PYDEVD_PANDAS_MAX_COLS=300
PYDEVD_PANDAS_MAX_COLWIDTH=80

How can I set pd.options.display.expand_frame_repr = False in env. vars case?
How to set env vars to None? unset?

Note that raising those values could make interacting with pandas slower when you're stepping in general in the debugger (I'm not sure why, but getting the representation of a pandas data frame is very slow in pandas and the debugger will ask for that representation whenever a pandas data frame is found).

I really don't get it, why you have so many problems with pandas dataframes.
Under the hood, they are numpy arrays which is very fast type of storage.

Actually, vscode way of debugging code with pandas dataframes is the worst way I have ever experienced.

For example, we have a lot of pydantic models that have several dataframes as attributes, something like this:

class MyObj(BaseModel):
    df1: pd.DataFrame
    df2: pd.DataFrame

To be able to see the dataframes in debugger, I have to add to each model a __repr__() method in which I will print dataframes' len() (because of swallowing last line problem) and print dataframes in their entirety. I have to print few "\n" at the and to raise chances of printing all lines.
So if I create an object x = MyObj(), when I am in debug console I can just type "x " and now chances are that I will get the output I need, but now I am getting all the dataframes all the time.

Do you think this is normal for an IDE?

As microsoft/vscode#162965 is tracking the remaining issue (where the contents of the debug console messages seem clipped in your use case), I'm closing this one.

I think you shouldn't.

@zljubisic
Copy link
Author

Furthermore, if this is an idea:

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Current File",
            "type": "python",
            "request": "launch",
            "program": "${file}",
            "console": "integratedTerminal",
            "justMyCode": false,
            "env": {
                "PYDEVD_PANDAS_MAX_COLWIDTH": "1000",
            }
        }
    ]
}

than it doesn't work.

@fabioz
Copy link
Collaborator

fabioz commented Oct 7, 2022

Many issues in the same place... let's see one at a time:

  1. Printing doesn't work (this is due to pytest capturing and setting -s in the args -- in the settings.json -- does work for me):

i.e.:

{
"python.testing.unittestEnabled": false,
"python.testing.pytestEnabled": true,
"python.testing.pytestArgs": [
"--no-cov", "-s"
]
}

When I add the -s printing does work for me (see image below).

Can you check sys.argv to make sure it's actually using that value in your use case? Maybe the settings aren't being loaded from where you expect in your use case?

image

@fabioz
Copy link
Collaborator

fabioz commented Oct 7, 2022

  1. Customizing the repr for pandas which is given be the debugger.

Right now because pandas is slow to print (it can easily take 0.3 seconds even on small dataframe and having many dataframes or a single big one without the current limits can easily take a few seconds which adds to a very annoying debug experience due to the repr slowness of pandas, which is why the limits were made much smaller).

Regarding pandas customization, we really set up lower limits in the debugger (you could use the print to use the pandas configuration directly if you setup pytest to stop capturing the output).

First, when you say "setup pytest to stop capturing the output" am I doing it correctly with settings.json above or there is another procedure?

To setup the limits used in the debugger you need to set the following environment variables (to the values you find appropriate in your use case):

PYDEVD_PANDAS_MAX_ROWS=300
PYDEVD_PANDAS_MAX_COLS=300
PYDEVD_PANDAS_MAX_COLWIDTH=80

How can I set pd.options.display.expand_frame_repr = False in env. vars case? How to set env vars to None? unset?

That isn't really handled by the debugger (so, you have to set it at runtime as usual).

Note that raising those values could make interacting with pandas slower when you're stepping in general in the debugger (I'm not sure why, but getting the representation of a pandas data frame is very slow in pandas and the debugger will ask for that representation whenever a pandas data frame is found).

I really don't get it, why you have so many problems with pandas dataframes. Under the hood, they are numpy arrays which is very fast type of storage.

Yes, storage is good as is usage of its api, just converting that to a string the user can see is very slow in pandas (they probably didn't really optimize those code paths).

You can't really unset those values, but you can change it to different values as you see fit. If you're interested, the code which handles this is:

https://github.com/microsoft/debugpy/blob/main/src/debugpy/_vendored/pydevd/pydevd_plugins/extensions/types/pydevd_plugin_pandas_types.py

-- you can locally comment the code which customizes the pandas display options (i.e.: customize_pandas_options -- just make it yield and comment all the other code).

-- The values for the PANDAS_MAX_ROWS / PANDAS_MAX_COLS / PANDAS_MAX_COLWIDTH are loaded from those constants. I guess we could provide a way to set it to None there (but that's currently not supported and yours is the first complain regarding it... I guess most of it steams from the fact that you should be able to use print to get any format you want but it's not working in your case due to another library capturing your output).

@fabioz
Copy link
Collaborator

fabioz commented Oct 7, 2022

Furthermore, if this is an idea:

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Current File",
            "type": "python",
            "request": "launch",
            "program": "${file}",
            "console": "integratedTerminal",
            "justMyCode": false,
            "env": {
                "PYDEVD_PANDAS_MAX_COLWIDTH": "1000",
            }
        }
    ]
}

than it doesn't work.

Note that this will only override the current if the current representation is too big (i.e.: it only makes the repr from pandas smaller, but if it's small already it'll not make it bigger).

So, you probably want to check the current values you have for pandas and customize those...

i.e.:

import pandas
pandas.get_option("display.max_rows")
pandas.get_option("display.max_columns")
pandas.get_option("display.max_colwidth")

if those are smaller you should raise those first...

Also, just to note, if you're running a test from the testing view you also need to add a "purpose": ["debug-test"], to the configuration...

@zljubisic
Copy link
Author

zljubisic commented Oct 7, 2022

@fabioz Do you have an explanation for this ?

image

Version: 1.72.0 (user setup)
Commit: 64bbfbf67ada9953918d72e1df2f4d8e537d340e
Date: 2022-10-04T23:20:39.912Z
Electron: 19.0.17
Chromium: 102.0.5005.167
Node.js: 16.14.2
V8: 10.2.154.15-electron.0
OS: Windows_NT x64 10.0.19044
Sandboxed: No

@fabioz
Copy link
Collaborator

fabioz commented Oct 7, 2022

@zljubisic on the first line the header is aligned (it's just that it appears in the same line as the tests/...:test_print_one from pytest) and the default print seems right for me.

I don't have an explanation for the smaller print after changing pd.options.display.max_colwidth = 500 though (and I can't reproduce it here) -- could be dependent on the pandas version you're using...

Given that the print just calls __str__ which is implemented by pandas itself you'll have to ask the pandas maintainers... when you're dealing with that print the debugger doesn't customize anything, it just calls the print and I don't know how pandas itself is customized in that case (you could set "justMyCode": false in the launch config and step into it if you're curious about why it does that...)

@fabioz
Copy link
Collaborator

fabioz commented Oct 7, 2022

Actually, I was able to reproduce the case where it gets smaller by making the terminal window smaller (apparently pandas checks the size of the terminal and as it sees that the string wouldn't fit in the terminal it just elides the column).

You'll have to refer to the pandas docs / ask the pandas maintainers how to override that though...

@zljubisic
Copy link
Author

zljubisic commented Oct 7, 2022

@fabioz you must kidding me. I have just executed pytest -s --no-cov in windows terminal (out of vscode) in two different terminal sizes. Both times dataframe is printed correctly.
Please be careful when you are accusing pandas because pandas is a remarkable python library.
All options from display are very well tested because they are used very often.
No one in data science has a small dataframes, so you have to use different display options to see what you want. There is an options context managers (https://pandas.pydata.org/docs/reference/api/pandas.option_context.html) to change options on the fly only for particular case. I am mentioning this just to explain to you how important it is if you work with dataframes.
Furthermore, if you try to "View Value in Data Viewer", it is slow, but more important, all column widths are the same, you can make them wider (except the very last column) but if you press refresh, columns are again of the same width, so data viewer is useless.

Some of things like jupyter notebooks are quite pleasant to use, vscode as editor is quite good, but working with pandas dataframes is not acceptable.
Try to run this example in pycharm to see how it will look like... and you closed this issue.

image

image

@fabioz
Copy link
Collaborator

fabioz commented Oct 7, 2022

Please be careful when you are accusing pandas because pandas is a remarkable python library.

I'm sorry if I passed you the impression that pandas is a bad library, that wasn't my intention... all I'm saying is that the debugger has no control over the printing is done by pandas itself and that the printing itself is really slow on the pandas side, but this is just a note on that particular case, not on the awesomes of pandas ;)

Try to run this example in pycharm to see how it will look like... and you closed this issue.

As far as I know, PyCharm doesn't have a real tty when you're doing that run, so, you're not comparing the same thing... I took a quick look at their manual and apparently you can override that using pd.options.display.width = None (or you can make PyCharm behave the same way by setting something as pd.options.display.width = 60).

Furthermore, if you try to "View Value in Data Viewer", it is slow, but more important, all column widths are the same, you can make them wider (except the very last column) but if you press refresh, columns are again of the same width, so data viewer is useless.

This isn't controlled by debugpy. You should check with the vscode jupyter team regarding this.

Try to run this example in pycharm to see how it will look like... and you closed this issue.

Well, the clipping issue is from vscode, the stdout capturing is from pytest, the data viewer is from jupyter and the printing is from pandas.

But now, after thinking a bit about it, I'm going to reopen because there's one thing which could be improved on debugpy which is not customizing anything when the representation is gotten on the debug console repl (right now it makes no distinction if it's being printed for the debug console or a watch window, which is why when you just type df in the debug console the result is different from print(df) -- I guess that particular bit can be improved, so, I'm going to reopen and change the title to reflect that).

@fabioz fabioz reopened this Oct 7, 2022
@fabioz fabioz changed the title Pandas dataframes in (py)test debug console are broken When getting the repr of a pandas dataframe in the repl, don't do any customization to avoid getting a big output. Oct 7, 2022
@zljubisic
Copy link
Author

@fabioz I must say that I am surprised with the way you (as an organization) are dealing with issues.
Im my very first post in this thread I showed several problems:

  1. default column width = 80 (PYDEVD_PANDAS_MAX_COLWIDTH=80) is not working
  2. pd.options.display.max_colwidth = is not working
  3. missing rows when df is executed directly in debug console
  4. non working print(df) (even with "--capture=no"

If I put "env": {"PYDEVD_PANDAS_MAX_COLWIDTH": "1000",} it is still not working as well as if I execute "export PYDEVD_PANDAS_MAX_COLWIDTH=1000".

You said that you have an issue which solves default "--capture=no" (microsoft/vscode-python#19903) but I don't think it is relevant if I put in settings "python.testing.pytestArgs": ["--no-cov", "-s"], and in sys.argv there is --no-capture. So, it is not a matter of defaults, it is matter that even it is set, it is not working.

You also said that playing with pandas options can make interaction slower, but we are not talking about performances here, we are talking about pandas options that don't work at all.

About pandas.str() it is working everywhere except vscode. I never had an issue with it in pycharm, shell, jupyter(lab) notebook... if you execute print out of vscode and change terminal size, all the time dataframe is printed correctly.

And then, from time to time last line is not displayed.

After my all remarsk, you are changing the subject of the issue from "Pandas dataframes in (py)test debug console are broken" to "When getting the repr of a pandas dataframe in the repl, don't do any customization to avoid getting a big output".
You would like to say that everything is OK with pandas dataframes in vscode?
At least it would be fair to put somewhere that vscode doesn't support pandas dataframes.

Maybe I am a bit irritated by your (as an organization) attitude. because after presentation of the problem, all said is practically ignored.
That will not improve the product. There is no point of introducing new features if something like famous pandas library is not working.

At the moment my colleagues and I are using import pdb; pdb.set_trace() for debugging where we are dealing with pandas dataframes.
pdb shows everything as expected. All lines are presented, it reacts to the display options... strange, isn't it?

Anyway, in case that you think what I have reported here is not OK, maybe I can help in the process.

@int19h
Copy link
Contributor

int19h commented Oct 13, 2022

Please bear in mind that this is a repository specifically for the debugger - which, by the way, is not even VSCode-specific. As explained above, this is really a set of different issues, some of which have to do with the simple fact that Debug Console is not a real terminal (and thus cannot be compared to e.g. running pdb in one), and some are bugs in code that is not even in this repo.

So let's try to disentangle this issue into pieces that can be individually tackled and create separate issues for them in the appropriate repositories. So far as I can tell, these are:

  1. Missing row in output - looks like you've already filed The representation of a variable introspected in the Debug Console can be clipped on VSCode. vscode#162965
  2. Various issues with the Data Viewer - these should go into https://github.com/microsoft/vscode-jupyter; I took a quick look but couldn't find anything that would be an obvious dupe.
  3. print() not outputting anything - the puzzle here is that this only repros without -s, and there's a known issue around pytest, hence the assumption. If this is something else, we'll need to dig further; can you please file a separate issue (in this same repo)? I think it would also be interesting to see what sys.stdout is set to - this would be the most reliable way to tell if pytest output capture is still somehow in effect or not.
  4. Not being able to set column width via PYDEVD_PANDAS_MAX_COLWIDTH - this one definitely looks like a bug on our end.

Did I miss anything?

@zljubisic
Copy link
Author

Hi @int19h and thank you for interfering here.
0. yes, looks like we are trying to solve it in microsoft/vscode#162965 but with no luck

  1. thanks for pointing out where the issue should be opened (I will open it there)
  2. looks like this is partially solved through print() doesn't work in debug console while debugging pytests vscode-python#19949, but still there is a problem with additional one or two empty line(s) that shows from time to time (print() doesn't work in debug console while debugging pytests vscode-python#19949 (comment))
  3. yes, I am not able to manage anything using
    PYDEVD_PANDAS_MAX_ROWS=300
    PYDEVD_PANDAS_MAX_COLS=300
    PYDEVD_PANDAS_MAX_COLWIDTH=80

image

As you can see on the picture above max width of all df columns = 136 (df is from the very first post here). We could add some characters for space between columns, for index... you can see what is dataframe representation in the debug console with default settings.

Default column width is PYDEVD_PANDAS_MAX_COLWIDTH=80 and if you consider dataframe like this one (df2 = pd.DataFrame({'first': range(3), 'description': ['0123456789' * 8]*3, 'middle': ['0123456789' * 8]*3, 'last': range(3)})), you can see that PYDEVD_PANDAS_MAX_COLWIDTH is not respected at all.

image

Even this is truncated:

image

Furthermore, after creating several issues I still don't know what to do to be able to see non truncated dataframe in debug console. For example, if I created a dataframe described in the very first post, what I have to do to see it in its entirety with no "..." nor braking lines, let's say in the same way as I would see it in regular python console or jupyter(lab) notebook, or pdb...?

@fabioz
Copy link
Collaborator

fabioz commented Oct 17, 2022

what I have to do to see it in its entirety with no "..." nor braking lines, let's say in the same way as I would see it in regular python console or jupyter(lab) notebook, or pdb...?

Besides the other configurations for the column width you also need to configure the display width for pandas (because the default value is based on the number of columns from the terminal) with:

pandas.options.display.width = None

@zljubisic
Copy link
Author

@fabioz, please enter the debug console, create this dataframe as:

df = pd.DataFrame(
    [
        ['A', 'short text', 'very very very very very very very very very very long text', 1],
        ['A', '1 very 2 very 3 very 4 very 5 very 6 very 7 very 8 very 9 very 10 very text', 'short text', 2],
        ['A', 'not so long text', 'also short text', 3],
    ], columns=['first', 'description', 'middle', 'last']
)

and tell me what I have to do to see all rows and columns in their entirety.

@fabioz
Copy link
Collaborator

fabioz commented Oct 19, 2022

The defaults that pandas has when it initializes when running in the VSCode terminal don't allow for much to be shown, so, what you have to do is configure pandas to disable those limits with:

pd.options.display.max_colwidth = None
pd.options.display.width = None

See screenshot below:

image

@zljubisic
Copy link
Author

zljubisic commented Oct 19, 2022

@fabioz I could say that this works, although I don't know why dataframe is not printed from very first multiline command and also, I don't see the difference between df and print(df).

image

Thanks.

In case you need it:

import pandas as pd
import numpy as np
nrows = 10
ncols = 30
col_names = [f"col_{cnt:02}" for cnt in range(ncols)]
df = pd.DataFrame(np.random.randint(0,100,size=(nrows, ncols))*1000000, columns=col_names)
df

@zljubisic
Copy link
Author

But in the second try looks like you are not right:

import pandas as pd
import numpy as np
nrows = 10
ncols = 50
col_names = [f"col_{cnt:02}" for cnt in range(ncols)]
df = pd.DataFrame(pd.util.testing.rands_array(50, (nrows, ncols)), columns=col_names)
df

image

@fabioz
Copy link
Collaborator

fabioz commented Oct 19, 2022

Just doing df will end up being constrained by the values in the environment variables in the constants:

PYDEVD_PANDAS_MAX_ROWS=300
PYDEVD_PANDAS_MAX_COLS=300
PYDEVD_PANDAS_MAX_COLWIDTH=80

So, you'll need to set those to really large numbers if you don't want those constrains when doing just a df (because in this case the value is further constrained by the debugger -- the plan is to remove this constraint when evaluating in the repl as a part of resolving this ticket).

As for why it doesn't work from the start like that it's because pandas itself sets small limits by default when running inside the VSCode terminal. You can see that by checking the values it has set with:

print(pd.options.display.max_colwidth)
print(pd.options.display.width)

The debugger doesn't really have a say in that, it's pandas that decides/sets those values...

i.e.:
image

@zljubisic
Copy link
Author

@fabioz maybe you haven't noticed, but at the picture above you can see that I haven't got the full column size with df nor print(df) after applying

pd.options.display.max_colwidth = None
pd.options.display.width = None. 

That is the reason why I said that your solution is not working.
I am repeating the picture with arrows that will show up what is wrong.
Please notice what has happened with the row with index 0.
Do you agree that this is wrong?

image

@zljubisic
Copy link
Author

@fabioz regarding these three variables:

PYDEVD_PANDAS_MAX_ROWS=300
PYDEVD_PANDAS_MAX_COLS=300
PYDEVD_PANDAS_MAX_COLWIDTH=80

can you please provide instructions to me when and where to put them if I want their influence in debug console?

@fabioz
Copy link
Collaborator

fabioz commented Oct 19, 2022

@fabioz regarding these three variables:

PYDEVD_PANDAS_MAX_ROWS=300
PYDEVD_PANDAS_MAX_COLS=300
PYDEVD_PANDAS_MAX_COLWIDTH=80

can you please provide instructions to me when and where to put them if I want their influence in debug console?

You can put them either in your OS environment variables (and then restart VSCode so that it picks up those values) or you can put them in the launch configuration env.

i.e.:
image

      "env": {
        "PYDEVD_PANDAS_MAX_ROWS": "999999",
        "PYDEVD_PANDAS_MAX_COLS": "999999",
        "PYDEVD_PANDAS_MAX_COLWIDTH": "999999"
      }

Note that if you're running from a shortcut -- such as test run -- you have to set the launch configuration purpose too (i.e.: "purpose": ["debug-test", "debug-in-terminal"])

@fabioz
Copy link
Collaborator

fabioz commented Oct 19, 2022

That is the reason why I said that your solution is not working.
I am repeating the picture with arrows that will show up what is wrong.
Please notice what has happened with the row with index 0.
Do you agree that this is wrong?

I agree that does seem weird, but that seems some issue in the layout in VSCode (possibly the same one you opened a report about in VSCode already)... If you scroll the console all the way to the right, do you see the start of the line 0? Can you reproduce it if you print it multiple times?

When I try it here it seems correct -- i.e.:

image

@zljubisic
Copy link
Author

@fabioz I have just tried it. First time it went well, but second print destroyed index line 2.

image

there is no line 2 index at all.

image

And than if I repeat print I am getting mixed results. Please notice different vertical space betwen print(df) and dataframe content.

image

For example, look at last dataframe line with index 0. It starts with "fmnPNf4WEmTGCy" which is ending of col_28. Here is csv of the dataframe's first line so you can find it (bold):

0,5jqlC8Kb97v77FLgpagGmIdAyrJjIzzayYHJU54Tu3XU3GyTWI,Wkppuo4RuuTpWkyeQrEyyofrEGVBfa2sc6yzOSLkf4LuttHqW5,GjkPkMenosi8zN7j28sJT33YbHRT8PWFDb8qrTmgxVE4IhUImQ,10KURM36dyFzKBn7YXVoKwlKogbh255UUKbv1wyGk0mEpTQyDs,u2qlIPWq3N2xWxb5BvP3uKWtqg2EVGSGd8XPCwyN6qSved3L4R,VqJiiTIlrnoqsh8s1GvsMQOmSImkHoU7QBU75mcSsEN5W8lhj7,L85qwggHLcsRpxqU6iMgRsuI1KyNdnlfqdng2yy9WgXONOy2jD,7cbQQlzgkzI22Gi8e4IQabmn4CA7SsHTfMIEkQgp7XCsK71x7i,5XnpCVmlgBIh0B8w4HlvTEpYQt7UhCR0P0xgEAIBYSP5oWtZEx,wFvweheFLPpxDSVRucNPSnEj8pR3yLr5dS5pRcosHrvqRNUafG,pK7YyqEAX3ccLWqWPyNK6cVKRDLnpiSmDMNxV0Mbw95Sspn8Lj,nje3RxD4nZ4Q2s1RmIwpBfH7bXXXkPOvT8OOHIAY4brOBD7FEs,gBQ5DT3P71hwYHVM3z59nUeNaph5znK4SgtTSJNwaHR8Wyv214,dd3Ldhv8R9qUt0JwHbtVy8Oytz43A3eQo1KlzWnugQSyS3Y67X,cYB2pPgFkl68Z5YKekfnvZ6rdVSsWyp2J1oMAOUbvLwjRDRbsr,XRD91yYpPAUmXRW89dpWtGPdiIcijjoOAVTymDp8tCctoLP0K1,hiLS1q2yoirwMOQQxcBb97ap6jHKxir0hXNklsOgxn5smB8kzj,AcTqLjt41BQNCDogJxYRGoB2EWqUwyFvPKBfpeBQHrigmPrwTA,mKdulC6kp6yB5S5FvsiCfDQugs8kOe6Slbg86Te40pC5c1RZ5B,IK56HsHiGel2B6cVOu2Uolcg1bcxzMOU0ZGRrAX0Au38yKe1N8,QJkpaGVOnu6lwWOtcRyik8A5OBW9E1YcGH5RgfFuRO1yz6N4ua,67LsvSspUPT2TSTtQxjrc2XzU5uZ4S0yflUUIlZRpbbkbok5jT,nouQE70AENiP6PXA5YkKSidkhPulWowgCjknjFJ2N2aWqpbfnw,C3zUh6axDfzjYGnSHB8pOy9XHigwu4HM6VsbeACIhopR0tiEqB,mWVsz5BdJCZ0SXD4DUZLDIavROAg9I1MKVxcyvPLr9lx7R0uaM,D7w7G9OoeJZ8UVTTLnjz9qZDvnmEdd7Ui9ZBsAfMTnT5hhNHPJ,KZhuxleg5Stdb1oL40FIWOCpHOLveXzahNp8PbvYuYODkrORYp,dG5CYKMr5rxZQEVG5bsENbyi9FgWrKWV82xyF4hBEQubBgOHqQ,h4YhPoJbdbaKAEN2Bmf3kC9EhaDXLVGLmAIOfmnPNf4WEmTGCy,FfKPSCMmbcGmIDiDXDxeIucefgMhI4mmGOEZ7EwFNY01FogkwM,wHPh9EnXJCUTwQXpRHSxOhrro5TuNR06LnclCjAlPCfZbJh8L4,QJ0y4lNMdlHRS0dTgt9Yee9C2MFYUKwpwsIeNlYEuaclVV3M79,YffsyWmandk8E1A1jrrXZfm6rdhuSyI6LqcfMOipefLYAB19kr,TxlkKMEVYRPM9K286SHgTr0jLqp42W9WLDI9SdHidSFnnfWMaL,9tYVaeMRp1Q0ARWlFmZilpZJsSc1WCMCkleQCsr6w7tyi1ghJc,Ys2EkSh242RZbFe30tuE1IOhVv0LTGePXEam9kOO80kJtYUuFr,7nOeZsVrV7bUVXRpeILnuMOcv2L6RPf8UHteDuxWVDHzRyMEYd,UI55qkQCu3sOOeDgSNzThgN4pCLjcBAek6tned7KkvAKyNzWWl,vwNqIfon4MztGtQ76pEiqbxDhs5YmxaOsimEVky1CVn6q1WXjm,3jeKYOunVVjPbGwQrQHnC3LtCjcgEt33hsg6zs8Skwm5LKoFO5,1UPoLAdbVmpI3J8CRLCtIbTYFMDguU7O2WRBLv7HQCJa3zvMvt,FYjwrlOVKCs19bQ4mM9XiJj0geNoMgMp4iSTpUMovCKQIct8c1,RycVNrr4zKlQLYnocNl4Mq3zqo2EbhDQgczo0cSvs7XJbNn4A5,aoEmCjSvcSRQCE40PEnIBopY9hDgK4blkG3kkuvlZbpi9IeU5P,z2jsgsiihZVMxCyIxwEIcR2LnTsqaVW0ESLyimG4lzHKIYAEvN,NX0dka4EdEvN1cRgtnN2F1lGI5W8UbPYh8w43PCAArUVPdXqkI,sIkOJ4j54SIFHDmGHWEqSXIkDmz1mf57wYUb8wuw6X7VLMZlZQ,qT3rJi5jhh8hpZ8QTLssydrjSuJZxLqNr9QAAJlO1mQyCxvD5k,09YYGQBaVJ6W585d1EFNebwwdg1sWuC2QDnnfgdgEeBRUXnmZ7,DkQYOnyqt8wdlu4mtIX5h56BZMxpl5XhLK1wrsKUUBj9HfbzGs

When I add last dataframe line that is swallowed from time to time, you see how it is impossible to work with pandas dataframes in debug console.

@fabioz
Copy link
Collaborator

fabioz commented Oct 20, 2022

I see, unfortunately this seems to be a bug in VSCode itself and thus is not fixable in debugpy.

Please add this information to the existing issue you have reported at the VSCode repository so that the VSCode team can take a look at it and fix it accordingly (you may want to mention that you're using the debug console with word wrap disabled as that could be playing a part in it).

@zljubisic
Copy link
Author

I see, unfortunately this seems to be a bug in VSCode itself and thus is not fixable in debugpy.

Please add this information to the existing issue you have reported at the VSCode repository so that the VSCode team can take a look at it and fix it accordingly (you may want to mention that you're using the debug console with word wrap disabled as that could be playing a part in it).

Can you please help me with it? I have several issues opened and I am not sure what to put where.

@zljubisic
Copy link
Author

@fabioz regarding these three variables:

PYDEVD_PANDAS_MAX_ROWS=300
PYDEVD_PANDAS_MAX_COLS=300
PYDEVD_PANDAS_MAX_COLWIDTH=80

can you please provide instructions to me when and where to put them if I want their influence in debug console?

You can put them either in your OS environment variables (and then restart VSCode so that it picks up those values) or you can put them in the launch configuration env.

My setup is the following:
I am running vscode in Windows 10 and than connecting to the hyperv centos 7 vm.
When you say "You can put them either in your OS environment variables", you mean I could use terminal in centos 7 in order to set variables?

Another option is to use launch.json like this:

{
    // Use IntelliSense to learn about possible attributes.
    // Hover to view descriptions of existing attributes.
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Current File",
            "type": "python",
            "request": "launch",
            "program": "${file}",
            "console": "integratedTerminal",
            "justMyCode": false,
        },
        {
            "name": "Python: Debugging",
            "type": "python",
            "request": "launch",
            "program": "${file}",
            "console": "integratedTerminal",
            "env": {
                "PYDEVD_PANDAS_MAX_ROWS": "999999",
                "PYDEVD_PANDAS_MAX_COLS": "999999",
                "PYDEVD_PANDAS_MAX_COLWIDTH": "999999"
              },
            "purpose": ["debug-test", "debug-in-terminal"],            
            "justMyCode": false,
        },        
    ]
}

First configuration is the default one, second is for debugging.
Please do confirm that second configuration is what you have meant.

@fabioz
Copy link
Collaborator

fabioz commented Oct 20, 2022

When you say "You can put them either in your OS environment variables", you mean I could use terminal in centos 7 in order to set variables?

You can do that if that's the same terminal that'll make the run afterwards (i.e.: the environment variables need to be inherited in the run from that command -- the important part is that the environment variable needs to be available in os.environ in the target python process).

@fabioz
Copy link
Collaborator

fabioz commented Oct 20, 2022

First configuration is the default one, second is for debugging.
Please do confirm that second configuration is what you have meant.

Yes, that's what I've meant.

@zljubisic
Copy link
Author

@fabioz does this help?

[2022-10-25 14:26:10.356] [renderer1] [error] Could not find pty on pty host: CodeExpectedError: Could not find pty on pty host
    at a._throwIfNoPty (/home/vagrant/.vscode-server/bin/d045a5eda657f4d7b676dedbfa7aab8207f8a075/out/vs/platform/terminal/node/ptyHostMain.js:27:7252)
    at a.updateTitle (/home/vagrant/.vscode-server/bin/d045a5eda657f4d7b676dedbfa7aab8207f8a075/out/vs/platform/terminal/node/ptyHostMain.js:27:2984)
    at Object.call (/home/vagrant/.vscode-server/bin/d045a5eda657f4d7b676dedbfa7aab8207f8a075/out/vs/platform/terminal/node/ptyHostMain.js:16:8386)
    at c.onPromise (/home/vagrant/.vscode-server/bin/d045a5eda657f4d7b676dedbfa7aab8207f8a075/out/vs/platform/terminal/node/ptyHostMain.js:15:5833)
    at c.onRawMessage (/home/vagrant/.vscode-server/bin/d045a5eda657f4d7b676dedbfa7aab8207f8a075/out/vs/platform/terminal/node/ptyHostMain.js:15:5216)
    at /home/vagrant/.vscode-server/bin/d045a5eda657f4d7b676dedbfa7aab8207f8a075/out/vs/platform/terminal/node/ptyHostMain.js:15:4502
    at m.invoke (/home/vagrant/.vscode-server/bin/d045a5eda657f4d7b676dedbfa7aab8207f8a075/out/vs/platform/terminal/node/ptyHostMain.js:11:145)
    at D.deliver (/home/vagrant/.vscode-server/bin/d045a5eda657f4d7b676dedbfa7aab8207f8a075/out/vs/platform/terminal/node/ptyHostMain.js:11:2275)
    at O.fire (/home/vagrant/.vscode-server/bin/d045a5eda657f4d7b676dedbfa7aab8207f8a075/out/vs/platform/terminal/node/ptyHostMain.js:11:1853)
    at process.X (/home/vagrant/.vscode-server/bin/d045a5eda657f4d7b676dedbfa7aab8207f8a075/out/vs/platform/terminal/node/ptyHostMain.js:9:20837)
    at process.emit (node:events:526:28)
    at emit (node:internal/child_process:938:14)
    at processTicksAndRejections (node:internal/process/task_queues:84:21)

@fabioz
Copy link
Collaborator

fabioz commented Oct 28, 2022

@fabioz does this help?

No, logs are needed on the python side (the logs from vscode don't really help).

@fabioz
Copy link
Collaborator

fabioz commented Dec 2, 2022

I fixed the part that requires users to set in the environment:

"PYDEVD_PANDAS_MAX_ROWS": "999999",
"PYDEVD_PANDAS_MAX_COLS": "999999",
"PYDEVD_PANDAS_MAX_COLWIDTH": "999999"

When evaluating in the repl to get the original pandas value (so, now when evaluating in the repl the debugger does no additional customization, it just calls repr(pandas_object) and the resulting behavior is whathever pandas is configured to do.

Other issues reported here aren't fixable by debugpy (those should be fixed in jupyter or VSCode itself).
see #1078 (comment) for details on those.

@zljubisic
Copy link
Author

Thanks @fabioz I am waiting for new release to check it.

@carschandler
Copy link

Props for your patience, persistence, and many responses throughout this thread @fabioz!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants