Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: A new method that will more efficiently display 'tall' df #42837

Open
adamrossnelson opened this issue Jul 31, 2021 · 4 comments
Open

ENH: A new method that will more efficiently display 'tall' df #42837

adamrossnelson opened this issue Jul 31, 2021 · 4 comments
Labels
Enhancement Needs Discussion Requires discussion from core team before further action Output-Formatting __repr__ of pandas objects, to_string

Comments

@adamrossnelson
Copy link

Is your feature request related to a problem?

I wish there was a way to preserve verticle screen space when I'm inspecting 'tall' data frames. These are data frames with many observaitons but few columns. Thus, they're tall. I use the word tall to avoid conflating with long data.

Describe the solution you'd like

I believe this function accomplishes the goal:

def insp(df, n=5, parts='ht'):
  if parts == 'ht':
    display = pd.concat([df.head(n).reset_index().rename(columns={'index':'loc'}),
                         df.tail(n).reset_index().rename(columns={'index':'loc'})], 
                        axis=1,
                        keys=['head','tail'])
  if parts == 'hs':
    sep='< head | sample >'
    display = pd.concat([df.head(n).reset_index().rename(columns={'index':'loc'}),
                         df.sample(n).reset_index().rename(columns={'index':'loc'})], 
                        axis=1,
                        keys=['head','sample'])
  if parts == 'st':
    sep='< head | sample >'
    display = pd.concat([df.sample(n).reset_index().rename(columns={'index':'loc'}),
                         df.tail(n).reset_index().rename(columns={'index':'loc'})], 
                        axis=1,
                        keys=['sample','tail'])
  if parts == 'hst':
    sep1 = '< head | sample >'
    sep2 = '< sample | tail >'
    display = pd.concat([df.head(n).reset_index().rename(columns={'index':'loc'}),
                         df.sample(n).reset_index().rename(columns={'index':'loc'}),
                         df.tail(n).reset_index().rename(columns={'index':'loc'})], 
                        axis=1,
                        keys=['head','sample','tail'])
  return(display)

API breaking implications

Not applicable and/or unsure.

Describe alternatives you've considered

So the above function works great. Try it out. Take a look over at this colab notebook that demonstrates:

https://colab.research.google.com/drive/1mcNLlG6RVbhoXGCMdVyUryxKSgcaRPNC?usp=sharing

But, I'd propose a new method. Something like Pandas.DataFrame.inspect() or Pandas.DataFrame.insp() - to compliment the .head() - .sample() - and .tail() methods.

Additional Context

Inspired by the first suggestion in this article:
https://towardsdatascience.com/pandas-hacks-that-i-wish-i-had-when-i-started-out-1f942caa9792
(See the first 'hack' of the three).

@adamrossnelson adamrossnelson added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 31, 2021
@jreback
Copy link
Contributor

jreback commented Jul 31, 2021

there have been multiple requests for something like this

pls search and link similar issues

@adamrossnelson
Copy link
Author

I thought that would be the case. I'm not finding the other related issues. I believe you that they're there... seems like this would be a common request. Apologies that my keyword search skills are falling short here.

@jreback
Copy link
Contributor

jreback commented Aug 2, 2021

you can look thru these:
#38827, #27000, #18691, #9179, #7005

@adamrossnelson
Copy link
Author

Thanks for the finds. My reads...

#38827 - Expanded views.
This one is different from what I submitted. I am a fan of the idea presented in 38827.

#27000 - Updated logic on display truncation.
Also different. But also not a bad idea.

#18691 - Add an 'ends' method.
Yes. This is what I'm writing about. Though I'm also suggesting to display a sample (not just head and tail).

#9179 - Interactivity
Another good idea. Give an option to inspect data with a gui.

#7005 - Left and Right methods . . .
Also different. But clever idea. I suggest a method chain as a solution...

# Display left five columns
df.transpose().head().transpose()

# Display right five columns
df.transpose().tail().transpose()

Thanks to @jreback for the leads. I'm not sure next steps. My goal was to put it out there as a possible enhancement. I don't think I have the time to contribute just right now. Happy to chat a bit more about other options.

@mroeschke mroeschke added Needs Discussion Requires discussion from core team before further action Output-Formatting __repr__ of pandas objects, to_string and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Needs Discussion Requires discussion from core team before further action Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

No branches or pull requests

3 participants