Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add handling of multiple level column names in DataSets #212

Merged
merged 2 commits into from Oct 31, 2021
Merged

Add handling of multiple level column names in DataSets #212

merged 2 commits into from Oct 31, 2021

Conversation

baronet2
Copy link
Contributor

@baronet2 baronet2 commented Apr 2, 2021

Hello @swar ,

Thanks for this amazing repo!

I've figured out a patch for #211 and #98 that is robust to handle DataSet objects with an arbitrary number of levels for the column names.

For example, the following command should now work:

from nba_api.stats.endpoints import leaguedashteamshotlocations
raw_data = leaguedashteamshotlocations.LeagueDashTeamShotLocations()
print(raw_data.get_data_frames()[0])

When running the tests, I got the same number of failures (113, mostly on test_play_by_play_regex_integration) as before starting my work, so I believe my fix doesn't crash any previously-working features.

Hope this helps, and happy to discuss the changes.

Ethan Baron

@quepasapedro
Copy link

I tested this with responses from two the two endpoints below, and it worked well! I don't know how many other endpoints return multi-level column structures, so I'm not sure if there are any edge cases we need to look out for, but this seems like a solid solution for these cases.

endpoints.LeagueDashPlayerShotLocations(distance_range='5ft Range', per_mode_detailed='PerGame')
endpoints.LeagueDashTeamShotLocations(distance_range='By Zone', measure_type_simple='Opponent')

One usability question: for columns listed as columnsToSkip, the logic here is naming the first level '', which could potentially lead to some confusion, as it's not visually apparent that those columns have a multi-level label, or what it is. It looks really nice visually (see PLAYER_ID through AGE columns below), but do we think it would be worth adding an explicit label to that level, so that it's easier for users to call it?

image

I'm not sure what a nice, generalizable name would be, but we could change this:

test_df['']['PLAYER_NAME']

to something like

test_df['grouping_cols']['PLAYER_NAME']

Just a thought! Thanks for the quick fix on this, @baronet2!

@baronet2
Copy link
Contributor Author

baronet2 commented Apr 2, 2021

Hi @quepasapedro ,

Thanks for having a look and I'm glad it seems to be working.

You raise a good point about usability...

In the commit above, I added functionality to give the levels names. That should make the MultiIndex structure a bit more clear. Now, the PLAYER_NAME column in your example can be accessed using one of these two options:

from nba_api.stats.endpoints import LeagueDashPlayerShotLocations
raw_data = LeagueDashPlayerShotLocations(distance_range='5ft Range', per_mode_detailed='PerGame')
test_df = raw_data.get_data_frames()[0]
test_df.xs('PLAYER_NAME', level=1, axis=1) # Using the level number
test_df.xs('PLAYER_NAME', level='columns', axis=1) # Using the level name (new)

I can't really think of a more intuitive way to structure things. Perhaps a note in the documentation for the relevant endpoints on MultiIndex dataframes should be added?

@baronet2
Copy link
Contributor Author

Hi @swar ,

Just wondering if there is anything else I'm supposed to do to get this reviewed? It's my first time contributing here, so not too sure how this process works.

Thanks!

Ethan

@swar
Copy link
Owner

swar commented Apr 13, 2021

Hey Ethan,

I will be picking up all the work I left behind here hopefully soon. I've just had a very busy last few months and upkeep on this library ends up eating a few days because of all the changes made by the NBA Stats page and just difficulty in testing. I will update you when I get some time to review.

Sorry for the delay!

@baronet2
Copy link
Contributor Author

Sounds good, thanks for the quick response! 🙂

@swar swar merged commit 95e656b into swar:master Oct 31, 2021
@baronet2
Copy link
Contributor Author

baronet2 commented Nov 1, 2021

Thanks for merging!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants